-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Hello! Not sure if the way it is currently done is for a specific reason, but on my local version of lasso, I was able to significantly improve the reading speed of binout outputs by simply changing, in binout.py, as_df() method, the following:
Change from:
for i, j in enumerate(ids):
df[str(j)] = data.T[i]
to:
df = pd.DataFrame(data=data, index=time_pdi, columns=[str(j) for j in ids])
(Improvement in performance is at least 3x on small binout [~80MB], 30x+ on large databases [multiple GB] - a colleague of mine started reading data before a 1h meeting, the data was still loading after the meeting. I changed the code, read the data in 7 minutes using the above modification, and the old code was not even half-way done). This is especially useful for elout, swforc if there are lots of elements for instance.