-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sql #22
base: master
Are you sure you want to change the base?
Sql #22
Conversation
…, added .gitignore for possible config
…subjects to have new use_db flag
Hey @Saqibm128 what's the status of this? Are you waiting on us to merge? |
Yes, I changed some of the methods to read data tables to take an optional parameter called use_db, and added a new argument to the arg parser to use-db |
Looking at this PR this week. Going to try to include in the release. |
@Saqibm128 How much faster is this? |
Faster? Do you mean execution speed? If the SQL server is on the same computer, then it goes about as fast. If it is on a separate computer and the network host of the SQL isn't the localhost, then I guess it would go much slower due to network lag. |
I haven't actually looked at this project in a while, but if execution speed for setting up the directory structure is becoming a huge bottleneck, I might want to look into using parallel threads/processes. |
Thanks for getting back so quick. And yeah I didn't really look at the code before asking and naively assumed you'd taken the joins logic and rewritten that in SQL. My guess would be that, ceteris paribus, a series of group by + join queries would be faster than the current method. But speaking of parallelism, I actually have cobbled together a rather hackish parallel extract_subjects and extract_episodes, would cleaning it up and writing a PR be worthwhile or have folks mostly moved on from actively maintaining things here? |
To be honest, I am not really involved in this project, added the SQL stuff well after they had started, and haven't been paying attention to this for awhile since, so I'm not sure at all. Using joins actually makes a lot of sense, I don't think I was thinking at all about it, surprisingly. If you have a hacky parallel script, go ahead and share! I don't know if it would be accepted but others may still find any code of any quality useful. |
Oh also just remembered they have a gitter chat room. https://gitter.im/YerevaNN/mimic3-benchmarks |
@Harhro94 Now that the 1.0 release is done, should one of us should review this and then make the merge? |
@turambar is there still value in this? should I try to fix merge conflicts, or should I close the pr? |
adds SQL support to project as per #16 and explicitly declare code sections in README.md and dependencies in requirements.txt