Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor covid_hosp auto columns #1203

Conversation

rzats
Copy link
Contributor

@rzats rzats commented Jun 22, 2023

Summary:

Several refactors to covid_hosp files on the auto columns branch:

  • Updates tests to get the number of columns directly from the YAML file - so they won't break when it changes
  • Minimizes Database subclasses in covid_hosp (down to a single property) as the information stored there can now be obtained from the dataset definition YAML file
  • Updates the base Database class to obtain the information in question
  • Removes Network subclasses and condenses their logic into the base class
  • Changes various tests as needed

Prerequisites:

  • Unless it is a documentation hotfix it should be merged against the dev branch group/covid_hosp_auto_columns_2 base
  • Branch is up-to-date with the branch to be merged with, i.e. dev
  • Build is successful
  • Code is cleaned up and formatted

Sorry, something went wrong.

Copy link
Collaborator

@melange396 melange396 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! it makes me very happy to see redundancies disappear and silly complexities get flattened out!

one more point that didnt fit easily as an in-line comment:

you should remove the argument network from common/utils.py:update_dataset() and use Network instead. then you can change each */update.py to:

from delphi.epidata.acquisition.covid_hosp.common.utils import Utils
from delphi.epidata.acquisition.covid_hosp.XXXXXXXX.database import Database
# maybe even just:  `from .database import Database`  ???

if __name__ == "__main__":
  return Utils.update_dataset(Database)

...and then i would remove Utils.launch_if_main() because it is dumb.

@rzats
Copy link
Contributor Author

rzats commented Jun 26, 2023

@melange396 updated with all the changes you mentioned, re-requesting review!

@rzats rzats requested a review from melange396 June 26, 2023 21:38
Copy link
Collaborator

@melange396 melange396 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its looking pretty good, almost there!

@@ -7,22 +7,8 @@
# first party
from delphi.epidata.acquisition.covid_hosp.common.utils import Utils
from delphi.epidata.acquisition.covid_hosp.facility.database import Database
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work with the relative import?

Suggested change
from delphi.epidata.acquisition.covid_hosp.facility.database import Database
from .database import Database

then the 3 copies of update.py can be virtually identical, and copied ~verbatim when adding a new dataset. (the comments at the top of these files can probably be reduced to something generic, or (particularly for the "first party" and "entry point" comments) removed entirely.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, shortening those imports fails a couple of tests. I do agree that the "first party / entry point" comments aren't needed, though! It's probably nice to know what the original dataset is called, so I left the docstring comments in.

@rzats
Copy link
Contributor Author

rzats commented Jun 30, 2023

@melange396 updated with a few caveats (see the comments above!)

@rzats rzats requested a review from melange396 June 30, 2023 14:33
Copy link
Collaborator

@melange396 melange396 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the work youve done so far on this is excellent -- we have improved flexibility and modularity, gotten away from some bad practices/paradigms, and removed a whole bunch of needless repetition. on the minus side, its made some other existing design flaws more apparent... but after this round of comments, i think the only thing left is to see how we can factor out more duplicated code in tests/acquisition/covid_hosp/***/test_database.py and integrations/acquisition/covid_hosp/***/test_scenarios.py (which will make it easier to test new datasets that we might add in the future).

@contextmanager
def connect(database_class, mysql_connector_impl=mysql.connector):
def connect(self, mysql_connector_impl=mysql.connector):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we rename this to wrapped_connection() or something? it does connect, but it disconnects also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this matches Python's naming convention for context managers - compare:

with f = open("/etc/filename"):
   # do stuff with a file, then the context manager closes it

If you'd like to rename the method, I'd do create_connection (like socket.create_connection)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theres a difference in that open() and socket.create_connection() can be used outside of a context manager's with syntax:

with open('file') as f:
  f.blah()

vs

f = open('file')
f.blah()
f.close()

we dont have that option here, this connect() method must be used as a context manager.

but then again, its just a name... ¯\_(ツ)_/¯

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind renaming this one either :) would prefer a verb rather than noun as a name, though; wrapped_connection feels like a class property (but wrap_connection could work fine!)

@rzats
Copy link
Contributor Author

rzats commented Jul 7, 2023

@melange396 pushed another round of changes, re-requesting review!

@rzats rzats requested a review from melange396 July 7, 2023 20:25
@rzats rzats requested a review from melange396 July 14, 2023 15:15
@rzats rzats requested a review from melange396 July 14, 2023 21:19
@rzats
Copy link
Contributor Author

rzats commented Jul 18, 2023

@melange396 applied the set of changes you requested! (with a few small tweaks to make it compile)

@rzats rzats requested a review from melange396 July 18, 2023 13:55
@rzats rzats requested a review from melange396 July 18, 2023 16:46
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 1 Security Hotspot
Code Smell A 9 Code Smells

No Coverage information No Coverage information
0.6% 0.6% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@melange396
Copy link
Collaborator

i added a commit that renamed a class to TypeUtils because just 'Utils' is not very useful, and also removed some related unused code.

tests are nice and can help catch a lot of problems, but they arent perfect. before we call this done, can you do some actual sample acquisition runs to make sure this all still works to acquire data? we changed a lot, and we mightve missed something... for example, the src/acquisition/covid_hosp/***/update.py jobs wouldve failed if they still included the lines i just removed: from delphi.epidata.acquisition.covid_hosp.common.utils import Utils.

Copy link
Collaborator

@melange396 melange396 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! Rostyslav did some tests and things seem to be working properly (modulo #1225)

@melange396 melange396 merged commit f62feb8 into group/covid_hosp_auto_columns_2 Jul 24, 2023
@melange396 melange396 deleted the rzatserkovnyi/covid-hosp-auto-refactor branch July 24, 2023 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants