The repository you are seeing now is an extension of a Harvard CS109b (Advanced Topics in Data Science) project (Spring 2023) I performed in tandem with two other students, George Popoola and Brian Ndzuki, and includes alterations to some of the work they contributed. In the Jupyter Notebook within this repository, the portions of the project (both code and text) completed primarily by these individuals will be indicated clearly with an orange background and their names accredited.
This project was made possible by the work of the Lobachevsky University Electrocardiography Database, who amassed 10s of 12-lead ECG recordings for 200 patients with differing cardiac pathologies. Due to the large filesize of this dataset and proper accreditation, this could not be packaged into this repository. Thus, in order to successfully run the notebook, you will need to download this dataset (v1.0.1) from the Lobachevsky University Electrocardiography Database page on Physionet, and place the resulting directory within the directory for this repository.
For a Brief overview of the project's goals and results, please view the pdf of our final presentation (project_presentation.pdf), and for a more in-depth understanding of how the project was carried out see the Jupyter Notebook (final_notebook.ipynb).
Lastly, as this was a school project I had access to cluster computing resources that I unfortunately can no longer access at the present moment. As a result despite my best efforts toward configuring my local installation of tensorflow, attempting to rerun some of the operations in section 3 and onwards have thrown errors present in the notebook. I'll do my best to resolve those when I have the chance, but the results visible following those errors should be accurate and reproducible if executed in the proper environment.