Main: seminarextract.py
Please note, to run the program you must do the following:
-
Replace "dir" and "dataDir" variables in seminarextract.py to point to stanford-ner and nltk-data respectively
-
Download GoogleNews-vectors-negative300.bin and place in the nltk-data folder
-
Install with pip: gensim, nltk, pprint, sner
-
Ensure stopwords downloaded from nltk.corpus
-
Move seminar files required into the "seminars_training/training" folder
-
Results will show in "seminars_training/training/results"