Name		Name	Last commit message	Last commit date
parent directory ..
bm25		bm25
ensemble		ensemble
git_pke		git_pke
graphmodel		graphmodel
key2vec		key2vec
rake		rake
tfidf		tfidf
README.md		README.md
__init__.py		__init__.py

README.md

Keyword Extraction Project - Models

extending the project

Adding new methods to the pipeline is not difficult. We have defined a simple API to setup models and test/train them.

To add a new model the following steps should be followed

Add that model as a new module in models, making sure to stick to the requirements mentioned in that folder. The train and test functions should look like this:

def train(dataset, arguments, lang='dutch'):
	"""
	dataset: the dataset is a list of all documents.
	arguments: this is a list of all commandline arguments.
	lang: the language that is used.
	"""

def test(text, arguments, k=5, lang='dutch'):
	"""
	text: the text we want to extract the keywords from.
	arguments: this is a list of all commandline arguments.
	k: the amount of keywords that should be returned.
	lang: the language that is used.
	"""

In pipeline.py, import that module from the models folder.

#pipeline.py
from models import new_model

Add the relevant arguments to pipeline.py by including an argparse argument along with the others in the form:

    #line 182
    parser.add_argument(
        "--new_model",
        help="Use the new model",
        nargs='*'
    )

	#line 304
        if args.new_model is not None:
            methods.append({'name': 'Ensemble',
                            'train_function': new_model.train,
                            'test_function': new_model.test,
                            'arguments': args.new_model,
                            'k': args.k,
                            'dataset_name': dataset,
                            'match_type': args.matchtype}
                           )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

README.md

Keyword Extraction Project - Models

extending the project

Files

models

Directory actions

More options

Directory actions

More options

Latest commit

History

models

Folders and files

parent directory

README.md

Keyword Extraction Project - Models

extending the project