-
Notifications
You must be signed in to change notification settings - Fork 0
Recommend
- Minimize redundancy and maximize diversity of results in text summarization tasks.
- Select the keyword/keyphrase that is most similar to the document. Then select new candidates repeatedly that are similar to the document and are not similar to the already selected keyword/keyphrase
Install requirements.txt
pip install -r requirements.txt
def key_bert(user_id, database, model_ST, model_W2V, category):- The higher the Diversity, the more various keywords are extracted.
-
top_n: Number of keywords to be extracted
def mmr(doc_embedding, candidate_embeddings, words, top_n, diversity):We use this W2V model to calculate similarity between keywords from user's chat and categories. And it returns max_ctg that has the largest similarity above 0.8.
Install konlpy and soynlp
pip3 install konlpy
pip3 install soynlp
category_connect calls key_bert
def category_connect(...):
bert_keyword = key_bert(uid, db, model_ST, model_W2V, category)Get keyword_mmr to use.
def key_bert(user_id, database, model_ST, model_W2V, category):
...
keyword_mmr = mmr(doc_embedding, candidate_embeddings, candidates, top_n=5, diversity=0.7)First. Cut the keywords that get from Keybert into spaces.
Second. Check the words are in the trained model.
Third. Set max_score = 0.79999 to find the largest similarity above 0.8. The high score means it has large similarity.
Fourth. Calculate similarity using model_W2V.
Finally. It returns max_ctg that has the highest similarity of user's keywords.
Get max_ctg to bert_keyword from function key_bert.
def category_connect(...):
bert_keyword = key_bert(uid, db, model_ST, model_W2V, category)Update users list in firebase fav > category(bert_keyword).
def KoGPT(...):
...
db.collection("fav").document(bert_keyword).update({"users": firestore.ArrayUnion([email])})Now, user can enter the new category board.