Skip to content

Latest commit

 

History

History
58 lines (44 loc) · 4.29 KB

File metadata and controls

58 lines (44 loc) · 4.29 KB

These are the list of projects suggested but you are free to choose your own topic as well.

  1. Deep fake detection challenge

  2. Textual entailment task

  3. Background news linking task from TREC 2020

  4. Hyperpartisan new detection

  5. Reinforcement learning

    • In this project, students can experiment and learn about various reinforcement learning algorithms within the arena of Atari games from OpenAI's gym.
    • This project will give students the opportunity to explore the challenges and tradeoffs associated with different algorithms.
    • Students are welcome to apply algorithms from well-known libraries, ie. stable-baselines.
    • Resources:
  6. Stock market analysis: Finding the seasonality and trend

  • In this project you will investigate several stocks of your choosing to compare their seasonality and trend. Try choosing some stocks that are seasonal, like airlines, air conditioning, heating, etc. Decompose the stocks that you choose and discuss their seasonality and trend. Using one or more ARIMA models, try to predict the future price of the stock. The data is gathered from yahoo Finance, and can be accessed throug the yahoo-finance python library (https://pypi.org/project/yahoo-finance/).
  1. Watermarking neural networks

  2. Fake news detection using snopes and poltificat data

  3. Detect claims to fact check in political debates

    • In this project you will implement various classifiers using both neural and feature based technqiues to detect which sentences in political debates should be fact checked.
    • Dataset from ClaimBuster: https://zenodo.org/record/3609356
    • Evaluate your classifiers using the same metrics as http://ranger.uta.edu/~cli/pubs/2017/claimbuster-kdd17-hassan.pdf (Table 2)
      • Classification report from sklearn provides everything
      • Group crowdsourced.csv and groundtruth.csv into one dataset. Use debates from 1960-2008 for training (27 first debates) and 2012-2016 for testing (6 last debates)
      • Create a baseline model: Should be fairly simple one, e.g. SVM, Random Forest, Logistic regression using TF-IDF or other features of your choice. Aim for 60% or more for f1 weighted average.
      • Create advanced model(s) (suggestions are given below)
        • Generate more features that a model can use. For example the context around the sentence, sentiment, named entities etc.
        • Rule based classifier. For example, if sentence contains certain words, tags, statistics etc.
        • Deep learning (word embeddings, transformer models etc.)
        • Sub-sentence classifier. Long sentences may include several claims, so the goal is to mark the span of claim(s) within a sentence

More topics coming!