- Details of GPT-2: https://jalammar.github.io/illustrated-gpt2
- Composed with Transformer decoder layers
- Auto-regression model
- Like traditional language models such as RNN
- Outputs one token at a time
- Step 1. after each token is produced, that token is added to the sequence of inputs
- Step 2. And that new sequence becomes the input to the model in its next step.
- GPT series of Korean version
- Pretrained with Korean Wiki, news, corpus etc.
https://www.kaggle.com/datasets/ninetyninenewton/kr3-korean-restaurant-reviews-with-ratings
@article{GPT-2,
title={Language Models are Unsupervised Multitask Learners},
author={Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever},
year={2018}
}
https://github.com/openai/gpt-2
https://github.com/SKT-AI/KoGPT2
https://www.kaggle.com/code/ninetyninenewton/zero-shot-sentiment-classification-using-gpt-2/notebook