Skip to content

aakashvishcoder/Pretrained-Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Pretrained Chatbot

General Process & Overview

This project was created for the Congressional App Challenge. LLMS and Transformers are an integral part of AI today. After working with numerous types of Neural Networks, I expanded to employ transformers in various text based tasks. Transformers can be used for many text based tasks, whether it be classification or text generation. Additionally, the use of a transformer solves the age-old problem: the vanishing gradient problem.

To create this project, I imported the Qwen3-4B model from Hugging Face. Then, using a dataset that I also found in Hugging Face, I fine-tuned the model, changing up the weights and biases.

Complications

An LLM/Transformer that has been pretrained and posted on Hugging Face has already undergone intense training using extremely large datasets. Unfortunately, fine-tuning a model with a smaller dataset would most likely hinder the performance of the model, rather than help it. This proved to be evident while I pretrained the model, as not only did it take an absurd amount of time to train (15-17 hours), but it also performed worse than the pretrained model itself. Regardless, this project helped me gain the experience of pre-training a Hugging Face model, and how to improve its accuracy in future projets.

Tech Stack

How to Use

git clone https://github.com/aakashvishcoder/Pretrained-Chatbot.git
cd your-repo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published