Text-to-Image-Synthesis

This project is part of the course Topics in Deep Learning (UE17CS338) taken at PES University during my 6th semester (Spring 2020).

Introduction about GANs

GANs short for Generative Adversarial Networks are a class of deep learning networks. They consist of two neural networks that compete with each other in order to improve the outcomes of both the networks simultaneously.

GANs are an approach to Generative Modeling but rather than the conventional unsupervised form of generative modeling that requires a network to discover and learn patterns and regularities in data such that the network can generate outputs which would seem to have been obtained from the original dataspace itself, a GAN converts aims at generating the same through a supervised learning process.

A GAN consists of two networks - the Generator that generates new output and the Discriminator that tries to classify the output as real (from the training data space) or fake (generated). The generator tries to improve by moving from generating absolute noise to generating something close to the real dataset. The discriminator tries to improve by becoming better at differentiating the real output from the fake/generated output.

Once the generator and discriminator are sufficiently improved and the discriminator is unable to diffrentiate the real from the fake, the training process is complete. The generator can then be used independently to generate output.

Project Abstract

This project is an implementation of the paper Generative Adversarial Text to Image Synthesis. It is a tensorflow based implementation and the text descriptions are encoded using Skip Thought Vectors. The below image is the representation of the model architecture.

Results

Training:

After 120 epochs -

After 150 epochs -

After 210 epochs -

After 240 epochs -

After 270 epochs -

After 330 epochs -

After 450 epochs -

After 900 epochs -

After 1300 epochs -

Testing:

"A Red Flower" -

"A Blue Flower" -

"A Purple Flower" -

Improvements:

Train for a greater number of epochs to obtain better results
Train the model on the MS-COCO data set, and generate more generic images.
Try different embedding options for captions(other than skip thought vectors). Also try to train the caption embedding RNN along with the GAN-CLS model.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Data		Data
RESULTS		RESULTS
Utils		Utils
.gitignore		.gitignore
README.md		README.md
data_loader.py		data_loader.py
download_dataset.py		download_dataset.py
generate_images.py		generate_images.py
generate_thought_vectors.py		generate_thought_vectors.py
imp_note.txt		imp_note.txt
main.py		main.py
model.py		model.py
requirements.txt		requirements.txt
skipthoughts.py		skipthoughts.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text-to-Image-Synthesis

Introduction about GANs

Project Abstract

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Dhruvvvx17/Text-to-Image-Synthesis

Folders and files

Latest commit

History

Repository files navigation

Text-to-Image-Synthesis

Introduction about GANs

Project Abstract

Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages