deep-learning-programming

The assignments of 'Deep Learning Programming' lecture (2024 autumn)

All labs contains

pre-report: summarizing papers
ipynb files : implemented algorithm with Pytorch
final report : analyzing results

All reports are written in English except pre-report of Lab1.

We use latex(Overleaf) and BMVC Paper Templates for writing report.

index

Lab1: implementing VGGNet and ResNet architectures
- VGGNet: Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition, 2015
- ResNet: Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2015.
Lab2: Object Detection with implementing YOLO architectures
- YOLO: Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi. You Only Look Once: Unified, Real-Time Object Detection. 2015
Lab3: Object Detection with implementing Faster R-CNN architectures
- Faster R-CNN: Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster rcnn: Towards real-time object detection with region proposal networks, 2016.
Lab4: Semantic Segmentation with implementing Fully Convolutional Networks.
- Fully Convolutional Network: Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation, 2015.
- Learning Deconvolution Network: Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. Learning deconvolution network for semantic segmentation, 2015
Lab5: implementing Vision Transformer(ViT) model
- Vision Transformer: Alexey Dosovitskiy et al. An image is worth 16x16 words: Transformers for image recognition at scale, 2021.
Lab6: Style Transfer
- Style Transfer: L. A. Gatys, A. S. Ecker and M. Bethge, "Image Style Transfer Using Convolutional Neural Networks," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 2414-2423, doi: 10.1109/CVPR.2016.265.
Lab7: implementing Gradient-weighted Class Activation Mapping(Grad-CAM)
- CAM: Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. Learning deep features for discriminative localization, 2015.
- Grad-CAM: Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128(2):336-359, October 2019.ISSN 1573-1405. doi: 10.1007/s11263-019-01228-7.
Lab8: implementing CLIP
- CLIP: Alec Radford et al. Learning transferable visual models from natural language supervision, 2021.
  - ref: Simple Implementation of OpenAI CLIP model: A Tutorial
  - code ref: https://github.com/moein-shariatnia/OpenAI-CLIP
Lab9: Stable Diffusion fine-tuning with LoRA
- stable diffusion: Robin Rombach,Andreas Blattmann,Dominik Lorenz,Patrick Esser,and Björn Ommer. High-resolution image synthesis with latent diffusion models, 2022.
- LoRA : Edward J.Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang,and Weizhu Chen.Lora:Low-rank adaptation of large language models,2021.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

deep-learning-programming

index

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
Lab1		Lab1
Lab2		Lab2
Lab3		Lab3
Lab4		Lab4
Lab5		Lab5
Lab6		Lab6
Lab7		Lab7
Lab8		Lab8
Lab9		Lab9
LICENSE		LICENSE
README.md		README.md

License

minchoCoin/deep-learning-programming

Folders and files

Latest commit

History

Repository files navigation

deep-learning-programming

index

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages