SandalQuest - AI/ML Hackathon Project

ML-Fiesta: AI/ML Hackathon

International Institute of Information Technology (IIIT), Bangalore

Project Overview

The goal is to utilize AI and machine learning to develop a pipeline that processes and understands audio content related to sandalwood cultivation. We are focusing on creating:

An Automatic Speech Recognition (ASR) model for the Kannada language.
A speech-based question-answering system to help users access information from the audio data.

Project Objectives:

Develop an ASR model that accurately recognizes colloquial Kannada speech.
Create a searchable audio database using the ASR output.
Implement a question-answering system allowing users to ask questions via speech input.

Problem Statement

Karnataka is a key region for sandalwood, which holds significant cultural, religious, and economic value in India. However, much of the traditional knowledge around sandalwood cultivation is conveyed informally and captured in audio recordings. These resources are not easily accessible, and there's a need to digitize and preserve this indigenous knowledge. The main challenge is handling colloquial Kannada speech with background noise, as it differs from standard formal language.

Challenges:

Limited digital information on sandalwood cultivation.
Audio recordings often contain noise and informal language.
Standard ASR models struggle with colloquial language.

Scope

The project includes:

Building a Kannada ASR model for colloquial language recognition.
Creating a searchable database by transcribing audio files.
Developing a speech-based question-answering system to query the audio corpus.
Fine-tuning the ASR model using both provided and publicly available Kannada datasets.

Out of Scope:

Processing audio in languages other than Kannada.
Real-time transcription of live streams.
Handling complex multi-turn dialogues.

Dataset Description

The dataset consists of Kannada audio files focused on sandalwood cultivation:

Source: Audio files scraped from YouTube.
Content Type: Informal Kannada speech, possibly with background noise.
Format: Common audio formats like MP3.

Dataset Challenges:

Noisy recordings made in public spaces.
Informal and colloquial language use.
Variations in pronunciation and dialects.

Functional Requirements

Task 1: Speech Recognition

Develop an ASR model for Kannada speech from audio files.
Handle informal and colloquial speech.
Support model fine-tuning with additional datasets.

Key Features:

Kannada speech transcription to text.
Noise reduction and speech enhancement.
Accommodate dialect variations.

Code:

Task 1 Folder

More Details:

Task 1 Details

Task 2: Speech-based Question-Answering System

Allow users to ask questions via speech input.
Convert the spoken question to text using the ASR model.
Search the transcribed audio data for relevant answers.
Return the most relevant audio segment as the answer.

Key Features:

Accurate answer retrieval from speech queries.
Efficient search and indexing of the transcribed corpus.
User-friendly query and response interface.

Code:

Task 2 Folder

More Details:

Task 2 Details

Pipeline Architecture

Technical Requirements

Languages: Python.
Libraries & Frameworks: PyTorch, Whisper (ASR), Hugging Face Transformers.
Database: MongoDB.
Deployment: Google Cloud Platform (GCP) or AWS for scalable processing.
Hardware: GPU servers for training, cloud instances for deployment.
Tools: Google Colab, Jupyter Notebooks, GitHub.

Risks & Mitigation

Risk	Mitigation
Low-quality audio data	Use noise reduction and data augmentation
Poor ASR accuracy on colloquial speech	Fine-tune model with additional colloquial data
High query processing latency	Optimize search algorithms and indexing

Conclusion

This project aims to document and provide access to indigenous knowledge of sandalwood cultivation through advanced ASR and NLP technologies. By building a robust pipeline, we will not only aid conservation but also provide valuable insights for users interested in sandalwood cultivation.

Team Details

Team Name: Code Wizards

Team Members:

Shani Sinojiya (Team Lead / AI/ML & Backend Developer)
Mohammad Anas Africawala (AI/ML Engineer)
Tisha Patel (Full Stack Developer)

Project Links

License

SandalQuest by Shani Sinojiya is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

References

Whisper: A Speech Recognition Framework by OpenAI.
Hugging Face Transformers for NLP models.
Google Colab for collaborative coding.
MongoDB for database management.
PyTorch for deep learning models.
GitHub for version control and collaboration.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Task 1		Task 1
Task 2		Task 2
assets		assets
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SandalQuest - AI/ML Hackathon Project

ML-Fiesta: AI/ML Hackathon

International Institute of Information Technology (IIIT), Bangalore

Project Overview

Project Objectives:

Problem Statement

Challenges:

Scope

Out of Scope:

Dataset Description

Dataset Challenges:

Functional Requirements

Task 1: Speech Recognition

Key Features:

Code:

More Details:

Task 2: Speech-based Question-Answering System

Key Features:

Code:

More Details:

Pipeline Architecture

Technical Requirements

Risks & Mitigation

Conclusion

Team Details

Team Name: Code Wizards

Team Members:

Project Links

License

References

About

Languages

Shani-Sinojiya/SandalQuest

Folders and files

Latest commit

History

Repository files navigation

SandalQuest - AI/ML Hackathon Project

ML-Fiesta: AI/ML Hackathon

International Institute of Information Technology (IIIT), Bangalore

Project Overview​

Project Objectives:​

Problem Statement​

Challenges:​

Scope

Out of Scope:​

Dataset Description​

Dataset Challenges:

Functional Requirements

Task 1: Speech Recognition

Key Features:

Code:

More Details:

Task 2: Speech-based Question-Answering System

Key Features:

Code:

More Details:

Pipeline Architecture

Technical Requirements

Risks & Mitigation

Conclusion

Team Details

Team Name: Code Wizards

Team Members:

Project Links

License

References

About

Topics

Resources

Stars

Watchers

Forks

Languages

Project Overview

Project Objectives:

Problem Statement

Challenges:

Out of Scope:

Dataset Description