NYPD Conversation Analysis with OpenAI Whisper

Overview

This code utilizes OpenAI's advanced language model, Whisper, to analyze conversations within NYPD communications. The primary objective is to detect potential threats or concerns within the communication channel.

Functionality

The code employs Natural Language Processing techniques, utilizing the Natural Language Toolkit (NLTK) for tokenization. Threat categories, including terrorism, violence, and cybercrime, are identified based on a predefined keyword list.

1. Audio Capture (`grabber.py`)

This thread captures the audio stream from the NYPD public channel at the specified Broadcastify URL (https://broadcastify.cdnstream1.com/27526) which is related to NYPD - 109th and 111th Precincts. Captured audio files in mp3 format are stored in the grabbed folder.

2. Speech-to-Text Recognition (`recognizer.py`)

Utilizing OpenAI Whisper, this thread converts speech to text. For each mp3-file in the grabbed folder, it creates a corresponding metadata-file with the name NYPD_{timestamp}.json and places it into the recognized folder.

3. Communication Analysis (`analyzer.py`)

This thread uses NLTK to analyze the text content from recognized messages, categorizing them into threat categories. If the category is 'REGULAR', the corresponding mp3 file and its metadata are removed. However, if a threat is identified, both files are moved to the alert folder, signaling the presence of a threat.

How to Use

Install the required dependencies using requirements.txt.

pip install -r requirements.txt

Run the script, using python main.py command.

python main.py

Troubleshooting

Before running this script, ensure that FFMPEG is properly installed on your system, as OpenAI Whisper relies on it to read mp3-files. Visit the official FFMPEG website and download the appropriate version for your operating system. Then, set up the PATH correctly to enable the system to find it when necessary.

Get in touch

Feel free to ask me on Twitter if you have any questions. Twitter: https://twitter.com/dmytro_sazonov

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
analyzer.py		analyzer.py
grabber.py		grabber.py
main.py		main.py
recognizer.py		recognizer.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NYPD Conversation Analysis with OpenAI Whisper

Overview

Functionality

1. Audio Capture (`grabber.py`)

2. Speech-to-Text Recognition (`recognizer.py`)

3. Communication Analysis (`analyzer.py`)

How to Use

Threat Categories

Troubleshooting

Get in touch

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

under0tech/conversation-analysis

Folders and files

Latest commit

History

Repository files navigation

NYPD Conversation Analysis with OpenAI Whisper

Overview

Functionality

1. Audio Capture (grabber.py)

2. Speech-to-Text Recognition (recognizer.py)

3. Communication Analysis (analyzer.py)

How to Use

Threat Categories

Troubleshooting

Get in touch

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

1. Audio Capture (`grabber.py`)

2. Speech-to-Text Recognition (`recognizer.py`)

3. Communication Analysis (`analyzer.py`)

Packages