Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions 2_openai/community_contributions/emmy_news_summariser/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# 🧠 Multimodal Agent News Summarizer

An AI-powered system that **aggregates news**, **summarizes using GPT-4o-mini**, and **creates audio briefings** with **MiniMax TTS**.
Built with the **official OpenAI Agents SDK** for real autonomous decision-making.

---

## πŸš€ Features

* πŸ—žοΈ **Multi-Source Aggregation** – Fetches and merges RSS feeds from multiple topics
* 🧠 **AI Summarization** – GPT-4o-mini produces concise (β‰ˆ300 words) audio-optimized briefs
* πŸ”Š **Text-to-Speech** – MiniMax TTS converts summaries to high-quality MP3
* πŸ€– **Autonomous Agents** – Agents decide *when and how* to use tools
* ⚑ **Async/Await** – Fully asynchronous for speed and scalability
* 🎨 **Modern Gradio UI** – Simple blue-themed interface

---

## 🧩 Architecture

```
User β†’ Orchestrator β†’ Autonomous Agents
↓
[1] News Aggregator β†’ [2] Summarizer β†’ [3] Audio Generator
```

Each agent independently chooses its tool:

| 🧩 Agent | 🧰 Tool | 🎯 Purpose |
| :------------------ | :------------------------- | :------------------------ |
| **Aggregator** | `aggregate_news(topic)` | Fetch & merge articles |
| **Summarizer** | `summarize_articles(json)` | Create engaging briefings |
| **Audio Generator** | `synthesize_speech(text)` | Generate MP3 audio |

---

<details>
<summary>βš™οΈ Installation & Setup</summary>

### Prerequisites

* Python 3.13 +
* pip

### 1️⃣ Install

```bash
git clone <repo>
cd news_summariser
python -m venv .venv && source .venv/bin/activate
pip install -e .
```

### 2️⃣ Environment Variables

Create a `.env` file in the root:

```env
OPENAI_API_KEY=your_openai_api_key
MINIMAX_API_KEY=your_minimax_api_key
```

</details>

---

<details>
<summary>▢️ Usage</summary>

```bash
python main.py
```

Then open the Gradio UI (default β†’ [http://127.0.0.1:7860](http://127.0.0.1:7860)).

**Steps**

1. Pick a topic β†’ Tech | World | Business | Politics | Sports
2. Click **Submit** to generate a summary + audio briefing

</details>

---

## πŸ—οΈ Project Structure

```
news_summariser/
β”œβ”€β”€ news_agents/
β”‚ β”œβ”€β”€ news_aggregator.py
β”‚ β”œβ”€β”€ summarizer.py
β”‚ └── audio_generator.py
β”œβ”€β”€ orchestrator.py
β”œβ”€β”€ main.py
β”œβ”€β”€ config/
└── README.md
```

---

## 🧠 Tech Stack

| Category | Technologies |
| :------------------- | :--------------------- |
| **LLM** | OpenAI GPT-4o-mini |
| **TTS** | MiniMax API |
| **Framework** | OpenAI Agents SDK |
| **Web UI** | Gradio |
| **Async Runtime** | aiohttp Β· aiofiles |
| **Feeds** | feedparser |
| **Config & Logging** | python-dotenv Β· loguru |

---

## πŸ“Š Example Output

```bash
βœ“ Fetched 15 articles
βœ“ Summary created (287 words)
βœ“ Audio generated: news_summary_20251109.mp3
```

πŸ“° **Text Summary:** Engaging 300-word brief (opening hook β†’ top 3 stories β†’ closing)
πŸ”Š **Audio File:** MP3 briefing ready for listening on the go

---

## πŸ’‘ Why This Matters

| Traditional Pipelines | This Project (Autonomous Agents) |
| :------------------------ | :--------------------------------------- |
| Hard-coded function calls | Agents decide tools autonomously |
| Fixed sequence | Dynamic reasoning + error recovery |
| Rigid logic | Extensible and maintainable architecture |

**Benefits**

* πŸ” Change behavior by editing instructions (not code)
* 🧩 Easily add new tools or agents
* 🧱 Fewer hard-coded flows β†’ cleaner design

---

## πŸ”— Resources

* πŸ“˜ [OpenAI Agents SDK Docs](https://openai.github.io/openai-agents-python/)
* 🎧 [MiniMax TTS API](https://www.minimaxi.com/)
* 🎨 [Gradio Docs](https://www.gradio.app/docs)

---

> **Built with the Official OpenAI Agents SDK πŸš€**
> Keep your `.env` file secure β€” it’s already ignored by Git.

---

Would you like me to include a **short β€œHow to extend this project”** section (e.g., adding new tools or agents) at the end? It makes the README feel even more β€œdeveloper-friendly” for open-source contributors.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
"""News agents module for news summarizer application."""

# Our custom agent instances (they import from openai-agents SDK internally)
from .news_aggregator import news_aggregator_agent, aggregate_news
from .summarizer import summarizer_agent
from .audio_generator import audio_generator_agent, synthesize_speech

__all__ = [
"news_aggregator_agent",
"summarizer_agent",
"audio_generator_agent",
"aggregate_news",
"synthesize_speech",
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
"""Audio Generator Agent - Converts text to speech using MiniMax TTS."""

import os
import aiohttp
import aiofiles
from datetime import datetime
from pathlib import Path
from typing import Dict, Any
import uuid

from agents import Agent, function_tool


@function_tool
async def synthesize_speech(
text: str,
voice_id: str = "Chinese (Mandarin)_News_Anchor",
speed: float = 0.95,
pitch: int = -1
) -> str:
"""Synthesize speech from text using MiniMax TTS.

Args:
text: Text to convert to speech
voice_id: Voice ID to use
speed: Speech speed (0.5-2.0)
pitch: Pitch adjustment (-20 to 20)

Returns:
Path to the generated audio file
"""
api_key = os.getenv("MINIMAX_API_KEY")
if not api_key:
raise ValueError("MINIMAX_API_KEY must be set in environment variables")

url = "https://api.minimaxi.chat/v1/t2a_v2"

# Generate unique filename to avoid collisions
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
unique_id = str(uuid.uuid4())[:8]
audio_filename = f"news_summary_{timestamp}_{unique_id}.mp3"
audio_path = Path(audio_filename).absolute()

headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}

payload = {
"model": "speech-02-hd",
"text": text,
"voice_setting": {
"voice_id": voice_id,
"speed": speed,
"pitch": pitch,
"emotion": "neutral"
}
}

# Use aiohttp for async HTTP request
async with aiohttp.ClientSession() as session:
async with session.post(url, headers=headers, json=payload) as response:
if response.status == 200:
result = await response.json()

# MiniMax returns hex-encoded audio in result['data']['audio']
if 'data' in result and 'audio' in result['data']:
# Convert hex-encoded audio to bytes
audio_data = bytes.fromhex(result['data']['audio'])

# Write audio file asynchronously
async with aiofiles.open(audio_path, 'wb') as f:
await f.write(audio_data)

return str(audio_path)
else:
raise Exception(f"Unexpected response format: {result}")
else:
error_text = await response.text()
raise Exception(f"Error: {response.status} - {error_text}")


# Create the Audio Generator Agent with Gemini via LiteLLM
AUDIO_GENERATOR_INSTRUCTIONS = """You are an audio generator agent. Your task is to convert text
summaries into audio files using the MiniMax TTS API.

When given text, use the synthesize_speech tool to generate high-quality Chinese audio.
Return the path to the generated audio file."""

audio_generator_agent = Agent(
name="Audio Generator",
instructions=AUDIO_GENERATOR_INSTRUCTIONS,
tools=[synthesize_speech],
model="gpt-4o-mini",
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
"""News Aggregator Agent - Fetches news from RSS feeds."""

import asyncio
import aiohttp
import feedparser
from typing import Dict, List, Any

from agents import Agent, function_tool


FEEDS = {
'tech': [
'https://techcrunch.com/feed/',
'https://www.theverge.com/rss/index.xml',
'https://feeds.arstechnica.com/arstechnica/index'
],
'world': [
'http://feeds.bbci.co.uk/news/world/rss.xml',
'https://rss.nytimes.com/services/xml/rss/nyt/World.xml'
],
'business': [
'http://feeds.bbci.co.uk/news/business/rss.xml',
'https://www.cnbc.com/id/100003114/device/rss/rss.html'
],
'politics': [
'https://rss.nytimes.com/services/xml/rss/nyt/Politics.xml',
'http://feeds.bbci.co.uk/news/politics/rss.xml'
],
'sports': [
'https://rss.nytimes.com/services/xml/rss/nyt/Sports.xml',
'http://feeds.bbci.co.uk/sport/rss.xml'
]
}


async def fetch_feed(session: aiohttp.ClientSession, url: str) -> List[Dict[str, str]]:
"""Fetch articles from a single RSS feed.

Args:
session: aiohttp client session
url: RSS feed URL

Returns:
List of article dictionaries with title, summary, link, and published date
"""
try:
async with session.get(url, timeout=10) as response:
content = await response.text()
feed = feedparser.parse(content)
return [
{
'title': entry.title,
'summary': entry.get('summary', ''),
'link': entry.link,
'published': entry.get('published', '')
}
for entry in feed.entries[:3]
]
except Exception as e:
print(f"Error fetching {url}: {e}")
return []


@function_tool
async def aggregate_news(topic: str, num_sources: int = 5) -> Dict[str, Any]:
"""Aggregate news from RSS feeds concurrently.

Args:
topic: News topic (tech, world, business, politics, sports)
num_sources: Number of RSS sources to fetch from

Returns:
Dictionary containing list of articles
"""
feed_urls = FEEDS.get(topic, FEEDS['tech'])[:num_sources]

async with aiohttp.ClientSession() as session:
tasks = [fetch_feed(session, url) for url in feed_urls]
results = await asyncio.gather(*tasks)

# Flatten results
articles = [article for feed_articles in results for article in feed_articles]

return {"articles": articles[:15]}


# Create the News Aggregator Agent with Gemini via LiteLLM
NEWS_AGGREGATOR_INSTRUCTIONS = """You are a news aggregator agent. Your task is to fetch and aggregate
news articles from RSS feeds for a given topic.

When asked to get news on a topic, use the aggregate_news tool with the topic name
(tech, world, business, politics, or sports). The tool will fetch articles from multiple sources
and return a list of the most recent articles."""

news_aggregator_agent = Agent(
name="News Aggregator",
instructions=NEWS_AGGREGATOR_INSTRUCTIONS,
tools=[aggregate_news],
model="gpt-4o-mini",
)
Loading