Skip to content

fix: Use official OpenAI RSS feed for Research posts#62

Open
Jah-yee wants to merge 1 commit intoOlshansk:mainfrom
Jah-yee:main
Open

fix: Use official OpenAI RSS feed for Research posts#62
Jah-yee wants to merge 1 commit intoOlshansk:mainfrom
Jah-yee:main

Conversation

@Jah-yee
Copy link
Copy Markdown

@Jah-yee Jah-yee commented Mar 19, 2026

Summary

Replace fragile Selenium scraping with direct RSS parsing from OpenAI's official feed at https://openai.com/news/rss.xml, filtering for 'Research' category.

Problem

The existing used (Selenium) to scrape the research page, which was failing silently and producing an empty feed with no items (#60).

Solution

  • Remove Selenium/ChromeDriver dependency
  • Use to fetch the official OpenAI RSS feed
  • Filter items by 'Research' category using BeautifulSoup
  • Sort posts by date (newest first)
  • Handle None dates gracefully

Result

The feed now contains 182 research posts instead of being empty.

Testing

Ran the updated script locally and verified:

  • 182 research posts parsed successfully
  • Feed XML validates correctly
  • All items have proper title, link, description, and date

Closes #60

Replace fragile Selenium scraping with direct RSS parsing from
https://openai.com/news/rss.xml, filtering for 'Research' category.

Fixes Olshansk#60 - feed_openai_research.xml now contains 182 research posts
instead of being empty.

Changes:
- Remove undetected_chromedriver Selenium dependency
- Use requests + BeautifulSoup for RSS parsing
- Filter items by 'Research' category from official OpenAI feed
- Sort posts by date (newest first)
- Handle None dates gracefully
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feed_openai_research.xml is empty - no items generated

1 participant