Skip to content

Create rss collection with feed syncing #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .changeset/add-rss-atom-collection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
"@tanstack/rss-db-collection": patch
---

Add RSS and Atom feed collections for TanStack DB

Introduces `@tanstack/rss-db-collection` package with:

- `rssCollectionOptions()` for RSS 2.0 feeds
- `atomCollectionOptions()` for Atom 1.0 feeds
- Automatic polling with configurable intervals
- Built-in deduplication based on feed item IDs
- Custom transform functions for data normalization
- Full TypeScript support with proper type inference
- Error recovery and robust feed parsing
- HTTP configuration options for headers and timeouts

Both collection types provide seamless integration with TanStack DB's live queries and optimistic mutations, allowing you to sync RSS/Atom feed data and query it alongside other collection types.
81 changes: 79 additions & 2 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,9 @@ There are a number of built-in collection types:
1. [`QueryCollection`](#querycollection) to load data into collections using [TanStack Query](https://tanstack.com/query)
2. [`ElectricCollection`](#electriccollection) to sync data into collections using [ElectricSQL](https://electric-sql.com)
3. [`TrailBaseCollection`](#trailbasecollection) to sync data into collections using [TrailBase](https://trailbase.io)
4. [`LocalStorageCollection`](#localstoragecollection) for small amounts of local-only state that syncs across browser tabs
5. [`LocalOnlyCollection`](#localonlycollection) for in-memory client data or UI state
4. [`RSSCollection` and `AtomCollection`](#rsscollection-and-atomcollection) to sync data from RSS and Atom feeds with automatic polling
5. [`LocalStorageCollection`](#localstoragecollection) for small amounts of local-only state that syncs across browser tabs
6. [`LocalOnlyCollection`](#localonlycollection) for in-memory client data or UI state

You can also use:

Expand Down Expand Up @@ -297,6 +298,82 @@ This collection requires the following TrailBase-specific options:

A new collections doesn't start syncing until you call `collection.preload()` or you query it.

#### `RSSCollection` and `AtomCollection`

RSS and Atom feeds are widely used syndication formats for publishing frequently updated content like blogs, news, and podcasts. TanStack DB provides dedicated collection types for both RSS 2.0 and Atom 1.0 feeds with automatic polling, deduplication, and type safety.

Use `rssCollectionOptions` for RSS feeds or `atomCollectionOptions` for Atom feeds to sync feed data into collections:

```ts
import { createCollection } from "@tanstack/react-db"
import { rssCollectionOptions, atomCollectionOptions } from "@tanstack/rss-db-collection"

// RSS Collection
export const blogFeed = createCollection(
rssCollectionOptions({
id: "blog-posts",
feedUrl: "https://blog.example.com/rss.xml",
pollingInterval: 5 * 60 * 1000, // Poll every 5 minutes
getKey: (item) => item.guid || item.link,
transform: (item) => ({
id: item.guid || item.link || '',
title: item.title || '',
description: item.description || '',
link: item.link || '',
publishedAt: new Date(item.pubDate || Date.now()),
author: item.author
}),
schema: blogPostSchema,
})
)

// Atom Collection
export const newsFeed = createCollection(
atomCollectionOptions({
id: "news-items",
feedUrl: "https://news.example.com/atom.xml",
pollingInterval: 10 * 60 * 1000, // Poll every 10 minutes
getKey: (item) => item.id,
transform: (item) => ({
id: item.id || '',
title: typeof item.title === 'string' ? item.title : item.title?.$text || '',
description: typeof item.summary === 'string' ? item.summary : item.summary?.$text || '',
link: typeof item.link === 'string' ? item.link : item.link?.href || '',
publishedAt: new Date(item.published || item.updated || Date.now()),
author: typeof item.author === 'object' ? item.author?.name : item.author
}),
schema: newsItemSchema,
})
)
```

Both collection types require:

- `feedUrl` — the RSS or Atom feed URL to fetch from
- `getKey` — identifies the unique ID for feed items
- `pollingInterval` — how frequently to check for new items (default: 5 minutes)

Optional configuration includes:

- `transform` — custom function to normalize feed items to your desired format
- `httpOptions` — custom headers, timeout, and user agent settings
- `startPolling` — whether to begin polling immediately (default: true)
- `maxSeenItems` — maximum items to track for deduplication (default: 1000)

RSS and Atom collections automatically handle feed parsing, deduplication of items, and provide built-in error recovery. The collections will continue polling even after network failures or parsing errors.

Collections can be manually refreshed when needed:

```ts
// Manually refresh the feed data
await blogFeed.utils.refresh()

// Clear deduplication cache if needed
blogFeed.utils.clearSeenItems()

// Check how many items have been tracked
console.log(`Tracked items: ${blogFeed.utils.getSeenItemsCount()}`)
```

#### `LocalStorageCollection`

Expand Down
8 changes: 8 additions & 0 deletions packages/db-ivm/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,11 @@ export * from "./d2.js"
export * from "./multiset.js"
export * from "./operators/index.js"
export * from "./types.js"

// Export additional types and functions that are needed
export type { MultiSetArray } from "./multiset.js"
export { MultiSet } from "./multiset.js"
export type { IStreamBuilder, KeyValue } from "./types.js"
export { RootStreamBuilder } from "./d2.js"
export { orderByWithFractionalIndex } from "./operators/orderBy.js"
export type { JoinType } from "./operators/join.js"
Loading
Loading