RSS Collector

Overview

Fetches and parses RSS/Atom feeds from 87 sources. Each feed item becomes a signal with title, content, publication date, source handle, and URL.

Schedule

Every 4 hours via Dagu.

Sources

87 feeds across 14 categories: counter-disinformation (7), government (8), Estonian media (9), Baltic media (14), Finnish media (3), Polish media (2), Russian-language Estonian (5), defense & OSINT (14), Ukrainian media (2), Russian state (7), Russian independent (6), pro-Kremlin (4), mainstream (5), commentary (1).

Full source list: Media Monitoring → RSS feeds.

Processing

  1. Fetch each feed URL, parse XML
  2. Deduplicate by URL against existing signals
  3. Extract title, content (stripped HTML), publication date
  4. Tag with source handle, tier, category, and region from feeds.yaml
  5. Submit to ingest API

Configuration

dagu/config/feeds.yaml — defines all feeds with handle, name, URL, tier, category, and region tags.