A tool that ranks daily arXiv postings based on your personal research interests using a small local Large Language Model (LLM).
- Personalized Ranking: Builds a profile from your own papers (via arXiv IDs) and manual interest topics.
- Local LLM: Uses
mlx-lmto run a quantized Llama-3 model locally on your Mac (Apple Silicon optimized). - Daily Updates: Fetches the latest papers from specified arXiv categories (e.g.,
astro-ph). - Hybrid Profiling: Combines automated analysis of your past work with explicit manual interests.
- Smart Caching: Caches your profile summary for 30 days (configurable) to save time.
- macOS with Apple Silicon (M1/M2/M3).
- Python 3.10+.
condaormamba(recommended).
-
Clone the repository (or download the script).
-
Create an environment:
mamba create -n arxiv-ranker -c conda-forge \ python=3.11 \ feedparser \ arxiv \ python-dateutil \ tqdm \ pyyaml \ mlx-lm mamba activate arxiv-ranker
Edit config.yaml to customize your experience:
categories:
- astro-ph # arXiv categories to watch
lookback_hours: 36 # How far back to fetch papers
max_papers_per_category: 150
# Your own papers (arXiv IDs) to build the profile:
profile_arxiv_ids:
- 2505.03888
- 2406.07620
# Manual interests to supplement your paper-based profile
manual_interests:
- "Machine learning for astrophysics"
- "Time-domain astronomy"
local_model: mlx-community/Llama-3.2-3B-Instruct-4bit
output_markdown: ranked_arxiv.md
profile_refresh_days: 30 # Regenerate profile every N daysRun the script:
python rank_arxiv_daily.pyOn the first run, it will:
- Download the local LLM (approx. 2-3 GB).
- Fetch your profile papers and generate a summary.
- Fetch daily papers and rank them.
Subsequent runs will use the cached profile and model.
If you want to regenerate your profile immediately (e.g., after changing manual_interests):
rm .cache_arxiv_ranker/profile_summary.txt
python rank_arxiv_daily.pyThe script generates a Markdown file (default: ranked_arxiv.md) containing:
- A summary of your user profile.
- A ranked list of papers, sorted by relevance score (0-100).
- Rationales for why each paper matches your interests.