Weekly | Top 7 GitHub Repos | Week 41 - 2025
Noteworthy data-ops & analytics repos that first shipped less than a year ago.
#7. CodeWithCJ/SparkyFitness
SparkyFitness: Built for Families. Powered by AI. Track food, fitness, water, and health — together.
[”ai”, “artificial-intelligence”, “fitness”, “fitness-app”, “fitness-tracker”, “health”, “health-coaching”, “healthcheck”, “self-hosted”, “selfhosted”]
This repo was first pushed to Github on 2025-06-21. Its license was listed as: Other. Its primary language is TypeScript.
#6. IlyaRice/RAG-Challenge-2
Implementation of my RAG system that won all categories in Enterprise RAG Challenge 2
This repo was first pushed to Github on 2025-03-19. Its license was listed as: MIT License. Its primary language is Python.
#5. firecrawl/firecrawl-mcp-server
🔥 Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
[”batch-processing“, “claude”, “content-extraction”, “data-collection”, “firecrawl”, “firecrawl-ai”, “llm-tools”, “mcp-server”, “model-context-protocol”, “search-api”, “web-crawler”, “web-scraping”, “javascript-rendering“]
This repo was first pushed to Github on 2024-12-06. Its license was listed as: MIT License. Its primary language is JavaScript.
#4. tdrussell/diffusion-pipe
A pipeline parallel training script for diffusion models.
This repo was first pushed to Github on 2024-12-11. Its license was listed as: MIT License. Its primary language is Python.
#3. oxylabs/perplexity-scraper
Perplexity Scraper Track brand mentions, analyze rankings, and gain competitor intelligence from Perplexity. Get started in minutes.
["ai-scraper", "llm-api", "llm-scraper", "llm-scraping", "perplexity", "perplexity-api", "proxy-scraper", "perplexity-scraper"]
This repo was first pushed to Github on 2025-09-07. Its primary language is Java.
#2. facebook/openzl
A novel data compression framework
This repo was first pushed to Github on 2025-10-06. Its license was listed as: Other. Its primary language is C/C++.
#1. roboflow/trackers
A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms
["deep-sort", "multi-object-tracking", "sort"]
This repo was first pushed to Github on 2025-04-14. Its license was listed as: Apache License 2.0. Its primary language is Python.



