Monthly | Top 10 GitHub Repos | October 2025
Noteworthy data-ops & analytics repos that first shipped between one and three years ago.
#10. Cinnamon/kotaemon
An open-source RAG-based tool for chatting with your documents.
[”chatbot”, “llms”, “open-source”, “rag”]
This repo was first pushed to Github on 2024-04-03. Its license was listed as: Apache License 2.0. Its primary language is Python.
#9. adithya-s-k/omniparse
This repo was first pushed to Github on 2024-06-10. Its license was listed as: Apache License 2.0. Its primary language is Mixed/Unspecified.
#8. z1069614715/objectdetection_script
一些关于目标检测的脚本的改进思路代码,详细请看readme.md
This repo was first pushed to Github on 2023-01-05. Its primary language is Python.
#7. Azure-Samples/azure-search-openai-demo
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
["azure","azurecognitivesearch","chatgpt","openai","azureopenai","azd-templates"]
This repo was first pushed to Github on 2023-03-09. Its license was listed as: MIT License. Its primary language is Python.
#6. weaviate/Verba
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
This repo was first pushed to Github on 2023-07-28. Its license was listed as: BSD 3-Clause "New" or "Revised" License. Its primary language is Python.
#5. facebookresearch/nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
This repo was first pushed to Github on 2023-08-28. Its license was listed as: MIT License. Its primary language is Python.
#4. chidiwilliams/buzz
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
["whisper"]
This repo was first pushed to Github on 2022-09-24. Its license was listed as: MIT License. Its primary language is Python.
#3. openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
This repo was first pushed to Github on 2022-12-15. Its license was listed as: MIT License. Its primary language is Python.
#2. Byaidu/PDFMathTranslate
Latex PDF 翻译及双语对照,保留公式和图表排版
["chinese", "latex", "pdf", "translation"]
This repo was first pushed to Github on 2024-09-06. Its license was listed as: MIT License. Its primary language is Python.
#1. drawdb-io/drawdb
Free, simple, and intuitive online database design tool and SQL generator.
This repo was first pushed to Github on 2024-04-06. Its license was listed as: MIT License. Its primary language is JavaScript.



