Showing 155 open source projects for "python data analysis"

View related business solutions
  • Turn traffic into pipeline and prospects into customers Icon
    Turn traffic into pipeline and prospects into customers

    For account executives and sales engineers looking for a solution to manage their insights and sales data

    Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
    Learn More
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 1
    Scikit-LLM

    Scikit-LLM

    Seamlessly integrate LLMs into scikit-learn

    Seamlessly integrate powerful language models like ChatGPT into sci-kit-learn for enhanced text analysis tasks. At the moment the majority of the Scikit-LLM estimators are only compatible with some of the OpenAI models. Hence, a user-provided OpenAI API key is required. Additionally, Scikit-LLM will ensure that the obtained response contains a valid label. If this is not the case, a label will be selected randomly (label probabilities are proportional to label occurrences in the training...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Kor

    Kor

    LLM

    This is a half-baked prototype that “helps” you extract structured data from text using LLMs. Specify the schema of what should be extracted and provide some examples. Kor will generate a prompt, send it to the specified LLM and parse out the output. You might even get results back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ai-cookbook

    ai-cookbook

    Examples and tutorials to help developers build AI systems

    ...The repository contains examples that demonstrate how to build AI workflows using modern tools such as large language models, autonomous agents, and external APIs. Developers can learn how to construct applications like intelligent assistants, automation pipelines, and AI-powered data analysis tools through step-by-step tutorials and ready-to-run scripts. The code examples are designed to emphasize practical architecture patterns that are commonly used in production environments, helping developers understand how to integrate AI services into software products.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight Icon
    Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight

    Lock Down Any Resource, Anywhere, Anytime

    CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.
    Learn More
  • 5
    llmware

    llmware

    Unified framework for building enterprise RAG pipelines

    llmware is an open source framework designed to simplify the creation of enterprise-grade applications powered by large language models. The platform focuses on building secure and private AI workflows that can run locally on laptops, edge devices, or self-hosted servers without relying exclusively on cloud APIs. It provides a unified interface for constructing retrieval-augmented generation pipelines, agent workflows, and document intelligence applications. One of the framework’s defining...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    MegaParse

    MegaParse

    File Parser optimised for LLM Ingestion with no loss

    MegaParse is a file parser optimized for Large Language Model (LLM) ingestion, ensuring no loss of information. It efficiently parses various document formats, such as PDFs, DOCX, and PPTX, converting them into formats ideal for processing by LLMs. This tool is essential for applications that require accurate and comprehensive data extraction from diverse document types.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    E2M

    E2M

    E2M converts various file types (doc, docx, epub, html, htm, url

    E2M is a SourceForge mirror of the e2m open-source project, which focuses on providing tools or services designed to convert or process content between different formats or systems. Projects with similar naming conventions typically emphasize automation workflows where input data from one environment is transformed into another representation or output structure. The mirrored repository allows users to access the project’s codebase independently from its original hosting platform while...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LongBench

    LongBench

    LongBench v2 and LongBench (ACL 25'&24')

    LongBench is a comprehensive benchmark designed to evaluate the ability of large language models to understand and reason over very long textual contexts. Traditional language model benchmarks typically evaluate tasks involving relatively short inputs, which does not reflect many real-world applications such as analyzing large documents or entire code repositories. LongBench addresses this gap by providing datasets that require models to process and reason over long sequences of text across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 10
    llms-from-scratch-cn

    llms-from-scratch-cn

    Build a large language model from 0 only with Python foundation

    ...Rather than focusing on using pre-trained models through APIs, the project emphasizes understanding the internal mechanisms of modern language models, including tokenization, attention mechanisms, transformer architecture, and training workflows. Through a collection of notebooks, code examples, and translated learning materials, users can explore how to implement components such as multi-head attention, data loaders, and training pipelines using Python and PyTorch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    OpenDAN

    OpenDAN

    OpenDAN is an open source Personal AI OS

    OpenDAN is an open-source Personal AI OS , that consolidates various AI modules in one place for your personal use. The goal of OpenDAN (Open and Do Anything Now with AI) is to create a Personal AI OS , which provides a runtime environment for various Al modules as well as protocols for interoperability between them. With OpenDAN, users can securely collaborate with various AI modules using their private data to create powerful personal AI agents, such as butlers, lawyers, doctors, teachers,...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    SGR Agent Core

    SGR Agent Core

    Schema-Guided Reasoning (SGR) has agentic system design

    SGR Agent Core is an open-source framework for building intelligent AI research agents based on a methodology known as Schema-Guided Reasoning (SGR). The framework provides a core library that allows developers to design autonomous agents capable of structured reasoning and complex task execution. Instead of relying solely on free-form prompts, the system organizes reasoning processes around schemas that guide how agents analyze problems, gather information, and generate outputs. This...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    slime LLM

    slime LLM

    slime is an LLM post-training framework for RL Scaling

    slime is an open-source large language model (LLM) post-training framework developed to support reinforcement learning (RL)-based scaling and high-performance training workflows for advanced LLMs, blending training and rollout modules into an extensible system. It offers a flexible architecture that connects high-throughput training (e.g., via Megatron-LM) with a customizable data generation pipeline, enabling researchers and engineers to iterate on new RL training paradigms effectively. The...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Prometheus-Eval

    Prometheus-Eval

    Evaluate your LLM's response with Prometheus and GPT4

    ...It implements an “LLM-as-a-judge” approach in which a dedicated language model analyzes instruction–response pairs and assigns scores or rankings based on predefined evaluation criteria. The repository includes a Python package that provides a straightforward interface for running evaluations and integrating them into model development pipelines. It also provides training data and utilities for fine-tuning evaluator models so they can assess outputs according to custom scoring rubrics such as helpfulness, accuracy, or style.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    MetaScreener

    MetaScreener

    AI-powered tool for efficient abstract and PDF screening

    MetaScreener is an open-source AI-assisted tool designed to streamline the screening process in systematic literature reviews and academic research workflows. The system helps researchers analyze large collections of academic abstracts and research papers to determine which studies are relevant for inclusion in evidence synthesis projects. Instead of manually reviewing hundreds or thousands of documents, researchers can use MetaScreener to apply machine learning techniques that assist with...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    GPT Academic

    GPT Academic

    Research-oriented chatbot framework

    GPT Academic is a research-oriented chatbot framework designed to integrate large language models (LLMs) into academic workflows. It provides tools for structured document processing, citation management, and enhanced interaction with research papers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    OpenLLMetry

    OpenLLMetry

    Open-source observability for your LLM application

    The repo contains standard OpenTelemetry instrumentations for LLM providers and Vector DBs, as well as a Traceloop SDK that makes it easy to get started with OpenLLMetry, while still outputting standard OpenTelemetry data that can be connected to your observability stack. If you already have OpenTelemetry instrumented, you can just add any of our instrumentations directly.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos. Deep Lake is used by Google, Waymo,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Index

    Index

    The SOTA Open-Source Browser Agent

    Index is an open-source browser automation agent designed to autonomously perform complex tasks across websites by transforming web interfaces into programmable APIs. The system enables developers to instruct an AI agent to interact with web pages using natural language rather than traditional automation scripts. Instead of writing detailed browser automation code, users can describe the desired task and allow the agent to interpret the page structure, interact with elements, and complete...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    NVIDIA Generative AI Examples

    NVIDIA Generative AI Examples

    Generative AI reference workflows

    NVIDIA GenerativeAIExamples is an open-source repository that provides practical reference implementations and example workflows for building generative AI applications using NVIDIA’s software ecosystem. The project is designed to help developers accelerate the development of AI applications by providing ready-to-run pipelines, notebooks, and tools that demonstrate how to integrate large language models into real-world systems. The repository includes examples covering topics such as...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    UFO³

    UFO³

    Weaving the Digital Agent Galaxy

    UFO is an open-source framework developed by Microsoft for building intelligent agents that automate interactions with graphical user interfaces on the Windows operating system. The system allows users to issue natural language instructions that are translated into automated actions across multiple desktop applications. Using a dual-agent architecture, the framework analyzes both visual interface elements and system control structures in order to understand how applications should be...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    JamAI Base

    JamAI Base

    The collaborative spreadsheet for AI

    JamAI Base is an open-source backend platform designed to simplify the development of retrieval-augmented generation systems and AI-driven applications. The platform integrates both a relational database and a vector database into a single embedded architecture, allowing developers to store structured data alongside semantic embeddings. It includes built-in orchestration for large language models, vector search, and reranking pipelines so that AI applications can retrieve relevant...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    II Agent

    II Agent

    A new open-source framework to build and deploy intelligent agents

    II-Agent is an open-source intelligent assistant framework designed to automate complex workflows across multiple domains using large language models and external tools. The platform allows users to interact with multiple AI models within a single environment while connecting those models to external services and knowledge sources. Through a unified interface, users can switch between models, access specialized tools, and execute tasks that require information retrieval, code execution, or...
    Downloads: 3 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB