Showing 217 open source projects for "sql data generator"

View related business solutions
  • The fastest way to host, scale and get paid on WordPress Icon
    The fastest way to host, scale and get paid on WordPress

    For developers searching for a web hosting solution

    Lightning-fast hosting, AI-assisted site management, and enterprise payments all in one platform designed for agencies and growth-focused businesses.
    Learn More
  • AI-Powered Identity Governance Icon
    AI-Powered Identity Governance

    For IT Teams and MSPs in need of a solution to simplify, optimize and secure their SaaS, file, and device management operations

    Define governance policies, manage access, and optimize licenses with unified visibility across every identity, app, and file.
    Learn More
  • 1
    Synthetic Data Generator

    Synthetic Data Generator

    SDG is a specialized framework

    Synthetic Data Generator is an open-source framework designed to generate high-quality synthetic tabular datasets that replicate the statistical characteristics of real data while avoiding privacy risks. The platform enables developers and data scientists to create artificial datasets that preserve important relationships between variables without containing sensitive personal information.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    SQL Explorer

    SQL Explorer

    Easily share data across your company via SQL queries

    SQL Explorer aims to make the flow of data between people fast, simple, and confusion-free. It is a Django-based application that you can add to an existing Django site, or use as a standalone business intelligence tool. Quickly write and share SQL queries in a simple, usable SQL editor, preview the results in the browser, share links, download CSV, JSON, or Excel files (and even expose queries as API endpoints, if desired), and keep the information flowing! ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    Dash Data Agent

    Dash Data Agent

    Self-learning data agent that grounds its answers in layers of content

    Dash is a self-learning data agent built by the Agno AI community that generates grounded answers to English queries over structured data by synthesizing SQL and reasoning based on six layers of context, improving automatically with each run. It sidesteps common limitations of simple text-to-SQL agents by incorporating multiple context layers — including schema structure, human annotations, known query patterns, institutional knowledge from docs, machine-discovered error patterns, and live runtime context — to generate SQL queries that are both technically correct and semantically meaningful. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Hello SQL

    Hello SQL

    Spanish-language course repository that teaches fundamentals of SQL

    ...The materials emphasize real-world query writing, schema design basics, and the mental model behind SELECT, JOIN, GROUP BY, and subqueries. Learners progress from setup and connection to hands-on exercises that build confidence with CRUD operations and data modeling. The repository’s structure favors incremental learning, with clear folders, references, and exercises you can run locally. It targets absolute beginners as well as developers from other stacks who want a clean, project-based path into SQL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stigg | SaaS Monetization and Entitlements API Icon
    Stigg | SaaS Monetization and Entitlements API

    For developers in need of a tool to launch pricing plans faster and build better buying experiences

    A monetization platform is a standalone middleware that sits between your application and your business applications, as part of the modern enterprise billing stack. Stigg unifies all the APIs and abstractions billing and platform engineers had to build and maintain in-house otherwise. Acting as your centralized source of truth, with a highly scalable and flexible entitlements management, rolling out any pricing and packaging change is now a self-service, risk-free, exercise.
    Learn More
  • 5
    Data Science Interviews

    Data Science Interviews

    Data science interview questions and answers

    Data Science Interviews is an open-source repository that collects common data science interview questions along with community-provided answers and explanations. The project serves as a preparation resource for students, job seekers, and professionals who want to review the technical knowledge required for data science roles. The repository organizes questions into different categories including theoretical machine learning concepts, technical programming questions, and probability or statistics problems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    ...The project includes ready-to-use applications that showcase these agents in action, such as an exploratory data analysis copilot that generates reports, a pandas data analyst that combines wrangling and plotting, and SQL database agents that can query business databases and output results directly.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    cracking-the-data-science-interview

    cracking-the-data-science-interview

    A Collection of Cheatsheets, Books, Questions, and Portfolio

    Cracking the Data Science Interview is an open educational repository that collects study materials, resources, and reference links for preparing for data science interviews. The project organizes content across many fundamental areas of data science, including statistics, probability, SQL, machine learning, and deep learning. It includes cheat sheets that summarize important technical concepts commonly discussed during technical interviews.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Mimesis

    Mimesis

    High-performance fake data generator for Python

    Mimesis is an open source high-performance fake data generator for Python, able to provide data for various purposes in various languages. It's currently the fastest fake data generator for Python, and supports many different data providers that can produce data related to people, food, transportation, internet and many more. Mimesis is really easy to use, with everything you need just an import away.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Pony ORM

    Pony ORM

    Pony Object Relational Mapper

    Pony ORM is a Python ORM that enables developers to write database queries using generator expressions and Pythonic syntax, making code more readable and intuitive. It automatically translates Python expressions into SQL and supports multiple databases including SQLite, MySQL, PostgreSQL, and Oracle. With an emphasis on simplicity and maintainability, Pony ORM is suitable for both small projects and complex applications.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Cloud-hosted construction project information management for improved communication, and increased efficiency. Icon
    Cloud-hosted construction project information management for improved communication, and increased efficiency.

    Ideal for on-premise project information management.

    Newforma empowers over 4M professionals and 1,500 AECO firms worldwide by revolutionizing Project Information Management. We transform vast amounts of project data into a meticulously organized, easily accessible, and fully searchable resource—all from a single, centralized platform. From pre-construction to years after completion, Newforma ensures you have the critical information you need at every stage of your projects.
    Learn More
  • 10
    Mage.ai

    Mage.ai

    Build, run, and manage data pipelines for integrating data

    Open-source data pipeline tool for transforming and integrating data. The modern replacement for Airflow. Effortlessly integrate and synchronize data from 3rd party sources. Build real-time and batch pipelines to transform data using Python, SQL, and R. Run, monitor, and orchestrate thousands of pipelines without losing sleep. Have you met anyone who said they loved developing in Airflow?
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI-Powered Knowledge Graph is an open-source project focused on building knowledge graph systems that integrate artificial intelligence and machine learning to represent complex relationships between data entities. Knowledge graphs organize information as networks of nodes and relationships, allowing applications to analyze connections between concepts, datasets, or real-world entities. By incorporating AI techniques such as natural language processing and semantic reasoning, the project...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SQLAlchemy

    SQLAlchemy

    The Database Toolkit for Python

    SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. SQLAlchemy provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language. An industrial strength ORM, built from the core on the identity map, unit of work, and data mapper patterns.
    Downloads: 145 This Week
    Last Update:
    See Project
  • 13
    Databend

    Databend

    Cloud-native open source data warehouse for analytics and AI queries

    ...This architecture enables cost-efficient storage and elastic scaling for workloads that involve large datasets and complex queries. Databend provides a unified engine capable of handling analytics, vector search, and full-text search within a single platform. Databend supports SQL-based workflows and enables real-time data ingestion, transformation, and analysis through streaming and task orchestration features. With its cloud-native design and distributed architecture, Databend can run both as a self-hosted system or within managed environments to power data analytics, AI workloads, and large-scale data.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 14
    Vanna

    Vanna

    Chat with your SQL database

    Vanna.AI is an AI-powered tool for natural language database querying, enabling users to interact with databases using simple English queries. It converts natural language questions into SQL queries, making data access more intuitive for non-technical users.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Grafana

    Grafana

    Leading open-source visualization and observability platform

    Grafana OSS is the leading open-source platform for visualization and observability. It enables teams to query, visualize, alert on, and explore telemetry data from multiple sources in a single interface. With support for 100+ data source plugins—including Prometheus, Loki, Elasticsearch, InfluxDB, SQL/NoSQL databases, and OpenTelemetry—Grafana helps teams correlate metrics, logs, and traces across applications and infrastructure. Users can build interactive dashboards with rich visualizations, template variables, and reusable panels to monitor systems and troubleshoot issues in real time. ...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 16
    Bracket

    Bracket

    Selfhosted tournament system

    Bracket is an open-source tool that tracks and manages data access across your PostgreSQL database. It provides visibility into which parts of your codebase are accessing which tables and columns, enabling data governance, security auditing, and architectural insights. Bracket is particularly helpful for growing teams needing better observability in complex applications.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    MindsDB

    MindsDB

    Making Enterprise Data Intelligent and Responsive for AI

    MindsDB is an AI data solution that enables humans, AI, agents, and applications to query data in natural language and SQL, and get highly accurate answers across disparate data sources and types. MindsDB connects to diverse data sources and applications, and unifies petabyte-scale structured and unstructured data. Powered by an industry-first cognitive engine that can operate anywhere (on-prem, VPC, serverless), it empowers both humans and AI with highly informed decision-making capabilities. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    SQLModel

    SQLModel

    SQL databases in Python, designed for simplicity, compatibility

    SQLModel, SQL databases in Python, designed for simplicity, compatibility, and robustness. SQLModel is a library for interacting with SQL databases from Python code, with Python objects. It is designed to be intuitive, easy to use, highly compatible, and robust. SQLModel is based on Python-type annotations, and powered by Pydantic and SQLAlchemy.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Liger Kernel

    Liger Kernel

    Efficient Triton Kernels for LLM Training

    Liger Kernel is a unified kernel developed by LinkedIn to streamline data science and machine learning workflows across different languages and tools. It provides a consistent interface for running code in various languages (such as Python, R, SQL) within a single Jupyter-like environment, enhancing productivity and collaboration for data scientists working in mixed-language projects.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    Xiyan MCP Server

    Xiyan MCP Server

    A Model Context Protocol (MCP) server

    The XiYan MCP Server is a Model Context Protocol (MCP) server that enables natural language queries to databases, powered by XiYan-SQL, a state-of-the-art text-to-SQL model. It allows users to interact with databases using conversational language, simplifying data retrieval processes. ​
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Vanna 2.0

    Vanna 2.0

    Chat with your SQL database

    Vanna is an open-source Python framework that enables natural language interaction with databases by converting user questions into executable SQL queries using large language models. The framework uses a retrieval-augmented generation architecture that learns from database schemas, documentation, and past query examples to generate accurate queries tailored to a specific dataset. Vanna can be integrated into many environments, including notebooks, web applications, messaging platforms, and data dashboards, making it flexible for analytics and data exploration workflows. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    MySQL MCP Server

    MySQL MCP Server

    A Model Context Protocol (MCP) server that enables secure interaction

    The MySQL MCP Server enables secure interaction with MySQL databases, allowing AI assistants to list tables, read data, and execute SQL queries through a controlled interface. It is designed for integration with AI applications like Claude Desktop and should not be run as a standalone Python program. ​
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    DeerFlow

    DeerFlow

    Deep Research framework, combining language models with tools

    DeerFlow is an open-source, community-driven “deep research” framework / multi-agent orchestration platform developed by ByteDance. It aims to combine the reasoning power of large language models (LLMs) with automated tool-use — such as web search, web crawling, Python execution, and data processing — to enable complex, end-to-end research workflows. Instead of a monolithic AI assistant, DeerFlow defines multiple specialized agents (e.g. “planner,” “searcher,” “coder,” “report generator”) that collaborate in a structured workflow, allowing tasks like literature reviews, data gathering, data analysis, code execution, and final report generation to be largely automated. ...
    Downloads: 72 This Week
    Last Update:
    See Project
  • 25
    Faker for Python

    Faker for Python

    Python package that generates fake data for you

    Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you. Starting from version 4.0.0, Faker dropped support for Python 2 and from version 5.0.0 only supports Python 3.6 and above. If you still need Python 2 compatibility, please install version 3.0.1 in the meantime, and please consider updating...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB