Showing 39 open source projects for "big data"

View related business solutions
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • Free Website Monitoring Service | UptimeRobot Icon
    Free Website Monitoring Service | UptimeRobot

    The free online uptime monitoring service with an App is available for iOS and Android.

    With the Free Plan, you can monitor up to 50 URLs, check for a website's content (using the keyword monitor), ping your server or monitor your ports in 5-minute intervals. You can create a status page to showcase your uptime. SMS or Call alerts can be bought anytime.
    Learn More
  • 1
    pandas

    pandas

    Fast, flexible and powerful Python data analysis toolkit

    pandas is a Python data analysis library that provides high-performance, user friendly data structures and data analysis tools for the Python programming language. It enables you to carry out entire data analysis workflows in Python without having to switch to a more domain specific language. With pandas, performance, productivity and collaboration in doing data analysis in Python can significantly increase. pandas is continuously being developed to be a fundamental high-level building...
    Downloads: 129 This Week
    Last Update:
    See Project
  • 2
    marimo

    marimo

    A reactive notebook for Python

    marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    LlamaIndex

    LlamaIndex

    Central interface to connect your LLM's with external data

    LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. LlamaIndex is a simple, flexible interface between your external data and LLMs. It provides the following tools in an easy-to-use fashion. Provides indices over your unstructured and structured data for use with LLM's. These indices help to abstract away common boilerplate and pain points for in-context learning. Dealing with prompt limitations (e.g. 4096 tokens for Davinci) when the context is too big. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Dask

    Dask

    Parallel computing with task scheduling

    ...It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Evertune | Improve Your Brand's Visibility in AI Search Icon
    Evertune | Improve Your Brand's Visibility in AI Search

    For enterprise marketing teams looking for a platform to understand and influence how AI models like ChatGPT recommend their products or services.

    Evertune is the Generative Engine Optimization (GEO) platform that helps brands improve visibility in AI search across ChatGPT, AI Overview, Gemini, Claude and more.
    Learn More
  • 5
    Modin

    Modin

    Scale your Pandas workflows by changing a single line of code

    Scale your pandas workflow by changing a single line of code. Modin uses Ray, Dask or Unidist to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical. It is not necessary to know in advance the available hardware resources in order to use Modin. Additionally, it is not necessary to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Kinto

    Kinto

    A generic JSON document store with sharing and synchronisation options

    ...Kinto is used at Mozilla and released under the Apache v2 license. It’s hard for frontend developers to respect users' privacy when building applications that work offline, store data remotely and synchronize across devices. Existing solutions either rely on big corporations that crave user data or require a non-trivial amount of time and expertise to set up a new server for every new project. We want to help developers focus on the front, and we don’t want the challenge of storing user data to get in their way. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Lithops

    Lithops

    A multi-cloud framework for big data analytics

    Lithops is an open-source serverless computing framework that enables transparent execution of Python functions across multiple cloud providers and on-prem infrastructure. It abstracts cloud providers like IBM Cloud, AWS, Azure, and Google Cloud into a unified interface and turns your Python functions into scalable, event-driven workloads. Lithops is ideal for data processing, ML inference, and embarrassingly parallel workloads, giving you the power of FaaS (Function-as-a-Service) without...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Streamlink

    Streamlink

    Streamlink is a CLI utility which pipes video streams

    ...The main purpose of Streamlink is to avoid resource-heavy and unoptimized websites, while still allowing the user to enjoy various streamed content. There is also an API available for developers who want access to the stream data. Streamlink is built upon a plugin system that allows support for new services to be easily added. Most of the big streaming services are supported. Streamlink is made up of two parts, a cli and a library API. See their respective sections for more information on how to use them. The default behavior of Streamlink is to playback streams in the VLC player. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Runn is a modern resource and capacity planning platform that gets remote teams on the same page. Icon
    Runn is a modern resource and capacity planning platform that gets remote teams on the same page.

    Runn is best suited for project managers, operations leads, resourcing managers and other people responsible for project delivery.

    Runn has a modern and easy-to-use interface that provides your team with a shared view of all the people and projects in your organization. Plan new work alongside existing projects and instantly see how changes to your plans and resourcing affect your company’s bottom line. Runn is intuitive to use and lets you quickly schedule work using simple drag and drop functionality. Runn also allows you to collaborate with your co-workers in real-time, seeing updates live without having to refresh your browser. Runn combines resource and capacity planning with integrated actual tracking and powerful forecasting to deliver meaningful insights and a full picture of your organization.
    Sign Up - 100% free until July!
  • 10
    CleanVision

    CleanVision

    Automatically find issues in image datasets

    CleanVision automatically detects potential issues in image datasets like images that are: blurry, under/over-exposed, (near) duplicates, etc. This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning. CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset! The quality of machine learning models hinges on the quality of the data used to train them, but it is hard to manually identify all of the low-quality data in a big dataset. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    ChatGLM2-6B

    ChatGLM2-6B

    ChatGLM2-6B: An Open Bilingual Chat LLM

    ChatGLM2-6B is the second-gen Chinese-English conversational LLM from ZhipuAI/Tsinghua. It upgrades the base model with GLM’s hybrid pretraining objective, 1.4 TB bilingual data, and preference alignment—delivering big gains on MMLU, CEval, GSM8K, and BBH. The context window extends up to 32K (FlashAttention), and Multi-Query Attention improves speed and memory use. The repo includes Python APIs, CLI & web demos, OpenAI-style/FASTAPI servers, and quantized checkpoints for lightweight local deployment on GPUs or CPU/MPS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    All-in-RAG

    All-in-RAG

    Big Model Application Development Practice 1

    All-in-RAG is an open-source educational project designed to teach developers how to build applications using retrieval-augmented generation techniques. The repository provides a structured learning path that covers both theoretical foundations and practical implementation steps for RAG systems. It explains the full development pipeline required to create knowledge-aware AI assistants, including data preparation, document indexing, vector embedding generation, and retrieval strategies. The...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    zpdf is a high-performance PDF text extraction library written in Zig that focuses on speed, low overhead, and modern parsing techniques. It leans heavily on memory-mapped file reading and zero-copy patterns where possible, so it can scan large PDFs without repeatedly copying data around in memory. The library supports streaming extraction using efficient arena allocation, making it well suited for workloads that need to process big documents quickly or in batches. It implements multiple PDF decompression filters and handles common font encoding pathways, which are essential for turning raw PDF content streams into readable text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    gravitino

    gravitino

    Unified metadata lake for data & AI assets.

    Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Old File Delete

    Old File Delete

    Clean up old files with a single click.

    OldFileDelete (OFD) is a lightweight and efficient utility designed for those who value minimalism and order. The app helps you instantly clear selected folders of accumulated digital clutter. Featuring a modern flat design, the interface is intuitive: simply select a folder, specify the number of days, and the program will find and remove outdated files. No complex settings—just cleanliness and speed.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Make sure to download from the link below and not the big giant button. I'm not sure how to fix that, so if you know!
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 17
    IPyPlot

    IPyPlot

    Fast and efficient plotting of images inside Python Notebooks

    IPyPlot is a small python package offering fast and efficient plotting of images inside Python Notebooks. It's using IPython with HTML for faster, richer and more interactive way of displaying big numbers of images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Arctic TimeSeries and Tick store

    Arctic TimeSeries and Tick store

    High performance datastore for time series and tick data

    Arctic is a timeseries/dataframe database that sits atop MongoDB. Arctic supports serialization of a number of datatypes for storage in the mongo document model. Serializes a number of data types eg. Pandas DataFrames, Numpy arrays, Python objects via pickling etc. so you don't have to handle different datatypes manually. Uses LZ4 compression by default on the client side to get big savings on network / disk. Allows you to version different stages of an object and snapshot the state (In some ways similar to git), and allows you to freely experiment and then just revert back the snapshot. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    ufonet

    ufonet

    UFONet - Denial of Service Toolkit

    UFONet - Is a set of hacktivist tools that allow launching coordinated DDoS and DoS attacks and combine both in a single offensive. It also works as an encrypted DarkNET to publish and receive content by creating a global client/server network based on a direct-connect P2P architecture. + FAQ: https://ufonet.03c8.net/FAQ.html -------------------------------------------- -> UFONet-v1.8 [DPh] "DarK-PhAnT0m!" (.zip) -> md5 = [ c8ab016f6370c8391e2e6f9a7cbe990a ] -> UFONet-v1.8...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 20
    Advanced Trigonometry Calculator

    Advanced Trigonometry Calculator

    Precision Trigonometry: Advanced Calculator for Complex Math

    Advanced Trigonometry Calculator is equipped with a user-friendly interface that allows for easy input of problems and instant computation. Professionals such as engineers who need to perform advanced trigonometric calculations in their work will find this tool extremely useful. ATC Online Alpha: https://advantrigoncalc.sourceforge.io/atc/ More info by clicking below: https://advantrigoncalc.sourceforge.io/ Advanced Trigonometry Calculator was only and always only developed by...
    Leader badge
    Downloads: 13 This Week
    Last Update:
    See Project
  • 21
    Parveshdhull AutoTyper

    Parveshdhull AutoTyper

    A Data Entry Tool for Windows and Linux

    Sometimes we have to write content in programs where copy-paste is not allowed, like in data entry software Notepad RT. There are many tools available online but almost all of them only provide trial versions. And requires big payment for continued access. And even if they are free, it is not wise to give complete access to a keyboard to any third-party software. So I wrote this simple-short python script that reads content from a text file then simulates keyboard typing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    The Art of Programming

    The Art of Programming

    A collection of practical tips can be found at the bottom of this page

    The Art of Programming (Second Edition) is a curated collection of programming problems and solutions originally derived from the Microsoft 100 Interview Questions blog series, later refined into a long-running tutorial and ultimately a published book. Created by July, the series began in 2010 and has since evolved into an in-depth exploration of algorithmic thinking, data structures, and coding interview preparation. The repository brings together 42 classic programming problems from the original series, enhanced with detailed explanations, formula derivations, and optimized solutions. In July 2023, work on the second edition was announced, which expands the project with updated content, new problems inspired by recent big-tech interviews, and introductions to modern machine learning techniques such as XGBoost, CNNs, RNNs, and LSTMs. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Blend_My_NFTs

    Blend_My_NFTs

    Easily generate thousands of 3D models, images, and animation NFTs

    Blend_My_NFTs is an open-source, free-to-use Blender add-on that enables you to easily generate thousands of 3D Models, Animations, and Images. This add-on's primary purpose is to aid in the creation of large generative 3D NFT collections. It is the first and easiest 3D NFT generator. Blend_My_NFTs was initially developed to create Cozy Place, an NFT collection by This Cozy Studio Inc. Blend_My_NFTs works with Blender 3.2.2 on Windows 10 or macOS Big Sur 11.6. Linux is supported, however we...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A set of tools (command line and GUI) to provide a complete digital photo workflow for Unixes. EXIF headers are used as the central information repository, so users may change their software at any time without loosing any data.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    SparrowRecSys

    SparrowRecSys

    A Deep Learning Recommender System

    SparrowRecSys is an open-source deep learning recommendation system framework designed to demonstrate the architecture and implementation of modern industrial-scale recommender systems. The project integrates multiple machine learning models and data processing pipelines to simulate how real-world recommendation platforms operate. It includes components for offline data processing, feature engineering, model training, real-time data updates, and online recommendation services. SparrowRecSys...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB