Showing 74 open source projects for "big data"

View related business solutions
  • QA Wolf | We Write, Run and Maintain Tests Icon
    QA Wolf | We Write, Run and Maintain Tests

    For developer teams searching for a testing software

    QA Wolf is an AI-native service that delivers 80% automated E2E test coverage for web & mobile apps in weeks not years.
    Learn More
  • Enterprise-Class Managed File Transfer. Icon
    Enterprise-Class Managed File Transfer.

    For organizations that need to automate secure file transfers to protect sensitive data.

    Diplomat MFT by Coviant Software is a secure, reliable managed file transfer solution designed to simplify and automate SFTP, FTPS, and HTTPS file transfers. Built for seamless integration, Diplomat MFT works across major cloud storage platforms, including AWS S3, Azure Blob, Google Cloud, Oracle Cloud, SharePoint, Dropbox, Box, and more.
    Learn More
  • 1
    data.table

    data.table

    Extends base R’s data for high-performance data manipulation

    data.table is an R package that extends base R’s data.frame for high-performance data manipulation. It offers concise syntax, blazing speed, and memory-efficient operations. It supports fast file reading/writing, joins, grouping, reshaping, and updates by reference. It is heavily used in large data workflows, big data in R, production pipelines, etc. Extremely efficient grouping/aggregation/summarization; can handle very large datasets (hundreds of millions to billions of rows) in memory (if available). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    FinMind

    FinMind

    Open Data, more than 50 financial data

    In the era of big data, data is the foundation of everything. We collect more than 50 kinds of Taiwan stock related information and provide download, online analysis, and backtesting. Regardless of the program, you can download data through the api provided by FinMind, or you can download data directly from the website. After data is available, statistical analysis, regression analysis, time series analysis, machine learning, and deep learning can be performed. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    JuiceFS

    JuiceFS

    JuiceFS is a distributed POSIX file system built on top of Redis

    ...Whether it's a public cloud, private cloud, or hybrid cloud, JuiceFS is available on any cloud of your choice and delivers flexibility, availability, scalability and strong consistency for your data-intensive applications. Purposely built to serve big data scenarios such as self-driving model training, recommendation engine, and Next-generation Gene Sequencing, JuiceFS specializes in high performance and easier management of tens of billion of files management. We bring JuiceFS to developers with the hope that it will be easy to use, reliable, high-performance, and solve all your file storage problems in a cloud environment.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 4
    XCharts

    XCharts

    A charting and data visualization library for Unity

    A charting and data visualization library for Unity. Unity data visualization chart plugin. A UGUIpowerful, easy-to-use, parameter-configurable data visualization chart plug-in. It supports ten built-in charts. A powerful, easy-to-use, configurable charting and data visualization library for Unity. Visual configuration of parameters, real-time preview of effects, and pure code drawing without additional resources. Support ten built-in charts such as line chart, column chart, pie chart, radar...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Tremendous is the global payouts platform for businesses sending gift cards and money at scale. Icon
    Tremendous is the global payouts platform for businesses sending gift cards and money at scale.

    Getting started is simple: add a funding method and place your first order in minutes.

    Trusted by 20,000+ leading organizations, Tremendous has delivered billions of rewards and enables businesses to reach recipients across 230+ countries and regions. Recipients have 2,500+ payout options to choose from, including gift cards, prepaid cards, cash transfers, and charitable donations.
    Learn More
  • 5
    Apache Bigtop

    Apache Bigtop

    Bigtop is an Apache Foundation project for Infrastructure Engineers

    Apache Bigtop is a project focused on building and packaging the Hadoop ecosystem and related big data components. It provides a consistent framework for testing, packaging, and deploying Hadoop distributions, including tools like HDFS, YARN, Spark, Hive, HBase, and more. By maintaining cross-platform builds (RPMs, DEBs, Docker images, and Kubernetes support), Bigtop makes it easier for organizations to deploy big data stacks in different environments. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do....
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    ...InLong was originally built at Tencent, which has served online businesses for more than 8 years, to support massive data (data scale of more than 80 trillion pieces of data per day) reporting services in big data scenarios. The entire platform has integrated 5 modules: Ingestion, Convergence, Caching, Sorting, and Management, so that the business only needs to provide data sources, data service quality, data landing clusters and data landing formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Blue Whale Configuration Platform

    Blue Whale Configuration Platform

    Blue Whale smart cloud configuration platform

    Has accumulated experience in supporting hundreds of Tencent businesses, compatible with various complex system architectures, born in operation and maintenance, and proficient in operation and maintenance. From configuration management to job execution, task scheduling and monitoring self-healing, and then through operation and maintenance big data analysis to assist operational decision-making, it covers the full-cycle assurance management of business operations in a comprehensive manner. The open PaaS has a powerful development framework and scheduling engine, as well as a complete operation and maintenance development training system, which helps the rapid transformation and upgrading of operation and maintenance. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    .NET for Apache Spark

    .NET for Apache Spark

    A free, open-source, and cross-platform big data analytics framework

    .NET for Apache Spark provides high-performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data. .NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Effortlessly manage macOS, iOS, iPadOS and tvOS devices across multiple clients and locations. Icon
    Effortlessly manage macOS, iOS, iPadOS and tvOS devices across multiple clients and locations.

    The Most Powerful Apple Device Management Tool for MSPs and IT Teams

    Addigy solutions accelerate Apple adoption in any environment.
    Learn More
  • 10
    ElasticJob

    ElasticJob

    Distributed scheduled job framework

    ElasticJob is a distributed scheduling solution consisting of two separate projects, ElasticJob-Lite and ElasticJob-Cloud. ElasticJob-Lite is a lightweight, decentralized solution that provides distributed task sharding services. ElasticJob-Cloud uses Mesos to manage and isolate resources. It uses a unified job API for each project. Developers only need code one time and can deploy at will. Support job sharding and high availability in distributed system. Scale out for throughput and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Fishing Funds

    Fishing Funds

    Fund, big market, stock, virtual currency status bar display for apps

    Display real-time trends of Chinese funds in the menubar. Fund, big market, stock, virtual currency status bar displays small applications, developed based on Electron, supports MacOS, Windows, Linux clients, data sources come from Tiantian Fund, Ant Fund, Love Fund, Tencent Securities, Sina Fund, etc. This project refers to electron-react-boilerplate-menubar, which is developed based on Electron React Boilerplate and menubar.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    testng

    testng

    TestNG testing framework

    TestNG is a testing framework inspired from JUnit and NUnit but introduces some new functionalities that make it more powerful and easier to use. Run your tests in arbitrarily big thread pools with various policies available (all methods in their own thread, one thread per test class, etc...).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Volcano

    Volcano

    A Cloud Native Batch System (Project under CNCF)

    ...It provides a suite of mechanisms that are commonly required by many classes of batch & elastic workload including machine learning/deep learning, bioinformatics/genomics, and other "big data" applications. These types of applications typically run on generalized domain frameworks like TensorFlow, Spark, Ray, PyTorch, MPI, etc, which Volcano integrates with. Volcano builds upon a decade and a half of experience running a wide variety of high-performance workloads at scale using several systems and platforms, combined with best-of-breed ideas and practices from the open-source community. ...
    Downloads: 280 This Week
    Last Update:
    See Project
  • 14
    Bacalhau

    Bacalhau

    Community-driven, simple, yet powerful framework

    Bacalhau is a decentralized compute platform for running jobs on data stored across distributed networks, like IPFS or Filecoin, without moving the data to centralized cloud environments. It allows developers to run containerized workloads close to where the data lives, reducing latency, cost, and privacy risks. Bacalhau supports various runtime environments and is designed to make decentralized data processing as accessible as traditional cloud computing. It’s especially useful for...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    ...With Spark Streaming (microbatches) and Structured Streaming, it delivers low-latency event processing suitable for real-time analytics. The built-in MLlib library provides scalable machine learning algorithms, while GraphX enables graph computations integrated with data pipelines. Spark supports multiple languages—Scala, Java, Python, R—and connects with many storage systems like HDFS, S3, Cassandra, and streaming platforms like Kafka, making it a versatile choice for big data workloads in analytics, ETL, and data science.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Ptah.sh

    Ptah.sh

    Self-hosted alternative to Heroku

    Ptah.sh is a Fair Source self-hosting deployment platform - an alternative to Heroku/Vercel and other Big Corp software. We believe that indie, startups, and small to medium businesses must not suffer from unpredicted billing or bare-metal/VPS configurations. Designed for indie developers and SMBs, Ptah.sh offers the simplest hosting experience.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    applied-ml

    applied-ml

    Papers & tech blogs by companies sharing their work on data science

    ...For someone designing—or planning to build—a production ML system, this repo provides patterns, precedents, and lessons learned from firms that operate at big scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CubeFS

    CubeFS

    cloud-native file store

    CubeFS is a new generation cloud-native storage that supports access protocols such as S3, HDFS, and POSIX. It is widely applicable in various scenarios such as big data, AI/LLMs, container platforms, separation of storage and computing for databases and middleware, data sharing and protection, etc. Compatible with various access protocols such as S3, POSIX, HDFS, etc., and the access between protocols can be interoperable. Support replicas and erasure coding engines, users can choose flexibly according to business scenarios. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    QRCoder

    QRCoder

    A pure C# Open Source QR Code implementation

    QRCoder is a simple library, written in C#.NET, which enables you to create QR codes. It hasn't any dependencies on other libraries and is available as .NET Framework and .NET Core PCL version on NuGet. Feel free to grab-up/fork the project and make it better! QRCoder is a .NET library, completely written in C#, which enables you to generate QR Codes as defined by ISO/IEC 18004. The main target of the QRCoder library is to deliver a small and easy-to-use solution, which has no dependencies...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 20
    MobX

    MobX

    A Simple, scalable state management

    MobX is a battle tested library that makes state management simple and scalable by transparently applying functional reactive programming (TFRP). Write minimalistic, boilerplate free code that captures your intent. Trying to update a record field? Use the good old JavaScript assignment. Updating data in an asynchronous process? No special tools are required, the reactivity system will detect all your changes and propagate them out to where they are being used. All changes to and uses of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    xxHash

    xxHash

    Extremely fast non-cryptographic hash algorithm

    ...It is proposed in four flavors (XXH32, XXH64, XXH3_64bits and XXH3_128bits). The latest variant, XXH3, offers improved performance across the board, especially on small data. It successfully completes the SMHasher test suite which evaluates collision, dispersion and randomness qualities of hash functions. Code is highly portable, and hashes are identical across all platforms (little / big endian). Performance on large data is only one part of the picture. Hashing is also very useful in constructions like hash tables and bloom filters. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Foundatio

    Foundatio

    Pluggable foundation blocks for building distributed apps

    Pluggable foundation blocks for building loosely coupled distributed apps. Includes implementations in Redis, Azure, AWS, RabbitMQ and in memory (for development). When building several big cloud applications we found a lack of great solutions (that's not to say there aren't solutions out there) for many key pieces to building scalable distributed applications while keeping the development experience simple. Wanted to build against abstract interfaces so that we could easily change...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 23
    huihut interview

    huihut interview

    A summary of C/C++ technical interview basics

    ...It’s organized to be approachable whether you’re a student preparing for your first internship or an experienced engineer brushing up on fundamentals before a big interview round.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    zpdf is a high-performance PDF text extraction library written in Zig that focuses on speed, low overhead, and modern parsing techniques. It leans heavily on memory-mapped file reading and zero-copy patterns where possible, so it can scan large PDFs without repeatedly copying data around in memory. The library supports streaming extraction using efficient arena allocation, making it well suited for workloads that need to process big documents quickly or in batches. It implements multiple PDF decompression filters and handles common font encoding pathways, which are essential for turning raw PDF content streams into readable text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Nano Events

    Nano Events

    Simple and tiny (107 bytes) event emitter library for JavaScript

    Nano Events is a minimalistic, high-performance event emitter library for JavaScript. Its goal is to provide the simplest possible API to add pub/sub capabilities (emitters and listeners) to any JS object or application, while keeping overhead and bundle size extremely small. Rather than offering many complex features, nanoevents focuses on the core primitives: creating an emitter, subscribing to named events, emitting events with arbitrary data, and unsubscribing. Because of its minimal API...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB