big data free download

Showing 31 open source projects for "big data"

View related business solutions

Database Linux Clear Filters & Widen Search

Endpoint Protection Software for Businesses | HYPERSECURE
DriveLock protects systems, data, end devices from data loss and misuse.

The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.

Learn More
Safety Compliance Made Easy
SiteDocs is a digital safety management software used to support work site compliance.

Ideally designed for business that deals with Construction, Oil & Gas, Mining, Manufacturing, Mechanical, Electrical, Plumbing, Heating, and Excavating, SiteDocs is a perfect solution for any size business looking to modernize the way Safety Compliance is organized.

Learn More
1

Apache HBase

Get random, realtime read/write access to your Big Data

Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables, billions of rows X millions of columns, atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable. A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. ...

Downloads: 21 This Week

Last Update: 2025-11-14
See Project
2

Vespa

The open big data serving engine

Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do....

Downloads: 10 This Week

Last Update: 5 days ago
See Project
3

HugeGraph

A graph database that supports more than 100+ billion data

...HugeGraph supports fast import performance in the case of more than 10 billion Vertices and Edges Graph, millisecond-level OLTP query capability, and can be integrated into big data platforms like Hadoop or Spark for OLAP analysis. The main scenarios of HugeGraph include correlation search, fraud detection, and knowledge graph. Not only supports Gremlin graph query language and RESTful API but also provides commonly used graph algorithm APIs. To help users easily implement various queries and analyses, HugeGraph has a full range of accessory tools, such as supporting distributed storage, data replication, scaling horizontally, and supports many built-in backends of storage engines.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
4

GridDB

GridDB is a next-generation open source database

A cyber-physical systems is a system that collects a variety of data in physical space (the real world), analyzes and converts it into knowledge in cyberspace, and feeds the knowledge back to the real world to revitalize industry and solve social problems. GridDB is an open database that enables real-time processing of vast amounts of time-series data in physical space, which is necessary to realize a cyber-physical system. Multi-model architecture capable of supporting various data stores...

Downloads: 0 This Week

Last Update: 2026-02-18
See Project
Workable Hiring Software - Hire The Best People, Fast
Find the best candidates with the best recruitment software

Workable is the preferred software for today's recruiting industry and HR teams, trusted by over 6,000 companies to streamline their hiring processes. Finding the right person for the job has never been easier—users now possess the ability to manage multiple hiring pipelines at once, from posting a job to sourcing candidates. Workable is also seamlessly integrated between desktop and mobile, allowing admins full control and flexibility all in the ATS without needing additional software.

Learn More
5

Blue Whale Configuration Platform

Blue Whale smart cloud configuration platform

Has accumulated experience in supporting hundreds of Tencent businesses, compatible with various complex system architectures, born in operation and maintenance, and proficient in operation and maintenance. From configuration management to job execution, task scheduling and monitoring self-healing, and then through operation and maintenance big data analysis to assist operational decision-making, it covers the full-cycle assurance management of business operations in a comprehensive manner. The open PaaS has a powerful development framework and scheduling engine, as well as a complete operation and maintenance development training system, which helps the rapid transformation and upgrading of operation and maintenance. ...

Downloads: 1 This Week

Last Update: 2025-05-30
See Project
6

Nebula Graph

A distributed, fast open-source graph database

The graph database built for super large-scale graphs with milliseconds of latency. Optimized SUBGRAPH and FIND PATH for better performance. Optimized query paths to reduce redundant paths and time complexity. Optimized the method to get properties for better performance of MATCH statements. Nebula Graph adopts the Apache 2.0 license, one of the most permissive free software licenses in the world. Free as in freedom, because, under the Apache 2.0 license, you can use, copy, modify and...

Downloads: 0 This Week

Last Update: 2024-05-17
See Project
7

Magda

A federated data catalog for all your big and small data

Magda is an open-source data catalog system designed to make datasets easier to find, access, and use. Built for government and enterprise use, it supports harvesting metadata from multiple sources, managing data access policies, and integrating with data APIs. Magda is highly customizable and ideal for building open data portals or internal data discovery tools.

Downloads: 4 This Week

Last Update: 2025-12-24
See Project
8

Cloudberry

One advanced and mature open-source MPP

Apache Cloudberry is a distributed real-time analytics engine designed for querying massive social media datasets. It integrates with Apache AsterixDB and supports efficient ad-hoc queries and aggregations across large volumes of data. Cloudberry is especially useful for dashboards, trend analysis, and time-series social data exploration.

Downloads: 8 This Week

Last Update: 7 days ago
See Project
9

Apache Iceberg

Apache Iceberg

Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data while making it possible for engines like Spark, Trino, Flink, Presto, Hive, and Impala to safely work with the same tables, at the same time. The core Java library that tracks table snapshots and metadata is complete, but still evolving. Current work is focused on adding row-level deletes and upserts, and integration work with new engines like Flink and Hive. ...

Downloads: 2 This Week

Last Update: 2025-12-22
See Project
Digital business card + lead capture + contact enrichment
Your complete in-person marketing platform

Share digital business cards, capture leads, and enrich validated contact info - at events, in the field, and beyond. Powered by AI and our proprietary data engine, Popl drives growth for companies around the world, turning every handshake into an opportunity.

Learn More
10

Redash

Connect to any data source, easily visualize and share your data

...It lets you create big, beautiful and easy to digest visualizations on dashboards for better decision-making. Redash supports a multitude of SQL and NoSQL data sources, and can be extended to support even more. Best of all it’s open source, so you can customize and add features to suit your organization’s needs perfectly.

Downloads: 11 This Week

Last Update: 2026-03-02
See Project
11

TimescaleDB

An open-source time-series SQL database optimized for fast ingest

TimescaleDB is the open-source relational database for time-series and analytics. Build powerful data-intensive applications. Become instantly productive with full SQL. Rely on the same PostgreSQL you know, love, and trust. Hyperfunctions make time series easier. Achieve 10-100x faster queries than with vanilla PostgreSQL, InfluxDB, MongoDB. Write millions of data points per second per node. Horizontally scale to petabytes. Don’t worry about cardinality. Simplify your stack, ask more complex...

Downloads: 60 This Week

Last Update: 5 days ago
See Project
12

manticoresearch

Easy to use open source fast database for search

...Modern MPP architecture and smart query parallelization capabilities allow to fully utilize all your CPU cores to lower response time as much as possible, when needed. Powerful and fast full-text searching which works fine for small and big datasets. Columnar storage support via the Manticore Columnar Library for bigger datasets (much bigger than can fit in RAM). SQL-first: Manticore's native syntax is SQL. It speaks SQL over HTTP and uses the MySQL protocol (you can use your preferred MySQL client). JSON over HTTP: to provide a more programmatic way to manage your data and schemas, Manticore provides a HTTP JSON protocol. ...

Downloads: 0 This Week

Last Update: 2026-03-27
See Project
13

Arctic TimeSeries and Tick store

High performance datastore for time series and tick data

Arctic is a timeseries/dataframe database that sits atop MongoDB. Arctic supports serialization of a number of datatypes for storage in the mongo document model. Serializes a number of data types eg. Pandas DataFrames, Numpy arrays, Python objects via pickling etc. so you don't have to handle different datatypes manually. Uses LZ4 compression by default on the client side to get big savings on network / disk. Allows you to version different stages of an object and snapshot the state (In some ways similar to git), and allows you to freely experiment and then just revert back the snapshot. ...

Downloads: 1 This Week

Last Update: 2024-02-01
See Project
14

SnappyData

Memory optimized analytics database, based on Apache Spark

...SnappyData delivers high throughput, low latency, and high concurrency for a unified analytics workload. By fusing an in-memory hybrid database inside Apache Spark, it provides analytic query processing, mutability/transactions, access to virtually all big data sources and stream processing all in one unified cluster. One common use case for SnappyData is to provide analytics at interactive speeds over large volumes of data with minimal or no pre-processing of the dataset. For instance, there is no need to often pre-aggregate/reduce or generate cubes over your large data sets for ad-hoc visual analytics. ...

Downloads: 1 This Week

Last Update: 2024-10-15
See Project
15

QuickRedis

QuickRedis is a free forever redis gui tool

QuickRedis is a free forever Redis Desktop manager. It supports direct connection, sentinel, and cluster mode, supports multiple languages, supports hundreds of millions of keys, and has an amazing UI. Supports both Windows, Mac OS X and Linux platform.

2 Reviews

Downloads: 36 This Week

Last Update: 2022-07-03
See Project
16

TensorBase

TensorBase is a new big data warehousing with modern efforts

...TensorBase has a clear-cut opposition to fork communities, repeat wheels, or hack traffic for so-called reputations (like Github stars). After thoughts, we decided to temporarily leave the general data warehousing field. For people who want to learn how a database system can be built up, or how to apply modern Rust to the high-performance field, or embed a lightweight data analysis system into your own big one. You can still try, ask or contribute to TensorBase. The committers are still around the community. We will help you in all kinds of interesting things pursued in the project by us and maybe you. ...

Downloads: 0 This Week

Last Update: 2022-07-25
See Project
17

Open Source Data Quality and Profiling

World's first open source data quality & data preparation project

...It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/

8 Reviews

Downloads: 0 This Week

Last Update: 2021-01-20
See Project
18

MyCAT

Active, high-performance open source database middleware

...Regarded as MySQL cluster of enterprise database, MyCAT can take the place of expensive Oracle cluster. MyCAT is also a new type of database, which seems like a SQL Server integrated with the memory cache technology, NoSQL technology and HDFS big data. And as a new modern enterprise database product, MyCAT is combined with the traditional database and new distributed data warehouse. In a word, MyCAT is a fresh new middleware of database. MyCAT ’s objective is to smoothly migrate the current stand-alone database and applications to cloud side with low cost and to solve the bottleneck problem caused by the rapid growth of data storage and business scale.

Downloads: 4 This Week

Last Update: 2021-06-28
See Project
19

FastoRedis

Cross-platform open source Redis DB management tool

FastoRedis (fork of FastoNoSQL) — is a cross-platform open source Redis management tool (i.e. Admin GUI). It put the same engine that powers Redis's redis-cli shell. Everything you can write in redis-cli shell — you can write in FastoRedis! Our program works on the most amount of Linux systems, also on Windows, Mac OS X, FreeBSD and Android platforms, on desktops and embedded devices.

Downloads: 3 This Week

Last Update: 2019-10-25
See Project
20

FastoNoSQL

FastoNoSQL it is GUI platform for NoSQL databases.

Gui managment admin tool for: Redis Memcached SSDB LevelDB RocksDB UnQLite LMDB UpscaleDB ForestDB

Downloads: 18 This Week

Last Update: 2019-06-19
See Project
21

Redis Desktop Manager

:wrench: Cross-platform GUI management tool for Redis

Redis Desktop Manager is a fast, open source Redis database management application based on Qt 5. It's available for Windows, Linux and MacOS and offers an easy-to-use GUI to access your Redis DB. With Redis Desktop Manager you can perform some basic operations such as view keys as a tree, CRUD keys and execute commands via shell. It also supports SSL/TLS encryption, SSH tunnels and cloud Redis instances, such as: Amazon ElastiCache, Microsoft Azure Redis Cache and Redis Labs.

1 Review

Downloads: 0 This Week

Last Update: 2018-10-11
See Project
22

Cosmos DB Spark

Apache Spark Connector for Azure Cosmos DB

...It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.

Downloads: 0 This Week

Last Update: 2023-12-21
See Project
23

Voldemort

A distributed key-value storage system

Voldemort is a distributed database that’s an open source clone of Amazon’s Dynamo. It automatically replicates data over multiple servers, and automatically partitions them as well so each server only contains a subset of the total data. It offers many other features such as pluggable serialization support, data item versioning and an SSD Optimized Read Write storage engine. Voldemort is not a relational database or an object database. It is essentially a big, distributed, persistent, fault-tolerant hash table. ...

Downloads: 1 This Week

Last Update: 2020-07-16
See Project
24

garlic

GaRLiC=Gambas Raw Lines of Code, a Gambas2 code library for use in SMB

Building blocks, small demo apps, and pieces of sample code for Gambas2. Also see more recent Garlic3 project on https://sourceforge.net/projects/garlic3. All can be used to take a look how things can be done in Gambas, not with the idea to supply a product or a finished program. The sourcecode can easily be run from Gambas2 on Linux. Some small programs that appear here are in real use in a small business, but no warranty is given! SimpleDemo contains more complete apps that can be...

Downloads: 1 This Week

Last Update: 2020-11-24
See Project
25

Relation Tags

Source code for be able to use Relation Tags.

...Please read "readme" file. It is recommended to use a binary matrix class like BinMatrix in order to have enough speed for calculations of implicit relations in a system of bogus tags with big data. Need to be compiled with C++11 and Qt libraries

Downloads: 0 This Week

Last Update: 2015-08-11
See Project