Showing 29 open source projects for "python data analysis"

View related business solutions
  • Online Project Management Platform - Zoho Icon
    Online Project Management Platform - Zoho

    A plan put together with small businesses and startups in mind.

    Zoho Projects is a cloud-based project management solution that helps teams plan, track, collaborate, and achieve project goals.
    Learn More
  • More Bookings. Better Experience. Icon
    More Bookings. Better Experience.

    For tour and activity providers

    The all-in-one solution built to help you stay organised and get more bookings with thousands of connections to online travel agencies (OTAs), resellers and suppliers.
    Learn More
  • 1
    pandas

    pandas

    Fast, flexible and powerful Python data analysis toolkit

    pandas is a Python data analysis library that provides high-performance, user friendly data structures and data analysis tools for the Python programming language. It enables you to carry out entire data analysis workflows in Python without having to switch to a more domain specific language. With pandas, performance, productivity and collaboration in doing data analysis in Python can significantly increase. ...
    Downloads: 129 This Week
    Last Update:
    See Project
  • 2
    FinMind

    FinMind

    Open Data, more than 50 financial data

    In the era of big data, data is the foundation of everything. We collect more than 50 kinds of Taiwan stock related information and provide download, online analysis, and backtesting. Regardless of the program, you can download data through the api provided by FinMind, or you can download data directly from the website. After data is available, statistical analysis, regression analysis, time series analysis, machine learning, and deep learning can be performed. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 3
    Apache Doris

    Apache Doris

    MPP-based interactive SQL data warehousing for reporting and analysis

    Apache Doris is a modern MPP analytical database product. It can provide sub-second queries and efficient real-time data analysis. With it's distributed architecture, up to 10PB level datasets will be well supported and easy to operate. Apache Doris can meet various data analysis demands, including history data reports, real-time data analysis, interactive data analysis, and exploratory data analysis. Make your data analysis easier! Support standard SQL language, compatible with MySQL protocol. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    marimo

    marimo

    A reactive notebook for Python

    marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots, make working with data feel refreshingly fast, futuristic, and intuitive. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • All-in-One Inspection Software Icon
    All-in-One Inspection Software

    flowdit is a connected worker platform tailored for industry needs in commissioning, quality, maintenance, and EHS management.

    Optimize Frontline Operations: Elevate Equipment Uptime, Operational Excellence, and Safety with Connected Teams and Data, Including Issue Capture and Corrective Action.
    Learn More
  • 5
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data streams. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Blue Whale Configuration Platform

    Blue Whale Configuration Platform

    Blue Whale smart cloud configuration platform

    Has accumulated experience in supporting hundreds of Tencent businesses, compatible with various complex system architectures, born in operation and maintenance, and proficient in operation and maintenance. From configuration management to job execution, task scheduling and monitoring self-healing, and then through operation and maintenance big data analysis to assist operational decision-making, it covers the full-cycle assurance management of business operations in a comprehensive manner. The open PaaS has a powerful development framework and scheduling engine, as well as a complete operation and maintenance development training system, which helps the rapid transformation and upgrading of operation and maintenance. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    HugeGraph

    HugeGraph

    A graph database that supports more than 100+ billion data

    ...HugeGraph supports fast import performance in the case of more than 10 billion Vertices and Edges Graph, millisecond-level OLTP query capability, and can be integrated into big data platforms like Hadoop or Spark for OLAP analysis. The main scenarios of HugeGraph include correlation search, fraud detection, and knowledge graph. Not only supports Gremlin graph query language and RESTful API but also provides commonly used graph algorithm APIs. To help users easily implement various queries and analyses, HugeGraph has a full range of accessory tools, such as supporting distributed storage, data replication, scaling horizontally, and supports many built-in backends of storage engines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    MOA - Massive Online Analysis

    MOA - Massive Online Analysis

    Big Data Stream Analytics Framework.

    A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.
    Downloads: 56 This Week
    Last Update:
    See Project
  • 9
    Modin

    Modin

    Scale your Pandas workflows by changing a single line of code

    Scale your pandas workflow by changing a single line of code. Modin uses Ray, Dask or Unidist to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical. It is not necessary to know in advance the available hardware resources in order to use Modin. Additionally, it is not necessary to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Digital business card + lead capture + contact enrichment Icon
    Digital business card + lead capture + contact enrichment

    Your complete in-person marketing platform

    Share digital business cards, capture leads, and enrich validated contact info - at events, in the field, and beyond. Powered by AI and our proprietary data engine, Popl drives growth for companies around the world, turning every handshake into an opportunity.
    Learn More
  • 10
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    ...Financial grade transactional message. Built-in fault tolerance and high availability configuration options base on DLedger. A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. Million-level message accumulation capacity in a single queue. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    gravitino

    gravitino

    Unified metadata lake for data & AI assets.

    Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    qvge

    qvge

    Qt Visual Graph Editor

    ...Its main goal is to make possible visually edit two-dimensional graphs in a simple and intuitive way. Please note that qvge is not a replacement for such a software like Gephi, Graphvis, Dot, yEd, Dia and so on. It is neither a tool for "big data analysis" nor a math application. It is really just a simple graph editor :)
    Downloads: 11 This Week
    Last Update:
    See Project
  • 13

    Faum

    Fast Autonomous Unsupervised Multidimiensional Classification

    This is the proof-of-concept implementation of the FAUM Clustering method. This implementation was used to perform the published results and is now released in the hope that it will be useful.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SentimentAnalysis-Rick&Morty

    SentimentAnalysis-Rick&Morty

    Rick & Morty Sentiment Analysis - End-of-Degree Project - UNIR

    The remarkable progress in the field of Big Data has driven the development of new technologies in natural language processing and data analysis. Text mining is a fascinating application of data analysis that extracts relevant information from related writings in different linguistic contexts. And therefore, in natural language processing, sentiment analysis and classification stands out as a key application supported by text mining. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TensorBase

    TensorBase

    TensorBase is a new big data warehousing with modern efforts

    ...TensorBase has a clear-cut opposition to fork communities, repeat wheels, or hack traffic for so-called reputations (like Github stars). After thoughts, we decided to temporarily leave the general data warehousing field. For people who want to learn how a database system can be built up, or how to apply modern Rust to the high-performance field, or embed a lightweight data analysis system into your own big one. You can still try, ask or contribute to TensorBase. The committers are still around the community. We will help you in all kinds of interesting things pursued in the project by us and maybe you. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    An introduction to Data Analysis in R

    A guide for learning the basic tools on data anaylisis with R

    An Introduction to Data Analysis in R [Book] A guide for learning the basic tools on data anaylisis: process, visualize and learn from your data using R programming. This repository holds the necessary data sets for the book "An introduction to Data Analysis in R", to be published by Springer series Use R!. The book can be purchased in XXX. The book is meant as an introductory guide to manipulate data sets in the Big Data paradigm. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads through the clustering of single-end and paired-end sequences from FASTQ/FASTA datasets. This tool allows bioinformatics to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset. MarDRe is the Big Data counterpart of ParDRe (link above), which employs HPC technologies (i.e., hybrid MPI/multithreading) to reduce runtime on multicore systems. Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    fooltrader

    fooltrader

    Quant framework for stock

    Build a standard data schema, and then implement various connectors to import systems you are familiar with for analysis. fooltrader is a quantitative analysis trading system designed using big data technology, including data capture, cleaning, structuring, calculation, display, backtesting and trading. Its goal is to provide a unified framework for the whole market (stock, futures, bonds, foreign exchange, digital currency, macroeconomics, etc.) for research, backtesting, forecasting, and trading. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    paralline

    Big Data tool

    Paralline executes a python function (or lambda function) or a script over each line of huge text files, in parallel processes and aggregates the result to a list.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Vaex

    Vaex

    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python

    Data science solutions, insights, dashboards, machine learning, deployment. We start at 100GB. Vaex is a high-performance Python library for lazy Out-of-Core data frames (similar to Pandas), to visualize and explore big tabular datasets. It calculates statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid for more than a billion (10^9) samples/rows per second.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Random Bits Regression

    Random Bits Regression is a strong general predictor.

    ...The fast-speed nature of our method not only allows big data analysis but also enables real-time recognition and predictions. The RBR framework also hints the mechanism of brain function and leads to a "wide learning" hypothesis. We believe that this method will make a great impact and enable many downstream applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Universal Java Matrix Package

    Universal Java Matrix Package

    sparse and dense matrix, linear algebra, visualization, big data

    The Universal Java Matrix Package (UJMP) is an open source Java library which provides sparse and dense matrix classes, as well as a large number of calculations for linear algebra such as matrix multiplication or matrix inverse. Operations such as mean, correlation, standard deviation, replacement of missing values or the calculation of mutual information are supported, too. The Universal Java Matrix Package provides various visualization methods, import and export filters for a large...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    Chordalysis

    Log-linear analysis (data modelling) for high-dimensional data

    ===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    giServer

    giServer

    giServer the easy to use and extensible batch and integration server

    ...Instead of using complex XML configuration files an elaborate GUI for batch job management is included. Some possible usage scenarios are: - Automatic processing of incoming data files - Big Data applications - Process automation - Data Mining/Aggregation applications - Automatic Reporting - Processing and analysis of database records
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB