Showing 1131 open source projects for "python data analysis"

View related business solutions
  • Deliver trusted data with dbt Icon
    Deliver trusted data with dbt

    dbt Labs empowers data teams to build reliable, governed data pipelines—accelerating analytics and AI initiatives with speed and confidence.

    Data teams use dbt to codify business logic and make it accessible to the entire organization—for use in reporting, ML modeling, and operational workflows.
    Learn More
  • Workspace management made easy, fast and affordable. Icon
    Workspace management made easy, fast and affordable.

    For companies searching for a desk booking software for safe and flexible working

    The way we work has changed and Clearooms puts you in complete control of your hybrid workspace. Both meeting rooms and hot desk booking can be easily managed to ensure flexible and safe working, however big or small your organisation.
    Learn More
  • 1
    PerfView

    PerfView

    PerfView is a CPU and memory performance-analysis tool

    PerfView is a free performance analysis tool that helps isolate CPU and memory-related performance issues. It is a Windows tool, but it also has some support for analyzing data collected on Linux machines. It works for a wide variety of scenarios, but has a number of special features for investigating performance issues in code written for the .NET runtime. If you are unfamiliar with PerfView, there are PerfView video tutorials.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 2
    Recommenders 2023

    Recommenders 2023

    Best Practices on Recommendation Systems

    Recommenders objective is to assist researchers, developers and enthusiasts in prototyping, experimenting with and bringing to production a range of classic and state-of-the-art recommendation systems. Recommenders is a project under the Linux Foundation of AI and Data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models. While you’re in the research phase, and want to validate your data, find potential methodological problems, and/or validate...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    NetPad

    NetPad

    A cross-platform C# editor and playground

    ...Open NetPad, start coding, hit Run, and see your output immediately. It's that simple. Quickly prototype and test code snippets before incorporating them into your projects. Visualize data interactively for better insights and analysis. Query databases using LINQ or SQL effortlessly. Experiment with new C# features or start learning C# in an intuitive and accessible environment. Create and save your own utility or administration scripts for repeated use.
    Downloads: 13 This Week
    Last Update:
    See Project
  • AI Powered Global HCM for the Evolving World of Work Icon
    AI Powered Global HCM for the Evolving World of Work

    For Start-ups, SME's, Large Enterprise

    Darwinbox is a new-age & disruptive mobile-first, cloud-based HRMS platform built for the large enterprises to attract, engage and nurture their most critical resource - talent. It is an end-to-end integrated HR system that aids in streamlining activities across the employee lifecycle (Hire to Retire). Our powerful enterprise product features are built with a clear focus on intuitiveness and scalability, with standards of best in class consumer apps. Darwinbox’s motto is to engage, empower, and inspire employees on one side in addition to automating and simplifying all HR processes for the enterprise on the other. Over 350+ leading enterprises with 850k users manage their entire employee lifecycle on this unified platform.
    Learn More
  • 5
    Google Cloud Dataflow Template Pipelines

    Google Cloud Dataflow Template Pipelines

    Cloud Dataflow Google-provided templates for solving data tasks

    ...Its structure shows support for multiple generations of templates, including v1 and v2 implementations, as well as related metadata, YAML assets, plugins, and Python components that support broader template execution and maintenance. This design makes the project more than a sample set, because it acts as the implementation base for official Google-provided templates used in real cloud data workflows.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    BCC (BPF Compiler Collection)

    BCC (BPF Compiler Collection)

    Tools for BPF-based Linux IO analysis, networking, monitoring, etc.

    BCC is a toolkit that simplifies creating efficient kernel tracing, monitoring, and manipulation programs by leveraging extended Berkeley Packet Filters (eBPF). It includes a rich set of example tools and scripting interfaces in C, Python, and Lua. BCC makes BPF programs easier to write, with kernel instrumentation in C (and includes a C wrapper around LLVM), and front-ends in Python and lua. It is suited for many tasks, including performance analysis and network traffic control. With a BPF-specific frontend, one should be able to write in a language and receive feedback from the compiler on the validity as it pertains to a BPF backend. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    tsfresh

    tsfresh

    Automatic extraction of relevant features from time series

    ...Further tsfresh is compatible with pythons pandas and scikit-learn APIs, two important packages for Data Science endeavours in python. The extracted features can be used to describe or cluster time series based on the extracted characteristics. Further, they can be used to build models that perform classification/regression tasks on the time series. Often the features give new insights into time series and their dynamics.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 76 This Week
    Last Update:
    See Project
  • 9
    peewee

    peewee

    A small, expressive orm, which supports postgresql, mysql and sqlite

    ...You can override the default name by specifying a table_name attribute in the inner “Meta” class (alongside the database attribute). To learn more about how Peewee generates table names, refer to the Table Names section. There are lots of field types suitable for storing various types of data. Peewee handles converting between pythonic values and those used by the database, so you can use Python types in your code without having to worry. The real strength of our database is in how it allows us to retrieve data through queries. Relational databases are excellent for making ad-hoc queries. Peewee provides a magical helper fn(), which can be used to call any SQL function.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Taking the Paper Out of Work Icon
    Taking the Paper Out of Work

    For organizations that need powerful ECM and document automation software

    The Square 9 AI-powered intelligent document processing platform takes the paper out of work and makes it easier to get things done with digital workflows.
    Learn More
  • 10
    ModernGL

    ModernGL

    Modern OpenGL binding for Python

    ModernGL is a Python wrapper over OpenGL, designed to simplify the creation of high-performance, modern graphics applications. It provides an intuitive API for rendering 2D and 3D graphics, making it accessible to both beginners and experienced developers. ModernGL is suitable for applications such as games, simulations, and data visualizations.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Petastorm

    Petastorm

    Petastorm library enables single machine or distributed training

    ...It can also be used from pure Python code. A dataset created using Petastorm is stored in Apache Parquet format. On top of a Parquet schema, petastorm also stores higher-level schema information that makes multidimensional arrays into a native part of a petastorm dataset. Petastorm supports extensible data codecs. These enable a user to use one of the standard data compressions (jpeg, png) or implement her own.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    segyio

    segyio

    Fast Python library for SEGY files

    Segyio is a small LGPL-licensed C library for easy interaction with SEG-Y and Seismic Unix formatted seismic data, with language bindings for Python and Matlab. Segyio is an attempt to create an easy-to-use, embeddable, community-oriented library for seismic applications. Features are added as they are needed; suggestions and contributions of all kinds are very welcome.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    PaddleX

    PaddleX

    PaddlePaddle End-to-End Development Toolkit

    PaddleX is a deep learning full-process development tool based on the core framework, development kit, and tool components of Paddle. It has three characteristics opening up the whole process, integrating industrial practice, and being easy to use and integrate. Image classification and labeling is the most basic and simplest labeling task. Users only need to put pictures belonging to the same category in the same folder. When the model is trained, we need to divide the training set, the...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    OpenTelemetry Collector distributions

    OpenTelemetry Collector distributions

    OpenTelemetry Collector Official Releases

    High-quality, ubiquitous, and portable telemetry to enable effective observability. OpenTelemetry is a collection of APIs, SDKs, and tools. Use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) to help you analyze your software’s performance and behavior. Create and collect telemetry from your services and software, then forward it to a variety of analysis tools. OpenTelemetry integrates with many popular libraries and frameworks, and supports code-based and zero-code instrumentation.
    Downloads: 99 This Week
    Last Update:
    See Project
  • 15
    Odigos

    Odigos

    Distributed tracing without code changes

    Odigos supports any application written in Java, Python, .NET, Node.js and Go. Historically, compiled languages like Go have been difficult to instrument without code changes. Odigos solves this problem by uniquely leveraging eBPF. Odigos currently supports all the popular managed and open source destinations. By producing data in the OpenTelemetry format, Odigos can be used with any observability tool that supports OTLP.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 16
    Llama Cloud Services

    Llama Cloud Services

    Knowledge Agents and Management in the Cloud

    Llama Cloud Services is a suite of tools designed to facilitate the integration of large language models (LLMs) into applications. It offers components for parsing, extracting, and reporting on complex documents, streamlining the process of preparing data for LLM consumption.​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    Cytoscape.js

    Cytoscape.js

    Graph theory library for visualization and analysis

    A fully featured graph library written in pure JS. Permissive open source license (MIT) for the core Cytoscape.js library and all first-party extensions. Used in commercial projects and open-source projects in production. Designed for users first, for both frontfacing app usecases and developer usecases. Highly optimized. Compatible with All modern browsers. Legacy browsers with ES5 and canvas support. ES5 and canvas support are required, and feature detection is used for optional...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    rich

    rich

    Rich is a Python library for rich text and beautiful formatting

    ...Rich can be installed in the Python REPL, so that any data structures will be pretty printed and highlighted. As you might expect, this will print "Hello World!" to the terminal. Note that unlike the builtin print function, Rich will word-wrap your text to fit within the terminal width.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    Goose Developer Agent

    Goose Developer Agent

    Goose is a developer agent that operates from your command line

    ...Guided by you, it can intelligently assess your project's needs, generate the required code or modifications, and implement these changes on its own. Goose can interact with a multitude of tools via external APIs such as Jira, GitHub, Slack, infrastructure and data pipelines, and more -- if your task uses a shell command or can be carried out by a Python script, Goose can do it for you too! Like semi-autonomous driving, Goose handles the heavy lifting, allowing you to focus on other priorities. Simply set it on a task and return later to find it completed, boosting your productivity with less manual effort.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21
    pywinauto

    pywinauto

    Windows GUI Automation with Python (based on text properties)

    pywinauto is a set of Python modules to automate the Microsoft Windows GUI. At its simplest it allows you to send mouse and keyboard actions to Windows dialogs and controls, but it has support for more complex actions like getting text data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22

    Halide

    A language for fast, portable data-parallel computation

    ...This representation can then be compiled to an object file, or a JIT-compile and run in the same process. Halide also comes with a Python binding, allowing the writing of Halide embedded in Python without C++.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    Mineflayer

    Mineflayer

    Create Minecraft bots with a powerful and high level JavaScript API

    Create Minecraft bots with a powerful, stable, and high-level JavaScript API, also usable from Python. First time using Node.js? You may want to start with the tutorial. Know Python? Check out some Python examples and try out Mineflayer on Google Colab. Supports Minecraft 1.8, 1.9, 1.10, 1.11, 1.12, 1.13, 1.14, 1.15, 1.16, 1.17 and 1.18. Block knowledge. You can query the world around you. Milliseconds to find any block. Miscellaneous stuff such as knowing your health and whether it is raining. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 24
    DrissionPage

    DrissionPage

    Python based web automation tool. Powerful and elegant

    DrissionPage is a Python-based automation framework that blends the capabilities of Selenium for browser automation with Requests-HTML for fast, headless web data extraction. It enables seamless switching between browser-controlled and headless HTTP sessions within the same interface. Ideal for web scraping, testing, and automation, DrissionPage is lightweight and highly efficient, offering more flexibility than standard Selenium or Requests usage alone.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    ...XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems. XGBoost can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.
    Downloads: 10 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB