Showing 190 open source projects for "data"

View related business solutions
  • Monitor production, track downtime and improve OEE. Icon
    Monitor production, track downtime and improve OEE.

    For manufacturing companies interested in OEE monitoring solutions

    Evocon is a visual and user-friendly OEE software that helps manufacturing companies improve productivity and remove waste as they become better.
    Learn More
  • Information Security Made Simple and Affordable | Carbide Icon
    Information Security Made Simple and Affordable | Carbide

    For companies requiring a solution to scale their business without incurring security debt

    Get expert guidance and smart tools to launch or level up your security and compliance efforts without the complexity.
    Learn More
  • 1
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    FairScale is a collection of PyTorch performance and scaling primitives that pioneered many of the ideas now used for large-model training. It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    quantitative

    quantitative

    Quantized transactions python3

    ...The repo is evidently tied to a popular video series (on Bilibili) that reportedly drew substantial attention, suggesting the material is meant to be both educational and hands-on. The README and associated lessons walk the user through implementing algorithms, likely covering data handling, backtesting, and maybe simple trading logic. As an open-source educational resource, it’s designed for Python users interested in automatic trading, algorithmic strategies, and financial data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Jraph

    Jraph

    A Graph Neural Network Library in Jax

    Jraph (pronounced “giraffe”) is a lightweight JAX library developed by Google DeepMind for building and experimenting with graph neural networks (GNNs). It provides an efficient and flexible framework for representing, manipulating, and training models on graph-structured data. The core of Jraph is the GraphsTuple data structure, which enables users to define graphs with arbitrary node, edge, and global attributes, and to batch variable-sized graphs efficiently for JAX’s just-in-time compilation. The library includes a comprehensive set of utilities for batching, padding, masking, and partitioning graph data, making it ideal for distributed and large-scale GNN experiments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Fairseq

    Fairseq

    Facebook AI Research Sequence-to-Sequence Toolkit written in Python

    Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the new FullyShardedDataParallel (FSDP) wrapper provided by fairscale. Fairseq can be extended through user-supplied plug-ins. Models define the neural network architecture and encapsulate all of the learnable parameters. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-Powered Identity Governance Icon
    AI-Powered Identity Governance

    For IT Teams and MSPs in need of a solution to simplify, optimize and secure their SaaS, file, and device management operations

    Define governance policies, manage access, and optimize licenses with unified visibility across every identity, app, and file.
    Learn More
  • 5
    Whisper Library

    Whisper Library

    Whisper is a file-based time-series database format for Graphite

    Whisper is one of three components within the Graphite project. Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows for higher resolution (seconds per point) of recent data to degrade into lower resolutions for long-term retention of historical data. Copies data from src in dst, if missing. Unlike whisper-merge, don't overwrite data that's already present in the target file, but instead, only add the missing data (e.g. where the gaps in the target file are). ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    ...AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Binarytree

    Binarytree

    Python library for studying Binary Trees

    Binarytree is Python library that lets you generate, visualize, inspect and manipulate binary trees. Skip the tedious work of setting up test data, and dive straight into practicing algorithms. Heaps and BSTs (binary search trees) are also supported. Binarytree supports another representation which is more compact but without the indexing properties. Traverse trees using different algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Interpret-Text

    Interpret-Text

    State-of-the-art explainers for text-based machine learning models

    A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard. Interpret-Text builds on Interpret, an open source python package for training interpretable models and helping to explain blackbox machine learning systems. We have added extensions to support text models. Interpret-Text incorporates community-developed interpretability techniques for NLP models and a visualization dashboard to view the results....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    TensorFlow Examples

    TensorFlow Examples

    TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

    ...For clarity and educational value, each example is accompanied by explanatory comments or markdown cells to illustrate what the code does and why — a design that makes it especially suitable for self-learners or students following along with real data. Besides raw implementations, the repo often shows best practices using higher-level constructs (e.g. dataset pipelines, estimators, layers) which reflect modern TensorFlow workflows rather than only textbook-style code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Create a personalized AI chatbot for each team in minutes Icon
    Create a personalized AI chatbot for each team in minutes

    Get better, faster answers for your whole team with an AI chatbot trained on your company documents.

    QueryPal is the lifeline your team needs. Our AI chatbot integrates seamlessly with your communication channels, using advanced language understanding to identify and auto-answer repetitive questions — in seconds.
    Learn More
  • 10
    PyTorchVideo

    PyTorchVideo

    A deep learning library for video understanding research

    ...The library includes efficient implementations of state-of-the-art architectures such as SlowFast, X3D, and MViT, optimized for both research prototyping and production inference. It supports video I/O pipelines, data augmentation, distributed training, and mixed precision computation for large-scale experiments. PyTorchVideo also connects seamlessly with other Meta AI tools such as Detectron2 and PyTorch3D for multimodal video analysis. Designed to accelerate research and deployment, it serves as a unified framework for reproducible, high-performance video AI development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ReinventCommunity

    ReinventCommunity

    Jupyter Notebook tutorials for REINVENT 3.2

    This repository is a collection of useful jupyter notebooks, code snippets and example JSON files illustrating the use of Reinvent 3.2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Arraymancer

    Arraymancer

    A fast, ergonomic and portable tensor library in Nim

    Arraymancer is a tensor and deep learning library for the Nim programming language, designed for high-performance numerical computations and machine learning applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    The Big List of Naughty Strings is a community-maintained catalog of “gotcha” inputs that commonly break software, from unusual Unicode to SQL and script injection payloads. It exists so developers and QA engineers can easily test edge cases that normal test data would miss, such as zero-width characters, right-to-left marks, emojis, foreign alphabets, and long or malformed strings. By throwing these strings at forms, APIs, databases, and UIs, teams can discover encoding bugs, sanitizer gaps, rendering issues, and security oversights early. The list is language-agnostic and repository-friendly, meaning you can consume it from CI pipelines or local scripts with minimal setup. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    BMC

    BMC

    Notes on Scientific Computing for Biomechanics

    This repository is a collection of lecture notes and code on scientific computing and data analysis for Biomechanics and Motor Control.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    earthengine-py-notebooks

    earthengine-py-notebooks

    A collection of 360+ Jupyter Python notebook examples

    ...Users can quickly adapt the examples for their own remote sensing, environmental monitoring, or spatial data science projects, and can run the code in environments like Google Colab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CNN for Image Retrieval
    ...The repository provides implementations of CNN-based methods to extract feature representations from images and use them for similarity-based retrieval. It focuses on applying deep learning techniques to improve upon traditional handcrafted descriptors by learning features directly from data. The code includes training and evaluation scripts that can be adapted for custom datasets, making it useful for experimenting with retrieval systems in computer vision. By leveraging CNN architectures, the project showcases how learned embeddings can capture semantic similarity across varied images. This resource serves as both an educational reference and a foundation for further exploration in image retrieval research.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    BeaEngine 5

    BeaEngine 5

    BeaEngine disasm project

    BeaEngine is a C library designed to decode instructions from 16-bit, 32-bit and 64-bit intel architectures. It includes standard instructions set and instructions set from FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VMX, CLMUL, AES, MPX, AVX, AVX2, AVX512 (VEX & EVEX prefixes), CET, BMI1, BMI2, SGX, UINTR, KL, TDX and AMX extensions. If you want to analyze malicious codes and more generally obfuscated codes, BeaEngine sends back a complex structure that describes precisely the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    NLP Architect

    NLP Architect

    A model library for exploring state-of-the-art deep learning

    ...The library is designed to be a tool for model development: data pre-processing, build model, train, validate, infer, save or load a model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    gradslam

    gradslam

    gradslam is an open source differentiable dense SLAM library

    ...The question of “representation” is central in the context of dense simultaneous localization and mapping (SLAM). Newer learning-based approaches have the potential to leverage data or task performance to directly inform the choice of representation. However, learning representations for SLAM has been an open question, because traditional SLAM systems are not end-to-end differentiable. In this work, we present gradSLAM, a differentiable computational graph take on SLAM. Leveraging the automatic differentiation capabilities of computational graphs, gradSLAM enables the design of SLAM systems that allow for gradient-based learning across each of their components, or the system as a whole.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Zipline

    Zipline

    Zipline, a Pythonic algorithmic trading library

    ...Zipline is currently used in production as the backtesting and live-trading engine powering Quantopian -- a free, community-centered, hosted platform for building and executing trading strategies. Quantopian also offers a fully managed service for professionals that includes Zipline, Alphalens, Pyfolio, FactSet data, and more. Installing Zipline is slightly more involved than the average Python package. For a development installation (used to develop Zipline itself), create and activate a virtualenv, then run the etc/dev-install script. Please note that Zipline is not a community-led project. Zipline is maintained by the Quantopian engineering team, and we are quite small and often busy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MMdnn

    MMdnn

    Tools to help users inter-operate among deep learning frameworks

    MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML. MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model management, and "dnn" is the acronym of deep neural network. We implement a universal converter to convert DL models between frameworks,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Forecasting Best Practices

    Forecasting Best Practices

    Time Series Forecasting Best Practices & Examples

    ...Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featuring the data, optimizing and evaluating models, and scaling up to the cloud. The examples and best practices are provided as Python Jupyter notebooks and R markdown files and a library of utility functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ebfformat

    ebfformat

    An Efficient Binary data Format

    ...It is also designed to simplify the programming of input output routines in different programming languages. In a nutshell an EBF file is a collection of data objects. Each data object is specified by a unique name and a single file can have multiple data objects. Each data object is preceded by a meta-data or header which describes the binary data associated with it. Among other things, this header allows the files to be portable across systems with different endianess.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Brand new cheatsheets and handouts

    Brand new cheatsheets and handouts

    Matplotlib 3.1 cheat sheet

    ...It lays out common use cases (plot types, styling, figure configuration, saving/exporting, subplot layout, etc.) in a concise and organized format — often serving as a “cheat sheet” for rapid look-up. For practitioners working on data-heavy projects, dashboards, or research code where plotting is frequent, it helps speed up development by reducing context-switching and documentation navigation overhead. It is especially useful when you know roughly what you want (e.g. “I need a scatter + histogram marginal plot”) but don’t remember the exact Matplotlib call.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB