Search Results for "java open source" - Page 2

Sort By:

Showing 500 open source projects for "java open source"

View related business solutions

Data Management Python Clear Filters & Widen Search

Assembled is the only unified platform for staffing and managing your human and AI support team.
AI for world-class support operations

Assembled is the only platform that unifies AI agents and intelligent workforce management to power fast and flexible support operations. Built for scale, we help teams automate over 50% of customer interactions, forecast with 90%+ accuracy, and optimize staffing across in-house and BPO teams. Orchestrate every chat, email, or call, balancing workloads between human and AI agents in real time — without sacrificing quality or control. Trusted by Stripe, Canva, and Robinhood, Assembled transforms support from a cost center into a strategic advantage. Our Workforce and Vendor Management tools connect forecasting, scheduling, and performance for smarter staffing decisions. AI Agents automate conversations across channels with your workflows and brand voice. AI Copilot empowers agents with real-time guidance, suggested replies, and one-click actions for faster, higher-quality resolutions.

Learn More
Feroot AI automates website security with 24/7 monitoring
Trusted by enterprises, healthcare providers, retailers, SaaS platforms, payment service providers, and public sector organizations.

Feroot unifies JavaScript behavior analysis, web compliance scanning, third-party script monitoring, consent enforcement, and data privacy posture management to stop Magecart, formjacking, and unauthorized tracking.

Learn More
1

Datasette

An open source multi-tool for exploring and publishing data

Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size, analyze and explore it, and publish it as an interactive website and accompanying API. Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of tools and plugins dedicated to making working with structured data as productive as...

Downloads: 6 This Week

Last Update: 2025-11-05
See Project
2

Population Shift Monitoring

Monitor the stability of a Pandas or Spark dataframe

popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets. popmon creates histograms of features binned in time-slices, and compares the stability of the profiles and distributions of those histograms using statistical tests, both over time and with respect to a reference. It works with numerical, ordinal, categorical features, and the histograms can be higher-dimensional, e.g. it can also track correlations between any two...

Downloads: 8 This Week

Last Update: 2026-01-09
See Project
3

HyperTools

A Python toolbox for gaining geometric insights

HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes,...

Downloads: 8 This Week

Last Update: 2026-01-29
See Project
4

Dask

Parallel computing with task scheduling

Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.

Downloads: 5 This Week

Last Update: 2026-03-18
See Project
Find out just how much your login box can do for your customer | Auth0
With over 53 social login options, you can fast-track the signup and login experience for users.

From improving customer experience through seamless sign-on to making MFA as easy as a click of a button – your login box must find the right balance between user convenience, privacy and security.

Sign up
5

Mage.ai

Build, run, and manage data pipelines for integrating data

Open-source data pipeline tool for transforming and integrating data. The modern replacement for Airflow. Effortlessly integrate and synchronize data from 3rd party sources. Build real-time and batch pipelines to transform data using Python, SQL, and R. Run, monitor, and orchestrate thousands of pipelines without losing sleep. Have you met anyone who said they loved developing in Airflow?

Downloads: 6 This Week

Last Update: 2026-01-20
See Project
6

classic.tplx

A more accurate representation of jupyter notebooks

A more accurate representation of Jupyter notebooks when converting to pdfs. This template was designed to make converted Jupyter notebooks look (almost) identical to the actual notebook. If something doesn't exist in the original notebook then it doesn't belong in the conversion. As of nbconvert 5.5.0, the majority of these improvements have been merged into nbconvert's default template. Version 3.x of this package will continue to support nbconvert 5.5.0 and lower, whereas in the future...

Downloads: 5 This Week

Last Update: 2025-06-05
See Project
7

Bytewax

Python Stream Processing

Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries. Bytewax is a Python framework and Rust distributed...

Downloads: 7 This Week

Last Update: 2024-11-25
See Project
8

Astropy

Repository for the Astropy core package

The Astropy Project is a community effort to develop a common core package for Astronomy in Python and foster an ecosystem of interoperable astronomy packages. Astropy is a Python library for use in astronomy. Learn Astropy provides a portal to all of the Astropy educational material through a single dynamically searchable web page. It allows you to filter tutorials by keywords, search for filters, and make search queries in tutorials and documentation simultaneously. The Anaconda Python...

Downloads: 6 This Week

Last Update: 2025-11-29
See Project
9

Gretel Synthetics

Synthetic data generators for structured and unstructured text

Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....

Downloads: 7 This Week

Last Update: 2025-03-17
See Project
Contract Management Software | Concord
AI-powered contract management that helps businesses track spending, negotiate smarter, and never miss deadlines.

Concord serves small and mid-sized businesses and Fortune 500 companies. This robust, web-based platform is used by human resource, sales, procurement, and legal teams, and virtually anyone who deals with contracts.

Learn More
10

NannyML

Detecting silent model failure. NannyML estimates performance

NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, and interactive visualizations, is completely model-agnostic, and currently supports all tabular classification use cases.

Downloads: 5 This Week

Last Update: 2025-07-12
See Project
11

Cookiecutter Data Science

Project structure for doing and sharing data science work

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. When we think about data analysis, we often think just about the resulting reports, insights, or visualizations. While these end products are generally the main event, it's easy to focus on making the products look nice and ignore the quality of the code that generates them. Because these end products are created programmatically, code quality is still important! And we're not talking...

Downloads: 7 This Week

Last Update: 2025-07-24
See Project
12

SDGym

Benchmarking synthetic data generation methods

The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the...

Downloads: 9 This Week

Last Update: 2026-04-01
See Project
13

SageMaker Training Toolkit

Train machine learning models within Docker containers

Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and...

Downloads: 7 This Week

Last Update: 2025-09-22
See Project
14

marimo

A reactive notebook for Python

marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots, make working with data feel refreshingly fast, futuristic, and intuitive. ...

Downloads: 4 This Week

Last Update: 6 days ago
See Project
15

gusty

Making DAG construction easier

gusty allows you to control your Airflow DAGs, Task Groups, and Tasks with greater ease. gusty manages collections of tasks, represented as any number of YAML, Python, SQL, Jupyter Notebook, or R Markdown files. A directory of task files is instantly rendered into a DAG by passing a file path to gusty's create_dag function. gusty also manages dependencies (within one DAG) and external dependencies (dependencies on tasks in other DAGs) for each task file you define. All you have to do is...

Downloads: 6 This Week

Last Update: 2025-05-14
See Project
16

Elementary

Open-source data observability for analytics engineers

Elementary is an open-source data observability solution for data & analytics engineers. Monitor your dbt project and data in minutes, and be the first to know of data issues. Gain immediate visibility, detect data issues, send actionable alerts, and understand the impact and root cause. Generate a data observability report, host it or share with your team.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
17

DataChain

AI-data warehouse to enrich, transform and analyze unstructured data

Datachain enables multimodal API calls and local AI inferences to run in parallel over many samples as chained operations. The resulting datasets can be saved, versioned, and sent directly to PyTorch and TensorFlow for training. Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. Datachain...

Downloads: 4 This Week

Last Update: 4 days ago
See Project
18

Mercury

Convert Python notebook to web app and share with non-technical users

Turn Python notebooks to web applications with open-source Mercury framework. Hide code and add interactive widgets. Non-technical users can tweak widgets and execute notebook with new parameters. The core of Mercury is Open Source under AGPLv3. We provide Mercury Pro with additional features, dedicated support and friendly commercial license. Mercury is a perfect tool to convert Python notebook to interactive web application and share with non-programmers. ...

Downloads: 8 This Week

Last Update: 7 days ago
See Project
19

Cleanlab

The standard data-centric AI package for data quality and ML

cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you...

Downloads: 6 This Week

Last Update: 2026-01-13
See Project
20

Apache Airflow Provider

Great Expectations Airflow operator

Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+. If your Airflow version is 2.1.0, and you want to install this provider version, first upgrade Airflow to at least version 2.1.0. Otherwise, your Airflow package version will be upgraded automatically, and you will have to manually run airflow upgrade db to complete the migration. This operator currently works with the Great Expectations V3 Batch Request API only. If you would like to use the...

Downloads: 4 This Week

Last Update: 2026-01-28
See Project
21

Timesketch

Collaborative forensic timeline analysis

Timesketch is a collaborative forensic timeline analysis platform used to investigate security incidents by turning diverse evidence into a single, searchable chronology. Analysts ingest logs and artifacts from many sources—endpoints, servers, cloud services—and Timesketch normalizes them into events on a unified timeline. Powerful search, aggregations, and saved views help you pivot quickly, highlight anomalies, and preserve investigative steps for later review. The system supports tagging,...

Downloads: 4 This Week

Last Update: 2026-03-26
See Project
22

electricityMap

A real-time visualisation of the CO2 emissions of electricity

Real-time visualization of the Greenhouse Gas (in terms of CO2 equivalent) footprint of electricity consumption built with d3.js and mapbox GL. Real-time data is defined as a data source with an hourly (or better) frequency, delayed by less than 2hrs. It should provide a breakdown by generation type. Often fossil fuel generation (coal/gas/oil) is combined under a single heading like 'thermal' or 'conventional', this is not a problem. Citizens should not be responsible for the emissions...

Downloads: 4 This Week

Last Update: 2026-01-13
See Project
23

Great Expectations

Always know what to expect from your data

Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. Software developers have long known that testing and documentation are essential for managing complex codebases. Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams. Expectations are assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues. Expectations...

Downloads: 5 This Week

Last Update: 1 day ago
See Project
24

geemap

A Python package for interactive geospaital analysis and visualization

A Python package for interactive geospatial analysis and visualization with Google Earth Engine. Geemap is a Python package for geospatial analysis and visualization with Google Earth Engine (GEE), which is a cloud computing platform with a multi-petabyte catalog of satellite imagery and geospatial datasets. During the past few years, GEE has become very popular in the geospatial community and it has empowered numerous environmental applications at local, regional, and global scales. GEE...

Downloads: 5 This Week

Last Update: 2026-03-20
See Project
25

PySyft

Data science on data without acquiring a copy

Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data...

Downloads: 5 This Week

Last Update: 2025-02-13
See Project