Search Results for "pentaho data integration"

Showing 300 open source projects for "pentaho data integration"

View related business solutions
  • Premier Construction Software Icon
    Premier Construction Software

    Premier is a global leader in financial construction ERP software.

    Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
    Learn More
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 1
    Pentaho

    Pentaho

    Pentaho offers comprehensive data integration and analytics platform.

    Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. ...
    Leader badge
    Downloads: 1,598 This Week
    Last Update:
    See Project
  • 2
    Spring Data MongoDB

    Spring Data MongoDB

    Provide support to increase developer productivity in Java

    ...The Spring Data MongoDB project provides integration with the MongoDB document database. Key functional areas of Spring Data MongoDB are a POJO-centric model for interacting with a MongoDB Document and easily writing a repository-style data access layer. You do not need to build from source to use Spring Data. Binaries are available in repo.spring.io and accessible from Maven using the Maven configuration noted.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Apache SeaTunnel

    Apache SeaTunnel

    SeaTunnel is a distributed, high-performance data integration platform

    SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of nearly 100 companies. There are hundreds of commonly-used data sources of which versions are incompatible. With the emergence of new technologies, more data sources are appearing. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    Flink CDC

    Flink CDC

    Flink CDC is a streaming data integration tool

    Apache Flink CDC is a distributed data integration tool that captures data changes in real-time from various databases. It leverages Change Data Capture (CDC) technology to stream data changes into Apache Flink, enabling real-time analytics and data processing. Flink CDC simplifies data pipeline development with its declarative YAML configurations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • 5
    LakeSoul

    LakeSoul

    An end-to-end, realtime and cloud native Lakehouse framework

    LakeSoul is a high-performance, unified table storage framework for big data lakes, supporting both streaming and batch data in a single format. Built on top of Apache Spark and leveraging Apache Arrow and Parquet, LakeSoul provides ACID transactions, schema evolution, and time travel. It is designed for large-scale data lake architectures that require consistency, efficiency, and easy integration with modern data stacks.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data streams. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Addax

    Addax

    Addax is a versatile open-source ETL tool

    Addax is a data integration and ETL (Extract, Transform, Load) tool designed for high-performance data migration tasks. It simplifies the process of moving data between different systems and formats.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 8
    SeedCrackerX

    SeedCrackerX

    Minecraft mod designed to reverse-engineer

    SeedcrackerX is a Minecraft mod designed to reverse-engineer and determine a world’s seed by analyzing in-game structures and environmental data. It operates by collecting information from structures such as shipwrecks, temples, and monuments, then using that data to progressively narrow down possible seeds until the correct one is identified. The mod automates much of this process, initiating cracking procedures once sufficient data has been gathered, often requiring only exploration of...
    Downloads: 356 This Week
    Last Update:
    See Project
  • 9
    Canal

    Canal

    MySQL binlog

    Canal is an open-source project developed by Alibaba that simulates MySQL slave functionality to parse MySQL binlog files. It enables real-time data synchronization and change data capture (CDC) between MySQL and other systems such as Elasticsearch, Kafka, or HBase. Canal is widely used for data integration, replication, and monitoring across distributed systems, offering high performance and low-latency log parsing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Loan management software that makes it easy. Icon
    Loan management software that makes it easy.

    Ideal for lending professionals who are looking for a feature rich loan management system

    Bryt Software is ideal for lending professionals who are looking for a feature rich loan management system that is intuitive and easy to use. We are 100% cloud-based, software as a service. We believe in providing our customers with fair and honest pricing. Our monthly fees are based on your number of users and we have a minimal implementation charge.
    Learn More
  • 10
    Datacap

    Datacap

    DataCap is integrated software for data transformation

    Datacap is an open-source data catalog and governance tool that helps organizations manage and document their data assets. It provides metadata management, lineage tracking, and collaboration features to ensure data transparency and quality. Datacap is designed for teams that need a lightweight, self-hosted solution to organize and govern their data ecosystems.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Apache Avro

    Apache Avro

    Apache Avro is a data serialization system

    Apache Avro™ is a data serialization system. Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation is an optional optimization, is only worth implementing for statically typed languages. Avro relies on schemas. When Avro data is read, the schema used when writing it is always present.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a server version for remote or multi-user deployment via a web browser. In addition to code editing and execution, RStudio offers extensive support for reproducible research via R Markdown, notebooks, and integration with version control systems like Git and SVN. ...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 13
    JimuReport

    JimuReport

    Open source drag-and-drop reporting and dashboard builder platform

    ...JimuReport supports traditional report generation, print templates, and modern dashboard visualizations for business intelligence scenarios. JimuReport also includes components for building interactive charts, data tables, and analytical displays that can be used in enterprise applications. It can connect to multiple data sources and retrieve data through SQL queries, APIs, or other structured formats. It can be embedded into Java applications using Spring Boot integration modules.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    keycloak-config-cli

    keycloak-config-cli

    Import YAML/JSON-formatted configuration files into Keycloak

    keycloak-config-cli is a Keycloak utility to ensure the desired configuration state for a realm based on a JSON/YAML file. The format of the JSON/YAML file is based on the export realm format. Store and handle the configuration files inside git just like normal code. A Keycloak restart isn't required to apply the configuration. The config files are based on the keycloak export files. You can use them to re-import your settings. But keep your files as small as possible. Remove all UUIDs and...
    Downloads: 118 This Week
    Last Update:
    See Project
  • 15
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    Fully open source, cloud-native, scalable, micro streaming, and complex event processing system capable of building event-driven applications for use cases such as real-time analytics, data integration, notification management, and adaptive decision-making. Event processing logic can be written using Streaming SQL queries via graphical and source editors, to capture events from diverse data sources, process and analyze them, integrate with multiple services and data stores, and publish output to various endpoints in real time. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AKHQ

    AKHQ

    Kafka GUI for Apache Kafka to manage topics, topics data, etc.

    Kafka GUI for Apache Kafka to manage topics, topics data, consumers group, schema registry, connect and more. Enabling your teams to search and explore data in a unified console, while supporting its administration and integration within your ecosystem. Multi-Cluster vision into a central console, available in Multi-Cloud environments. Enabling users to access, search and get insights from your topics, including Live Tail.
    Downloads: 41 This Week
    Last Update:
    See Project
  • 18
    Reactor Core

    Reactor Core

    Non-Blocking Reactive Foundation for the JVM

    Reactor Core is a foundational library for building reactive applications in Java, providing a powerful API for asynchronous, non-blocking programming.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    FIT Framework

    FIT Framework

    An enterprise-level AI development framework

    ...The system is built to be extensible, enabling integration with various machine learning libraries and tools, as well as customization for domain-specific tasks.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Wren Engine

    Wren Engine

    The Semantic Engine for Model Context Protocol(MCP)

    Wren Engine is a semantic engine designed to empower Model Context Protocol (MCP) clients and AI agents by providing accurate, contextual, and governed access to business data. It serves as a bridge between large language models (LLMs) and enterprise systems, facilitating seamless integration and interaction. ​
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 40 This Week
    Last Update:
    See Project
  • 22
    MyBatis Mapper4

    MyBatis Mapper4

    Mybatis common mapper, easy to use

    This book starts with a simple MyBatis query to build a basic development environment for learning MyBatis. Through a comprehensive sample code and test, the basic usage of adding, deleting, modifying, and checking operations in the MyBatis XML mode and annotation mode is explained, and the application of dynamic SQL in different aspects and the best practice program in the use process are introduced. Provides a wealth of examples for MyBatis advanced mapping, stored procedures, and type...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Testcontainers Java

    Testcontainers Java

    Testcontainers is a Java library that supports JUnit tests

    Testcontainers for Java is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container. Use a containerized instance of a MySQL, PostgreSQL or Oracle database to test your data access layer code for complete compatibility, but without requiring complex setup on developers' machines and safe in the knowledge that your tests will always start with a known DB state. Any other...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    EnvFile

    EnvFile

    EnvFile 3.x is a plugin for JetBrains IDEs

    Env File is a plugin for JetBrains IDEs that allows you to set environment variables for your run configurations from one or multiple files. Not all run configurations available in IDEA-based IDEs are implemented similarly. Some of them differ significantly. In certain cases (so far, only Gradle has been confirmed) the implementation exposes interfaces to integrate the EnvFile UI but doesn't provide interfaces for it to actually make its work. Luckily, it was possible to make few assumptions...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 25
    IoTDB

    IoTDB

    Apache IoTDB

    Apache IoTDB (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Due to its light-weight architecture, high performance and rich feature set together with its deep integration with Apache Hadoop, Spark and Flink, Apache IoTDB can meet the requirements of massive data storage, high-speed data ingestion and complex data analysis in the IoT industrial fields. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB