Data Integration Tools for Windows

View 114 business solutions

Browse free open source Data Integration tools and projects for Windows below. Use the toggles on the left to filter open source Data Integration tools by OS, license, language, programming language, and project status.

  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 1
    Pentaho

    Pentaho

    Pentaho offers comprehensive data integration and analytics platform.

    Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. The Pentaho Enterprise Edition Free Trial can be obtained from https://pentaho.com/download/
    Leader badge
    Downloads: 1,598 This Week
    Last Update:
    See Project
  • 2
    Pentaho Data Integration

    Pentaho Data Integration

    Pentaho Data Integration ( ETL ) a.k.a Kettle

    Pentaho Data Integration uses the Maven framework. Project distribution archive is produced under the assemblies module. Core implementation, database dialog, user interface, PDI engine, PDI engine extensions, PDI core plugins, and integration tests. Maven, version 3+, and Java JDK 1.8 are requisites. Use of the Pentaho checkstyle format (via mvn checkstyle:check and reviewing the report) and developing working Unit Tests helps to ensure that pull requests for bugs and improvements are processed quickly. In addition to the unit tests, there are integration tests that test cross-module operation.
    Downloads: 94 This Week
    Last Update:
    See Project
  • 3
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust multi-tenant, multi-tool engine that scales technically and organizationally. Dagster as a unified control plane: The ‘single plane of glass’ data teams love to use. Rein in the chaos and maintain control over your data as the complexity scales. Centralize your metadata in one tool with built-in observability, diagnostics, cataloging, and lineage. Spot any issues and identify performance improvement opportunities.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 4
    nango

    nango

    A single API for all your integrations.

    Nango is a single API to interact with all other external APIs. It should be the only API you need to integrate to your app. Nango is an open-source solution for integrating third-party APIs with applications, simplifying API authentication, data syncing, and management.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Parasoft: Automated Testing to Deliver Superior Quality Software Icon
    Parasoft: Automated Testing to Deliver Superior Quality Software

    Parasoft provides test automation for every phase of the software development life cycle.

    Parasoft helps organizations continuously deliver high-quality software with its AI-powered software testing platform and automated test solutions. Supporting the embedded, enterprise, and IoT markets, Parasoft’s proven technologies reduce the time, effort, and cost of delivering secure, reliable, and compliant software by integrating everything from deep code analysis and unit testing to web UI and API testing, plus service virtualization and complete code coverage, into the delivery pipeline. Bringing all this together, Parasoft’s award-winning reporting and analytics dashboard provides a centralized view of quality, enabling organizations to deliver with confidence and succeed in today’s most strategic ecosystems and development initiatives—security, safety-critical, Agile, DevOps, and continuous testing.
    Learn More
  • 5
    PHPCI

    PHPCI

    PHPCI is a free and open source continuous integration tool

    PHPCI is a continuous integration (CI) server designed specifically for PHP applications. It automates tasks such as testing, code quality checks, and deployment, helping developers maintain code consistency and detect issues early. PHPCI supports various plugins and tools, including PHPUnit, PHPMD, and Codeception, making it highly customizable for different project needs.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    Common Core Ontologies

    Common Core Ontologies

    The Common Core Ontology Repository

    The Common Core Ontologies (CCO) comprise twelve ontologies that are designed to represent and integrate taxonomies of generic classes and relations across all domains of interest. CCO is a mid-level extension of Basic Formal Ontology (BFO), an upper-level ontology framework widely used to structure and integrate ontologies in the biomedical domain (Arp, et al., 2015). BFO aims to represent the most generic categories of entity and the most generic types of relations that hold between them, by defining a small number of classes and relations. CCO then extends from BFO in the sense that every class in CCO is asserted to be a subclass of some class in BFO, and that CCO adopts the generic relations defined in BFO (e.g., has_part) (Smith and Grenon, 2004). Accordingly, CCO classes and relations are heavily constrained by the BFO framework, from which it inherits much of its basic semantic relationships.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Airbyte

    Airbyte

    Data integration platform for ELT pipelines from APIs, databases

    We believe that only an open-source solution to data movement can cover the long tail of data sources while empowering data engineers to customize existing connectors. Our ultimate vision is to help you move data from any source to any destination. Airbyte already provides the largest catalog of 300+ connectors for APIs, databases, data warehouses, and data lakes. Moving critical data with Airbyte is as easy and reliable as flipping on a switch. Our teams process more than 300 billion rows each month for ambitious businesses of all sizes. Enable your data engineering teams to focus on projects that are more valuable to your business. Building and maintaining custom connectors have become 5x easier with Airbyte. With an average response rate of 10 minutes or less and a Customer Satisfaction score of 96/100, our team is ready to support your data integration journey all over the world.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    Apache DevLake

    Apache DevLake

    Apache DevLake is an open-source dev data platform

    Apache DevLake is an open-source dev data platform that ingests, analyzes, and visualizes the fragmented data from DevOps tools to extract insights for engineering excellence, developer experience, and community growth. Apache DevLake is designed for developer teams looking to make better sense of their development process and to bring a more data-driven approach to their own practices. You can ask Apache DevLake many questions regarding your development process. Just connect and query. Your Dev Data lives in many silos and tools. DevLake brings them all together to give you a complete view of your Software Development Life Cycle (SDLC). From DORA to scrum retros, DevLake implements metrics effortlessly with prebuilt dashboards supporting common frameworks and goals. DevLake fits teams of all shapes and sizes, and can be readily extended to support new data sources, metrics, and dashboards, with a flexible framework for data collection and transformation.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    Gradle Docker Compose Plugin

    Gradle Docker Compose Plugin

    Simplifies usage of Docker Compose for integration testing

    The Gradle Docker Compose Plugin by Avast integrates Docker Compose lifecycle management into Gradle builds. It allows developers to define and manage Docker containers required for integration testing or local development directly from their Gradle build scripts. This plugin automates the startup and shutdown of services, supports container health checks, and enables tight integration between application code and containerized services, enhancing reproducibility and automation in development pipelines.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Assembled is the only unified platform for staffing and managing your human and AI support team. Icon
    Assembled is the only unified platform for staffing and managing your human and AI support team.

    AI for world-class support operations

    Assembled is the only platform that unifies AI agents and intelligent workforce management to power fast and flexible support operations. Built for scale, we help teams automate over 50% of customer interactions, forecast with 90%+ accuracy, and optimize staffing across in-house and BPO teams. Orchestrate every chat, email, or call, balancing workloads between human and AI agents in real time — without sacrificing quality or control. Trusted by Stripe, Canva, and Robinhood, Assembled transforms support from a cost center into a strategic advantage. Our Workforce and Vendor Management tools connect forecasting, scheduling, and performance for smarter staffing decisions. AI Agents automate conversations across channels with your workflows and brand voice. AI Copilot empowers agents with real-time guidance, suggested replies, and one-click actions for faster, higher-quality resolutions.
    Learn More
  • 10
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    KubeRay

    KubeRay

    A toolkit to run Ray applications on Kubernetes

    KubeRay is a powerful, open-source Kubernetes operator that simplifies the deployment and management of Ray applications on Kubernetes. It offers several key components. KubeRay core: This is the official, fully-maintained component of KubeRay that provides three custom resource definitions, RayCluster, RayJob, and RayService. These resources are designed to help you run a wide range of workloads with ease.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    nichenetr

    nichenetr

    NicheNet: predict active ligand-target links between interacting cells

    nichenetr: the R implementation of the NicheNet method. The goal of NicheNet is to study intercellular communication from a computational perspective. NicheNet uses human or mouse gene expression data of interacting cells as input and combines this with a prior model that integrates existing knowledge on ligand-to-target signaling paths. This allows to predict ligand-receptor interactions that might drive gene expression changes in cells of interest. This model of prior information on potential ligand-target links can then be used to infer active ligand-target links between interacting cells. NicheNet prioritizes ligands according to their activity (i.e., how well they predict observed changes in gene expression in the receiver cell) and looks for affected targets with high potential to be regulated by these prioritized ligands.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    PhantomJS-Node

    PhantomJS-Node

    PhantomJS integration module for NodeJS

    PhantomJS-Node is a Node.js bridge to PhantomJS, enabling programmatic control of the headless browser for tasks like web scraping, automated testing, and page rendering.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14

    Tarunyai

    Tool is in development please donot download.Analytics tool

    This tool is still in development phase please donot download for now. Tarunyai data integration prepares and blends data to create a complete picture of your business that drives actionable insights.
    Downloads: 61 This Week
    Last Update:
    See Project
  • 15
    ChunJun

    ChunJun

    A data integration framework

    ChunJun is a distributed integration framework, and currently is based on Apache Flink. It was initially known as FlinkX and renamed ChunJun on February 22, 2022. It can realize data synchronization and calculation between various heterogeneous data sources. ChunJun has been deployed and running stably in thousands of companies so far. Based on the real-time computing engine--Flink, and supports JSON template and SQL script configuration tasks. The SQL script is compatible with Flink SQL syntax. Supports a variety of heterogeneous data sources, and supports synchronization and calculation of more than 20 data sources such as MySQL, Oracle, SQLServer, Hive, Kudu, etc. Easy to expand, highly flexible, newly expanded data source plugins can integrate with existing data source plugins instantly, plugin developers do not need to care about the code logic of other plugins.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    PANDORA

    PANDORA

    Revolutionizing Biomedical Research with Advanced Machine Learning

    PANDORA is a machine learning (ML) tool that can be used to integrate various data types, including clinical, transcriptome and microbiome data and find connections in large datasets. PANDORA can be easily installed using Docker, a pre-built version of the software can be pulled from DockerHub. In order to run a test instance of PANDORA, users will first need to prepare their local environment by downloading, installing, and configuring Docker. genular is a community behind SIMON an open-source Machine Learning KnowledgeDiscovery software, built by a vibrant community of people just like you! Join us and make SIMON even cooler! Exploratory analysis of machine learning results with the help of many different visualization techniques will give you instant insights into models and data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Open Information Integration
    Open Information Integration Tool Suite (Open II) is used by analysts and programmers to accelerate data integration and harmonization across organizations. OpenII has a neutral schema repository for browsing and comparing all sorts of data models. OpenII is built as a Rich Client Platform Application on top of Eclipse 3.x. Developers need to download Eclipse, install the RCP support, the Fatjar plugin and the Delta Pack in one of the 3.x flavors. Release Notes Release Date: Jan 2014 Build Version: 1.0.2666 Notes: 1. Now support for AVRO and HCatalog imports 2. Better support for OWL 3. New OWL and Containing Relationship viewers 4. Added case insensitive option in exact matcher for Harmony
    Downloads: 14 This Week
    Last Update:
    See Project
  • 18
    Metl ETL Data Integration

    Metl ETL Data Integration

    Simple message-based, web-based ETL integration

    Metl is a simple, web-based ETL tool that allows for data integrations including database, files, messaging, and web services. Supports RDBMS, SOAP, HTTP, FTP, SFTP, XML, FIXLEN, CSV, JSON, ZIP, and more. Metl implements scheduled integration tasks without the need for custom coding or heavy infrastructure. It can be deployed in the cloud or in an internal data center, and it was built to allow developers to extend it with custom components.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 19
    XAware Data Integration Project

    XAware Data Integration Project

    Create XML and JSON data services from any data source

    Create services to integrate applications & move data of any type. Build data views across DBMS, SOAP, HTTP/REST, Salesforce, SAP, Microsoft, SharePoint, Text, LDAP, FTP sources to read, write & transfer data. Eclipse designer & run-time engine.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Framework for text mining, data integration and data analysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    CloverDX

    CloverDX

    Design, automate, operate and publish data pipelines at scale

    Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd parties. Simple data manipulation jobs can be created visually. More complex business logic can be implemented using Clover's domain-specific-language CTL, in Java or languages like Python or JavaScript. Through its DataServices functionality, it allows to quickly turn data pipelines into REST API endpoints. The platform allows to easily scale your data job across multiple cores or nodes/machines. Supports Docker/Kubernetes deployments and offers AWS/Azure images in their respective marketplace
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    OPENSUITE - an integration platform to enable process data integration between independently developed business applications.OPENSUITE integration platform takes advantage of the SOA best integration practices to supply the middleware layer functionality
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24

    PDI Data Vault framework

    Data Vault loading automation using Pentaho Data Integration.

    A metadata driven 'tool' to automate loading a designed Data Vault. It consists of a set of Pentaho Data Integration and database objects. Thel Virtual Machine (VMware) is a 64 bit Ubuntu Server 14.04, with MySQL (Percona Server) and PostgreSQL 9.4 as the database flavours and PDI version 5.2 CE. NB: Directory version_2.4 contains the most recent Virtual Machine. The readme.txt contains info about that VM.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    The aim of this project is to publish releases of Pentaho Data Integration not provided by pentaho.org
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB