Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
-
Updated
Aug 5, 2025 - Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Turns Data and AI algorithms into production-ready web applications in no time.
An orchestration platform for the development, production, and observation of data assets.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Work with your web service, database, and streaming schemas in a single format.
Powerful RDF Knowledge Graph Generation with RML Mappings
🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.
Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources
An example mini data warehouse for python project stats, template for new projects
scikit-fusion: Data fusion via collective latent factor models
First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).
An Efficient RML-Compliant Engine for Knowledge Graph Construction
A tool for semi-automatic cell type harmonization and integration
A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching methods.
Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python.
Build complete API integrations with YAML and SQL. Rapid development without vendor lock-in and per-row costs.
Add a description, image, and links to the data-integration topic page so that developers can more easily learn about it.
To associate your repository with the data-integration topic, visit your repo's landing page and select "manage topics."