Data Pipelines
A data pipeline, sometimes referred to as an ETL pipeline, is a sequence of ETL jobs that work together to transforms data and information to be consumable by one or more data products.
Generally, each ETL in a data pipeline will extract data from one or more data sources, transform it for some particular purpose, then load it to a new data store. Subsequent ETLs will consume data from that store, then transform and load it to their own data store, and so on.
Deeper Knowledge on Data Pipelines
![Apache Kafka](/img/blog-article-generic_256x256.jpg)
Apache Kafka
A distributed event streaming platform for data-pipelines and analytics
![Data Products](/img/blog-article-generic_256x256.jpg)
Data Products
Ways of making data available
![Extract Transform Load (ETL)](/img/blog-article-generic_256x256.jpg)
Extract Transform Load (ETL)
Ways to extract, transform, and load data
![Apache Spark](/img/blog-article-generic_256x256.jpg)
Apache Spark
A data processing engine for batch processing, stream processing, and machine learning
Broader Topics Related to Data Pipelines
![Data Products](/img/blog-article-generic_256x256.jpg)
Data Products
Ways of making data available
![Data Engineering](/img/blog-article-generic_256x256.jpg)
Data Engineering
Engineering approaches to data management