Apache Spark

Apache Spark is an open-source, extensible, distributed data processing engine, suitable for big data engineering tasks including batch data processing, data streaming, analytics, and machine learning. It supports Python, SQL, Scala, Java, and R programming languages.

Apache Spark Resources

Broader Topics Related to Apache Spark

Data Pipelines

Data Pipelines

Ways of making data available

Data Analysis

Data Analysis

The transformation of data to information

Open-Source Software

Open-Source Software

Useful open source software projects

Apache Software Foundation (ASF)

Apache Software Foundation (ASF)

Overview of the Apache Software Foundation (ASF)

Apache Spark Knowledge Graph