A data warehouse is a type of data product that consists a database that stores data accumulated from a wide range of sources and stored for efficient retrieval for analysis. Data warehouses are generally structured for online analytical processing (OLAP) to allow for multiple data views, filters, and refinements based on multiple dimensions, which are attributes of interest to the business. This is most commonly achieved by implementing a star schema, which uses a relational model to organize data into facts and dimensions.
A typical data warehouse architecture consists of multiple data sources, a staging area, the warehouse its self, and one or more data marts.
Data is extracted, transferred, and loaded from source to destination by ETL processes, the sum of which constitute a data pipeline. Subsets of the data in a data warehouse are sometimes broken down into data marts, which are essentially "miniature data warehouses" intended for a specific audience.
The data staging area is a temporary storage area for source data that helps quickly extract and consolidate source data, perform quality checks and cleansing, detecting changes, troubleshooting, and performing pre-aggregation functions before the data is transferred to the data warehouse. Staging areas are often ephemeral, though they may be maintained or archived. In modern data warehouse architecture, the staging area is often a data lake.
Deeper Knowledge on Data warehouses
A columnar data warehouse solution on AWS
Azure Synapse Analytics
An integrated set of data services on Microsoft Azure
A combination of data lakes and data warehouses
Online Analytical Processing (OLAP)
A technique to create views and calculations from multi-dimensional data
Star schemas with normalized dimension tables
Schemas to organize data by facts and dimensions for analysis
Broader Topics Related to Data warehouses
Ways of making data available
Organized collections of structured data