Skip to content

Architecture: Data Lake

David Liu edited this page Jan 4, 2023 · 8 revisions

LifeCycle

1. Ingest

Batch

  • Storage Transfer Service
  • BigQuery Data Transfer Service
  • Transfer Appliance

Streaming

  • Pub/Sub

2. Store

Storage decision tree image

3. Process and Analyze

Data cleansing and normalize

  • Cloud Dataprep Data Harvest
  • Dataplex ETL
  • Dataflow and Cloud Data Fusion for data absorption

Warehouse

  • Dataproc and BigQuery

4. Explore and Visualize

Clone this wiki locally