cdc_deltaLake
所属分类:自动编程
开发工具:Jupyter Notebook
文件大小:43KB
下载次数:0
上传日期:2022-09-07 15:41:40
上 传 者:
sh-1993
说明: Docker compose和Google Colab演示,用三角洲湖构建疾病控制与预防中心
(Docker compose and Google Colab demo to build a CDC with Delta Lake)
文件列表:
Dockerfile (124, 2021-11-17)
Spark_DeltaLake_Notebook.ipynb (42442, 2021-11-17)
docker-compose.yaml (6666, 2021-11-17)
img (0, 2021-11-17)
img\architecture.jpg (36234, 2021-11-17)
retail.sql (3178, 2021-11-17)
start_docker.sh (180, 2021-11-17)
# PostgreSQL (Debezium) - Kafka - Spark Delta Lake
## Description
This project is a demo for testing a CDC (Change Data Capture).
All infrastructure is built using docker.
![alt text](https://github.com/masfworld/cdc_deltaLake/blob/main/img/architecture.jpg?raw=true)
## Features
- PostgreSql as Legacy database
- Debezium as Change Data Capture
- Kafka to ingest data from Debezium
- KSQLDB to transform Avro messages into JSON
- Spark Delta Lake to manage events from legacy database
近期下载者:
相关文件:
收藏者: