Streaming-Pipeline-using-Dataflow下载

【文件属性】：

文件名称：Streaming-Pipeline-using-Dataflow

文件大小：486KB

文件格式：ZIP

更新时间：2024-03-29 19:46:54

Python

使用DataFlow进行流传输管道（正在建设中）这是使用Python存储库的Apache Beam简介的一部分。在这里，我们将尝试学习Apache Beam的基础知识来创建Streaming管道。我们将逐步学习如何使用创建流式传输管道。完整的过程分为5个部分：从Pub Sub读取数据解析数据过滤数据执行类型转换数据争吵删除不需要的列在Bigquery中插入数据动机在过去的两年中，我一直处于良好的学习曲线中，在此过程中，我提高了自己的技能，进入了机器学习和云计算领域。这个项目是我所有学习的实践项目。这是未来的第一件事。使用的库/框架内置克隆库 # clone this repo: git clone https://github.com/adityasolanki205/Streaming-Pipeline-using-DataFlow

立即下载

【文件预览】：
Streaming-Pipeline-using-Dataflow-master
----generating_data.py(4KB)
----publish_to_pubsub.py(776B)
----publish_to_pubsub.ipynb(2KB)
----data()
--------german-original.data(78KB)
--------german.data(78KB)
--------.ipynb_checkpoints()
----Book1.xlsx(100KB)
----__pycache__()
--------Test.cpython-37.pyc(4KB)
--------Testing.cpython-37.pyc(4KB)
--------generating_data.cpython-37.pyc(3KB)
----output()
--------simpleoutput.txt-00000-of-00001(78KB)
--------Convert_datatype.txt-00000-of-00001(449KB)
--------beam-temp-testing.txt-865077c8798211eb8de37440bb0a5a10()
--------complete_output.txt-00000-of-00001(470KB)
--------beam-temp-Filtered_Data.txt-c411140676be11eba9e67440bb0a5a10()
--------Delete_Unwanted_Columns.txt-00000-of-00001(470KB)
--------SplitPardo.txt-00000-of-00001(468KB)
--------Filtered_Data.txt-00000-of-00001(462KB)
--------DataWrangle.txt-00000-of-00001(513KB)
--------beam-temp-Filtered_Data.txt-468a3b8076c011eb9d9b7440bb0a5a10()
--------.ipynb_checkpoints()
----streaming-pipeline.py(6KB)
----.ipynb_checkpoints()
--------generating_data-checkpoint.ipynb(6KB)
--------streaming-pipeline-checkpoint.py(6KB)
--------publish_to_pubsub-checkpoint.ipynb(72B)
--------generating_data-checkpoint.py(4KB)
--------Local-checkpoint.py(6KB)
--------README-checkpoint.md(16KB)
--------generating_data_testing-checkpoint.py(4KB)
--------Testing-checkpoint.py(5KB)
----README.md(16KB)
----generating_data_testing.py(4KB)
----generating_data.ipynb(6KB)

秒客网

Streaming-Pipeline-using-Dataflow

网友评论