谷歌的数据流和谷歌的数据流有什么区别?

时间:2022-06-18 14:50:18

DataFlow itself has ETL,computation and streaming process why do we need to go for google's Dataproc?

DataFlow本身有ETL,计算和流媒体处理为什么我们需要去谷歌的Dataproc?

1 个解决方案

#1


5  

Google Dataflow is a fully managed and self-optimizing cloud service that lets you use the Apache Beam programming model to write your batch and streaming data processing pipelines. It's integrated with many open source and Google Cloud data sources and sinks.

Google Dataflow是一种完全托管和自我优化的云服务,可让您使用Apache Beam编程模型编写批处理和流数据处理管道。它与许多开源和Google Cloud数据源和接收器集成在一起。

Google Dataproc is a fully managed cloud service for running Apache Hadoop and Apache Spark clusters in a simple cost-effective way. If you have existing data processing pipelines that use Spark, Hive, or Pig this is an quick and easy way to move your pipelines. You can install custom packages, start/stop and scale these clusters at any time. On top Google Dataproc is integrated with many of Google Clouds data services.

Google Dataproc是一种完全托管的云服务,用于以简单经济的方式运行Apache Hadoop和Apache Spark群集。如果您有使用Spark,Hive或Pig的现有数据处理管道,这是一种快速简便的移动管道的方法。您可以随时安装自定义程序包,启动/停止和缩放这些群集。最重要的是,Google Dataproc与许多Google云端数据服务集成在一起。

#1


5  

Google Dataflow is a fully managed and self-optimizing cloud service that lets you use the Apache Beam programming model to write your batch and streaming data processing pipelines. It's integrated with many open source and Google Cloud data sources and sinks.

Google Dataflow是一种完全托管和自我优化的云服务,可让您使用Apache Beam编程模型编写批处理和流数据处理管道。它与许多开源和Google Cloud数据源和接收器集成在一起。

Google Dataproc is a fully managed cloud service for running Apache Hadoop and Apache Spark clusters in a simple cost-effective way. If you have existing data processing pipelines that use Spark, Hive, or Pig this is an quick and easy way to move your pipelines. You can install custom packages, start/stop and scale these clusters at any time. On top Google Dataproc is integrated with many of Google Clouds data services.

Google Dataproc是一种完全托管的云服务,用于以简单经济的方式运行Apache Hadoop和Apache Spark群集。如果您有使用Spark,Hive或Pig的现有数据处理管道,这是一种快速简便的移动管道的方法。您可以随时安装自定义程序包,启动/停止和缩放这些群集。最重要的是,Google Dataproc与许多Google云端数据服务集成在一起。