阅读背景:

如何在Google Cloud Dataflow中控制Parallel / GroupBy阶段的位置

来源:互联网 

I'm building an ElasticSearch output for Google Cloud Dataflow that is scalable and does not put load on the ES cluster (batch data flow). My idea for this is to have the nodes of the Dataflow pipeline join the ES cluster and perform the indexing themselves without putting any additional load on the main ES cluster. Therefore I have a stage in the pipeline that on start of the bundle creates a ES node that joins the cluster and then indexes every item that is passed to it itself (via routing settings).I'm building an ElasticSearch output for Google




你的当前访问异常,请进行认证后继续阅读剩余内容。

分享到: