strom原理

http://storm.apache.org/releases/current/Understanding-the-parallelism-of-a-Storm-topology.html

1.三个组成部分

worker 是topology 的一个子集
A worker process executes a subset of a topology
**worker 属于特定topology **
A worker process belongs to a specific topology and may run one or more executors for one or more components (spouts or bolts) of this topology.
运行的topology 包含运行在很多机器上的进程
A running topology consists of many such processes running on many machines within a Storm cluster.
每个bolt或者spout执行很多个task
each spout or bolt that you implement in your code executes as many tasks across the cluster
一个task就是一个组件(spout or bolt).
threads 数小于等于task数目 #threads ≤ #tasks
. By default, the number of tasks is set to be the same as the number of executors, i.e. Storm will run one task per thread.

Paste_Image.png

2.

you can configure not only the number of executors but also the number of worker processes and the number of tasks of a Storm topology. We will specifically call out when "parallelism" is used in the normal, narrow definition of Storm.

3.设置并行度

BlueSpout sends its output to GreenBolt, which in turns sends its own output to YellowBolt
.


Paste_Image.png
Config conf = new Config();
conf.setNumWorkers(2); // use two worker processes

topologyBuilder.setSpout("blue-spout", new BlueSpout(), 2); // set parallelism hint to 2

topologyBuilder.setBolt("green-bolt", new GreenBolt(), 2)
               .setNumTasks(4)
               .shuffleGrouping("blue-spout");

topologyBuilder.setBolt("yellow-bolt", new YellowBolt(), 6)
               .shuffleGrouping("green-bolt");

StormSubmitter.submitTopology(
        "mytopology",
        conf,
        topologyBuilder.createTopology()
    );
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容