當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

twitter storm源码走读（二）

發布時間：2024/4/13 编程问答 24 豆豆

生活随笔收集整理的這篇文章主要介紹了 twitter storm源码走读（二）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

topology提交過程分析

概要

storm cluster可以想像成為一個工廠，nimbus主要負責從外部接收訂單和任務分配。除了從外部接單，nimbus還要將這些外部訂單轉換成為內部工作分配，這個時候nimbus充當了調度室的角色。supervisor作為中層干部，職責就是生產車間的主任，他的日常工作就是時刻等待著調度到給他下達新的工作。作為車間主任，supervisor領到的活是不用自己親力親為去作的，他手下有著一班的普通工人。supervisor對這些工人只會喊兩句話，開工，收工。注意，講收工的時候并不意味著worker手上的活已經干完了，只是進入休息狀態而已。

topology的提交過程涉及到以下角色。

storm client?　?負責將用戶創建的topology提交到nimbus
nimbus?　　?????通過thrift接口接收用戶提交的topology
supervisor???????根據zk接口上提示的消息下載最新的任務安排，并負責啟動worker
worker????????????worker內可以運行task,這些task要么屬于bolt類型，要么屬于spout類型
executor?????????executor是一個個運行的線程，同一個executor內可以運行同一種類型的task,即一個線程中的task要么全部是bolt類型，要么全部是spout類型

一個worker等同于一個進程，一個executor等同于一個線程，同一個線程中能夠運行一或多個tasks。在0.8.0版之前，一個task是對應于一個線程的，在0.8.0版本中引入了executor概念，變化引入之后，task與thread之間的一一對應關系就取消了，同時在zookeeper server中原本存在的tasks-subtree也消失了，有關這個變化，可以參考http://storm-project.net/2012/08/02/storm080-released.html

?storm client

storm client需要執行下面這句指令將要提交的topology提交給storm cluster 假設jar文件名為storm-starter-0.0.1-snapshot-standalone.jar,啟動程序為 storm.starter.ExclamationTopology,給這個topology起的名稱為exclamationTopology.

#./storm jar $HOME/working/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-standalone.jar storm.starter.ExclamationTopology exclamationTopology

這么短短的一句話對于storm client來說，究竟意味著什么呢？源碼面前是沒有任何秘密可言的，那好打開storm client的源碼文件

def jar(jarfile, klass, *args):"""Syntax: [storm jar topology-jar-path class ...]Runs the main method of class with the specified arguments. The storm jars and configs in ~/.storm are put on the classpath. The process is configured so that StormSubmitter (http://nathanmarz.github.com/storm/doc/backtype/storm/StormSubmitter.html)will upload the jar at topology-jar-path when the topology is submitted."""exec_storm_class(klass,jvmtype="-client",extrajars=[jarfile, USER_CONF_DIR, STORM_DIR + "/bin"],args=args,jvmopts=["-Dstorm.jar=" + jarfile]) def exec_storm_class(klass, jvmtype="-server", jvmopts=[], extrajars=[], args=[], fork=False):global CONFFILEall_args = ["java", jvmtype, get_config_opts(),"-Dstorm.home=" + STORM_DIR, "-Djava.library.path=" + confvalue("java.library.path", extrajars),"-Dstorm.conf.file=" + CONFFILE,"-cp", get_classpath(extrajars),] + jvmopts + [klass] + list(args)print "Running: " + " ".join(all_args)if fork:os.spawnvp(os.P_WAIT, "java", all_args)else:os.execvp("java", all_args) # replaces the current process andnever returns

exec_storm_class說白了就是要運行傳進來了的WordCountTopology類中main函數，再看看main函數的實現

public static void main(String[] args) throws Exception {TopologyBuilder builder = new TopologyBuilder();builder.setSpout("spout", new RandomSentenceSpout(), 5);builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));Config conf = new Config();conf.setDebug(true);if (args != null && args.length > 0) {conf.setNumWorkers(3);StormSubmitter.submitTopology(args[0], conf, builder.createTopology());} }

對于storm client側來說，最主要的函數StormSubmitter露出了真面目，submitTopology才是我們真正要研究的重點。

public static void submitTopology(String name, Map stormConf,StormTopology topology, SubmitOptions opts) throws AlreadyAliveException, InvalidTopologyException {if(!Utils.isValidConf(stormConf)) {throw new IllegalArgumentException("Storm conf is not valid. Must be json-serializable");}stormConf = new HashMap(stormConf);stormConf.putAll(Utils.readCommandLineOpts());Map conf = Utils.readStormConfig();conf.putAll(stormConf);try {String serConf = JSONValue.toJSONString(stormConf);if(localNimbus!=null) {LOG.info("Submitting topology " + name + " in local mode");localNimbus.submitTopology(name, null, serConf, topology);} else {NimbusClient client = NimbusClient.getConfiguredClient(conf);if(topologyNameExists(conf, name)) {throw new RuntimeException("Topology with name `"+ name + "` already exists on cluster");}submitJar(conf);try {LOG.info("Submitting topology " + name + " in distributed mode with conf " + serConf);if(opts!=null) {client.getClient().submitTopologyWithOpts(name, submittedJar, serConf, topology, opts); } else {// this is for backwards compatibility client.getClient().submitTopology(name, submittedJar, serConf, topology); }} catch(InvalidTopologyException e) {LOG.warn("Topology submission exception", e);throw e;} catch(AlreadyAliveException e) {LOG.warn("Topology already alive exception", e);throw e;} finally {client.close();}}LOG.info("Finished submitting topology: " + name);} catch(TException e) {throw new RuntimeException(e);}}

submitTopology函數其實主要就干兩件事，一上傳jar文件到storm cluster，另一件事通知storm cluster文件已經上傳完畢，你可以執行某某某topology了.

先看上傳jar文件對應的函數submitJar,其調用關系如下圖所示

再看第二步中的調用關系，圖是我用tikz/pgf寫的，生成的是pdf格式。

在上述兩幅調用關系圖中，處于子樹位置的函數都曾在storm.thrift中聲明，如果此刻已經忘記了的點話，可以翻看一下前面1.3節中有關storm.thrift的描述。client側的這些函數都是由thrift自動生成的。

由于篇幅和時間的關系，在storm client側submit topology的時候，非常重要的函數還有TopologyBuilder.java中的源碼。

nimbus

storm client側通過thrift接口向nimbus發送了了jar并且通過預先定義好的submitTopologyWithOpts來處理上傳的topology，那么nimbus是如何一步步的進行文件接收并將其任務細化最終下達給supervisor的呢。

submitTopologyWithOpts

一切還是要從thrift說起，supervisor.clj中的service-handler具體實現了thrift定義的Nimbus接口，代碼這里就不羅列了，太占篇幅。主要看其是如何實現submitTopologyWithOpts

(^void submitTopologyWithOpts[this ^String storm-name ^String uploadedJarLocation ^String serializedConf ^StormTopology topology^SubmitOptions submitOptions](try(assert (not-nil? submitOptions))(validate-topology-name! storm-name)(check-storm-active! nimbus storm-name false)(.validate ^backtype.storm.nimbus.ITopologyValidator (:validator nimbus)storm-name(from-json serializedConf)topology)(swap! (:submitted-count nimbus) inc)(let [storm-id (str storm-name "-" @(:submitted-count nimbus) "-" (current-time-secs))storm-conf (normalize-confconf(-> serializedConffrom-json(assoc STORM-ID storm-id)(assoc TOPOLOGY-NAME storm-name))topology)total-storm-conf (merge conf storm-conf)topology (normalize-topology total-storm-conf topology)topology (if (total-storm-conf TOPOLOGY-OPTIMIZE)(optimize-topology topology)topology)storm-cluster-state (:storm-cluster-state nimbus)](system-topology! total-storm-conf topology) ;; this validates the structure of the topology(log-message "Received topology submission for " storm-name " with conf " storm-conf);; lock protects against multiple topologies being submitted at once and;; cleanup thread killing topology in b/w assignment and starting the topology(locking (:submit-lock nimbus)(setup-storm-code conf storm-id uploadedJarLocation storm-conf topology)(.setup-heartbeats! storm-cluster-state storm-id)(let [thrift-status->kw-status {TopologyInitialStatus/INACTIVE :inactiveTopologyInitialStatus/ACTIVE :active}](start-storm nimbus storm-name storm-id (thrift-status->kw-status (.get_initial_status submitOptions))))(mk-assignments nimbus)))(catch Throwable e(log-warn-error e "Topology submission exception. (topology name='" storm-name "')")(throw e))))

storm cluster在zookeeper server上創建的目錄結構。目錄結構相關的源文件是config.clj.

白話一下上面這個函數的執行邏輯，對上傳的topology作必要的檢測，包括名字，文件內容及格式，好比你進一家公司上班之前做的體檢。這些工作都完成之后進入關鍵區域，是進入關鍵區域所以上鎖，呵呵。

normalize-topology

(defn all-components [^StormTopology topology](apply merge {}(for [f thrift/STORM-TOPOLOGY-FIELDS](.getFieldValue topology f))))

一旦列出所有的components,就可以讀出這些component的配置信息。

mk-assignments

在這關鍵區域內執行的重點就是函數mk-assignments，mk-assignment有兩個主要任務，第一是計算出有多少task,即有多少個spout,多少個bolt，第二就是在剛才的計算基礎上通過調用zookeeper應用接口，寫入assignment，以便supervisor感知到有新的任務需要認領。

先說第二點，因為邏輯簡單。在mk-assignment中執行如下代碼在zookeeper中設定相應的數據以便supervisor能夠感知到有新的任務產生

(doseq [[topology-id assignment] new-assignments:let [existing-assignment (get existing-assignments topology-id)topology-details (.getById topologies topology-id)]](if (= existing-assignment assignment)(log-debug "Assignment for " topology-id " hasn't changed")(do(log-message "Setting new assignment for topology id " topology-id ": " (pr-str assignment))(.set-assignment! storm-cluster-state topology-id assignment))))

調用關系如下圖所示

而第一點涉及到的計算相對繁雜，需要一一仔細道來。其實第一點中非常重要的課題就是如何進行任務的分發，即scheduling.
也許你已經注意到目錄src/clj/backtype/storm/scheduler，或者注意到storm.yaml中與scheduler相關的配置項。那么這個scheduler到底是在什么時候起作用的呢。mk-assignments會間接調用到這么一個名字看起來奇怪異常的函數。compute-new-topology->executor->node+por，也就是在這么很奇怪的函數內，scheduler被調用

_ (.schedule (:scheduler nimbus) topologies cluster) new-scheduler-assignments (.getAssignments cluster) ;; add more information to convert SchedulerAssignment to Assignment new-topology->executor->node+port (compute-topology->executor->node+port new-scheduler-assignments)]

schedule計算出來的assignments保存于Cluster.java中，這也是為什么new-scheduler-assignment要從其中讀取數據的緣由所在。有了assignment，就可以計算出相應的node和port，其實就是這個任務應該交由哪個supervisor上的worker來執行。

?storm在zookeeper server上創建的目錄結構如下圖所示

有了這個目錄結構，現在要解答的問題是在topology在提交的時候要寫哪幾個目錄？assignments目錄下會新創建一個新提交的topology的目錄，在這個topology中需要寫的數據，其數據結構是什么樣子？

supervisor

一旦有新的assignment被寫入到zookeeper中，supervisor中的回調函數mk-synchronize-supervisor立馬被喚醒執行

主要執行邏輯就是讀入zookeeper server中新的assignments全集與已經運行與本機上的assignments作比較，區別出哪些是新增的。在sync-processes函數中將運行具體task的worker拉起。

?要想講清楚topology提交過程中，supervisor需要做哪些動作，最主要的是去理解下面兩個函數的處理邏輯。

mk-synchronize-supervisor??當在zookeeper server的assignments子目錄內容有所變化時，supervisor收到相應的notification, 處理這個notification的回調函數即為mk-synchronize-supervisor，mk-sychronize-supervisor讀取所有的assignments即便它不是由自己處理，并將所有assignment的具體信息讀出。爾后判斷分析出哪些assignment是分配給自己處理的，在這些分配的assignment中，哪些是新增的。知道了新增的assignment之后，從nimbus的相應目錄下載jar文件，用戶自己的處理邏輯代碼并沒有上傳到zookeeper server而是在nimbus所在的機器硬盤上。
sync-processes?mk-synchronize-supervisor預處理過完與assignment相關的操作后，將真正啟動worker的動作交給event-manager, event-manager運行在另一個獨立的線程中，這個線程中進行處理的一個主要函數即sync-processes. sync-processes會將當前運行著的worker全部kill,然后指定新的運行參數，重新拉起worker.

(defn mk-synchronize-supervisor [supervisor sync-processes event-manager processes-event-manager](fn this [](let [conf (:conf supervisor)storm-cluster-state (:storm-cluster-state supervisor)^ISupervisor isupervisor (:isupervisor supervisor)^LocalState local-state (:local-state supervisor)sync-callback (fn [& ignored] (.add event-manager this))assignments-snapshot (assignments-snapshot storm-cluster-state sync-callback) storm-code-map (read-storm-code-locations assignments-snapshot)downloaded-storm-ids (set (read-downloaded-storm-ids conf));;read assignments from zookeeperall-assignment (read-assignmentsassignments-snapshot(:assignment-id supervisor))new-assignment (->> all-assignment(filter-key #(.confirmAssigned isupervisor %))) ;;task在assignment中assigned-storm-ids (assigned-storm-ids-from-port-assignments new-assignment)existing-assignment (.get local-state LS-LOCAL-ASSIGNMENTS)](log-debug "Synchronizing supervisor")(log-debug "Storm code map: " storm-code-map)(log-debug "Downloaded storm ids: " downloaded-storm-ids)(log-debug "All assignment: " all-assignment)(log-debug "New assignment: " new-assignment);; download code first;; This might take awhile;; - should this be done separately from usual monitoring?;; should we only download when topology is assigned to this supervisor?(doseq [[storm-id master-code-dir] storm-code-map](when (and (not (downloaded-storm-ids storm-id))(assigned-storm-ids storm-id))(log-message "Downloading code for storm id "storm-id" from "master-code-

總結

以上是生活随笔為你收集整理的twitter storm源码走读（二）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：实战 MDT 2012(六)---基于M
下一篇：项目代码规范