使用docker安装部署Spark集群来训练CNN(含Python实例)
使用docker安裝部署Spark集群來訓練CNN(含Python實例)
本博客僅為作者記錄筆記之用,不免有很多細節不對之處。
還望各位看官能夠見諒,歡迎批評指正。
博客雖水,然亦博主之苦勞也。
如需轉載,請附上本文鏈接,不甚感激!?
http://blog.csdn.net/cyh_24/article/details/49683221
實驗室有4臺神服務器,每臺有8個tesla-GPU,然而平時做實驗都只使用了其中的一個GPU,實在暴遣天物!?
于是想用spark來把這些GPU都利用起來。聽聞docker是部署環境的神器,于是決定使用docker安裝部署Spark集群來訓練CNN。配置環境雖然簡單,純苦力活,但配過的人都知道,里面有太多坑了。
本文是博主含淚寫出的踩坑總結,希望能夠給各位提供了一些前車之鑒來避開這些坑。
docker
什么是docker
Docker 是一個開源項目,誕生于 2013 年初,最初是 dotCloud 公司內部的一個業余項目。直觀來說,docker是一種輕量級的虛擬機。Docker 和傳統虛擬化方式的不同之處在于:
docker是在操作系統層面上實現虛擬化,直接復用本地主機的操作系統,而傳統方式則是在硬件層面實現。
一張圖更直觀地解釋一下這兩種差異:
?
為什么使用docker
作為一種新興的虛擬化方式,Docker 跟傳統的虛擬化方式相比具有眾多的優勢。
- Docker 容器的啟動可以在秒級實現,這相比傳統的虛擬機方式要快得多。
- Docker 對系統資源的利用率很高,一臺主機上可以同時運行數千個 Docker 容器。
- 容器除了運行其中應用外,基本不消耗額外的系統資源,使得應用的性能很高,同時系統的開銷盡量小。(傳統虛擬機方式運行 10 個不同的應用就要起 10 個虛擬機,而Docker 只需要啟動 10 個隔離的應用即可)。
- 一次創建或配置,就可以在任意地方正常運行。
- Docker 容器幾乎可以在任意的平臺上運行,包括物理機、虛擬機、公有云、私有云、個人電腦、服務器等。 這種兼容性可以讓用戶把一個應用程序從一個平臺直接遷移到另外一個。
簡單總結一下:
| 啟動 | 秒級 | 分鐘級 |
| 硬盤使用 | 一般為 MB | 一般為 GB |
| 性能 | 接近原生 | 弱于 |
| 系統支持量 | 單機支持上千個容器 | 一般幾十個 |
Spark
Spark是 UC Berkeley AMP lab 所開源的類Hadoop MapReduce 的通用并行框架。?
Spark,擁有Hadoop MapReduce所具有的優點;?
但不同于MapReduce的是Job中間輸出結果可以保存在內存中,從而不再需要讀寫HDFS。
因此 Spark 能更好地適用于數據挖掘與機器學習等需要迭代的 MapReduce 的算法。
關于spark的原理應用等內容,這里就不多說了,改天我再寫一篇單獨來聊。現在你只要知道它能有辦法讓你的程序分布式跑起來就行了。
Elephas(支持spark的深度學習庫)
先說?keras,它是基于 theano 的深度學習庫,用過 theano 的可能會知道,theano 程序不是特別好些。keras 是對theano的一個高層封裝,使得代碼寫起來更加方便,下面貼一段keras的cnn模型代碼:
<code class="hljs oxygene has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">model = Sequential()model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Convolution2D(nb_filters, nb_conv, nb_conv,border_mode=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'full'</span>,input_shape=(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, img_rows, img_cols))) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'relu'</span>)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Convolution2D(nb_filters, nb_conv, nb_conv)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'relu'</span>)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(MaxPooling2D(pool_size=(nb_pool, nb_pool))) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Dropout(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.25</span>))model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Flatten()) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Dense(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">128</span>)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'relu'</span>)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Dropout(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.5</span>)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Dense(nb_classes)) model.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">add</span>(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'softmax'</span>))model.compile(loss=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'categorical_crossentropy'</span>, optimizer=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'adadelta'</span>)model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">True</span>, verbose=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, validation_data=(X_test, Y_test)) </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li></ul>是不是比caffe的配置文件還要簡單?
elephas?使得keras程序能夠運行在Spark上面。使得基本不改變keras,就能夠將程序運行到spark上面了。?
下面貼一個elephas的代碼(model還是上文的model):
要想在spark上面運行,只需要執行下面的命令:
spark-submit –driver-memory 1G ./your_script.py
該介紹的都介紹完了,下面我來手把手教你如何使用docker安裝部署Spark-GPU集群來分布式訓練CNN.
Spark on docker 安裝
在線安裝docker
Ubuntu 14.04 版本系統中已經自帶了 Docker 包,可以直接安裝。
<code class="hljs lasso has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">$ sudo apt<span class="hljs-attribute" style="box-sizing: border-box;">-get</span> update $ sudo apt<span class="hljs-attribute" style="box-sizing: border-box;">-get</span> install <span class="hljs-attribute" style="box-sizing: border-box;">-y</span> docker<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">.</span>io $ sudo ln <span class="hljs-attribute" style="box-sizing: border-box;">-sf</span> /usr/bin/docker<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">.</span>io /usr/<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">local</span>/bin/docker $ sudo sed <span class="hljs-attribute" style="box-sizing: border-box;">-i</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'$acomplete -F _docker docker'</span> /etc/bash_completion<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">.</span>d/docker<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">.</span>io</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>如果是較低版本的 Ubuntu 系統,需要先更新內核。
<code class="hljs lasso has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">$ sudo apt<span class="hljs-attribute" style="box-sizing: border-box;">-get</span> update $ sudo apt<span class="hljs-attribute" style="box-sizing: border-box;">-get</span> install linux<span class="hljs-attribute" style="box-sizing: border-box;">-image</span><span class="hljs-attribute" style="box-sizing: border-box;">-generic</span><span class="hljs-attribute" style="box-sizing: border-box;">-lts</span><span class="hljs-attribute" style="box-sizing: border-box;">-raring</span> linux<span class="hljs-attribute" style="box-sizing: border-box;">-headers</span><span class="hljs-attribute" style="box-sizing: border-box;">-generic</span><span class="hljs-attribute" style="box-sizing: border-box;">-lts</span><span class="hljs-attribute" style="box-sizing: border-box;">-raring</span> $ sudo reboot</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul> 然后重復上面的步驟即可。?
安裝之后啟動 Docker 服務。
離線安裝docker
如果你的電腦連不上外網(像我的服務器那樣),那還可以通過離線安裝包來安裝docker。?
你可以從這里下載離線包:https://get.daocloud.io/docker/builds/Linux/x86_64/docker-latest
Spark on docker 安裝
Sequenceiq 公司提供了一個docker容器,里面安裝好了spark,你只要從docker hub上pull下來就行了。
docker pull sequenceiq/spark:1.5.1
執行下面命令來運行一下:
sudo docker run -it sequenceiq/spark:1.5.1 bash
測試一下spark的功能:?
首先用ifconfig得到ip地址,我的ip是172.17.0.109,然后:
bash-4.1# cd /usr/local/spark?
bash-4.1# cp conf/spark-env.sh.template conf/spark-env.sh?
bash-4.1# vi conf/spark-env.sh
添加兩行代碼:
<code class="hljs bash has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">export</span> SPARK_LOCAL_IP=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">172.17</span>.<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.109</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">export</span> SPARK_MASTER_IP=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">172.17</span>.<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.109</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>然后啟動master 跟slave:
bash-4.1# ./sbin/start-master.sh?
bash-4.1# ./sbin/start-slave.sh spark:172.17.0.109:7077
瀏覽器打開(你的ip:8080) 可以看到如下spark各節點的狀態。
用spark-sumit提交一個應用運行一下:
bash-4.1# ./bin/spark-submit examples/src/main/python/pi.py
得到如下結果:
15/11/05 02:11:23 INFO scheduler.DAGScheduler: Job 0 finished: reduce at /usr/local/spark-1.5.1-bin-hadoop2.6/examples/src/main/python/pi.py:39, took 1.095643 s
Pi is roughly 3.148900
恭喜你,剛剛跑了一個spark的應用程序!
你是不是覺得到目前為止都很順利?提前劇透一下,困難才剛剛開始,好在我把坑都踩了一遍,所以雖然還是有點麻煩,不過至少你們還是繞過了一些深坑。。。
各種庫的安裝
elephas 需要python2.7,不過我們剛剛安裝的docker自帶的python是2.6版本,所以,我們先把python版本更新一下。
CentOS 的Python 版本升級
溫馨提示:在python編譯之前一定要安裝openssl和openssl-devel,不要問我是怎么知道的。
<code class="hljs lasso has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">yum install <span class="hljs-attribute" style="box-sizing: border-box;">-y</span> zlib<span class="hljs-attribute" style="box-sizing: border-box;">-devel</span> bzip2<span class="hljs-attribute" style="box-sizing: border-box;">-devel</span> openssl openssl<span class="hljs-attribute" style="box-sizing: border-box;">-devel</span> xz<span class="hljs-attribute" style="box-sizing: border-box;">-libs</span> wget</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>安裝詳情:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">wget http://www<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.python</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.org</span>/ftp/python/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.7</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.8</span>/Python-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.7</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.8</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.tar</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.xz</span> xz -d Python-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.7</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.8</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.tar</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.xz</span> tar -xvf Python-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.7</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.8</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.tar</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># 進入目錄:</span> cd Python-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.7</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.8</span> <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># 運行配置 configure:</span> ./configure --prefix=/usr/local CFLAGS=-fPIC (一定要加fPIC,不要問我怎么知道的) <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># 編譯安裝:</span> make make altinstall</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li></ul>設置 PATH
<code class="hljs bash has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">mv /usr/bin/python /usr/bin/python2.<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">export</span> PATH=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/usr/local/bin:<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$PATH</span>"</span> 或者 ln <span class="hljs-operator" style="box-sizing: border-box;">-s</span> /usr/local/bin/python2.<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">7</span> /usr/bin/python <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 檢查 Python 版本:</span> python -V</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>安裝 setuptools
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">#獲取軟件包</span> wget --no-check-certificate https://pypi<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.python</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.org</span>/packages/source/s/setuptools/setuptools-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.4</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.2</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.tar</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.gz</span> <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># 解壓:</span> tar -xvf setuptools-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.4</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.2</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.tar</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.gz</span> cd setuptools-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.4</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.2</span> <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># 使用 Python 2.7.8 安裝 setuptools</span> python setup<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.py</span> install</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>安裝 PIP
<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">curl <span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">https:</span>/<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/raw.githubusercontent.com/pypa</span><span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/pip/master</span><span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/contrib/get</span>-pip.py | python -</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>修復 yum 工具
<code class="hljs bash has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">vi /usr/bin/yum<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#修改 yum中的python </span> 將第一行 <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#!/usr/bin/python </span> 改為 <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#!/usr/bin/python2.6</span> 此時yum就ok啦</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>theano, keras, elephas的安裝
<code class="hljs cmake has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">pip <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">install</span> --upgrade --no-deps git+git://github.com/Theano/Theano.gitpip <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">install</span> keraspip <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">install</span> elephas</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>已達成技能
我們簡單總結一下,我們已經完成的工作:
現在,我們已經可以做的事情:?
√?如果你的機器有多個CPU(假設24個):
你可以只開一個docker,然后很簡單的使用spark結合elephas來并行(利用24個cpu)計算CNN。
√?如果你的機器有多個GPU(假設4個):
你可以開4個docker鏡像,修改每個鏡像內的~/.theanorc來選擇特定的GPU來并行(4個GPU)計算。(需自行安裝cuda)
單機多CPU集群并行訓練CNN實例
跑一個最簡單的網絡來訓練mnist手寫字識別,貼一個能夠直接運行的代碼(要事先下載好mnist.pkl.gz):
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> __future__ <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> absolute_import <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> __future__ <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> print_function <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> numpy <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> np<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> keras.datasets <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> mnist <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> keras.models <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> Sequential <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> keras.layers.core <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> Dense, Dropout, Activation, Flatten <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> keras.optimizers <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> SGD, Adam, RMSprop <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> keras.layers.convolutional <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> Convolution2D, MaxPooling2D <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> keras.utils <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> np_utils<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> elephas.spark_model <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> SparkModel <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> elephas.utils.rdd_utils <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> to_simple_rdd<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> pyspark <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> SparkContext, SparkConf<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> gzip <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> cPickleAPP_NAME = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"mnist"</span> MASTER_IP = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'local[24]'</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Define basic parameters</span> batch_size = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">128</span> nb_classes = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span> nb_epoch = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input image dimensions</span> img_rows, img_cols = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># number of convolutional filters to use</span> nb_filters = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">32</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># size of pooling area for max pooling</span> nb_pool = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># convolution kernel size</span> nb_conv = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Load data</span> f = gzip.open(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"./mnist.pkl.gz"</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"rb"</span>) dd = cPickle.load(f) (X_train, y_train), (X_test, y_test) = ddX_train = X_train.reshape(X_train.shape[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, img_rows, img_cols) X_test = X_test.reshape(X_test.shape[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, img_rows, img_cols)X_train = X_train.astype(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"float32"</span>) X_test = X_test.astype(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"float32"</span>) X_train /= <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">255</span> X_test /= <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">255</span>print(X_train.shape[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'train samples'</span>) print(X_test.shape[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'test samples'</span>)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Convert class vectors to binary class matrices</span> Y_train = np_utils.to_categorical(y_train, nb_classes) Y_test = np_utils.to_categorical(y_test, nb_classes)model = Sequential() model.add(Convolution2D(nb_filters, nb_conv, nb_conv,border_mode=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'full'</span>,input_shape=(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, img_rows, img_cols))) model.add(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'relu'</span>)) model.add(Convolution2D(nb_filters, nb_conv, nb_conv)) model.add(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'relu'</span>)) model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool))) model.add(Dropout(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.25</span>))model.add(Flatten()) model.add(Dense(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">128</span>)) model.add(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'relu'</span>)) model.add(Dropout(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.5</span>)) model.add(Dense(nb_classes)) model.add(Activation(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'softmax'</span>))model.compile(loss=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'categorical_crossentropy'</span>, optimizer=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'adadelta'</span>)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">## spark</span> conf = SparkConf().setAppName(APP_NAME).setMaster(MASTER_IP) sc = SparkContext(conf=conf)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Build RDD from numpy features and labels</span> rdd = to_simple_rdd(sc, X_train, Y_train)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Initialize SparkModel from Keras model and Spark context</span> spark_model = SparkModel(sc,model)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Train Spark model</span> spark_model.train(rdd, nb_epoch=nb_epoch, batch_size=batch_size, verbose=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>, validation_split=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.1</span>, num_workers=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">24</span>)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Evaluate Spark model by evaluating the underlying model</span> score = spark_model.get_network().evaluate(X_test, Y_test, show_accuracy=<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">True</span>, verbose=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>) print(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'Test accuracy:'</span>, score[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li><li style="box-sizing: border-box; padding: 0px 5px;">69</li><li style="box-sizing: border-box; padding: 0px 5px;">70</li><li style="box-sizing: border-box; padding: 0px 5px;">71</li><li style="box-sizing: border-box; padding: 0px 5px;">72</li><li style="box-sizing: border-box; padding: 0px 5px;">73</li><li style="box-sizing: border-box; padding: 0px 5px;">74</li><li style="box-sizing: border-box; padding: 0px 5px;">75</li><li style="box-sizing: border-box; padding: 0px 5px;">76</li><li style="box-sizing: border-box; padding: 0px 5px;">77</li><li style="box-sizing: border-box; padding: 0px 5px;">78</li><li style="box-sizing: border-box; padding: 0px 5px;">79</li><li style="box-sizing: border-box; padding: 0px 5px;">80</li><li style="box-sizing: border-box; padding: 0px 5px;">81</li><li style="box-sizing: border-box; padding: 0px 5px;">82</li><li style="box-sizing: border-box; padding: 0px 5px;">83</li><li style="box-sizing: border-box; padding: 0px 5px;">84</li><li style="box-sizing: border-box; padding: 0px 5px;">85</li><li style="box-sizing: border-box; padding: 0px 5px;">86</li><li style="box-sizing: border-box; padding: 0px 5px;">87</li><li style="box-sizing: border-box; padding: 0px 5px;">88</li><li style="box-sizing: border-box; padding: 0px 5px;">89</li><li style="box-sizing: border-box; padding: 0px 5px;">90</li><li style="box-sizing: border-box; padding: 0px 5px;">91</li><li style="box-sizing: border-box; padding: 0px 5px;">92</li></ul>執行以下命令即可運行:
/usr/local/spark/bin/spark-submit?mnist_cnn_spark.py
使用24個slave,并行迭代了5次,得到的準確率和運行時間如下:
Test accuracy:?95.68%?
took:?1135s
不使用spark,大概測了一下,1次迭代就需要1800s,所以還是快7~8倍的。
多GPU集群并行訓練CNN實例
由于博主近幾日踩太多坑了,心實在太累了!?
關于單機多GPU集群,多機多GPU集群的配置,還請各位多待幾日,等博主元氣恢復,會繼續義無反顧地繼續踩坑的。。。
為了赤焰軍,我會回來的!
from:?http://blog.csdn.net/cyh_24/article/details/49683221
總結
以上是生活随笔為你收集整理的使用docker安装部署Spark集群来训练CNN(含Python实例)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 优秀且免费的照片库
- 下一篇: 【机器学习】Logistic Regre