gcp devops_将GCP AI平台笔记本用作可重现的数据科学环境
gcp devops
By: Edward Krueger and Douglas Franklin.
作者: 愛德華·克魯格 ( Edward Krueger)和道格拉斯·富蘭克林 ( Douglas Franklin) 。
In this article, we will cover how to set up a cloud computing instance to run Python with or without Jupyter Notebook. Then we show how to connect that instance to Github for a smooth cloud workflow.
在本文中,我們將介紹如何設(shè)置云計(jì)算實(shí)例以在有或沒有Jupyter Notebook的情況下運(yùn)行Python。 然后,我們展示了如何將該實(shí)例連接到Github,以實(shí)現(xiàn)流暢的云工作流程。
We utilize cloud computing instances to get flexible Python and Jupyter environments while maintaining the reproducibility of enterprise data science platforms.
我們利用云計(jì)算實(shí)例來獲得靈活的Python和Jupyter環(huán)境,同時(shí)保持企業(yè)數(shù)據(jù)科學(xué)平臺的可重復(fù)性。
These AI platform notebooks come configured with many data science and analytics packages, including NumPy, Pandas, Scikit-learn and TensorFlow. Typically, we would discourage the use of bloated virtual machines. However, package bloat on our analytics machine isn’t as much of a problem because we only save the result (model, data, report) for later use. Needing only this result and the few packages needed to run our model allows us to disregard the numerous packages on the VM.
這些AI平臺筆記本配置了許多數(shù)據(jù)科學(xué)和分析軟件包,包括NumPy,Pandas,Scikit-learn和TensorFlow。 通常,我們不鼓勵(lì)使用of腫的虛擬機(jī)。 但是,由于我們只保存結(jié)果(模型,數(shù)據(jù),報(bào)告)供以后使用,因此我們的分析機(jī)上的軟件包膨脹并不是什么大問題。 只需要這個(gè)結(jié)果和運(yùn)行模型所需的幾個(gè)軟件包,就可以忽略VM上的眾多軟件包。
For example, in this Medium article, we push an NLP mode to the cloud without having to worry about dependencies.
例如,在這篇中型文章中,我們將NLP模式推到了云端,而不必?fù)?dān)心依賴關(guān)系。
Note that AI platform notebooks have all of the client packages for GCP services installed and are already authenticated to allow easy access to anything within the same GCP project. Additionally, this platform gives us not just access to Jupyter Notebooks, but also a Python console and a CLI where we can run BASH commands.
請注意,AI平臺筆記本電腦已安裝了所有用于GCP服務(wù)的客戶端軟件包,并且已經(jīng)過身份驗(yàn)證,可以輕松訪問同一GCP項(xiàng)目中的任何內(nèi)容。 此外,該平臺使我們不僅可以訪問Jupyter Notebook,而且還可以使用Python控制臺和CLI來運(yùn)行BASH命令。
取得GCP帳戶 (Getting a GCP account)
Google’s AI Platform Notebooks offer a JupyterLab and Python environment for data scientists and machine learning developers to experiment, develop, and deploy models into production. Users can create instances running JupyterLab that come pre-installed with common packages.
Google的AI平臺筆記本為數(shù)據(jù)科學(xué)家和機(jī)器學(xué)習(xí)開發(fā)人員提供JupyterLab和Python環(huán)境,以進(jìn)行實(shí)驗(yàn),開發(fā)并將模型部署到生產(chǎn)中。 用戶可以創(chuàng)建預(yù)裝有通用軟件包的運(yùn)行JupyterLab的實(shí)例。
Before we can set up an AI Platform Notebook, we will have to set up an account and billing, don’t worry new users get $300 in free credits!
在我們設(shè)置AI Platform Notebook之前,我們必須先設(shè)置一個(gè)帳戶并進(jìn)行結(jié)算,不要擔(dān)心新用戶將獲得300美元的免費(fèi)積分!
Visit GCP AI Platform and click ‘go to console.’
訪問GCP AI平臺 ,然后單擊“轉(zhuǎn)到控制臺”。
Be sure to click ‘Enable API’ below to access notebooks.
確保單擊下面的“啟用API”以訪問筆記本。
Enable API啟用APIOnce we have billing set up, we can start a project.
設(shè)置好帳單后,我們可以開始一個(gè)項(xiàng)目。
啟動(dòng)您的第一個(gè)GCP AI Platform Notebook實(shí)例 (Starting up your first GCP AI Platform Notebook Instance)
Now we need to select the hardware we want our virtual machine to run on. Be sure to set up the cheapest machine possible if you are testing this out!
現(xiàn)在,我們需要選擇要在其上運(yùn)行虛擬機(jī)的硬件。 如果要進(jìn)行測試,請務(wù)必設(shè)置最便宜的機(jī)器!
Once we have the API enabled, the popup selections will change to those seen below, click ‘Go to instances page’ to get started.
啟用API后,彈出式菜單選擇將變?yōu)橐韵滤?#xff0c;單擊“轉(zhuǎn)到實(shí)例頁面”開始使用。
Click GO TO INSTANCES PAGE單擊轉(zhuǎn)到實(shí)例頁面The instances page might have you select ‘Enable API’ another time, be sure to do so. Then click on the ‘New Instance’ button and select ‘Python 2 and 3.’
實(shí)例頁面可能會讓您再次選擇“啟用API”,請務(wù)必選擇。 然后點(diǎn)擊“新實(shí)例”按鈕并選擇“ Python 2和3”。
Notebook Instances筆記本實(shí)例This will open up an options menu where you’ll input the region you’d like to use. Note that different regions can have different pricing. Once you have a region selected, you will want to click ‘Customize’ and select the machine with the least RAM to have the lowest cost. In our case, it is the ‘n1-standard-1’ VM with 3.75GB of RAM.
這將打開一個(gè)選項(xiàng)菜單,您可以在其中輸入要使用的區(qū)域。 請注意,不同地區(qū)的定價(jià)可能不同。 選定區(qū)域后,將需要單擊“自定義”,然后選擇RAM最少的機(jī)器以降低成本。 在我們的案例中,它是具有3.75GB RAM的“ n1-standard-1” VM。
This instance will only generate fees when it is running and can be easily paused at any time! If needed, you can swap out hardware with the dropdown menus seen below while the instance is paused.
該實(shí)例僅在運(yùn)行時(shí)才會產(chǎn)生費(fèi)用,并且可以隨時(shí)輕松暫停! 如果需要,您可以在實(shí)例暫停時(shí)通過下面顯示的下拉菜單交換硬件。
Selecting a low-cost machine選擇低成本機(jī)器Now we can use SSH to connect our VM to GitHub to allow us to push and pull to our repositories with ease.
現(xiàn)在,我們可以使用SSH將虛擬機(jī)連接到GitHub,從而使我們可以輕松地push存儲庫push和pull 。
設(shè)置SSH (Setting Up SSH)
Be aware you will only have to do this once per instance.
請注意,每個(gè)實(shí)例只需執(zhí)行一次。
使用SSH連接到GitHub (Connecting to GitHub with ssh)
Generate an ssh key by running ssh-keygen and accepting the defaults by leaving them blank and pressing the enter key. This command generates files at user/.ssh/id_rsa that you’ll need to enter into GitHub.
通過運(yùn)行ssh-keygen生成ssh密鑰,并通過將其保留為空白并按Enter鍵來接受默認(rèn)值。 此命令在user/.ssh/id_rsa處生成文件,您需要將這些文件輸入GitHub。
2. Copy your public key to your clipboard. One way to do this is by running cat ~/.ssh/id_rsa.pub to return the public key text into your console, display its contents, and then copy with the mouse and keyboard.
2.將您的公鑰復(fù)制到剪貼板。 一種方法是運(yùn)行cat ~/.ssh/id_rsa.pub將公鑰文本返回到控制臺,顯示其內(nèi)容,然后使用鼠標(biāo)和鍵盤進(jìn)行復(fù)制。
using cat to get our key用貓拿到我們的鑰匙3. Go to github.com and sign in.
3.轉(zhuǎn)到gi??thub.com并登錄。
4. Click your profile image in the top right and then click “Settings.”
4.單擊右上角的個(gè)人資料圖片,然后單擊“設(shè)置”。
5. On the left-hand side, click “SSH and GPG keys.”
5.在左側(cè),單擊“ SSH和GPG密鑰”。
6. On the top right, click “New SSH key.”
6.在右上方,單擊“新建SSH密鑰”。
7. Set the title to whatever you like. The “Title” is your choice, but it will help you identify what computer this authorization authorizes. Paste the copied key into the “Key” field and press “Add SSH key.”
7.將標(biāo)題設(shè)置為任何您喜歡的名稱。 您可以選擇“標(biāo)題”,但這將幫助您確定此授權(quán)授權(quán)的計(jì)算機(jī)。 將復(fù)制的密鑰粘貼到“密鑰”字段中,然后按“添加SSH密鑰”。
8. Go back to your computer and run eval 'ssh-agent -s' to start your ssh authentication agent.
8.返回計(jì)算機(jī)并運(yùn)行eval 'ssh-agent -s'以啟動(dòng)ssh身份驗(yàn)證代理。
Steps 8 and 9 adding our ssh-key步驟8和9添加我們的ssh-key9. Run ssh-addto add your private key so that the agent can authenticate the public key.
9.運(yùn)行ssh-add添加您的私鑰,以便代理可以驗(yàn)證公鑰。
10. Set your git configuration so that GitHub knows who you are by running git config --global user.email you@email.com and git config --global user.name username, where the email and username are those attached to your GitHub account.
10.設(shè)置您的git配置,以便GitHub通過運(yùn)行g(shù)it config --global user.email you@email.com和git config --global user.name username知道您的git config --global user.name username ,其中電子郵件和用戶名是附加到GitHub上的電子郵件和用戶名帳戶。
Now you can git clone any repository you have access too right onto the VM, make changes to the code, and push them back to the repository!
現(xiàn)在,您可以git clone任何有權(quán)訪問的存儲庫直接git clone到VM上,對代碼進(jìn)行更改,然后將其推回到存儲庫中!
結(jié)論 (Conclusion)
We’ve discussed how to set up a cloud computing instance to run Python, BASH, and Jupyter Notebooks and how to connect that instance to Github for an easy and secure cloud workflow.
我們已經(jīng)討論了如何設(shè)置一個(gè)云計(jì)算實(shí)例來運(yùn)行Python,BASH和Jupyter Notebook,以及如何將該實(shí)例連接到Github,以實(shí)現(xiàn)簡單而安全的云工作流程。
This workflow is great because it is so reproducible! Teams using VMs like this will encounter less of the ‘it works on my machine’ bugs. Using ssh to connect the cloud VM and our remote repositories provide a safe connection to protect your data. Additionally, if you want to run code on expensive hardware, you don’t have to buy that hardware! Instead, run what you need and pause your instance to save costs.
這個(gè)工作流程很棒,因?yàn)樗侨绱说目蓮?fù)制! 使用此類VM的團(tuán)隊(duì)將遇到較少的“在我的計(jì)算機(jī)上運(yùn)行”錯(cuò)誤。 使用ssh連接云VM和我們的遠(yuǎn)程存儲庫可提供安全的連接來保護(hù)您的數(shù)據(jù)。 此外,如果您想在昂貴的硬件上運(yùn)行代碼,則不必購買該硬件! 而是運(yùn)行所需的內(nèi)容并暫停實(shí)例以節(jié)省成本。
We hope this guide has been helpful and that your coding skills are leveling up with us!
我們希望本指南對您有所幫助,并且您的編碼技能正在與我們一起發(fā)展!
翻譯自: https://towardsdatascience.com/using-gcp-ai-platform-notebooks-as-reproducible-data-science-environments-964cba32737
gcp devops
總結(jié)
以上是生活随笔為你收集整理的gcp devops_将GCP AI平台笔记本用作可重现的数据科学环境的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 梦到拉一裤兜子屎是什么意思
- 下一篇: 梦到鱼和海水预示着什么