机器学习与分布式机器学习_我将如何再次开始学习机器学习(3年以上)
機器學習與分布式機器學習
重點 (Top highlight)
I’m underground, back where it all started. Sitting at the hidden cafe where I first met Mike. I’d been studying in my bedroom for the past 9-months and decided to step out of the cave. Half of me was concerned about having to pay $19 for breakfast (unless it’s Christmas, driving Uber on the weekends isn’t very lucrative), the other half about whether any of this study I’d been doing online meant anything.
我在地下,回到一切開始的地方。 坐在我第一次遇見邁克的隱蔽咖啡館。 在過去的9個月里 ,我一直在臥室里學習 ,因此決定走出洞穴。 我一半的人擔心必須支付19美元的早餐費(除非是圣誕節,周末開車去優步不是很賺錢),另一半的人擔心我在網上進行的這項研究是否有意義。
In 2017, I left Apple, tried to build a web startup, failed, discovered machine learning, fell in love, signed up to a deep learning course with zero coding experience, emailed the support team asking what the refund policy was, didn’t get a refund, spent the next 3-months handing in the assignments four to six days late, somehow passed, decided to keep going and created my own AI Masters Degree.
2017年,我離開了蘋果公司,試圖建立一個網絡初創公司,失敗了,發現了機器學習,陷入了愛河,簽了一份零編碼經驗的深度學習課程, 向支持團隊發送了電子郵件,詢問退款政策是什么 ,沒有獲得退款,在接下來的3個月里延遲交了4到6天的作業,以某種方式過去了,決定繼續前進并創建了自己的AI碩士學位 。
9-months into my AI Masters Degree, I met Mike, we had coffee, I told him my grand plan; use AI to help the world move more and eat better, he told me I should I meet Cam, I met Cam, I told Cam I’m going to the US, he said why not stay here, come in on Thursday, okay, went in on Thursday for a 1-day a week internship and two weeks later was offered a role as a junior machine learning engineer at Max Kelsen.
進入AI碩士學位9個月后,我遇到了Mike,我們喝咖啡了,我告訴他我的宏偉計劃; 用AI幫助世界更多地移動,更好地飲食,他告訴我我應該和Cam見面,遇到Cam,我告訴Cam我要去美國了,他說為什么不留在這里,星期四進來,好吧,周四參加了一周的一日實習,兩周后在Max Kelsen擔任初級機器學習工程師。
14-months into my machine learning engineer role, I decided to leave and try it on my own. I wrote an article about what I’d learned, Andrei found it, emailed me asking if I wanted to build a beginner-friendly machine learning course, I said yes, we built the course and 6-months in we’ve got the privilege of teaching 27,177 students in 150+ countries.
在擔任機器學習工程師職位14個月后,我決定離開并自己嘗試。 我寫了一篇關于所學知識的文章,安德烈(Andrei)找到了,給我發了電子郵件,詢問我是否想建立一個適合初學者的機器學習課程 ,我說是的,我們建立了該課程并有6個月的學習時間在150多個國家/地區教授27177名學生。
Add it up and you get about 3-years. About the time my original undergraduate degree was supposed to take (due to several failures, I took 5-years to do a 3-year degree).
加起來,您將獲得大約三年的時間。 大約是我最初的大學學位應該獲得的時間(由于幾次失敗,我花了5年的時間才取得了3年的學位)。
So as it stands, I feel like I’ve done a machine learning undergraduate degree.
這樣看來,我覺得我已經完成了機器學習的本科學位。
Someone looking from the outside in might think I know a fair bit about machine learning and I do, I know a lot more than I started but I also know how much I don’t know. That’s the thing with knowledge.
從外面往外看的人可能會認為我對機器學習有一定的了解,而且我知道,我比開始學習的要多得多,但我也知道我不知道多少。 那就是知識。
1-year in: The honeymoon phase, also known as the noob gains period. You’re much better than a beginner, perhaps even a little too confident (though this isn’t a bad thing).
1年: 蜜月期,也稱為菜鳥收獲期。 您比初學者要好得多,甚至可能有點過于自信(盡管這不是一件壞事)。
2-years in: The oh, maybe I’m not as good as I thought phase. Your beginner skills are starting to mature but now you realise getting better is going to take some effort.
2年:哦,也許我不如我想像的好。 您的初學者技能已經開始成熟,但是現在您意識到要變得更好需要付出一些努力。
3-years in: The wow, there’s still so much to learn phase. Not a beginner anymore but now you know enough to realise how much you don’t know (I’m here).
三年了:哇,還有很多東西需要學習。 不再是初學者,但是現在您已經足夠了解不知道的內容(我在這里)。
But enough about me. That’s my story. Yours might be similar or you might be starting out today.
但是對我來說足夠了。 那是我的故事。 您的可能相似,或者您今天可能開始。
If you’re getting started, this article is for you. If you’re a veteran, you can offer your advice or critique my ideas.
如果您是入門者,那么本文適合您。 如果您是資深人士,則可以提出您的建議或批評我的想法。
Let’s get into it, shall we?
讓我們開始吧,對吧?
如果您來找課程清單,那您選錯地方了 (If you came for a list of courses, you’re in the wrong place)
I’ve done a bunch of online courses. I’ve even created my own.
我已經完成了一堆在線課程。 我什至創建了自己的。
And guess what?
你猜怎么著?
They’re all remixes of the same thing.
它們都是同一件事的混音。
Instead of worrying about which course is better than another, find a teacher who excites you.
不用擔心哪門課程比另一門課程更好,而是找一位讓您興奮的老師。
Learning anything is 10% material and 90% being excited to learn.
學習任何東西都是10%的物質,而90%的學習是興奮的。
How many of your school teachers do you remember?
您還記得多少個學校老師?
My guess is, regardless of what they taught, you remember the teacher themselves more than the material. And if you remember the material, it’s because they sparked a fire in you enough for it to be burned into your memory.
我的猜測是,不管他們教了什么,您對老師的記憶比對材料的記憶還多。 而且,如果您還記得這些材料,那是因為它們在您體內引發了一場大火,足以將其燃燒到您的記憶中。
What then?
然后怎樣呢?
Dabble in a few resources, you’re smart enough to find the best ones. See which ones spark your interest enough to keep going and stick with those.
涉足一些資源,您足夠聰明以找到最佳資源。 看看哪些激發您的興趣足以繼續前進并堅持下去。
It isn’t an unpleasant task to learn a skill if the teacher gets you interested in it.
如果老師引起您的興趣,學習一項技能并不是一件令人不快的任務。
工程師(和技術書呆子)的詛咒 (The curse of the engineer (and technology nerd))
Show me an engineer who proclaims her use case of the latest and greatest tools and I’ll show you an amateur.
給我看一個宣布她使用最新最好的工具的用例的工程師,我將給你看一個業余愛好者。
I’ll confess. I’m guilty. Every new shiny framework which comes out, every new state of the art model, I’m onto it.
我承認。 我有罪。 每一個新出現的閃亮框架,每一個新的先進模型都在我身上。
Often I’ll catch myself trying to invent a problem to use whatever new tool is on the market. A classic cart before the horse scenario.
我經常會發現自己想發明一個問題以使用市場上的任何新工具。 馬情景之前的經典手推車。
A chef’s entire work centres around two tools, the controlled use of fire and a knife.
廚師的整個工作圍繞兩種工具進行,即受控使用火和一把刀。
This is embodied in the best programming advice I’ve ever received: learn the language, not the framework.
我所收到的最佳編程建議體現了這一點:學習語言,而不是框架。
If you’re just starting out and can’t count the number of tools you’re learning on one hand, you’re trying to use too many.
如果您只是剛起步而又無法數出正在學習的工具數量,那么您將嘗試使用過多的工具。
“我想建造東西” (“I want to build things”)
If you want to build things, such as web applications or mobile applications, learn software engineering before (or at least alongside) machine learning.
如果要構建Web應用程序或移動應用程序之類的東西,請在機器學習之前(或至少與機器學習一起)學習軟件工程。
Too many models live and die within Jupyter Notebooks.
Jupyter筆記本電腦中存在太多的模型,它們死活了。
Why?
為什么?
Because machine learning is an infrastructure problem (infrastructure means all the things which go around your model so others can use it, the hot new term you’ll want to lookup is MLOps).
由于機器學習是一個基礎結構問題(基礎結構意味著模型中所涉及的所有事物,以便其他人可以使用它,因此您要查找的熱門新術語是MLOps )。
And deployment, as in getting your models into the hands of others, is hard.
就像將模型移交給其他人一樣,部署也很困難。
But that’s exactly why I should’ve spent more time there.
但這就是為什么我應該在那里花更多時間的原因。
If I was starting again today, I’d find a way to deploy every semi-decent model I build (with exceptions for the dozens of experiments leading to the one worth sharing).
如果今天重新開始,我會找到一種方法來部署我構建的每個半體面的模型(數十項導致例外的值得分享的實驗除外)。
How?
怎么樣?
Don’t be afraid to make something simple. A basic front-end which someone can interact with is far more interesting than a notebook in a GitHub repo.
不要害怕使事情變得簡單。 可以與人交互的基本前端比GitHub存儲庫中的筆記本有趣得多。
No really, how?
真的不行,如何?
Train a model, build a front-end application around it with Streamlit, get the application working locally (on your computer), once it’s working wrap the application with Docker, then deploy the Docker container to Heroku or another cloud provider.
訓練模型,并圍繞它構建前端應用程序 流光 ,讓該應用程序在本地(在您的計算機上)運行,然后將其包裝到Docker中,然后將Docker容器部署到Heroku或其他云提供商。
Sure, we’re going against the rule here of using a few too many tools, but pulling this off a few times will get you thinking about what it’s like to get your machine learning model into people’s hands.
當然,我們在這里違反了使用太多工具的規則,但是將其拖了幾次會使您思考將您的機器學習模型掌握在人們手中的感覺。
Deploying your models will raise the questions you don’t get to ask when your machine learning model lives its life in a Jupyter Notebook, like:
部署模型將引發您的疑問,當您的機器學習模型在Jupyter Notebook中生活時,您不會問這些問題,例如:
- How long does inference take (the time for your model to make a prediction)? 推理需要多長時間(您的模型進行預測所需的時間)?
- How do people interact with it (maybe the data they send to your image classifier is different to your test set, data in the real world changes often)? 人們如何與之交互(也許他們發送到圖像分類器的數據與測試集不同,現實世界中的數據經常變化)?
- Would someone actually use this? 有人會實際使用嗎?
“我想做研究” (“I want to do research”)
Building things becomes research. You’ll want your models to work faster, better. To achieve this, you’ll need to research alternative ways of doing things. You’ll find yourself reading research papers, replicating them and improving upon them.
建筑成為研究。 您將希望自己的模型更快,更好地工作。 為此,您需要研究其他的處理方式。 您會發現自己正在閱讀研究論文,進行復制并加以改進。
I’m often asked, “how much math should I know before I start machine learning?”
經常有人問我:“在開始機器學習之前,我應該知道多少數學?”
To which I usually reply, “how much walking should I know before I go for a run?”
我通常會對此回答:“跑步前我應該知道多少步行?”
I don’t really say this, I’m usually nicer and say something like, “can you solve the problem you’re currently working on?”, if so, you know enough, if not, learn more.
我并不是真的這么說,我通常會說得更好,比如說:“您能解決您當前正在研究的問題嗎?”,如果可以,那么您就足夠了,如果不是,請了解更多。
As a side note, I’ve just ordered the Mathematics for Machine Learning book. I’m going to be spending the next month or two reading it cover to cover. Having read the free text online it’s more than enough to cover the fundamentals.
作為附帶說明,我剛剛訂購了《機器學習數學》一書 。 我將在下個月或兩個月內閱讀有關內容。 在線閱讀免費文本,足以覆蓋基本知識。
證書前技巧 (Skill before certificates)
I’ve got online course certificates coming out of my ass.
我的屁股上有在線課程證書。
I got caught thinking more certificates equals more skills.
我被認為是更多的證書等于更多的技能。
I’d burn through lectures on 1.75x speed just to get to the end, pass the automated exam and share my progress online.
我會以1.75倍的速度瀏覽所有演講,直到結束,通過自動考試并在線共享我的進度。
I optimised for completing courses instead of creating skills. Because watching someone else explain it was easier than learning how to do it myself.
我為完成課程而不是技能進行了優化。 因為看著別人解釋它比自己學習如何做要容易。
Idiot.
白癡。
Here’s the thing. Everything I learned for an exam, I’ve forgotten. Everything I learned through experimenting, I remember.
這是東西 我為考試學到的一切,我都忘記了。 我記得我通過實驗學到的一切。
Now, this isn’t to say online certifications and courses aren’t worth your time. Courses help to build foundational skills. But working on your own projects helps to build specific knowledge (knowledge which can’t be taught).
現在,這并不是說在線認證和課程不值得您花時間。 課程有助于建立基礎技能。 但是,在您自己的項目上工作有助于建立特定的知識(無法教授的知識)。
- Instead of stacking certificates, stack skills (and prove your skill through sharing your work, more on this later). 而不是堆疊證書,而是堆疊技能(并通過共享您的工作來證明您的技能,稍后會對此進行更多介紹)。
- Instead of doing more courses, repeat the ones you’ve already done. 無需重復其他課程,而是重復您已經完成的課程。
- Instead of looking for the newest tools, improve your use of the ones which have been around the longest. 不要尋找最新的工具,而要改進使用時間最長的工具。
- Instead of looking for more resources, reread the best books on your shelf. 重讀書架上最好的書,而不是尋找更多的資源。
Learning (anything) isn’t linear, better to read the same book twice (as long as it’s got some substance) than to add more to the pile.
學習(任何東西)都不是線性的,最好多讀一本書(只要有實質性內容)兩次,而不是多讀一本書。
I often tell my students, despite the immense proudness I feel when I see someone share a graduation certificate, I’d prefer them not to finish my course and instead take the parts they need and use them for their own work.
我經常告訴我的學生,盡管我看到有人分享畢業證書時感到非常自豪,但我還是希望他們不要完成我的課程,而是選擇他們需要的部分并將其用于自己的工作。
Before you add something, ask yourself, “have I sucked the juice out of what I’ve already covered?”
在添加東西之前,問自己:“我是否已經從已經覆蓋的東西中吸取果汁了?”
我將如何重新開始 (How I’d start again)
First of all, more important than any resource is to get rid of the “I can’t learn it” mentality. That’s bullsh*t. You’ve got the internet. You can learn anything.
首先,擺脫“我無法學習”的思想比任何資源都重要。 那太牛逼了。 你有互聯網。 你什么都可以學。
The internet has given rise to a new kind of hunter-gatherer. And if you decide to take on the challenge you can gather resources to create your own path.
互聯網催生了一種新型的狩獵采集者。 如果您決定接受挑戰,則可以收集資源來創建自己的道路。
The following path isn’t set either. It’s designed to be a compass rather than a map. And guess what? It’s all accessible online.
也未設置以下路徑。 它被設計為指南針而不是地圖。 你猜怎么著? 都可以在線訪問。
Let’s lay some foundations.
讓我們奠定一些基礎。
2020 Machine Learning Roadmap. 2020年機器學習路線圖摘錄。 Note: This curriculum is heavily focused on code-first, Python code in particular. It also neglects mobile or embedded device development. However, it contains more than enough resources to get an outstanding grounding in the field.注意:本課程非常注重代碼優先,尤其是Python代碼。 它還忽略了移動或嵌入式設備的開發。 但是,它包含的資源遠遠超過了在野外扎根的基礎。初學者路徑(6-12個月以上) (The beginner path (6–12+ months))
If I was starting again I’d learn far more software engineering practices intertwined with machine learning.
如果我重新開始,我會學到更多與機器學習交織在一起的軟件工程實踐。
My main goal would be to build more things people could interact with.
我的主要目標是建立更多可以與人們互動的事物。
The machine learning specific parts would be:
機器學習的特定部分將是:
Machine learning concepts — understand what kind of problems machine learning can and should be used for. Elements of AI is great for this.
機器學習概念 -了解機器學習可以并且應該用于什么樣的問題。 AI元素對此非常有用 。
Python — the language itself, along with the machine learning specific frameworks, NumPy, pandas, matplotlib, Scikit-Learn. Check out pythonlikeyoumeanit or the official documentation for each of these.
Python-語言本身,以及特定于機器學習的框架,NumPy,pandas,matplotlib,Scikit-Learn。 查閱pythonlikeyoumeanit或這些工具的官方文檔。
Machine learning tools — the main one being Jupyter Notebooks.
機器學習工具 -主要的工具是Jupyter Notebooks 。
Machine learning math — linear algebra from 3Blue1Brown or Khan Academy, matrix manipulation and calculus from Khan Academy or just read the Mathematics for Machine Learning Book.
機器學習數學 -3Blue1Brown或Khan Academy的 線性代數, Khan Academy的矩陣處理和微積分,或者只是閱讀《 Mathematics for Machine Learning》一書 。
Alongside these, I’d go through:
除了這些,我將經歷:
freeCodeCamp — for web development skills.
freeCodeCamp —用于Web開發技能。
CS50 + CS50 artificial intelligence — for foundational computer science and artificial intelligence skills.
CS50 + CS50人工智能 -用于基礎計算機科學和人工智能技能。
The Missing Part of Your CS Degree — for the parts CS50 misses out on and for coverage of all the tools you’ll end up using here and there anyway.
CS學位的缺失部分 -CS50 缺失的部分以及您無論如何在這里和那里最終都會使用的所有工具的覆蓋范圍。
Hands-On Machine Learning with Scikit-Learn and TensorFlow Part 1 — covers a vast majority of the most useful and time-tested machine learning techniques.
使用Scikit-Learn和TensorFlow的動手機器學習第1部分 -涵蓋了絕大多數最有用且經過時間考驗的機器學習技術。
There’s a lot here. So to consolidate my knowledge I’d build 1–2 milestone projects using Streamlit or the web development skills I’d learned from freeCodeCamp. And of course, these would be shared on GitHub.
這里有很多東西。 因此,為了鞏固我的知識,我將使用Streamlit或從freeCodeCamp學到的Web開發技能來構建1-2個里程碑項目。 當然,這些將在GitHub上共享。
進階路徑(6-12個月以上/進行中) (The advanced path (6–12+ months/ongoing))
Once I’d gotten some foundational machine learning skills, I’d build upon them with the following.
一旦我掌握了一些基礎的機器學習技能,便可以在以下基礎上進一步發展它們。
All of fast.ai’s curriculum(s) — practical use cases of many deep learning and machine learning techniques. Watching one fast.ai lecture turned into a solution we built for a client.
fast.ai的所有課程 -許多深度學習和機器學習技術的實際用例。 觀看一次fast.ai演講,變成了我們為客戶構建的解決方案。
Any of deeplearning.ai’s curriculum(s) — choose the one which sparks your interest the most. Compliments fast.ai’s practical approach with theory.
deeplearning.ai的任何課程 -選擇最能引起您興趣的課程 。 將fast.ai的實踐方法與理論相輔相成。
Full-stack deep learning curriculum — this is where you’re going to tie together the machine learning knowledge you’ve got with the web development knowledge you’ve been learning.
全棧深度學習課程 -您將在這里將已經擁有的機器學習知識與正在學習的Web開發知識聯系在一起。
- Replicate a research paper (or multiple). 復制一份研究論文(或多篇)。
Hands-on Machine Learning Book with Scikit-Learn and TensorFlow Part 2 — TensorFlow focused but the concepts bridge to many different applications.
帶有Scikit-Learn和TensorFlow的動手機器學習書第2部分-TensorFlow著重介紹,但概念可連接到許多不同的應用程序。
Again, after going through these, I’d consolidate my knowledge by building a project people can interact with.
同樣,在經歷了這些之后,我將通過構建一個人們可以與之交互的項目來鞏固我的知識。
An example would be a web application powered by a machine learning model.
一個示例是由機器學習模型支持的Web應用程序。
課程范例 (Example curriculums)
Two of the biggest things you pay for with a college degree is accountability and structure.
大學學位支付的最大兩件事是責任感和結構。
Good news is, you can get both of these yourself.
好消息是,您可以自己獲得這兩項。
I created my own AI Masters Degree as a form of accountability and structure. You can do something similar.
我以問責制和結構形式創建了自己的AI碩士學位 。 您可以做類似的事情。
In fact, if I was starting again, I’d follow something more similar to Jason Benn’s How I learned web development, software engineering & ML. It’s similar to mine but includes more software engineering practices.
實際上,如果我要重新開始,我會遵循與Jason Benn的“我如何學習Web開發,軟件工程和ML”更相似的內容。 它與我的相似,但包含更多的軟件工程實踐。
If you can find a (small) community to learn with others, that’s a big bonus. I’m still not quite sure how to do this.
如果您可以找到一個(小)社區與他人一起學習,那將是很大的收獲。 我還是不太確定該怎么做。
A billion dollar idea is to develop a platform where people can create their own self-driven curriculums and interact with others who are on similar paths. I say self-driven here because all knowledge is largely self-taught. Rather than hand-feed knowledge, the role of an instructor is instead more to excite, guide and challenge.
十億美元的想法是開發一個平臺,人們可以在此平臺上創建自己的自我驅動課程,并與處于相似道路的其他人進行互動。 我在這里說自我驅動,因為所有知識在很大程度上都是自學的。 與其直接掌握知識,不如說教師的角色更能激發,引導和挑戰。
Does someone want to build this?有人要建立這個嗎?分享你的作品 (Share your work)
Learning and reading is inhaling. Building and creating is exhaling. Don’t hold your breath.
學習和閱讀令人振奮。 建立和創造是令人陶醉的。 不要屏住呼吸。
Balance your consumption of materials with creations of your own.
平衡材料消耗與您自己的創作。
For example, you might spend 6 weeks learning, then 6 weeks putting your knowledge together in a form of shared work.
例如,您可能要花6周的時間來學習,然后花6周的時間以共享工作的形式將您的知識整合在一起。
Your shared work is your new resume.
您共享的工作就是您的新簡歷。
Where?
哪里?
GitHub and your own blog. Use the other platforms when needed. For machine learning projects, a runnable Colab notebook is your minimum requirement.
GitHub和您自己的博客。 必要時使用其他平臺。 對于機器學習項目,可運行的Colab筆記本是您的最低要求。
少了什么東西? (What’s missing?)
Everything here is biased by my own experience of graduating from a nutrition degree, spending 9-months studying machine learning in my bedroom whilst driving Uber on the weekends to pay for courses, getting a machine learning job, leaving the job and building a machine learning course.
我所擁有的營養學位,在臥室里學習機器學習9個月,周末開車去Uber支付課程費用,獲得機器學習工作,離職并建立機器學習的經歷使我的一切偏頗課程。
I have no experience of going to a coding bootcamp or university to learn technological skills so therefore can’t compare the differences.
我沒有去編碼訓練營或大學學習技術技能的經驗,因此無法比較兩者之間的差異。
Though, since we’re talking about code and math, it either works or it doesn’t. Knowing this, the contents of the materials you choose doesn’t matter as much as how you learn it.
但是,由于我們在談論代碼和數學,因此它要么起作用要么不起作用。 知道這一點,您選擇的材料的內容與您學習的方式無關緊要。
翻譯自: https://towardsdatascience.com/how-id-start-learning-machine-learning-again-3-years-in-55c52aaee52a
機器學習與分布式機器學習
總結
以上是生活随笔為你收集整理的机器学习与分布式机器学习_我将如何再次开始学习机器学习(3年以上)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 需求量太低难以盈利,松下宣布停止生产刻录
- 下一篇: 乘联会:1 月狭义乘用车零售预计 136