机器学习导论�_机器学习导论
機器學習導論�
Say you are practising basketball on your own and you are trying to shoot the ball into the hoop. If you fail at the first try, your first instinct would most probably be to move forward or backwards, maybe jump higher or go lower, or even stretch your hands properly. Thing is, whatever you do, you are trying to get that ball into the basket. If it does not work, you keep trying new tactics to eventually reach your goal. This is the concept of machine learning.
假設您自己練習籃球,并且試圖將球射入籃筐。 如果您第一次嘗試失敗,那么您的第一個直覺很可能是向前或向后移動,可能會跳得更高或更低,甚至正確地伸手。 事情是,無論您做什么,您都在努力將那個球放進籃筐。 如果它不起作用,您將繼續嘗試新的策略以最終實現目標。 這就是機器學習的概念。
Machine learning is an application of artificial intelligence that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use statistical analysis to predict an output while updating outputs as new data becomes available(ie learn).
機器學習是人工智能的一種應用,它使系統能夠自動學習并從經驗中進行改進,而無需進行明確的編程。 它著重于計算機程序的開發,該程序可以訪問數據并使用統計分析來預測輸出,同時隨著新數據的獲得(即學習)更新輸出。
“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”- Tom Mitchel
“如果計算機的程序在T上的性能(由P來衡量)隨著經驗E的提高而提高,那么據說計算機程序就可以從經驗E中學習一些任務T和一些性能指標P。”
機器學習分類 (Classification of Machine Learning)
There are various categories of machine learning. They are:
機器學習有各種類別。 他們是:
- Supervised Learning 監督學習
- Unsupervised Learning 無監督學習
- Reinforcement Learning 強化學習
Supervised Learning: Here, the system has been supplied with previously labelled data so it can apply what has been learned from those labelled examples to new data to predict future events. It is like someone trying to memorize new facts while comparing it to a note. This learning algorithm can compare its output with the correct, intended output and find errors in order to modify the model accordingly. A typical example would be email classification as spam, where you already have some emails that have been labelled “spam”, and you classify new emails as spam or not depending on whether they have the same qualities as the spam mails. Regression is another type of supervised learning.
監督學習:這里已向系統提供了以前標記的數據,因此可以將從那些標記的示例中學到的信息應用于新數據以預測未來事件。 就像有人試圖在將新事實與筆記進行比較時記住新事實一樣。 該學習算法可以將其輸出與正確的預期輸出進行比較,并發現錯誤,以便相應地修改模型。 一個典型的示例是將電子郵件分類為垃圾郵件,其中您已經有一些已標記為“垃圾郵件”的電子郵件,并且根據新郵件是否具有與垃圾郵件相同的質量,將新郵件分類為垃圾郵件。 回歸是另一種監督學習。
Unsupervised Learning: Here, the system is presented with unlabeled, uncategorized data leaving to the algorithm to determine the data patterns on its own. The system doesn’t figure out the right output, but it explores the data and can draw inferences from datasets to describe hidden structures from unlabeled data. Recommendation systems usually seen on the web in that does marketing automation are based on this type of learning. Clustering and association are types of unsupervised learning.
無監督學習:在這里,系統將顯示未標記,未分類的數據,并留給算法自行確定數據模式。 該系統無法找出正確的輸出,但可以瀏覽數據并可以從數據集中得出推論,以描述未標記數據中的隱藏結構。 在網絡上通常可以看到的推薦系統可以進行營銷自動化,它是基于這種學習類型的。 聚類和關聯是無監督學習的類型。
Reinforcement Learning: Here, you present the system with examples that lack labels as in unsupervised learning, but this time around, you accompany an example with positive or negative feedback (a reward system) according to the solution the algorithm proposes. It is a type of dynamic programming that trains algorithms using a system of reward and punishment. This method allows the algorithm or agent to automatically determine the ideal behaviour within a specific context in order to maximize its performance. The learning algorithm, or agent, learns by interacting with its environment and is typically seen when computers learn to play games, outperform human players, and even optimize its score.
強化學習:在這里,您向系統展示的示例缺少無監督學習中的標簽,但是這次,根據算法提出的解決方案,您將為示例提供正面或負面的反饋(獎勵系統)。 這是一種動態編程,它使用獎勵和懲罰系統來訓練算法。 此方法允許算法或代理自動確定特定上下文內的理想行為,以使其性能最大化。 學習算法或代理是通過與環境互動來學習的,通常在計算機學習玩游戲,超越人類玩家甚至優化其分數時才能看到。
選擇正確的機器學習問題 (Choosing the Right Machine Learning Problem)
You have collected a bunch of data and want to use machine learning techniques to analyse this data, how do you choose the right machine learning problem for your use case? The problem categories we will cover in this article are:
您已經收集了很多數據,并希望使用機器學習技術來分析這些數據,如何為您的用例選擇正確的機器學習問題? 我們將在本文中介紹的問題類別是:
- Classification 分類
- Regression 回歸
- Clustering 聚類
- Dimensionality reduction 降維
Classification: When you need to classify your input data into categories or classes, it turns out that predicting categories is a very common use case and these categories could be virtually anything. Like I mentioned in the email example above, is this email “spam” or “not spam”? Should you send it to the “inbox” or “spam” folder? As a financial trader constantly monitoring stock markets, given past information on the market, company performance, stock performance, should you “buy”, “sell” or “hold”? Or say you are working with image data and want to do object recognition, is this a “cat”, “mouse” or “dog”. The list is endless, but we can see that the output of a classification model is one category or class.
分類:當您需要將輸入數據分類為類別或類別時,事實證明預測類別是一個非常普遍的用例,而這些類別實際上可以是任何東西。 就像我在上面的電子郵件示例中提到的那樣,此電子郵件是“垃圾郵件”還是“非垃圾郵件”? 您應該將其發送到“收件箱”或“垃圾郵件”文件夾嗎? 作為一名金融交易員,不斷監控股票市場,鑒于過去的市場信息,公司業績,股票表現,您應該“買”,“賣”還是“持有”? 或者說您正在使用圖像數據并且想要進行對象識別,這是“貓”,“鼠標”還是“狗”。 列表是無止境的,但是我們可以看到分類模型的輸出是一個類別或類。
Regression: When you want your model to predict continuous numeric values, you would want to use a regression model. As a financial trader, given current market sentiments, previous earnings of the company and you need to predict the price of the stock tomorrow, then a regression model is your guy. You might be analysing the performance of different cars available given the attributes of a car and you want to predict its mileage or even trying to predict the price of a house considering the location and other conditions of the house. Once you are able to observe the nature of the problem, it is easier to know what to use.
回歸 :當您希望模型預測連續的數值時,您將要使用回歸模型。 作為金融交易員,考慮到當前的市場情緒,公司的先前收益以及您需要預測明天的股票價格,那么回歸模型就是您的理想選擇。 給定汽車的屬性,您可能正在分析可用的不同汽車的性能,并且您想要預測其行駛里程,甚至考慮房屋的位置和其他條件來嘗試預測房屋的價格。 一旦您能夠觀察到問題的本質,就更容易知道使用什么。
Clustering: When you have a really large dataset with no idea of what is in it, to make some sense of it, you may want to try clustering. In social media ads targeting, finding users that are interested in a particular field so you can target specific ads to them is an application of clustering. Another one is document discovery, you could gather all documents related to armed robbery and see if you can find patterns in the cases. Clustering just allows you to self discover patterns in fine details.
聚類:如果您有一個非常大的數據集,卻不知道其中的內容,那么從某種意義上講,您可能想嘗試聚類。 在社交媒體廣告定位中,找到對特定字段感興趣的用戶,以便您可以將特定廣告定位到他們,這是集群的一種應用。 另一個是文件發現,您可以收集與武裝搶劫有關的所有文件,看看是否可以找到案件中的模式。 聚類僅允許您自行發現詳細的模式。
Dimensionality Reduction: This is a preprocessing technique used to perform feature detection on your data. Let’s say you have 500 different variables, which of them are most significant? What features do you pay more attention to? This is where dimensionality reduction comes to play. It is used to preprocess your data to build more robust machine learning models with better performance whether they are classification, regression or any other kind. Dimensionality reduction helps us find latent factors when we have large data and no target values.
降維 :這是一種預處理技術,用于對數據執行特征檢測。 假設您有500個不同的變量,其中哪個變量最重要? 您需要注意哪些功能? 這就是降維的作用所在。 無論是分類,回歸還是任何其他類型的數據,它都可以用于預處理數據以構建更強大的,性能更好的機器學習模型。 當我們擁有大數據且沒有目標值時,降維可幫助我們找到潛在因素。
結論 (Conclusion)
Machine Learning comes into the picture when problems cannot be solved by means of typical approaches. It enables the analysis of large data delivers faster, more accurate results in order to identify profitable opportunities or dangerous risks.
當無法通過典型方法解決問題時,機器學習就會成為現實。 它可以對大數據進行分析,從而提供更快,更準確的結果,從而確定可獲利的機會或危險的風險。
This article is intended to just give an introduction to the concept of Machine learning. There is a lot more to learn and it can be done by wanting to learn, creating time and finding the right resources online. I hope I have been able to make you want to learn more/
本文旨在僅介紹機器學習的概念。 還有很多東西要學習,可以通過學習,創造時間并在線找到合適的資源來完成。 希望我能夠使您想了解更多/
翻譯自: https://medium.com/@amarachi.anyim00/an-introduction-to-machine-learning-493d16017d9b
機器學習導論�
總結
以上是生活随笔為你收集整理的机器学习导论�_机器学习导论的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: nlp自然语言处理_自然语言处理(NLP
- 下一篇: 奥迪 Activesphere 概念车亮