netflix 数据科学家_数据科学和机器学习在Netflix中的应用
netflix 數據科學家
數據科學 , 機器學習 , 技術 (Data Science, Machine Learning, Technology)
Using data science, Netflix has surpassed its competition and now has over 100 million users globally. Data science helps Netflix keep track of all your likes and dislikes to make sure you’re satisfied.
借助數據科學,Netflix超越了競爭對手,目前在全球擁有超過1億用戶。 數據科學可幫助Netflix跟蹤您的所有好惡,以確保您滿意。
, Photo by , fauxels from Pexels的Pexelsfauxels攝影這個概念 (The Concept)
Data science is a combination of tools, algorithms, and machine learning principles that help users gain functional and beneficial patterns from raw data. A data scientist can identify future occurrences of an event by using advanced machine learning algorithms. The Internet of Things (IoT) has given rise to the fundamentals of data science, making it the most valuable resource for all companies today.
數據科學是工具,算法和機器學習原理的組合 ,可幫助用戶從原始數據中獲得功能性和有益的模式。 數據科學家可以使用高級機器學習算法來確定事件的未來發生。 物聯網(IoT)引起了數據科學的基礎知識,使其成為當今所有公司最有價值的資源 。
目的 (The Aim)
Netflix has always strived to improve User Interface at all levels. Their primary goal is to add Contextual Awareness to their recommendations. It means that the proposals should have high logical reasoning behind them. As per DataFlair, two types of contextual classes are relevant to Netflix.
Netflix一直在努力改善所有級別的用戶界面 。 他們的主要目標是在他們的建議中增加上下文意識 。 這意味著這些建議應在其背后具有較高的邏輯推理性。 根據DataFlair ,兩種類型的上下文類與Netflix有關。
1.明確 (1. Explicit)
● Location
●位置
● Language
●語言
● Time of the Day
●一天中的時間
● Device
●設備
2.推斷 (2. Inferred)
● Binging Patterns
●結合方式
● Companion
●同伴
User Interface at all levels, Photo by 用戶界面 , energepic.com from Pexels的Pexelsenergepic.com攝應用程序 (The Application)
Netflix has used data science to ensure that users enjoy value for money. With the help of various Analytical Tools, the Streaming Giant identifies the liking and proclivity of users and directs them towards similar options. A study suggests that recommendations influence more than 80% of all streamed content on Netflix.
Netflix利用數據科學來確保用戶享受金錢的價值 。 在各種分析工具的幫助下,Streaming Giant可以識別用戶的喜好和傾向并將其引導至相似的選項。 一項研究表明,推薦影響Netflix上所有流媒體內容的80%以上。
Netflix does not use the conventional Hadoop warehouse. It instead uses an upgraded Data Storage System, Amazon’s S3. It allows it to spin more Hadoop clusters for work bases accessing the same Data. It uses Hive for Ad hoc queries and Analytics/PIG for ETL (Extract, transform, load)
Netflix不使用傳統的Hadoop倉庫。 相反,它使用升級的數據存儲系統Amazon的S3 。 它允許它為訪問相同數據的工作基地旋轉更多的Hadoop集群。 它使用Hive進行臨時查詢,并使用Analytics / PIG進行ETL(提取,轉換,加載)
數據 (The Data)
To begin their Analysis, Netflix gathers Raw Fata, from which it plans to extract resourceful information using Data Science Algorithms. A combination of these algorithms transforms plain numbers to a detailed Recommendation Plan. For every 5 minutes a user spends on scrolling, Netflix can predict more than 40% of their relative selection patterns. There are several fields on Netflix, where Data is collected, captured, and stored.
為了開始進行分析,Netflix收集了Raw Fata,并計劃使用Data Science算法從中提取資源豐富的信息。 這些算法的組合將素數轉換為詳細的推薦計劃。 用戶每花5分鐘滾動一次,Netflix就可以預測其相對選擇模式的40%以上。 Netflix上有幾個字段,用于收集,捕獲和存儲數據。
● Time: The primary step is to understand and store the Time and Date when users stream content. It helps them identify your Sunday night-horror movie plans or your Afternoon-thriller preferences.
●時間:第一步是了解和存儲用戶流式傳輸內容時的時間和日期。 它可以幫助他們確定您的周日夜間恐怖電影計劃或您的下午驚悚片偏好。
● Searches: All Search Titles are automatically stored to re-direct further recommendations towards these searches. Let’s say you search “John Wick,” watch the movie and close Netflix. The next time you switch the application back on, you will undoubtedly find more Action movies or more Keanu Reeves starters.
●搜索:將自動存儲所有搜索標題,以將更多建議重定向到這些搜索。 假設您搜索“ John Wick ”,觀看電影并關閉Netflix。 下次您重新打開該應用程序時,無疑會找到更多的動作電影或更多的Keanu Reeves起動器。
● Browsing and scrolling behavior: Netflix also uses Advanced Analytical programs to identify which Movie/TV show you decided to stop and read about. It helps them showcase more similar content to catch your eye and get you interested again.
●瀏覽和滾動行為:Netflix還使用Advanced Analytical程序來確定您決定停止并閱讀的電影/電視節目。 它可以幫助他們展示更多類似的內容,以引起您的注意并再次引起您的興趣。
● Pause/Fast-forward: Using Data Science, Netflix catches the exact durations where a user starts Pausing or Fast-forwarding while streaming content. It helps it identify what kind of scenes are preferred over others. If you skip an action movie’s emotional scene, it develops the algorithm to avoid passionate movies in future recommendations. But if you re-watch an emotional scene, it will adapt accordingly.
●暫停/快進:使用數據科學,Netflix可以捕獲用戶在流式傳輸內容時開始暫停或快進的確切時長。 它有助于確定哪種場景比其他場景更受青睞。 如果您跳過動作片的情感場景,它會開發出避免在以后的推薦中出現激情片的算法。 但是,如果您重新觀看一個情感場景,它將相應地進行調整。
● A device used: If you use separate mechanisms to stream different content, this differentiation is stored permanently. For example, Children watching cartoons on the home-TV will not be recommended movies watched by their parents on the iPad, despite using the same account.
●使用的設備:如果使用單獨的機制來流傳輸不同的內容,則此差異將被永久存儲。 例如,即使使用相同的帳戶,也不會推薦父母在iPad上觀看家庭電視上觀看動畫片的兒童觀看的電影。
To begin their Analysis, Netflix gathers Raw Fata, from which it plans to extract resourceful information using Data Science Algorithms, Photo by Lukas from Pexels為了開始進行分析,Netflix收集了Raw Fata,并計劃使用Data Science算法從中提取資源豐富的信息, Pexels的Lukas 攝該項目 (The Project)
Netflix uses Data at all levels possible. From the time it a user logs in to log out, it stores all possible information it needs. It then channels these Data to bring out actionable information. The most famous story of Netflix’s marketing is how they purchased the “House of Cards” series. The series, starred by Kevin Spaced and directed by David Fincher, was one of the biggest blockbuster hits. More than a hundred million dollars was incurred to purchase this TV series, for several reasons.
Netflix盡可能使用數據。 從用戶登錄到注銷開始,它就存儲了所需的所有可能的信息。 然后,它會引導這些數據以帶出可操作的信息 。 Netflix營銷最著名的故事是他們如何購買“ 紙牌屋 ”系列。 該系列由凱文·斯派西德(Kevin Spaced)主演,由大衛·芬奇(David Fincher)執導,是最熱門的大片之一。 購買該電視連續劇的費用超過一億美元 ,原因有幾個。
● Netflix identified a vast fan base for Actor Kevin Spacey, who has acted in movies such as 21 and American Beauty.
●Netflix為演員凱文·斯派西(Kevin Spacey)確定了龐大的粉絲群,他曾出演過21電影和《美國美女》等電影。
● It also did a background check about Trending and Popular movies on their platform. Movies like Fight Club and The Social Network were highly rated and viewed by their audience, all directed by the renowned David Fincher.
●它還對平臺上的熱門電影和熱門電影進行了背景檢查。 像《 搏擊俱樂部》和《社交網絡》這樣的電影獲得了觀眾的高度評價和觀看,全部由著名的大衛·芬徹執導。
● Netflix also viewed the statistics of the British version of the series, that was earlier released. The UK version received due appreciation by its target audience, which boosted its stance.
●Netflix還查看了早先發行的該系列英國版本的統計信息。 英國版受到其目標受眾的應有贊賞,這增強了其立場。
● The Political Drama Genre was one of their most active genres, with movies like Elizabeth I: The Virgin Queen and Winnie Mandela, doing rounds on their website.
●政治戲劇類型是他們最活躍的類型之一,像伊麗莎白一世(Elizabeth I:The Virgin Queen)和溫妮·曼德拉(Winnie Mandela)等電影在其網站上進行巡回演出。
Using programmable algorithms, all factors were linked to a pattern, making Netflix spend the big bucks on House of Cards. The series then became a massive hit and climbed to the #1 position on their trending charts, making it a successive and profitable Analysis.
使用可編程算法 ,所有因素都與一種模式相關聯,從而使Netflix在“紙牌屋”上花了大錢。 該系列隨后大受歡迎,并在其趨勢圖上攀升至第一位,使其成為連續且盈利的分析。
Netflix identified a vast fan base for Actor Kevin Spacey, who has acted in movies such as 21 and American Beauty, Photo by Bich Tran from PexelsNetflix公司確定了演員凱文斯派西,誰在電影中,如21行動和美國麗人,照片龐大粉絲群的碧陳德良從Pexels好處 (The Benefits)
Why would a company like Netflix, having a Market Monopoly, spend their time on Data Science? The answer is Consumer Retention. It is crucial to attracting new customers while retaining the current batch. Using Data Analysis tools, users of Netflix have preferred its platform over other service providers such as Hotstar and Amazon Prime. Netflix has beautifully driven millions of users towards its platform, achieving 20 Billion Dollars in revenue in 2019.
為什么像Netflix這樣擁有市場壟斷地位的公司花時間在數據科學上 ? 答案是消費者保留。 在保留當前批次的同時吸引新客戶至關重要。 使用數據分析工具,Netflix的用戶比其他服務提供商(例如Hotstar和Amazon Prime)更喜歡其平臺。 Netflix吸引了數百萬用戶使用其平臺,在2019年實現了200億美元的收入。
結果 (The Outcome)
Netflix gained more than 3.1 Million followers on its platform after the release of House of Cards; this addition was majorly gained from the US streamers. It helped Netflix in plenty of ways.
在紙牌屋發布之后,Netflix在其平臺上吸引了310萬追隨者; 這種增加主要來自美國的彩帶。 它以多種方式幫助了Netflix。
● Revenue: Newly subscribed users added more than 72.5 Million Dollars in Revenue for Netflix. It was more than 75% of the combined investment Netflix made to air both seasons of the show.
●收入:新訂閱用戶為Netflix增加了超過7250萬美元的收入。 這是該節目兩個季度Netflix播出的總投資的75%以上。
● Word of Mouth: Adding high users and tending to their needs using Data Science helped Netflix gain even more popularity globally. It also led to the sequential addition of users through referrals, expanding, and creating further growth opportunities.
●口口相傳:使用數據科學增加高用戶群并滿足他們的需求有助于Netflix在全球范圍內獲得更大的普及。 它還通過推薦 ,擴展和創造進一步的增長機會而導致用戶的順序添加。
顯示器 (The Display)
Every section on Netflix’s home page is unique to its user’s account. Each chapter is displayed based on a vast set of Data collected, combined to produce the most relevant recommendations.
Netflix主頁上的每個部分對于其用戶帳戶都是唯一的。 每章都是根據收集的大量數據進行顯示的,并結合起來產生最相關的建議。
1.趨勢: (1. Trending:)
The Trending section is formatted according to the Location and preferences of the user. Chris Hemsworth’s Extraction was on the top of the Trending list in India, just after its release. Every user in India who had viewed action-based content or Chris Hemsworth’s movies was recommended Extraction.
根據用戶的位置和偏好設置“趨勢”部分的格式。 克里斯赫姆斯沃思的提取是在印度的趨勢列表的頂部,只是其發布后。 在印度每個用戶誰曾看到基于行動的內容或克里斯赫姆斯沃思的電影推薦提取 。
Netflix gained more than 3.1 Million followers on its platform after the release of House of Cards, Image by Jorge Gryntysz from PixabayNetflix公司獲得了其平臺上超過310萬周的追隨者卡,圖像的眾議院版本由后豪爾赫Gryntysz從Pixabay2.繼續觀看 (2. Continue Watching)
This section is a set of collective content that a User has begun streaming, but has left unfinished. Pause durations are stored to start streaming the content on the exact scene on which it has been paused/terminated before.
此部分是用戶已開始流式傳輸但未完成的一組集體內容。 暫停持續時間將被存儲,以開始在之前已被暫停/終止的確切場景上流式傳輸內容。
3.類型內容 (3. Genre Content)
If the user frequently indulges in viewing Action movies, A section will be separately created named “Violent Movies.” This section will contain all popular Action Movies that have plenty of Violent scenes. If a user watches shows like Money Heist (A top-rated show dealing with thieves in Spain), they will find an additional section named “Risk-Taker and Rule-Breaker TV” on their Home Page.
如果用戶經常沉迷于觀看動作電影,則將單獨創建一個名為“暴力電影”的部分。 本部分將包含所有具有暴力場景的流行動作片。 如果用戶觀看了諸如Money Heist之類的節目(這是西班牙處理盜賊的最佳節目),他們將在其主頁上找到一個名為“ Risk-Taker and Rule-Breaker TV”的附加欄目。
4.因為你看了 (4. Because You Watched)
There is also a combination section, where all other Data is factored in. Suppose a User watched the Movie Polar, a new part called “Because you watched Polar” will be created, containing other movies of the Same genre, Actors, Directors, and Producers.
還有一個組合部分,其中包含所有其他數據。假設用戶觀看了電影Polar,將創建一個名為“因為您觀看了Polar”的新部分,其中包含相同類型,演員,導演和其他電影的其他電影生產者。
Netflix aims at making people wonder how it always has a ready-made list that will entertain them. Every Pause, Scroll, and Log-in time is used to enhance User Interface in the best way possible.
Netflix的目的是讓人們懷疑它總是有一個現成的列表來娛樂他們。 每次暫停,滾動和登錄時間都用于以最佳方式增強用戶界面 。
Netflix aims at making people wonder how it always has a ready-made list that will entertain them, Photo by Stas Knop from PexelsNetflix的目標是讓人們懷疑它總是有一個現成的列表來娛樂他們, Pexels的Stas Knop 攝測試 (The Testing)
Netflix always conducts Background Testing at scale to understand the functionality of their Data analysis-driven recommendations. The Results and Statistics from these Tests determine whether a set of algorithms should be widely introduced in their platforms globally.
Netflix始終進行大規模的背景測試 ,以了解其以數據分析為依據的建議的功能。 這些測試的結果和統計數據決定了是否應在其全球平臺上廣泛引入一組算法。
基于交錯的個性化 (Personalization Based on Interleaving)
Netflix conventionally followed the A/B testing policy, where two sets of reduced algorithms were tested on two different sets of samples. The results of these tests were based on how accurately the recommendation section appealed to the target samples. This method was subsequently scrapped because of its implausibility.
Netflix一直遵循A / B測試政策 ,即在兩組不同的樣本上測試兩組簡化算法。 這些測試的結果基礎上,建議部分如何準確地上訴到目標樣本。 此方法由于其難以置信而隨后被廢棄。
Netflix adopted a new method of Testing. In this testing method, Netflix decided to infuse Interleaving of Algorithms to decide on the best Page Ranking Algorithm for improving User Interface. This method benefited the American Media Service Provider in many ways.
Netflix采用了一種新的測試方法。 在這種測試方法中,Netflix決定注入算法交織來確定最佳的頁面排名算法,以改善用戶界面。 這種方法使美國媒體服務提供商從許多方面受益。
Interleaving of Algorithms to decide on the best Page Ranking Algorithm for improving User Interface, Image by 交織 ,以決定用于改善用戶界面的最佳頁面排名算法。作者: Michal Jarmoluk from Michal PixabayJarmoluk● Cost-friendly: Interleaving involves blending, which means Netflix carried out two tests for the price of one. Background testing involves a significant amount of Cost, which was saved using this method.
●成本低廉:交織涉及融合,這意味著Netflix以一項價格進行了兩項測試。 后臺測試涉及大量Cost ,使用此方法可以節省這些費用 。
● Time-saving: Combining two testing methods into one saves time to work on other matters and quickly gives out the results. We all know that Time is Money; hence, this is considered as a more suitable and profitable choice of Testing.
●節省時間:將兩種測試方法合而為一,可以節省處理其他問題的時間并快速給出結果。 我們都知道時間是金錢 ; 因此,這被認為是一種更合適,更有利可圖的測試選擇。
重要性 (The Importance)
As the world moves into the future, digitization has been normalized by all. The inflow of Users on the Internet is continually growing in large numbers. It has created a heated environment filled with intense competition among Media Service Providers like Netflix and Amazon Prime.
隨著世界走向未來,數字化已被所有人規范化。 Internet上的用戶流入量持續大量增長。 它創造了一個激烈的環境,充滿了像Netflix和Amazon Prime這樣的媒體服務提供商之間的激烈競爭 。
1.參與度: (1. Engagement:)
Data Science helps Netflix to increase the participation of users powerfully and creatively. Using Analytics, a virtual rapport between the user and the Service provider is created. Netflix aims at exploiting this rapport with their Market Share advantage.
數據科學幫助Netflix強大而有創意地增加了用戶的參與度。 使用Analytics(分析),可以在用戶和服務提供商之間建立虛擬的融洽關系 。 Netflix旨在利用其市場份額優勢來發展這種融洽關系。
2.解決方案: (2. Solution:)
Netflix aims at using Data Science as a go-to for problem-solving. There are plenty of problems that Data Science can help with.
Netflix旨在將數據科學作為解決問題的捷徑 。 數據科學可以解決很多問題。
● Low reach: Recommendations on Netflix can improve the view count on overlooked content. It helps Netflix to keep its audience engaged on its platform.
●觸及率低:Netflix上的建議可以提高被忽略內容的觀看次數。 它可以幫助Netflix保持觀眾對平臺的關注。
● Feedback and Ratings: Analytical programs and Probability models help Netflix average a cluster of User Ratings to categorize content, based on its ability to impress.
●反饋和評分:分析程序和概率模型可幫助Netflix根據其印象深刻的能力對一組用戶評分進行平均,以對內容進行分類。
● Policy Control: Netflix has a strict policy that discourages the sharing of a single account by multiple people. Netflix allows up-to five Individual Profiles to access the website using one account. Using Data Science governs the Devices used for log-ins from the same accounts to avoid a breach.
●策略控制:Netflix具有嚴格的策略,不鼓勵多人共享一個帳戶。 Netflix允許多達五個個人檔案使用一個帳戶訪問該網站。 使用Data Science可以控制用于從同一帳戶登錄的設備,以避免違規。
rapport between the user and the Service provider is created. 融洽關系 。 Netflix aims at exploiting this rapport with their Netflix的目標是利用這種關系有其Market Share advantage, Video by 市場份額的優勢,視頻由BUMIPUTRA from 土著從PixabayPixabay● Innovation and Efficiency: The critical quality of Data Science is that it never runs out of fashion. Machine learning continually adapts to the present, uses previously-stored Data available at present to predict future outcomes. Efficiency for Netflix would mean to deliver the right content to the right user.
●創新和效率:數據科學的關鍵素質是它永遠不會過時。 機器學習不斷適應當前情況,使用當前可用的先前存儲的數據來預測未來的結果。 Netflix的效率意味著向正確的用戶提供正確的內容。
● Decision making: Gathering Data to make decisions is not the mantra to success. The mantra lies in mastering Analytics to use the Data and channel it in the right direction. Netflix has used Data Science to identify the appropriate opportunities and paths available.
●決策:收集數據來制定決策不是成功的咒語。 口頭禪在于掌握Analytics(分析)以使用數據并按正確的方向進行引導。 Netflix已使用數據科學來確定適當的機會和可用路徑。
● Personalization: In a commercial market where the physical sale is conducted, a consumer can ask for personalized products, test it, and purchase it. Data Science has helped Netflix stretch its range to meet all the customized demands of the public.
●個性化:在進行實物銷售的商業市場中,消費者可以要求個性化產品,進行測試并購買。 數據科學幫助Netflix擴展了其范圍,以滿足公眾的所有定制需求。
For a consumer, a sense of satisfaction is met when the correct product is available at the right time and place, for the right price. Netflix has made its users’ lives more convenient by providing high-quality, relevant content at their fingertips.
對于消費者而言,當在正確的時間和地點以正確的價格獲得正確的產品時,就會感到滿足感。 Netflix通過提供觸手可及的高質量相關內容,使用戶的生活更加便捷 。
結論 (The Conclusion)
It all comes down to one question:
歸結為一個問題:
Based on the historical actions taken by a user and the data available, what is the most probable video a user will play right now?
根據用戶的歷史操作和可用數據,用戶現在最可能播放的視頻是什么?
Netflix aims at using Data Science as a go-to for problem-solving. There are plenty of problems that Data Science can help with., Photo by Dominika Roseclay from PexelsNetflix旨在將數據科學作為解決問題的捷徑 。 數據科學可以解決很多問題。,來自Pexels的Dominika Roseclay 攝The list of recommendations can be prepared within seconds using Probability Models and Analytical Programs. Data science has become an integral part of the growing world. It has built the foundation on which companies like Netflix and more will develop their future. Netflix has minimized its scope for errors, enhanced User Interface, and boosted User Engagement.
可以使用概率模型和分析程序在幾秒鐘內準備好建議列表。 數據科學已成為成長中世界不可或缺的一部分。 它為Netflix等公司和更多公司發展未來奠定了基礎。 Netflix 最大限度地減少了出錯的范圍 , 增強了用戶界面 ,并增強了用戶參與度 。
I`ve always taken life as a journey from one experience to another. So far it has been a road full of interesting events and people. Join me on my Journey through LinkedIn, Instagram & Youtube
我一直把生活視為從一種經歷到另一種經歷的旅程。 到目前為止,這條路充滿了有趣的事件和人們。 通過 LinkedIn , Instagram 和 Youtube 加入我的旅程
Once in action, decision-making seems like an easy task. But it requires creative workers, using high-end tools to create solutions adaptable across all verticals. Netflix holds a dominating market share and is crowned as “HBO of Internet Tv.” The success of any platform on the World Wide Web can’t come without a strong foundation. Without Data Science, companies would be stuck with unfiltered clusters of Databases, with no clue how they will proceed further.
一旦采取行動,決策似乎是一件容易的事。 但是,這需要創意工作者使用高端工具來創建適用于所有行業的解決方案。 Netflix擁有主要的市場份額,并被冠以“ 互聯網電視的HBO”之稱 。 互聯網上任何平臺的成功都離不開堅實的基礎 。 沒有數據科學,公司將被困在未經過濾的數據庫集群中 ,而沒有任何線索進一步發展。
Every person must ask themselves whether Data Analytics will improve their business or not? Netflix did it, so should you.
每個人都必須問自己,Data Analytics是否會改善他們的業務? Netflix做到了,您也應該這樣做。
With all the information at hand, you are hopefully prepared to become a successful Data Scientist in the future. Hope this helps and all the best for your future endeavors! Thanks for reading this article! Leave a comment below if you have any questions.
掌握了所有信息,您有望將來成為一名成功的數據科學家。 希望這對您的未來有所幫助,并祝一切順利! 感謝您閱讀本文! 如有任何疑問,請在下面發表評論。
翻譯自: https://medium.com/towards-artificial-intelligence/applications-of-data-science-and-machine-learning-in-netflix-dcdf6abbb194
netflix 數據科學家
總結
以上是生活随笔為你收集整理的netflix 数据科学家_数据科学和机器学习在Netflix中的应用的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 两次飞天 航天员刘洋加了一颗星:还有两个
- 下一篇: 苹果推进隐私保护 中国用户可开启高级数据