AI:Algorithmia《2020 state of enterprise machine learning—2020年企业机器学习状况》翻译与解读
AI:Algorithmia《2020 state of enterprise machine learning—2020年企業機器學習狀況》翻譯與解讀
目錄
《2020 state of enterprise machine learning》翻譯與解讀
Introduction?
Survey at a glance概覽
Key finding 1: The rise of the data science arsenal for machine learning用于機器學習的數據科學武器庫的興起
Data scientists employed, a year-on-year comparison
Demand for data scientists
New roles, the same data science
Key finding 2: Cutting costs takes center stage as companies grow隨著公司的成長,削減成本成為焦點
Machine learning use case frequency
Smaller companies focus on customers
Breakdown of use cases by industry
Key finding 3: Overcrowding at early maturity levels and AI for AI’s sake早熟階段的過度擁擠和人工智能
2020 machine learning maturity levels
55% of companies surveyed have not deployed a machine learning model
9% more companies have gotten models into production since 2018
Year and company size comparison
Machine learning maturity and company size
Gauging maturity in the year ahead
Anticipated maturity stage in the next 12 months
Key finding 4: An unreasonably long road to deployment不合理的漫長部署之路
Model deployment timeline
Model deployment timeline and company size
Model deployment timeline and ML maturity
Data science workload and the last mile to deployment
Time data scientists spend deploying models by company size
Key finding 5: Innovation hubs and the trouble with scale創新中心和規模問題
Model reproducibility impedes ML maturity
Year comparison of machine learning challenges
Organizational misalignment and ML progress
Key finding 6: Budget and machine learning maturity, priorities and industry預算和機器學習成熟度、優先級和行業
AI/ML budgets FY18 to FY19
Budgets and ML maturity
FY19 AI/ML budgets and ML maturity level
AI/ML budgets for banking and financial services
AI/ML budgets for manufacturing
AI/ML budgets for information technology
Key finding 7: Determining machine learning success across the org chart在整個組織結構圖中確定機器學習的成功
The future of machine learning
Methodology?方法
About Algorithmia
About the cover
相關文章
AI:Algorithmia《2020 state of enterprise machine learning—2020年企業機器學習狀況》翻譯與解讀
AI:Algorithmia《2021 enterprise trends in machine learning 2021年機器學習的企業趨勢》翻譯與解讀
《2020 state of enterprise machine learning》翻譯與解讀
文章鏈接:2020 State of ML - Algorithmia
Introduction?
| ?In the last 12 months, there have been myriad developments in machine learning (ML) tools and applications, and hardware for AI and ML is also progressing. Google’s TPUs are in their third generation, the AWS Inferentia chip is a year old, Intel’s Nervana Neural Network Processors are enabling deep learning, and Microsoft is reportedly developing its own customAIhardware. This year, Algorithmia has had conversations with thousands of companies in various stages of machine learning maturity. From them we developed hypotheses about the state of machine learning in the enterprise, and in October, we decided to test those hypotheses. Building on the State of Enterprise Machine Learning report we published in 2018, we conducted a new two-prong survey this year, polling nearly 750 business decision makers across all industries from companies actively developing machine learning lifecycles or just beginning their machine learning journey. | ?在過去的 12 個月里,機器學習 (ML) 工具和應用程序有了無數的發展,人工智能和機器學習的硬件也在進步。 谷歌的 TPU 已進入第三代,AWS 推理芯片已有一年的歷史,英特爾的 Nervana 神經網絡處理器正在支持深度學習,據報道微軟正在開發自己的定制人工智能硬件。 今年,Algorithmia 與數千家處于機器學習成熟度不同階段的公司進行了對話。 我們從他們那里提出了關于企業機器學習狀態的假設,并在 10 月決定檢驗這些假設。 基于我們在 2018 年發布的企業機器學習狀況報告,我們今年進行了一項新的雙管齊下的調查,對來自各行各業的近 750 名業務決策者進行了民意調查,這些決策者包括來自積極開發機器學習生命周期或剛剛開始機器學習之旅的公司。 |
| One set of respondents was administered a blind version of our survey by a third-party (we refer to this group in the report as Group A); the other set was sent a survey by Algorithmia and was aware of the author (referred to herein as Group B). Group A contained 303 respondents and Group B contained 442. We analyzed the responses from both groups for insight into their work, their companies’ machine learning roadmaps, and the changes they’ve seen in recent months with regard to ML development. Where applicable, we state when only one group is being cited in a given statistic. The Methodology section provides further detail on the specifics of the survey prongs and how we processed the data. The following are the findings of that effort, presented with our original hypotheses, as well as our analysis of the results. Where possible, we have provided a year-on-year comparison with data from 2018 and included predictions about what is likely to manifest in the ML space in the near term. We will soon make our survey data available on an interactive webpage to foster greater understanding of the ML landscape, and we are committed to being good stewards of this technology. Algorithmia seeks to empower every organization to achieve its full potential through the use of artificial intelligence and machine learning by delivering the last-mile solution for model deployment at scale. | 其中一組受訪者由第三方管理我們的盲版調查(我們在報告中將此組稱為A組);另一組由Algorithmia發送一份調查問卷,并且知道作者(這里稱為B組)。A組受訪者為303人,B組受訪者為442人。 我們分析了這兩組人的反饋,以深入了解他們的工作、他們公司的機器學習路線圖,以及他們在最近幾個月在機器學習開發方面所看到的變化。在適用的情況下,當一個給定的統計數據中只有一個組被引用時,我們會說明。方法論部分提供了關于調查重點的細節以及我們如何處理數據的進一步詳細信息。 以下是這項工作的發現,以及我們最初的假設,以及我們對結果的分析。在可能的情況下,我們提供了與2018年數據的同比比較,并對ML空間近期可能出現的情況進行了預測。我們將很快在一個交互式網頁上提供我們的調查數據,以促進對ML環境的更好理解,我們致力于成為這項技術的優秀管理者。 Algorithmia旨在通過使用人工智能和機器學習,為模型大規模部署提供最后一英里的解決方案,使每個組織都能充分發揮其潛力。 |
Survey at a glance概覽
| The main takeaway from the 2020 State of Enterprise Machine Learning survey is that a growing number of companies are entering the early stages of ML development, but challenges in deployment, scaling,versioning, and other sophistication efforts still hinder teams from extracting value from their ML investments. As a result, we will likely see a boom in the number of ML companies providing services to overcome these obstacles in the near term. In this report, we focus on seven key survey findings and what they say about the machine learning landscape. Those key findings are as follows: (1)、The number of data scientist roles at companies is often less than 10, but is growing rapidly across all industries. (2)、Business use cases for machine learning are becoming more varied but currently, customer-centric applications are the most common. (3)、Machine learning operationalization (having a deployed ML lifecycle) is fledgling but maturing across all industries with software and IT firms leading the charge. (4)、The main challenges people face when developing ML capabilities are scale, version control, model reproducibility, and aligning stakeholders. (5)、The time it takes to deploy a model is stuck somewhere between 31 and 90 days for most companies. (6)、Budgets for ML programs are growing most often by 25 percent, and the banking, manufacturing, and IT industries have seen the largest budget growth this year. (7)、Organizations are determining ML success by both business unit and statistical metrics with a significant divide by job level. | 2020 年企業機器學習現狀調查的主要內容是,越來越多的公司正進入 ML 開發的早期階段,但部署、擴展、版本控制和其他復雜工作方面的挑戰,仍然阻礙團隊從其 ML 投資中獲取價值。 因此,我們可能會看到為克服這些障礙而提供服務的ML公司數量激增。 在本報告中,我們將重點關注七個關鍵的調查結果,以及它們對機器學習前景的看法。這些主要發現如下: (1)、公司的數據科學家職位數量通常少于 10 個,但在所有行業中都在迅速增長。 (2)、機器學習的業務用例越來越多樣化,但目前以客戶為中心的應用最為普遍。 (3)、機器學習操作化(具有已部署的 ML 生命周期)在所有行業都處于起步階段,但在軟件和IT公司主導的所有行業中都日趨成熟。 (4)、人們在開發 ML 能力時面臨的主要挑戰是規模、版本控制、模型可再現性和協調利益相關者。 (5)、對于大多數公司來說,部署模型所需的時間大約在31到90天之間。 (6)、機器學習項目的預算增長幅度最大,達到 25%,其中銀行、制造和 IT 行業的預算增長幅度今年最大。 (7)、組織正在通過業務部門和統計指標來確定 ML 的成功,并根據工作級別進行顯著劃分。 |
| The report will go into each finding in detail and provide analysis and our outlook. | 報告將詳細研究每一項發現,并提供分析和展望。 |
?
Key finding 1: The rise of the data science arsenal for machine learning用于機器學習的數據科學武器庫的興起
| One of the pieces of data we collected this year was the number of data scientists employed at the respondent’s place of work. In conversations we regularly have with companies, we repeatedly hear that management is prioritizing hiring for the data science role above many others, including engineering, IT, and software. Here is what the survey results showed. Half of people polled (across both survey groups) said their companies employ between one and 10 data scientists. This is actually down from 2018 wherein 58 percent of companies claimed to employ between one and 10 data scientists. We would have expected the number to increase over time because investment in AI and ML is known to be growing (Gartner). When assessed in the context of the full data, however, a likely reason for the downward trend presents itself. In 2018, 18 percent of companies employed 11 or more data scientists. This year, however, that number soared to 39 percent, suggesting that across all industries, organizations are ramping up their hiring efforts to build larger data science arsenals, with some of them starting from close to 10 data scientists already. Another observation is that in 2018, barely 2 percent of companies had more than 1,000 data scientists; today that number is just over 3 percent, indicating small but significant growth. These companies include the big FAANG tech giants—Facebook, Apple, Amazon, Netflix, and Google (Yahoo); their large data science teams are working to maintain competitiveness as more third-party solutions crop up. | 我們今年收集的數據之一是被調查者工作地點雇傭的數據科學家的數量。在我們經常與公司的交談中,我們反復聽到管理層將數據科學職位的招聘優先于其他許多職位,而不是其他許多人,包括工程、IT和軟件。以下是調查結果。 在接受調查的兩組人中,有一半人表示,他們的公司雇傭1至10名數據科學家。這實際上比2018年有所下降,當時有58%的公司聲稱雇傭了1至10名數據科學家。 由于對人工智能和ML的投資正在增加,預計這一數字還會增加(Gartner)。然而,在全面數據的背景下進行評估,就會發現下降趨勢的一個可能原因。2018年,18%的公司雇傭了11名或以上的數據科學家。然而,今年這一數字飆升至39%,這表明,在所有行業,企業都在加大招聘力度,以建立更大的數據科學武庫,其中一些企業已經從近10名數據科學家開始招聘。 另一個觀察結果是,2018年,只有2%的公司擁有超過1000名數據科學家;如今,這個數字僅略高于3%,這表明增長雖小但意義重大。這些公司包括FAANG科技巨頭——Facebook、蘋果(Apple)、亞馬遜(Amazon)、Netflix和谷歌(Yahoo);隨著越來越多的第三方解決方案涌現,它們的大型數據科學團隊正在努力保持競爭力。 |
Data scientists employed, a year-on-year comparison
受雇的數據科學家,逐年比較
| Reflects data from both survey groups. Note that respondents who did not know or were unsure are not depicted in this graph. | 反映了兩個調查群體的數據。請注意,不知道或不確定的受訪者沒有在這張圖表中描述。 |
Demand for data scientists
對數據科學家的需求
| In 2016, Deloitte predicted a shortage of 180,000 data scientists by 2018, and between 2012 and 2017, the number of data scientist jobs on LinkedIn increased by more than 650 percent (KDnuggets). The talent?deficit and high demand means that hiring and maintaining data science roles will only become more difficult for small and mid-sized companies that cannot offer the same salary and benefits packages as the FAANG companies. As demand for data scientists grows, we may see a trend of junior-level hires having less opportunity?to structure data science and machine learning efforts within their teams, as much of the structuring and program scoping may have already been done by predecessors who overcame the initial hurdles. It could also mean, however, that leadership alignment has already been attained so ML teams will have more ownership and leeway in project execution. | 2016 年,德勤預測到 2018 年將短缺 180,000 名數據科學家,而在 2012 年至 2017 年期間,LinkedIn 上的數據科學家職位數量增加了 650% 以上(KDnuggets)。 人才短缺和高需求意味著,對于無法提供與 FAANG 公司相同的薪資和福利待遇的中小型公司而言,招聘和維護數據科學職位只會變得更加困難。 隨著對數據科學家的需求增長,我們可能會看到初級員工在其團隊中構建數據科學和機器學習工作的機會減少的趨勢,因為許多結構化和程序范圍劃分可能已經由克服最初障礙的前輩完成。 然而,這也可能意味著領導層已經達成一致,因此 ML 團隊將在項目執行中擁有更多的所有權和回旋余地。 |
New roles, the same data science
同樣的數據科學新角色
| Finally, we may also see the merging of traditional business intelligence and data science in order to?fill immediate requirements in the latter talent pool since both domains use data modeling (BI work uses statistical methods to analyze past performance, and data science makes predictions about future events or performance). Gartner predicts that the overall lack of data science resources will result in an increasing number of developers becoming involved in creating and managing machine learning models (Gartner CIO survey). This blending of roles, will likely lead to another phenomenon related to this finding: more role names and job titles for the same sorts of work. To that end, we are seeing an influx of new job titles in data science such as Machine Learning Engineer, ML Developer, ML Architect, Data Engineer, Machine Learning Operations (ML Ops), and AI Ops as the industry expands and companies attempt to distinguish themselves and their talent from the pack. | 最后,我們可能還會看到傳統商業智能和數據科學的融合,以滿足后者人才庫的即時需求,因為這兩個領域都使用數據建模(BI工作使用統計方法分析過去的績效,數據科學對未來事件或績效進行預測)。 Gartner 預測,數據科學資源的整體缺乏將導致越來越多的開發人員參與創建和管理機器學習模型(Gartner CIO 調查)。 這種角色的混合,很可能會導致與這一發現相關的另一種現象:相同類型的工作有更多的角色名稱和職務。 隨著行業的擴張,公司試圖將自己和他們的人才從群體中脫穎而出,數據科學領域的新職位如機器學習工程師、ML開發人員、ML架構師、數據工程師、機器學習運營(ML Ops)和AI Ops大量涌現。 |
?
Key finding 2: Cutting costs takes center stage as companies grow隨著公司的成長,削減成本成為焦點
| As a company, we are interested in machine learning applications in the enterprise and we strive to keep a pulse on how industries are using emerging ML tech to automate workflows. There are countless ways to apply ML to a particular business problem, such as using prediction modeling to make assessments about customer churn or applying natural language processing to millions of tweets to analyze the percentage of negative sentiments. In this year’s survey, we polled respondents about the ways their companies are using machine learning to ensure our understanding of the landscape is accurate or that we aren’t missing a key use case entering the enterprise. We anticipated a trend toward using ML to automate time-consuming processes and cut down on the number of human resources needed to do a given task. The results are depicted below. | 作為一家公司,我們對機器學習在企業中的應用感興趣,并努力保持 行業如何利用新興的ML技術實現工作流自動化。有無數種方法可以將ML應用于特定的業務問題,比如使用預測建模來評估客戶流失,或者對數百萬條tweet應用自然語言處理來分析負面情緒的百分比。 在今年的調查中,我們對受訪者進行了調查,詢問他們的公司如何使用機器學習來確保我們對環境的理解是準確的,或者我們是否遺漏了進入企業的關鍵用例。我們預計會有一種趨勢,即使用ML自動化耗時的流程,并減少執行給定任務所需的人力資源數量。結果如下所示。 |
Machine learning use case frequency
機器學習用例頻率
| Reflects data only from survey Group B. Note that respondents were allowed to choose more than one answer. | 僅反映調查組B的數據。注意,受訪者被允許選擇一個以上的答案。 |
| In this year’s survey, we provided a wide-ranging list of possible use cases and a write-in option. Respondents were encouraged to select all answers that applied to how their companies use AI and ML models today. The top three machine learning use cases across the board (for companies of all sizes) were as follows: (1)、Reducing company costs (2)、Generating customer insights and intelligence (3)、Improving customer experience | 在今年的調查中,我們提供了一個范圍廣泛的可能用例列表和一個寫入選項。鼓勵受訪者選擇適用于他們的公司今天如何使用 AI 和 ML 模型的所有答案。全面排名前三的機器學習用例(適用于各種規模的公司)如下: (1)、降低公司成本 (2)、生成客戶洞察和情報 (3)、提升客戶體驗 |
| When we break down the data by company size, we start to see some differentiation in priorities. The top five ML use cases for companies with 10,000 employees or more: (1)、Reducing company costs (2)、Process automation for internal organization (3)、Improving customer experience (4)、Generating customer insights and intelligence (5)、Detecting fraud | 當我們按公司規模細分數據時,我們開始看到優先級有所不同。擁有 10,000 名或更多員工的公司的前五個 ML 用例: (1)、降低公司成本 (2)、內部組織過程自動化 (3)、提升客戶體驗 (4)、生成客戶洞察和情報 (5)、檢測欺詐 |
| The top five ML uses cases for companies with 1,001-5,000 employees: (1)、Reducing company costs (2)、Retaining customers (3)、Process automation for internal organization (4)、Recommender systems (5)、Increasing customer satisfaction | 擁有 1,001-5,000 名員工的公司的前五個 ML 用例: (1)、降低公司成本 (2)、留住客戶 (3)、內部組織過程自動化 (4)、推薦系統 (5)、提高客戶滿意度 |
| The top five ML use cases for companies with fewer than 100 employees: (1)、Generating customer insights and intelligence (2)、Improving customer experience (3)、Reducing company costs (4)、Increasing customer satisfaction (5)、Retaining customers | 員工人數少于 100 人的公司的前 5 個 ML 用例: (1)、生成客戶洞察和情報 (2)、提升客戶體驗 (3)、降低公司成本 (4)、提高客戶滿意度 (5)、留住客戶 |
Smaller companies focus on customers
小公司專注于客戶
| The survey data showed that large companies are using ML primarily for internal applications (reducing company spend and automating internal processes), and smaller companies are primarily focused on customer-centric functions (increasing customer satisfaction, improving customer experience, and gathering insights). This suggests that as companies grow, they prioritize customer service less than cost-saving measures and applications that improve their product lines. Doing so comes at a price, however, as one-third of Americans consider switching companies after just one instance of poor customer service (Qualtrics). Conversely, an increase in customer retention rate of just 5 percent can produce more than a 25-percent increase in profits (Bain&Company). | 調查數據顯示,大公司主要將 ML 用于內部應用程序(減少公司支出和自動化內部流程),而小公司主要關注以客戶為中心的功能(提高客戶滿意度、改善客戶體驗和收集見解)。這表明,隨著公司的發展,他們優先考慮客戶服務,而不是節省成本的措施和改善其產品線的應用程序。然而,這樣做是有代價的,因為三分之一的美國人在一次糟糕的客戶服務(Qualtrics)之后考慮更換公司。相反,客戶保留率僅增加 5% 就能產生超過 25% 的利潤增長(Bain&Company)。 |
| Fortunately, machine learning is a solution for both types of business problems—cutting costs and customer satisfaction—and will likely shift business priorities in the near term as workflows are drastically augmented by new tech. For comparison, in our 2018 survey, 48 percent of respondents from companies with 10,000 or more employees said cost savings was a major ML priority, and 59 percent said increasing customer loyalty was the top ML use case, depicting a notable shift away from customers this year. It will be important to monitor this metric in future years to see if this is the beginning of a trend or an anomaly. Before conducting this year’s survey, we anticipated a more even spread of use cases across companies of all sizes independent of industry because of the number of companies and applications in development in the AI/ML space (Forbes). The percentages for cost reduction, roboticprocessautomation, and customer service applications may be an indicator of ML’s general newness and immaturity, which our next key finding discusses, or it may be demonstrative of the fact that those types of repetitive applications lend themselves more readily to automation. As machine learning becomes more sophisticated with time, we are likely to see a wider pool of use cases designed for specific organizational initiatives. | 幸運的是,機器學習是解決這兩種業務問題(削減成本和客戶滿意度)的解決方案,并且可能會在短期內改變業務重點,因為新技術極大地增強了工作流程。相比之下,在我們 2018 年的調查中,來自擁有 10,000 名或更多員工的公司的?48% 的受訪者表示,節省成本是 ML 的主要優先事項,59% 的受訪者表示,提高客戶忠誠度是 ML 的首要用例,這這表明今年的ML用戶明顯減少。在未來幾年監控這一指標以查看這是趨勢的開始還是異常情況非常重要。 在進行今年的調查之前,由于 AI/ML 領域(福布斯)中正在開發的公司和應用程序的數量,我們預計使用案例在各種規模的公司中的分布會更加均勻,而與行業無關。成本降低、機器人流程自動化和客戶服務應用程序的百分比可能是 ML 普遍新穎和不成熟的指標,我們的下一個關鍵發現將討論這一點,或者它可能表明這些類型的重復應用程序更容易實現自動化這一事實.隨著機器學習隨著時間的推移變得越來越復雜,我們可能會看到為特定組織計劃設計的更廣泛的用例池。 |
Breakdown of use cases by industry
按行業劃分的用例
| Understandably, industries with customer-facing products or services (retail, manufacturing, healthcare, etc.) prioritize ML applications that improve customer service, and industries involved with security, compliance laws, and proprietary data (financial institutions, government agencies, insurers, etc.) focus more so on ML use cases that help solve those challenges. The following are a few noteworthy examples: | 可以理解,擁有面向客戶的產品或服務(零售、制造、醫療保健等)的行業優先考慮改善客戶服務的 ML 應用程序,以及涉及安全、合規法律和專有數據的行業(金融機構、政府機構、保險公司等) .) 更多地關注有助于解決這些挑戰的機器學習用例。以下是一些值得注意的例子: |
| Respondents in both survey groups who work in consulting and professional services industries said that reducing customer churn was their top ML priority. The education/edtech sector’s top ML use case was interacting with customers, which is reasonable considering that students and instructors are likely a primary customer set in those industries. For the healthcare, pharmaceutical, and biotech industries, increasing customer satisfaction was the leading use case, suggesting that customer dissatisfaction or churn may be a continual challenge in those fields. IT companies use ML primarily to acquire new customers, and software development organizations prioritize ML recommendersystems to guide users toward viewing new products or features to buy. Banks and financial services firms are focusing their ML efforts on retainingcustomers and detecting fraud—keeping customers happy and mitigating vulnerabilities to the company. Finally, the energy sector, including utility companies, are focusing on forecasting demand fluctuations using ML, likely to prevent power outages, reduce response times during disruptions of service, and plan for power consumption for coming years (NeuralDesigner). | 在咨詢和專業服務行業工作的兩個調查組中的受訪者都表示,減少客戶流失是他們在機器學習方面的首要任務。 教育/edtech 行業的頂級 ML 用例是與客戶交互,考慮到學生和教師可能是這些行業的主要客戶群,這是合理的。 對于醫療保健、制藥和生物技術行業,提高客戶滿意度是主要用例,這表明客戶不滿意或流失可能是這些領域的持續挑戰。 IT 公司主要使用 ML 來獲取新客戶,軟件開發組織優先使用 ML 推薦系統來引導用戶查看新產品或購買新功能。 銀行和金融服務公司正在將他們的機器學習工作重點放在留住客戶和檢測欺詐上——讓客戶滿意并減少公司的漏洞。 最后,包括公用事業公司在內的能源部門正專注于使用 ML 預測需求波動,可能防止停電,縮短服務中斷期間的響應時間,并規劃未來幾年的用電量(NeuralDesigner)。 |
Key finding 3: Overcrowding at early maturity levels and AI for AI’s sake早熟階段的過度擁擠和人工智能
| Understanding how companies view their own machine learning maturity provides insight into future developments in the ML space. For this survey, we asked respondents to gauge where they think their companies are located currently on the machinelearningroadmap. That is to say, we sought to determine if they are just starting to consider machine learning applications for business problems or if they are operating a fully developed machine learning program, or somewhere in the middle of that spectrum, and whether their positioning has changed in the previous 12 months. In 2018’s survey report, nearly 40 percent of respondents said they were just beginning to develop ML plans (ie. evaluating use cases, starting to build models). Moreover, in 2018 fewer than 10 percent of respondents considered themselves at a sophisticated ML maturity level. | 了解公司如何看待自己的機器學習成熟度可以洞察機器學習領域的未來發展。在本次調查中,我們要求受訪者評估他們認為他們的公司目前在機器學習路線圖上的位置。也就是說,我們試圖確定他們是否剛剛開始考慮將機器學習應用程序用于解決業務問題,或者他們是否正在運行一個完全開發的機器學習程序,或者處于該范圍的中間,以及他們的定位是否在過去的12個月改變。 在 2018 年的調查報告中,近 40% 的受訪者表示他們剛剛開始制定 ML 計劃(即評估用例,開始構建模型)。此外,在 2018 年,不到 10% 的受訪者認為自己處于復雜的 ML 成熟度水平。 |
| This year, we asked respondents to select one of the following options to gauge ML maturity levels: Not actively considering ML as a business solution Evaluating ML use cases Just starting to develop/build models Developed models; working toward production Early stage adoption (models in production for 1-2 years) Mid-stage adoption (models in production for 2-4 years) Sophisticated (models in production for 5+ years) | 今年,我們要求受訪者選擇以下選項之一來衡量 ML 成熟度水平: 沒有積極考慮將 ML 作為業務解決方案 評估機器學習用例 剛開始開發/構建模型 開發模型;致力于生產 早期采用(模型生產 1-2 年) 中期采用(模型生產 2-4 年) 復雜(模型生產 5 年以上) |
2020 machine learning maturity levels
2020機器學習成熟度水平
55% of companies surveyed have not deployed a machine learning model
55%的受訪公司沒有部署機器學習模型
| Of the respondents who are actively engaging in ML (removing the first category of those who are not evaluating ML as a business solution), about one-fifth said they are evaluating use cases, based on an average of both survey groups. Those just starting to develop and build models numbered 17 percent, and a separate 17 percent of companies have developed models but are still working toward production. This means that 55 percent of companies surveyed have not deployed a machine learning model. | 根據兩個調查組的平均值,在積極參與 ML 的受訪者中(刪除了不將 ML 作為業務解決方案評估的第一類),大約五分之一的人表示他們正在評估用例。 剛開始開發和建造模型的公司占 17%,另外 17% 的公司已經開發了模型,但仍在努力生產。 這意味著?55% 的受訪公司尚未部署機器學習模型。 |
| ML in early stages of development The number of companies with undeployed models is up 4 percent from last year, likely because there are more companies across the board beginning ML journeys, inflating the category of newcomers. It is important to note as well that our survey sample increased by more than 200 people from last year. | 處于早期開發階段的機器學習 未部署模型的公司數量比去年增加了 4%,這可能是因為有更多公司全面開始 ML 之旅,從而擴大了新人的類別。 還需要注意的是,我們的調查樣本比去年增加了 200 多人。 |
9% more companies have gotten models into production since 2018
自2018年以來,有9%以上的公司已將模型投入生產
| Just over 22 percent of companies have had models in production for 1-2 years; last year, 13 percent of respondents claimed this, demonstrating a fairly significant migration toward productionization even if it is still early days for most companies Moreover, one-fifth of companies said they plan on getting models into production within the next year, suggesting that we may see a noticeable portion of companies moving into the next maturity category (mid-stage) in the near term. | 超過 22% 的公司已經生產了 1-2 年的模型;去年,有 13% 的受訪者表示這一點,這表明即使對于大多數公司來說仍處于初期階段,但向生產化的遷移也相當顯著 此外,五分之一的公司表示他們計劃在明年將模型投入生產,這表明我們可能會看到相當一部分公司在短期內進入下一個成熟類別(中期)。 |
Year and company size comparison
年份和公司規模比較
| In 2018, only 6 percent of respondents considered their companies to have sophisticated ML programs. This year, 8 percent do, and the majority of companies in the sophisticated category either have fewer than 500 employees or more than 10,000. In 2018, 39 percent of sophisticated companies had fewer than 100 employees and 29 percent had more than 10,000 employees. There are several ways to read this maturity breakdown. First, large companies typically have more budget for innovation hubs and emerging technology, thus streamlining the development of sophisticated ML initiatives. Smaller companies, however, can be quite agile technologically, able to build, buy, and iterate quickly. | 2018 年,只有 6% 的受訪者認為他們的公司擁有復雜的 ML 程序。今年,這一比例為 8%,大多數復雜類別的公司員工人數少于 500 人或超過 10,000 人。 2018 年,39% 的成熟公司員工人數少于 100 人,29% 的公司員工人數超過 10,000 人。 有幾種方法可以閱讀此成熟度細分。首先,首先,大公司通常有更多的預算用于創新中心和新興技術,從而簡化復雜ML計劃的開發。然而,較小的公司在技術上可以非常靈活,能夠快速構建、購買和迭代。 |
| They can also be highly motivated to build reputation, profit, brand loyalty, and a competitive edge right out of the gate—machine learning can be an effective and efficient tool to reach all those goals. That the largest and smallest companies are leading in ML maturity is significant and may speak to and encourage a more equal tech landscape wherein the largest tech voices are not the only voices at play. | 他們也可以非常積極地建立聲譽、利潤、品牌忠誠度和競爭優勢——機器學習可以成為實現所有這些目標的有效工具。 最大和最小的公司在 ML 成熟度方面處于領先地位是非常重要的,并且可能會影響和鼓勵一個更平等的技術領域,其中最大的技術聲音并不是唯一的聲音。 |
Machine learning maturity and company size
機器學習成熟度與公司規模
| Mid-sized companies span all maturity levels with the highest concentration in the early-to-mid-stage levels of maturity, suggesting that they may have a bit of both worlds—the agility of smaller companies to tackle new projects quickly and growing budgets dedicated to emerging tech (DigitalistMagazine). | 中型公司跨越所有成熟度級別,其中早期到中期成熟度級別的集中度最高,這表明它們可能兼有小型公司快速處理新項目的靈活性和不斷增長的新興技術預算(DigitalistMagazine)。 |
Gauging maturity in the year ahead
衡量未來一年的成熟度
| In the next 12 months, we expect the number of companies in the earliest machine learning stages (evaluating use cases and starting to develop models) to expand and then decline as ML becomes ever more ubiquitous in the enterprise. Eventually the early stages will decline as companies proceed through the machine learning lifecycle. | 在接下來的 12 個月中,隨著機器學習在企業中變得越來越普遍,我們預計處于最早機器學習階段(評估用例和開始開發模型)的公司數量會擴大然后下降。最終,隨著公司在機器學習生命周期中的發展,早期階段將逐漸衰落。 |
Anticipated maturity stage in the next 12 months
未來 12 個月的預期成熟階段
| The bottom line is that we do see a shift toward greater ML maturity in all companies surveyed, however, those in mid-to-late stages of maturity are still quite low in number. We expect to see that group grow over the course of the next 12 months as companies overcome last-mileMLproblems and align stakeholders toward building sophistication into their ML programs. | 最重要的是,我們確實看到所有接受調查的公司都在向更高的機器學習成熟度轉變,但是,處于成熟中后期的公司數量仍然很少。 隨著公司克服最后一英里的機器學習問題,并使利益相關者一致致力于將成熟度構建到其ML計劃中,該團隊將不斷壯大。 |
?
?
?
Key finding 4: An unreasonably long road to deployment不合理的漫長部署之路
Model deployment timeline
模型部署時間表
| A new metric we are beginning to track this year is the time it takes an organization to deploy a single ML model. Of companies surveyed, just about half say they spend between 8 and 90 days deploying one model. And 18 percent of companies are taking longer than 90 days—some spending more than a year productionizing! We thoroughly understand that there are many challenges to overcome when building a robust ML lifecycle, deployment being a large one (Medium). That being said, we would still have expected the percentage of companies who deploy models in less than a week to be significantly larger than 14 percent, based on the number of companies in the early stage maturity level (models in production for 1-2 years). Company size and maturity level provide some context to explain this relatively low number. | 我們今年開始跟蹤的一個新指標是組織部署單個 ML 模型所需的時間。 在接受調查的公司中,只有大約一半表示他們會花費 8 到 90 天來部署一個模型。 18% 的公司需要 90 天以上的時間——有些公司花費了一年多的時間進行生產! 我們完全理解,在構建健壯的ML生命周期時,有許多挑戰需要克服,部署是一個大的生命周期(中等)。 話。盡管如此,根據處于早期成熟度水平的公司數量(生產1-2年的模型),我們仍然預計在不到一周的時間內部署模型的公司比例將大大超過14%。公司規模和成熟度水平提供了一些背景來解釋這一相對較低的數字。 |
Model deployment timeline and company size
模型部署時間表和公司規模
| Companies of all sizes typically spend between 8 and 90 days deploying one model, with a few notable exceptions. A small (and we expect decreasing) number of companies is spending more than a year deploying models, and of those, mostly small-to-midsize organizations. Moreover, a fairly significant portion of companies with 100 employees or fewer is spending somewhere between 8 and 30 days deploying a single model. Moreover, there is a slight decrease in the ideal 0-7 day range as company size increases, and on the other side, there is a somewhat uniform indication that the larger the company, the more likely it will spend between 4 and 12 months deploying a model. We assess, however, that the time to deployment phenomenon is less dependent on company size alone and more so on maturity level. | 各種規模的公司通常會花費 8 到 90 天來部署一個模型,但也有一些明顯的例外。 一小部分(我們預計會減少)公司花費一年多的時間來部署模型,其中大部分是中小型企業。 此外,相當大一部分員工人數不超過 100 人的公司會花費 8 到 30 天時間來部署一個模型。 此外,隨著公司規模的增加,理想的 0-7 天范圍略有下降,另一方面,有一個統一的跡象表明,公司越大,部署一個模型的時間 4 到 12 個月的可能性就越大。 然而,我們評估說,部署時間的現象不僅僅取決于公司規模,而更多地取決于成熟度。 |
Model deployment timeline and ML maturity
模型部署時間表和ML成熟度
| Most noticeable—and understandable—is the increase in the 0-7 day range as companies’ machine learning programs mature. It follows that the more sophisticated a company’s ML efforts are, the more likely it is to deploy a model quickly. Also noteworthy is that the more sophisticated a company becomes, the more time it spends in the 8-30 day range for model deployment. Our best guess as to why is discussed in Key finding 5, the challenges associated with machine learning. In short, struggles with scale and aligning all stakeholders can add to timelines. | 隨著公司機器學習程序的成熟,最引人注目且可以理解的是 0-7 天范圍的增加。 因此,公司的 ML 工作越復雜,就越有可能快速部署模型。 同樣值得注意的是,公司越成熟,在 8-30 天范圍內用于模型部署的時間就越多。 我們對原因的最佳猜測在關鍵發現 5(與機器學習相關的挑戰)中進行了討論。 簡而言之,與規模和協調所有利益相關者的斗爭可以增加時間表。 |
Data science workload and the last mile to deployment
| When we look at the actual time spent deploying models, we see that at companies of all sizes, at least 25 percent of data scientist time is spent on deployment efforts. Put simply, a quarter of data science capability is lost to infrastructure tasks. In 2018, closer to 70 percent of data science capability was spent lost to deploying models, which at face value, appears to imply drastic improvement. However, the data cannot tell us definitely why this large decrease occurred. Ideally, it’s due to data scientists having the tools they need to deploy with ease, but based on the low number of companies in the deployed category, we are not confident in that assessment. It is more likely that data science teams are handing off more of their models to a DevOps or IT team to deploy, if it’s happening at all. “I’ve heard many variants of this story: they all capture a misaligned pace of work between product and machine learning teams. Ultimately, this leads to machine learning research never making it out of the lab. And yet, the best measure of impact for machine learning, if you work in a non-research institution, is whether you can use it to help your customers—and that means getting it out of the door” (Medium). | 當我們查看部署模型所花費的實際時間時,我們發現在各種規模的公司中,至少 25% 的數據科學家時間都花在了部署工作上。簡而言之,基礎設施任務損失了四分之一的數據科學能力。 2018 年,近 70% 的數據科學能力被用于部署模型,從表面上看,這似乎意味著大幅改進。但是,數據無法明確告訴我們為什么會出現這種大幅下降。理想情況下,這是由于數據科學家擁有輕松部署所需的工具,但基于已部署類別中的公司數量較少,我們對該評估沒有信心。數據科學團隊更有可能將更多模型移交給 DevOps 或 IT 團隊進行部署(如果這種情況發生的話)。 “我聽說過這個故事的許多說法:它們都反映了產品和機器學習團隊之間的工作節奏不一致。最終,這導致機器學習研究永遠無法走出實驗室。然而,如果您在非研究機構工作,衡量機器學習影響的最佳方法是你是否可以利用它來幫助你的客戶,這意味著要把它帶出門”(中等)。 |
Time data scientists spend deploying models by company size
數據科學家按公司規模部署模型的時間
| Data science teams need to be able to deploy their work as quickly as possible to prevent their insights from being overcome by events (OBE); models and data change quickly as do market opportunities. As such, an insight that comes 10 days too late is OBE and no longer useful. To that end, much of the potential of ML may yet to be seen. “This is why AI has yet to reshape most businesses: For many companies, deploying AI is slower and more expensive than it might seem” (MITTechnologyReview). In November 2019, Gartner said that the “increased use of commercial AI and ML will help to accelerate ?the deployment of models in production, which will drive business value from these investments” (Gartner). It went on to assess that the majority of teams developing ML capabilities are doing so using open-source tooling because of the dearth of viable commercial options. We assess that that gap is soon to be filled with ?companies offering a full suite of ML tooling as companies seek to become more mature in their ML lifecycles and look for third-party solutions rather than spending valuable time building ML infrastructure. | 數據科學團隊需要能夠盡快部署他們的工作,以防止他們的見解被事件(OBE)所克服;模型和數據變化很快,市場機會也在變化。因此,遲到 10 天的洞察力是 OBE,不再有用。為此,ML的許多潛力可能尚未被發現。 “這就是為什么 AI 尚未重塑大多數企業的原因:對于許多公司來說,部署 AI 比看起來更慢且更昂貴”(MITTechnologyReview)。 2019 年 11 月,Gartner 表示,“商業 AI 和 ML 的使用增加將有助于加速模型在生產中的部署,這將推動這些投資的商業價值”(Gartner)。它繼續評估,由于缺乏可行的商業選擇,大多數開發 ML 功能的團隊都在使用開源工具這樣做。我們估計,隨著公司尋求在其 ML 生命周期中變得更加成熟并尋找第三方解決方案,而不是花費寶貴的時間構建 ML 基礎設施,這一差距很快就會被提供全套 ML 工具的公司填補。 |
| It is worth sharing here a word of warning from Ryan Calo, an associate law professor at the University of Washington, who is also a co-founder of the Tech Policy Lab and a leading voice on law and emerging tech issues in the media. At a recent SeattleTimes–sponsoredpanel discussion on AI and the future of work,Calo cautioned attendees about snake oil AI companies. He described a plausible future scenario in which countless third-party AI solutions flood the market, creating a cacophony of messaging about AI necessities. The resulting confusion might allow AI firms to take advantage of non-technical customers who hope to stay competitive in their spaces. They may pay for services that are inappropriate or unnecessary for their business in order to mature their ML programs quickly. It is important at a time of rapid technological innovation, such as now, to tread intelligently and not fall victim to the “AIforAI’ssake” adage. | 值得在這里分享華盛頓大學法學副教授 Ryan Calo 的警告,他也是科技政策實驗室的聯合創始人,也是媒體上法律和新興技術問題的主要發言者。在最近由《西雅圖時報》贊助的關于人工智能和工作未來的小組討論中,Calo 提醒與會者注意蛇油人工智能公司。他描述了一個似是而非的未來場景,無數第三方人工智能解決方案涌入市場,制造了關于人工智能必需品的不和諧消息。 由此產生的混亂可能會讓人工智能公司利用那些希望在他們的領域保持競爭力的非技術客戶。他們可能會為他們的業務不合適或不必要的服務付費,以便讓他們的 ML 程序快速成熟。在像現在這樣快速技術創新的時代,重要的是要明智地行事,不要成為“AIforAI”格言的犧牲品。 |
?
Key finding 5: Innovation hubs and the trouble with scale創新中心和規模問題
| A crucial component of realizing ML’s full potential is scale. Can it scale? Scaling models was the ?biggest overall challenge cited by respondents this year (43 percent). For comparison, that percentage is up 13 percent from last year. But multiple requirements factor into scaling—hardware, modularity, data sourcing, etc.—and optimizing for it can lead to cumbersome team cross-cutting (Arc.dev). Of respondents from companies of more than 10,000 employees, 58 percent said scaling up was their top ML challenge. This may be demonstrative of decentralized organizational structures—data science teams siloed throughout company org charts—which can cause tooling, framework, and even programming language friction when scaled. Earlier this year, Gartner predicted that “through 2020, 80 percent of AI projects will remain alchemy, run by wizards whose talents will not scale in the organization” (Gartner). This outlook may prove true, but, we are skeptical based on our observations of a continuous increase in centralized innovation hubs and emerging tech centers (see Ericsson, IBM, Pfizer, etc.). We assess these hubs will be more efficient at maturing ML for their companies than the decentralized alternative (ie. data science components siloed throughout organizations working on one-off projects and models). An innovation hub can iterate quickly, work with agility across an organization, and standardize ML efforts. They can often vet new technologies quickly, ensuring their companies keep at the bleeding edge of technological development. We anticipate this kind of centralized focus on ML and AI technologies may just turn lead into gold, so to speak. | 實現 ML 全部潛力的一個關鍵組成部分是規模。它可以擴展嗎?擴展模型是今年受訪者提到的最大的總體挑戰(43%)。相比之下,這個百分比比去年增加了 13%。但是多個需求因素會影響擴展——硬件、模塊化、數據源等——并對其進行優化可能會導致繁瑣的團隊交叉 (Arc.dev)。 在員工人數超過 10,000 人的公司的受訪者中,58% 的人表示擴大規模是他們面臨的最大 ML 挑戰。這可能表明分散的組織結構數據科學團隊分散在整個公司組織結構圖中,這可能導致工具、框架甚至編程語言在擴展時產生摩擦。 今年早些時候,Gartner 預測“到 2020 年,80% 的 AI 項目仍將是煉金術,由奇才管理,他們的才能在組織中無法擴展”(Gartner)。這種前景可能被證明是正確的,但是,基于我們對集中式創新中心和新興技術中心(參見愛立信、IBM、輝瑞等)不斷增加的觀察,我們持懷疑態度。我們評估這些中心在為他們的公司成熟 ML 方面將比分散的替代方案(即數據科學組件孤立在從事一次性項目和模型的組織中)更有效。創新中心可以快速迭代,跨組織靈活工作,并使ML工作標準化。他們通常可以快速審查新技術,確保其公司保持在技術發展的前沿。我們預計,這種對ML和AI技術的集中關注可能會讓鉛變成黃金。 |
Model reproducibility impedes ML maturity
| The second most cited ML challenge was versioning and reproducibility of models (41 percent of respondents reported this). This number is much higher than the 24 percent of respondents who cited this challenge in 2018. Machine learning requires faster iteration than the traditional software development lifecycle, and ironclad version-control is paramount for pipelining, retraining, and evaluating models for accuracy, speed, and drift. Versioning is one of the hurdles that data science and ML teams must overcome to reach more sophisticated levels of ML maturity, so in future surveys, we will be monitoring this metric closely. We expect the number of times this is cited as a challenge to decrease in the coming year. | 被引用次數第二多的 ML 挑戰是模型的版本控制和可重復性(41% 的受訪者報告了這一點)。 這個數字遠高于 2018 年提到這一挑戰的 24% 的受訪者。機器學習需要比傳統軟件開發生命周期更快的迭代,而鐵的版本控制對于流水線、再培訓和評估模型的準確性、速度和漂移至關重要。 版本控制是數據科學和 ML 團隊必須克服的障礙之一,以達到更復雜的 ML 成熟度水平,因此在未來的調查中,我們將密切監控這一指標。 我們預計,在未來一年,這一挑戰被引用的次數將減少。 |
Year comparison of machine learning challenges
機器學習挑戰的年份比較
| Reflects data only from survey Group B. Note that respondents were allowed to select more than one challenge. | 僅反映來自調查組 B 的數據。請注意,受訪者可以選擇多個挑戰。 |
Organizational misalignment and ML progress
組織錯位和機器學習進展
| The third most cited ML challenge was getting organizational alignment and senior buy-in for ML initiatives (34 percent). Notably of the respondents who cited this challenge, 47 percent are from companies with more than 10,000 employees. Especially for decentralized organizations (no central innovation hub), trying to obtain multiple team and stakeholder concurrence may take a lot of time. In 2018, 23 percent of respondents noted stakeholder alignment as a challenge. The 24-percent increase this year might contribute to other metrics, such as number of models deployed, time to deployment, and scaling. We expect this problem to decline in coming years as ML becomes more routine, reliable, and measurable. | 第三個被引用次數最多的 ML 挑戰是是組織一致性和高層對ML計劃的認同(34%)。 值得注意的是,在提到這一挑戰的受訪者中,47% 來自擁有超過 10,000 名員工的公司。 特別是對于去中心化的組織(沒有中央創新中心),試圖獲得多個團隊和利益相關者的同意可能需要很多時間。 2018 年,23% 的受訪者認為利益相關者的一致性是一項挑戰。 今年 24% 的增長可能有助于其他指標,例如部署的模型數量、部署時間和擴展。 隨著機器學習變得更加常規、可靠和可測量,我們預計這個問題將在未來幾年內減少。 |
?
Key finding 6: Budget and machine learning maturity, priorities and industry預算和機器學習成熟度、優先級和行業
| This year’s survey shows that ML budgets vary across industry and stage of maturity, but on the whole, are growing at companies of all sizes. This is in line with estimates that in 2018, the compound annual growth rate (CAGR) of AI was $23.94 billion and is expected to reach $208.49 billion by 2025 (MarketWatch). | 今年的調查顯示,ML 預算因行業和成熟階段而異,但總體而言,各種規模的公司都在增長。 這與2018年人工智能的復合年增長率(CAGR)為239.4億美元的估計一致,預計到2025年將達到2084.9億美元(MarketWatch)。 |
AI/ML budgets FY18 to FY19
AI/ML 預算 2018財年?至 2019 財年
| Twenty-one percent of respondents said budgets for AI/ML programs were growing between 26-50 percent. Forty-three percent of companies have increased their AI/ML budgets between 1 and 25 percent in the last year. And just under one-third (27 percent) of respondents noted that their budgets have not changed. This may be a reason why the majority of companies are still at early-stage maturity levels. | 21%的受訪者表示,AI/ML項目的預算增長率在26%至50%之間。43%的公司在去年將AI/ML預算增加了1%到25%。只有不到三分之一(27%)的受訪者表示他們的預算沒有改變。這可能是大多數公司仍處于早期成熟水平的一個原因。 |
Budgets and ML maturity
預算和機器學習成熟度
| Fifty-seven percent of companies in the mid-level maturity range (ML models in production for 2-4 years) increased their budgets between 1 and 25 percent. Close to 50 percent of companies at early stage maturity levels also increased their budgets between 1 and 25 percent. And nearly 40 percent of companies already at sophisticated ML maturity levels increased their budgets by as much as 25 percent. Finally, 30 percent of sophisticated maturity respondents said they increased their AI/ML budgets by 26-50 percent. We expect to see mid-stage and sophisticated companies increase their AI/ML budgets by more than a quarter in the very near term once ML has proven itself. | 57% 處于中等成熟度范圍(生產 2-4 年的機器學習模型)的公司將預算增加了 1% 到 25%。 接近 50% 處于早期成熟度水平的公司也將預算增加了 1% 到 25%。 近 40% 的公司已經處于復雜的 ML 成熟度級別,其預算增加了多達 25%。 最后,30% 成熟的受訪者表示他們將 AI/ML 預算增加了 26-50%。 一旦 ML 證明了自己,我們預計將在短期內看到中期和成熟的公司將其 AI/ML 預算增加四分之一以上。 |
FY19 AI/ML budgets and ML maturity level
2019 財年 AI/ML 預算和 ML 成熟度級別
| We assess that this upward budgetary trend is due to the fact that companies already at a mid or sophisticated level of ML maturity (models built and deployed for 2-5 years), are doubling down on their tech investment efforts. This means that companies in very early deployment stages ( just starting to develop ML models) will have to triple their efforts to stay competitive in their industry. Now is the time to start planning a 2020 (and beyond) ML strategy. From an industry perspective, there was budgetary growth in specific industries, suggesting some jockeying for ML prowess in the nascent space. | 我們評估,這種預算上升的趨勢是由于公司已經處于中等或復雜的機器學習成熟度水平(構建和部署了 2-5 年的模型),正在加倍投入技術投資。 這意味著處于非常早期部署階段(剛剛開始開發 ML 模型)的公司將不得不加倍努力以保持其行業競爭力。 現在是開始規劃 2020 年(及以后)機器學習戰略的時候了。 從行業的角度來看,特定行業的預算有所增長,這表明新興領域的ML實力受到了一些爭奪。 |
AI/ML budgets for banking and financial services
銀行和金融服務的 AI/ML 預算
AI/ML budgets for manufacturing
制造業的 AI/ML 預算
AI/ML budgets for information technology
信息技術的 AI/ML 預算
?
Key finding 7: Determining machine learning success across the org chart在整個組織結構圖中確定機器學習的成功
| While it is still early days for ML in the enterprise, our seventh key finding (how companies are determining what success means for their ML efforts) is likely an indicator of the path that machine learning will take as it develops throughout the enterprise (Emerj). Our hypothesis was that if ML success is primarily measured by dollars saved, then ML models designed to reduce costs are likely to be developed in droves, more so than any of countless other ML applications. The top two metrics for discerning ML success as noted by respondents this year were tied for first place: business metrics, such as guaranteed ROI, and a more technical evaluation of ML model performance. Across all industries and company size, 58 percent of respondents said ML efforts are successful if they produce ROI, reduce customer churn, aid in product adoption, and/or promote brand fidelity. And another 58 percent of respondents said ML efforts are successful when model accuracy, precision, speed, and drift meet threshold. (Note that respondents were encouraged to select more than one answer option, accounting for the more than 100 percent total.) | 雖然機器學習在企業中的應用還處于起步階段,但我們的第七個關鍵發現(公司如何確定其機器學習工作的成功意義)很可能是機器學習在整個企業發展過程中所走道路的一個指標(Emerj)。 我們的假設是,如果ML的成功主要是通過節省的資金來衡量的,那么設計用于降低成本的ML模型可能會大量開發,這比無數其他ML應用程序都要多。 今年受訪者指出,辨別機器學習成功的前兩個指標并列第一:業務指標,如保證的投資回報率,以及對機器學習模型性能的技術性評估。 在所有行業和公司規模中,58% 的受訪者表示,如果 ML 能夠產生投資回報率、減少客戶流失、幫助產品采用和/或提升品牌忠誠度,那么它們是成功的。另有 58% 的受訪者表示,當模型準確性、精度、速度和漂移達到閾值時,ML 工作是成功的。 (請注意,鼓勵受訪者選擇多個答案選項,占總數的 100% 以上。) |
| When those percentages are broken down by role, an interesting separation occurs. The individual contributor level (data scientist, software developer) values technical measures of ML success more so than the business metrics, and C-level executives and VPs generally place more value on the opposite—measuring ML success by how it ultimately benefits the company at a strategic level. The director level is in the middle, valuing both the business unit impact (ROI, budgetary, strategic planning metrics) as well as the more technical metrics surrounding model performance. We assess that the director level will prove to be the crux of ML decisions made within organizations in the coming years as they seek to demonstrate their teams’ capabilities but also prove to senior management that ML is a worthwhile investment to make. To that end, we expect to see an increase in the number of proof of concepts demanded of ML tooling companies by organizations looking to build ML programs or mature their current efforts. This expectation works into our assessment that aligning stakeholders and obtaining senior buy-in will also become less of a challenge in the near term. | 當這些百分比按角色細分時,就會發生有趣的分離。個人貢獻者級別(數據科學家、軟件開發人員)比業務指標更重視?ML 成功的技術衡量標準,而C級高管和VP通常更看重相反的衡量ML成功的方法,即從戰略層面衡量ML成功的最終收益。 director 級別處于中間位置,既重視業務部門的影響(投資回報率、預算、戰略規劃指標),也重視圍繞模型性能的更多技術指標。我們評估,director 級別將被證明是未來幾年組織內做出 ML 決策的關鍵,因為他們試圖展示其團隊的能力,同時也向高級管理層證明 ML 是值得投資的。 為此,我們預計希望構建 ML 程序或使當前工作成熟的組織對 ML 工具公司要求的概念證明數量會有所增加。這種預期符合我們的評估,即協調利益相關者和獲得高級支持也將在短期內變得不那么具有挑戰性。 |
| “Machine learning will be the biggest technological shift of our generation, enabling businesses to achieve their full potential.”Diego Oppenheimer, CEO Algorithmia | “機器學習將是我們這一代人最大的技術轉變,使企業能夠充分發揮潛力。”Diego Oppenheimer,Algorithmia 首席執行官 |
The future of machine learning
| In our survey report from last year, we concluded that ML was very much in pioneering days, with most companies only just beginning to develop use cases, build models, and align teams. Twelve months later, we see the ML landscape already changing as early efforts to build healthy ML lifecycles become more streamlined. Our hypotheses for the near-term future include the following: (1)、A growing number of data scientists employed at mid-sized companies to help gain industry edge using ML (2)、Lower levels of customer satisfaction at large corporations as they prioritize cutting costs (3)、The advent of more innovation hubs to drive ML adoption within organizations (4)、An increase in director-level roles stewarding ML progress across all industries | 在我們去年的調查報告中,我們得出的結論是,機器學習處于非常開創性時期,大多數公司才剛剛開始開發用例、構建模型和調整團隊。12個月后,我們看到,隨著構建健康的ML生命周期的早期努力變得更加精簡,ML環境已經發生了變化。 我們對近期未來的假設包括: (1)、越來越多的數據科學家受雇于中型公司,以幫助利用機器學習獲得行業優勢 (2)、大公司的客戶滿意度較低,因為他們優先考慮削減成本 (3)、更多創新中心的出現推動組織內部采用機器學習 (4)、管理所有行業機器學習進展的director級角色增加 |
| We are convinced that company size is not a determinant of ultimate ML maturity level, and we look forward to the future where companies of all sizes in all industries can implement machine learning to automate and augment their business goals. We are particularly curious about what the near-term future holds for machine learning use cases. The trend of using ML to automate fairly formulaic tasks will soon give rise to more complex and pipelined ML workflows. As that happens, the infrastructure needed to compute those more compound applications will also change, requiring practitioners to make choices about tech tooling that may affect infrastructural performance or flexibility down the road. This year’s survey report should confirm for readers that machine learning in the enterprise is progressing in haste. Though the majority of companies are still in the early stages of ML maturity, it is incorrect to think there is time to delay your ML efforts. If your company is not currently ML–minded, rest assured your competitors are, and the rate of AI’s development is bound to increase exponentially. Now is the time to future-proof your organization with AI/ML. Join the 2020 state of enterprise machine learning conversation @algorithmia #2020StateOfML | 我們堅信,公司規模并不是最終 ML 成熟度水平的決定因素,我們期待未來所有行業的各種規模的公司都可以實施機器學習來自動化和增強他們的業務目標。 我們對機器學習用例的近期前景特別好奇。使用 ML 自動化相當公式化的任務的趨勢將很快產生更復雜和流水線化的 ML 工作流。隨著這種情況的發生,計算這些更復雜應用程序所需的基礎設施也將發生變化,要求從業者選擇可能影響基礎設施性能或靈活性的技術工具。 今年的調查報告應該向讀者證實,企業中的機器學習進展迅速。盡管大多數公司仍處于 ML 成熟的早期階段,但認為有時間推遲您的 ML 工作是不正確的。如果你的公司目前沒有ML意識,請放心,你的競爭對手是,人工智能的發展速度肯定會成倍增長。現在是時候用AI/ML證明您的組織的未來了。現在是使用 AI/ML 讓您的組織面向未來的時候了。 加入 2020 年企業機器學習狀態對話@algorithmia #2020StateOfML |
Methodology?方法
| The purpose of the 2020 State of Enterprise Machine Learning report is to examine the progression of ML across the business landscape and compare the current state with that from 12 months ago to begin to identify trends, anomalies, or patterns of behavior. This report is based on data Algorithmia collected in the fall of 2019 in a two-prong survey effort that returned 745 respondents. The first prong (referred to herein as Group A) comprised a set of 20 questions pertaining to machine learning efforts, capabilities, and company demographics, and was disseminated by an independent third-party company on Algorithmia’s behalf. This was done to ensure survey attribution anonymity and remove bias for or against Algorithmia on the part of the respondents. The third party sourced a random sample panel of business leaders and ML practitioners (individual contributor, manager, director, and executives) at companies using data science for machine learning. Group A respondents voluntarily participated in the survey and were offered a small compensation by the third party for doing so. Algorithmia received the raw data following the third party’s survey completion after all identifiable respondent demographic information was removed by the third party. | 2020 年企業機器學習狀態報告的目的是檢查 ML 在整個業務領域的進展,并將當前狀態與 12 個月前的狀態進行比較,以開始識別趨勢、異常或行為模式。本報告基于 Algorithmia 在 2019 年秋季在一項雙管齊下的調查工作中收集的數據,該調查返回了 745 名受訪者。 第一個問題(在此稱為 A 組)包含一組 20 個與機器學習工作、能力和公司人口統計相關的問題,由一家獨立的第三方公司代表 Algorithmia 傳播。這樣做是為了確保調查歸因匿名并消除受訪者對算法的偏見。第三方在使用數據科學進行機器學習的公司中隨機抽取了一個由商業領袖和 ML 從業者(個人貢獻者、經理、董事和高管)組成的樣本小組。 A 組受訪者自愿參與調查,并因此獲得第三方的小額補償。在第三方刪除所有可識別的受訪者人口統計信息后,Algorithmia 在第三方完成調查后收到原始數據。 |
| The third party screened respondents using the following questions: (1)、Does your company employ data scientists? (2)、Which role best describes your title/role within your organization? (3)、Which industry do you currently work in? (4)、Which stage of ML maturity is your company in? If respondents gave specific “I do not know or I am unsure” or null answers, they were removed from the respondent pool. In this way, Algorithmia amassed a group of 303 individuals with a level of insight into the machine learning efforts of their companies across a random sampling of industries, company sizes, and machine learning maturity levels. The second prong (referred to herein as Group B) consisted of 21 questions pertaining to machine learning efforts, capabilities, and company demographics and was disseminated internally by Algorithmia with the company name and logo on it. Most questions overlapped with those in the third party’s survey, but there were several exceptions, one of which was email address, which was collected in order to fulfill 10 random $50 gift card incentives. The survey explained that a respondent’s email would be used solely for the gift card purpose, and Algorithmia maintained survey integrity by ensuring the respondents’ answers were not connected with their email addresses in any way. | 第三方使用以下問題篩選受訪者: (1)、貴公司是否雇傭數據科學家? (2)、哪個角色最能描述您在組織中的頭銜/角色? (3)、您目前從事哪個行業? (4)、貴公司處于ML成熟度的哪個階段? 如果受訪者給出了具體的“我不知道或我不確定”或空答案,他們就會從受訪者庫中刪除。通過這種方式,Algorithmia 在行業、公司規模和機器學習成熟度級別的隨機抽樣中聚集了一組 303 人,他們對他們公司的機器學習工作有一定程度的洞察力。 第二部分(此處稱為 B 組)由 21 個與機器學習工作、能力和公司人口統計相關的問題組成,由 Algorithmia 在內部傳播,上面有公司名稱和徽標。大多數問題與第三方調查中的問題重疊,但也有幾個例外,其中之一是電子郵件地址,收集該地址是為了滿足 10 個隨機的 50 美元禮品卡獎勵。該調查解釋說,受訪者的電子郵件將僅用于禮品卡目的,Algorithmia 通過確保受訪者的答案與他們的電子郵件地址沒有任何關聯來保持調查的完整性。 |
| Group B was sent to individuals who have engaged with Algorithmia in the past in various capacities (ie. attended a company webinar, read an internal whitepaper, met with our team at an industry trade show, etc.). The 442 respondents in this group voluntarily participated in the survey and represented a diverse sampling of industries, company sizes, ML maturity levels, and organizational structures and roles. The survey was conducted in two prongs to account for unintentional bias in either group. Where possible and appropriate, the researchers averaged Groups A and B for the most accurate readings and specified when one or both groups were represented in the text or in the graphs. All percentages were rounded to the nearest whole number. The 2020 State of Enterprise Machine Learning questionnaires (Groups A and B) were developed collaboratively by the Product and Marketing teams at Algorithmia. The teams identified the key issues to measure, determined critical survey questions, and provided feedback on the draft questionnaire. The survey was designed to be self-administered and completed online in an average of six minutes. We will continue to conduct this annual survey to increase the breadth of our understanding of machine learning technology in the enterprise and share with the broader industry how ML is evolving. In doing so,we can track trends in ML development across industries over time, ideally making more informed predictions with higher degrees of confidence. | B 組被派發給過去曾以各種身份參與過 Algorithmia 的個人(即參加公司網絡研討會、閱讀內部白皮書、在行業貿易展上與我們的團隊會面等)。該組中的 442 名受訪者自愿參與了調查,代表了來自行業、公司規模、機器學習成熟度水平以及組織結構和角色的不同樣本。 該調查分兩個方面進行,以說明任一組的無意偏見。在可能和適當的情況下,研究人員對 A 組和 B 組進行平均,以獲得最準確的讀數,并指定一個或兩個組何時出現在文本或圖表中。所有百分比均四舍五入至最接近的整數。 2020 年企業機器學習現狀調查問卷(A 組和 B 組)由 Algorithmia 的產品和營銷團隊合作開發。團隊確定了要衡量的關鍵問題,確定了關鍵的調查問題,并就問卷草案提供了反饋。該調查旨在自我管理并在平均六分鐘內在線完成。 我們將繼續開展這項年度調查,以擴大我們對企業機器學習技術的理解,并與更廣泛的行業分享 ML 的發展歷程。通過這樣做,我們可以隨著時間的推移跟蹤跨行業的 ML 發展趨勢,理想情況下做出更明智的預測,并具有更高的置信度。 |
?
About Algorithmia
| Algorithmia is a leader in the machine learning space. We aim to empower every organization to achieve its full potential through the use of artificial intelligence and machine learning by delivering the last-mile solution for model deployment at scale. Our technology is trusted by more than 100,000 developers, Fortune 100 financial institutions, government intelligence agencies, and private companies. Algorithmia enables customers to: (1)、Deploy models from a variety of frameworks, languages, and platforms (2)、Connect popular data sources, orchestration engines, and step functions (3)、Scale model inference on multiple infrastructure providers (4)、Manage the ML lifecycle with tools to iterate, audit, secure, and govern To learn more about how Algorithmia can help your company accelerate its ML journey, visit our website at algorithmia.com. | Algorithmia 是機器學習領域的領導者。 我們的目標是通過提供用于大規模模型部署的最后一英里解決方案,使每個組織都能夠通過使用人工智能和機器學習來充分發揮其潛力。 我們的技術受到超過 100,000 名開發商、財富 100 強金融機構、政府情報機構和私營公司的信賴。 算法使客戶能夠: (1)、從多種框架、語言、平臺部署模型 (2)、連接流行的數據源、編排引擎、step函數 (3)、對多個基礎設施提供商的規模模型推斷 (4)、使用迭代、審計、保護和治理工具管理 ML 生命周期 要詳細了解 Algorithmia 如何幫助您的公司加速其 ML 之旅,請訪問我們的網站 algorithmia.com。 |
About the cover
| The cover image is a parallel set chart—similar to a Sankey diagram. Each line-set represents a specific data category. The width of each line-set’s path is determined by the proportional amount of the category total. The line-set on the left depicts the different industries that our survey participants come from. The line-set on the right of the diagram displays the machine learning maturity level of our respondents’ companies. Reflects data only from survey Group B. | 封面圖片是一個平行集圖——類似于桑基圖。 每個行集代表一個特定的數據類別。 每個線集路徑的寬度由類別總數的比例決定。 左側的線條描繪了我們的調查參與者來自的不同行業。 圖右側的線組顯示了受訪公司的機器學習成熟度水平。 僅反映來自調查組 B 的數據。 |
?
總結
以上是生活随笔為你收集整理的AI:Algorithmia《2020 state of enterprise machine learning—2020年企业机器学习状况》翻译与解读的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Py之pandas:字典格式数据与dat
- 下一篇: ML之FE:特征工程中数据缺失值填充的简