什么事数据科学_如果您想进入数据科学,则必须知道的7件事
什么事數(shù)據(jù)科學(xué)
No way. No freaking way to enter data science any time soon…That is exactly what I thought a year back.
沒門。 很快就不會(huì)出現(xiàn)進(jìn)入數(shù)據(jù)科學(xué)的怪異方式 ……這正是我一年前的想法。
A little bit about my data science story: I am a complete beginner in the Data Science field and I was desperately looking for a switch from digital marketing to data science exactly 6 months back. I assume you may want to ask..why desperately? Well, Because I became over confident in my job hunting abilities and resigned my ex-job without a backup. I started panicking during the last few days of my notice period. All the courses and tutorials available online and just the vast number of topics I had to cover to get started in data science was overwhelming for me. They say time flies and boy do I agree! It has already been half a year into my first data science job. I cannot wait to share all the learnings and experiences with you. If you are currently in the same shoes as I was, go on and keep reading for insights and motivation.
關(guān)于數(shù)據(jù)科學(xué)的故事:我是數(shù)據(jù)科學(xué)領(lǐng)域的一個(gè)完整的初學(xué)者,而我拼命地希望在6個(gè)月前從數(shù)字營(yíng)銷轉(zhuǎn)向數(shù)據(jù)科學(xué)。 我想你可能想問..為什么要拼命? 好吧,因?yàn)槲覍?duì)自己的求職能力變得過于自信,并辭掉了我的前工作而沒有后援。 在通知期的最后幾天,我開始驚慌失措。 在線提供的所有課程和教程,以及我在數(shù)據(jù)科學(xué)入門中必須涵蓋的大量主題,對(duì)我來說是不勝枚舉的。 他們說時(shí)光飛逝,男孩,我同意! 我的第一份數(shù)據(jù)科學(xué)工作已經(jīng)半年了。 我迫不及待想與您分享所有的學(xué)習(xí)和經(jīng)驗(yàn)。 如果您目前的狀態(tài)與我相同,請(qǐng)繼續(xù)閱讀以獲取見識(shí)和動(dòng)力。
Practice more than you read:
練習(xí)比:
I remember going through every single data science boot camp course available in Udemy and buying a couple of top rated courses that covered Python, SQL, Tableau and Machine Learning topics (Pro tip: Don’t go for generic “Data Science boot camps”. These courses don’t cover important topics in depth. Instead, try tool-specific boot camps like python boot camp, SQL boot camp, Deep Learning boot camp etc.). The courses were all detailed and honestly very helpful. But even after all the 50+ hours of lectures and many assignments, I was still someone with no data science experience. Even the basic analysis tasks in the first month of my job were relatively difficult for me. I was absolutely struggling to meet deadlines.
我記得我要遍歷Udemy中的每個(gè)數(shù)據(jù)科學(xué)新手訓(xùn)練營(yíng)課程,并購(gòu)買幾個(gè)涵蓋Python,SQL,Tableau和機(jī)器學(xué)習(xí)主題的最受好評(píng)的課程(專業(yè)提示:不要參加通用的“數(shù)據(jù)科學(xué)新手訓(xùn)練營(yíng)”。這些課程沒有深入介紹重要的主題,而是嘗試使用特定于工具的新手訓(xùn)練營(yíng),例如python新手訓(xùn)練營(yíng),SQL新手訓(xùn)練營(yíng),深度學(xué)習(xí)新手訓(xùn)練營(yíng)等 。 這些課程都很詳盡,說實(shí)話非常有幫助。 但是即使經(jīng)過了50多個(gè)小時(shí)的講座和許多任務(wù),我仍然還是沒有數(shù)據(jù)科學(xué)經(jīng)驗(yàn)的人。 就連我上班第一個(gè)月的基本分析任務(wù)對(duì)我來說都是相對(duì)困難的。 我絕對(duì)難以按時(shí)完成任務(wù)。
PinterestPinterest購(gòu)買Looking back, I feel that I focused more on learning and less on practicing. I listened to all the lectures which covered new topics in every lecture, did some teeny tiny assignments and thought I am doing it all the right way. However, I think of it all very differently now. Learning should be through practicing and implementing new ideas. That is when you make mistakes, observe new things, research on how to code the solution in a better way and you know..really learn. This certainly happened after starting my latest job as I had to work on new ideas every day and implement them. Trust me, that is when I picked up actual skills. If you are in the online course phase, spare some time to build projects and implement the topics you learned.
回顧過去,我覺得我更多地專注于學(xué)習(xí)而不是練習(xí)。 我聽了所有講座,每次講座都涵蓋了新主題,做了一些小小的小作業(yè),并以為我做得很好。 但是,我現(xiàn)在對(duì)這一切的看法截然不同。 學(xué)習(xí)應(yīng)該通過實(shí)踐和實(shí)施新思想來進(jìn)行。 那就是當(dāng)您犯錯(cuò),觀察新事物,研究如何以更好的方式編寫解決方案的代碼時(shí),您才真正了解。 這肯定是在開始我的最新工作后發(fā)生的,因?yàn)槲颐刻毂仨氀芯亢蛯?shí)施新的想法。 相信我,那是我掌握實(shí)際技能的時(shí)候。 如果您處于在線課程階段,請(qǐng)花一些時(shí)間來構(gòu)建項(xiàng)目并實(shí)施您學(xué)到的主題。
2. Coding skills:
2.編碼技巧:
https://changhsinlee.com/https://changhsinlee.com/購(gòu)買Most people who try to enter this field have a slight misconception that data science involves relatively less coding than software engineering. There is a little bit of truth to it. Because if you take Python which is the widely used language in data science, there are built-in libraries for almost all types of algorithms and operations. Though these libraries are very helpful, there is only so much they can do. I for one thought that data science is all about data analysis, plots, model fitting, prediction and accuracy metrics. These things are of course a part of it but software engineering is another huge part too. For example, when you want to build a production level product recommendation engine pipeline, you will have to work on many things like SQL scripts, data sync, training, tuning, prediction, evaluation frameworks, unit testing, logging, dashboards, admin panel, model deployment, version control and so much more. All of this combined involves a hell lot of critical thinking and coding. This is the kind of stuff you will work in the long run or maybe in your first few months! I am not saying that you need to know everything about coding everything but some level of proficiency in coding will be needed and also useful for you.
大多數(shù)嘗試進(jìn)入該領(lǐng)域的人都有些誤解,認(rèn)為數(shù)據(jù)科學(xué)涉及的編碼少于軟件工程。 有一點(diǎn)道理。 因?yàn)槿绻褂肞ython(這是數(shù)據(jù)科學(xué)中廣泛使用的語(yǔ)言),那么幾乎所有類型的算法和操作都有內(nèi)置的庫(kù)。 盡管這些庫(kù)非常有用,但是它們只能做很多事情。 我曾經(jīng)以為,數(shù)據(jù)科學(xué)就是關(guān)于數(shù)據(jù)分析,圖表,模型擬合,預(yù)測(cè)和準(zhǔn)確性指標(biāo)的全部。 這些當(dāng)然是其中的一部分,但是軟件工程也是另一個(gè)重要部分。 例如,當(dāng)您要構(gòu)建生產(chǎn)級(jí)別的產(chǎn)品推薦引擎管道時(shí),您將需要處理許多事情,例如SQL腳本,數(shù)據(jù)同步,培訓(xùn),調(diào)整,預(yù)測(cè),評(píng)估框架,單元測(cè)試,日志記錄,儀表板,管理面板,模型部署,版本控制等等。 所有這些結(jié)合在一起涉及大量的批判性思維和編碼。 從長(zhǎng)遠(yuǎn)來看,或者您可能會(huì)在頭幾個(gè)月中使用這種東西! 我并不是說您需要了解有關(guān)一切編碼的所有知識(shí),但是將需要一定程度的編碼熟練度,并且對(duì)您也很有用。
3. No pressure to learn every single data science tool:
3.沒有學(xué)習(xí)每個(gè)數(shù)據(jù)科學(xué)工具的壓力:
There are way too many data science tools in the market and it can be quite confusing to find where to start. The best option is to learn one data science friendly coding language, one database tool and one visualization tool. This is a good way to begin with and is like the basic requirement for many entry level roles. When you are just laying the foundation, don’t pressure yourself to learn too many tools. Instead, take things slowly. Understand the basics and explore topics in depth in whatever tool you learn. You will eventually learn many tools when you are in the job due to project requirements or just while working on your passion projects.
市場(chǎng)上有太多的數(shù)據(jù)科學(xué)工具,很難找到從哪里開始。 最好的選擇是學(xué)習(xí)一種數(shù)據(jù)科學(xué)友好的編碼語(yǔ)言,一種數(shù)據(jù)庫(kù)工具和一種可視化工具。 這是開始的好方法,就像許多入門級(jí)角色的基本要求一樣。 當(dāng)您只是奠定基礎(chǔ)時(shí),不要強(qiáng)迫自己學(xué)習(xí)太多的工具。 相反,慢慢來。 了解基礎(chǔ)知識(shí),并以所學(xué)的任何工具深入探討主題。 由于項(xiàng)目要求或在從事激情項(xiàng)目時(shí),您最終將在工作中學(xué)習(xí)許多工具。
UdemyUdemy購(gòu)買I started with Python, SQL and Tableau when I was searching for a job. Nothing more. Now I know to work on a couple of other tools like Spark, Hbase, Kibana, Dash, Elasticsearch and Logstash. I am sure I will have to learn new tools in the coming days. The point is, learn a tool with utmost clarity of how it will be useful for your requirement.
在尋找工作時(shí),我從Python,SQL和Tableau開始。 而已。 現(xiàn)在我知道要使用其他幾個(gè)工具,例如Spark,Hbase,Kibana,Dash,Elasticsearch和Logstash。 我敢肯定,未來幾天我將不得不學(xué)習(xí)新工具。 重點(diǎn)是,要學(xué)習(xí)一種最清楚如何滿足您的需求的工具。
4. You are ready to take interviews:
4.您準(zhǔn)備接受采訪:
Tell that to yourself whenever you feel like skipping an interview call or meeting because your brain is telling you that you are not going to make it. I cannot remember the number of times I learned something new while attending an interview. It is either about the data science industry or new products or just a concept. I am not suggesting you to attend interviews randomly to learn stuff. It would be an obvious waste of time for the poor interviewer. Data science is a vague term and so are the job requirements for every data science role. You might never feel ready if you want to tick every single job requirement before attending an interview.
每當(dāng)您想跳過面試電話或會(huì)議時(shí)告訴自己,因?yàn)槟拇竽X告訴您您不會(huì)參加。 我不記得參加面試時(shí)學(xué)習(xí)新知識(shí)的次數(shù)。 它與數(shù)據(jù)科學(xué)行業(yè)或新產(chǎn)品有關(guān),或者只是一個(gè)概念。 我不建議您隨機(jī)參加面試以學(xué)習(xí)知識(shí)。 對(duì)于可憐的面試官來說,這顯然是浪費(fèi)時(shí)間。 數(shù)據(jù)科學(xué)是一個(gè)模糊的術(shù)語(yǔ),每個(gè)數(shù)據(jù)科學(xué)角色的工作要求也是如此。 如果您想在參加面試之前打勾每個(gè)工作要求,您可能永遠(yuǎn)也不會(huì)做好準(zhǔn)備。
GiphyGiphy購(gòu)買The preparation phase can be a long one too. It depends on your learning speed and prior knowledge. It is very easy to get stuck in that phase because there are too many topics to cover. Set goals during interview preparation and as you achieve those goals, start looking for interview opportunities. Every time you fail an interview, you will find the need to improve on a particular area or learn a new market requirement. And that my friend will help you in the next interviews.
準(zhǔn)備階段也可能很長(zhǎng)。 這取決于您的學(xué)習(xí)速度和先驗(yàn)知識(shí)。 由于涉及的主題太多,因此很容易陷入這一階段。 在準(zhǔn)備面試時(shí)設(shè)定目標(biāo),并在實(shí)現(xiàn)這些目標(biāo)時(shí)開始尋找面試機(jī)會(huì)。 每次面試失敗時(shí),您都會(huì)發(fā)現(xiàn)需要改進(jìn)特定領(lǐng)域或了解新的市場(chǎng)需求。 我的朋友會(huì)在下次面試中為您提供幫助。
5. Ideal companies to apply for data science roles
5.申請(qǐng)數(shù)據(jù)科學(xué)職位的理想公司
Usually, people are flexible about roles and companies when applying for interviews as beginners. But if you are wondering what is the type of company in which you should apply for a data science role, it is completely subjective. Let us talk about product-based and service-based companies from a data science perspective. Service companies usually work on one-time data analysis or prototype whereas product companies involve rigorous software development and data analysis is just a part of it. Python, R. Powerpoint and Excel will do the job for you most of the days in service companies whereas product companies will want you to work on whatever tool is required to do the job. Basically, product companies will involve a lot of software engineering in addition to data analysis.
通常,在初學(xué)者申請(qǐng)面試時(shí),人們會(huì)靈活選擇角色和公司。 但是,如果您想知道應(yīng)申請(qǐng)數(shù)據(jù)科學(xué)職位的公司類型,那完全是主觀的。 讓我們從數(shù)據(jù)科學(xué)的角度談?wù)劵诋a(chǎn)品和基于服務(wù)的公司。 服務(wù)公司通常從事一次性數(shù)據(jù)分析或原型工作,而產(chǎn)品公司則涉及嚴(yán)格的軟件開發(fā),而數(shù)據(jù)分析只是其中的一部分。 在服務(wù)公司中,Python,R。Powerpoint和Excel大部分時(shí)間都可以為您完成工作,而產(chǎn)品公司則希望您使用所需的任何工具來完成工作。 基本上,產(chǎn)品公司除數(shù)據(jù)分析外還將涉及許多軟件工程。
They work on projects that will help them to improve their products by incorporating data science in them or they make new data based products like product recommendation engine, AI-based chatbots etc. or they just use analytics to make better decisions in the organization. Service companies work on analytics projects purely based on client requirements. So like I said it is up to your interests. Choose wisely!
他們從事的項(xiàng)目將通過整合數(shù)據(jù)科學(xué)來幫助他們改善產(chǎn)品,或者開發(fā)基于新數(shù)據(jù)的產(chǎn)品,例如產(chǎn)品推薦引擎,基于AI的聊天機(jī)器人等,或者他們只是使用分析方法在組織中做出更好的決策。 服務(wù)公司純粹根據(jù)客戶需求來進(jìn)行分析項(xiàng)目。 因此,就像我說的那樣,這取決于您的興趣。 做出明智的選擇!
6. Data Science can be frustrating:
6.數(shù)據(jù)科學(xué)可能令人沮喪:
Data-based problems are very interesting to work on but some can be equally frustrating too. One of the difficult aspects of your work will be just to patiently wait for good results. Often you might not know whether you are going in the right direction. There are too many unknowns and a lot of things in your project will require plain trial and error to arrive at an optimal solution. Like they say it is all fun and games till you reach the hyper-parameter tuning part of your model :)
基于數(shù)據(jù)的問題非常有趣,但是有些問題同樣令人沮喪。 工作的困難之處之一就是耐心等待良好的結(jié)果。 通常,您可能不知道自己是否朝著正確的方向前進(jìn)。 未知數(shù)太多,您項(xiàng)目中的許多事情都需要經(jīng)過反復(fù)試驗(yàn)才能得出最佳解決方案。 就像他們說的那樣,這很有趣,也很有趣,直到您到達(dá)模型的超參數(shù)調(diào)整部分為止:)
Most of us do a Proof of Concept before implementing any solution. But sometimes even POCs fail to give insights about certain hiccups you might face during the actual task. For example, once at work, we spent an entire month researching and implementing a solution for our pipeline. It eventually didn’t work out. We had to start all over again and this caused a huge progress lag in the supposedly well-performing project. The key take away from a couple of incidents like this is that always set clear goals, evaluate your POC thoroughly and when stuck at a point for too long, just remember to try fast, fail fast, evaluate fast and try again fast. Being fast is super important for good progress.
我們大多數(shù)人在實(shí)施任何解決方案之前都要進(jìn)行概念驗(yàn)證。 但是有時(shí)候,甚至POC都無法提供您在實(shí)際任務(wù)中可能遇到的某些打h的見解。 例如,一旦上班,我們就花了整整一個(gè)月的時(shí)間研究和實(shí)施管道解決方案。 最終沒有奏效。 我們不得不重新開始,這在原本表現(xiàn)良好的項(xiàng)目中造成了巨大的進(jìn)度滯后。 避免發(fā)生此類事件的關(guān)鍵是始終設(shè)定明確的目標(biāo),徹底評(píng)估POC,并且在某個(gè)時(shí)間停留太長(zhǎng)時(shí)間時(shí),請(qǐng)記住要快嘗試,快失敗,快評(píng)估并再試一次。 快節(jié)奏對(duì)于取得良好的進(jìn)步至關(guān)重要。
7. Your storytelling skills will matter a lot:
7. 您的講故事技巧非常重要:
You will most likely be dealing with customers from non-technical backgrounds. Your organization leaders may not be data scientists. Your own teammates might be from diverse backgrounds (pure mathematicians, some API users etc.). These are the people who will recognize your work and will add value to your work.
您很可能會(huì)與非技術(shù)背景的客戶打交道。 您的組織負(fù)責(zé)人可能不是數(shù)據(jù)科學(xué)家。 您自己的隊(duì)友可能來自不同的背景(純數(shù)學(xué)家,某些API用戶等)。 這些人將認(rèn)可您的工作并為您的工作增添價(jià)值。
It is so important that you communicate your thoughts, ideas, analyses and results in an interactive and understandable way to your audience. I clearly remember struggling in my first team meeting with the CEO where we had to explain the progress in projects, discuss use cases and future AI goals. That is when it hit me that sticking to numbers and just analytical skills are not enough. A good story explaining the analysis can interest your manager. A story explaining how a particular data science solution can solve the pain point of a problem can interest your customer. Different stories have different impacts on different people. Frame your story carefully with data science elements like visualizations, dashboards, reports etc. and put your everything in it while delivering it.
以互動(dòng)和易于理解的方式與聽眾交流思想,想法,分析和結(jié)果非常重要。 我清楚地記得,在與首席執(zhí)行官的第一次團(tuán)隊(duì)會(huì)議中,我們不得不解釋項(xiàng)目的進(jìn)展,討論用例和未來的AI目標(biāo)時(shí)遇到的困難。 那就是讓我感到震驚的是,僅僅依靠數(shù)字和僅僅分析技能是不夠的。 講解分析的好故事會(huì)讓您的經(jīng)理感興趣。 解釋特定數(shù)據(jù)科學(xué)解決方案如何解決問題痛點(diǎn)的故事可能會(huì)使您的客戶感興趣。 不同的故事對(duì)不同的人有不同的影響。 借助可視化,儀表板,報(bào)告等數(shù)據(jù)科學(xué)元素精心構(gòu)建故事,并在交付時(shí)將所有內(nèi)容放入其中。
Final Thoughts:
最后的想法:
Data Science is no rocket science. If I can do it, then you can do it too! There is no good time as now to enter this fast-growing field. That being said, it definitely gets a little bit tough to keep up with all the new things happening in this field and the competition. But, what matters is that we learn, implement, make mistakes and grow consistently. Happy analyzing:)
數(shù)據(jù)科學(xué)不是火箭科學(xué)。 如果我可以做到,那么您也可以做到! 現(xiàn)在沒有進(jìn)入這個(gè)快速增長(zhǎng)領(lǐng)域的好時(shí)機(jī)。 話雖這么說,要跟上該領(lǐng)域和競(jìng)爭(zhēng)中發(fā)生的所有新事物肯定會(huì)有點(diǎn)困難。 但是,重要的是我們學(xué)習(xí),實(shí)施,犯錯(cuò)誤并不斷成長(zhǎng)。 分析愉快:)
翻譯自: https://medium.com/swlh/7-things-you-must-know-if-youre-trying-to-enter-data-science-2a9a531750e0
什么事數(shù)據(jù)科學(xué)
總結(jié)
以上是生活随笔為你收集整理的什么事数据科学_如果您想进入数据科学,则必须知道的7件事的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 梦到砖头什么征兆
- 下一篇: 季节性时间序列数据分析_如何指导时间序列