问题解决方案_问题
問題解決方案
問題 (The Problem)
There’s no dearth of online courses nowadays. In the past few years, the popularity of MOOC platforms has sky-rocketed, prompting many other such platforms to crop up.
如今沒有在線課程。 在過去的幾年中,MOOC平臺的受歡迎程度猛增,促使許多其他此類平臺興起。
The most popular of them all, Coursera, has been the blue-eyed boy of many learners, for a long time.
其中最受歡迎的是Coursera,長期以來一直是許多學習者的藍眼睛男孩。
However, not many users are satisfied with all the courses present.
但是,對當前的所有課程都不滿意。
Sometimes the level of difficulty is too high, or the hands-on labs are outdated, or the instructor hides behind technical jargon, or the videos lack contextual clarity and so on.
有時難度級別太高,或者動手實驗室已經過時,或者講師躲在技術術語后面,或者視頻缺乏上下文的清晰度等等。
With so many courses coming up, it is essential for the learner to save time and money by choosing only the best course available.
隨著眾多課程的涌現,學習者必須通過僅選擇最佳課程來節省時間和金錢,這一點至關重要。
The real mettle of a course is tested by the number of negative reviews, for only the truly disappointed users bother to write a bad review and give a low rating; which is why even a significantly high rate of negative reviews tells us a lot about the quality of the course. In an age where time is money, it is important we don’t invest time in not-so-good courses.
一門課程的真正勇氣是通過負面評論的數量來測試的,只有真正失望的用戶才愿意撰寫糟糕的評論并給予較低的評分; 這就是為什么甚至很高的負面評價也能告訴我們很多有關課程質量的原因。 在時間就是金錢的時代,重要的是我們不要將時間花在不太好的課程上。
But how to do that? There are thousands of reviews for each course, if not hundreds, and most of the reviews displayed on the front page are the top reviews, which are, well… always good.
但是該怎么做呢? 每個課程都有成千上萬的評論,即使不是數百條,并且首頁上顯示的大多數評論都是頂部的評論,嗯……總是很好。
Also, one can’t go by the ratings alone, for they are hardly reliable. Even I have seen ratings as high as 3/4 stars accompanied by a bad review.
另外,不能單靠評級來衡量,因為它們幾乎不可靠。 甚至我也看到評級高達3/4星,并伴有不佳評價。
Necessity is the mother of invention.
必要性是發明之母。
A tool was necessary for me to judge which course to opt for without listing down all the reviews and comments from all my prospective courses.
我需要一種工具來選擇要選擇的課程,而不列出所有我預期課程中的所有評論和意見。
And that is how CourseraAnalyzer was born.
這就是CourseraAnalyzer的誕生方式。
In this post, I’m going to walk you through a quick guide on how I developed this indigenous web application that served me an excellent purpose. If you want to delve deeper into the code behind this, feel free to check out the repository for this project.
在這篇文章中,我將引導您快速了解如何開發此本地Web應用程序,該應用程序為我提供了出色的目標。 如果您想深入研究其背后的代碼,請隨時簽出該項目的存儲庫 。
解決方案 (The Solution)
Build an application that looked for the course specified by the user, perform sentiment analysis on the reviews, and then display the results as visualizations.
生成一個應用程序,查找用戶指定的課程,對評論進行情感分析,然后將結果顯示為可視化。
Sounds pretty simple, right?
聽起來很簡單,對吧?
Not really.
并不是的。
This had Web Scraping + NLP + Machine Learning + Web Development incorporated in it. Difficult? Yes. Impossible? No.
其中包含了Web搜尋+ NLP +機器學習+ Web開發。 難? 是。 不可能? 沒有。
My first step was to extract the reviews of the course.
我的第一步是提取課程評論。
I used BeautifulSoup (you can use Selenium if you want) for scraping. After taking the URL as an input, I modified the URL to access the course’s reviews page and then scraped all the comments from all the pages.
我使用BeautifulSoup(如果需要,可以使用Selenium)進行刮擦。 將URL作為輸入后,我修改了URL以訪問課程的評論頁面,然后從所有頁面中抓取所有評論。
(Note: Don’t worry if you cannot make perfect sense of this, since the code snippet is only a small piece of the entire file. My intent is to just give you a tiny glimpse of how this works under the hood.)
( 注意 :如果您不能完全理解這一點, 請 不要擔心,因為代碼片段只是整個文件的一小部分。我的意圖是讓您簡要了解一下它的工作原理。)
I then used a text classifier (which I had already trained and pickled beforehand) to classify the comments into positive and negative reviews.
然后,我使用文本分類器(我已經預先對其進行過訓練和腌制)將評論分為正面評論和負面評論。
After counting the reviews, I passed the numbers to Chart.js which displays beautiful, interactive graphs on web pages.
在對評論進行計數之后,我將數字傳遞給Chart.js,該Chart.js在網頁上顯示漂亮的交互式圖形。
But there was one pesky little problem.
但是有一個討厭的小問題。
It took too much time.
花了太多時間。
Scraping thousands of reviews and using a text classifier on them was time-consuming. But I wanted it to go faster.
浪費數千條評論并在其上使用文本分類器非常耗時。 但是我希望它走得更快。
So I created a .json file that stored the list of the searched courses, along with their results. If the course entered by the user already existed on the .json file, I simply loaded the results, instead of repeating the same process every time there was a request.
因此,我創建了一個.json文件,該文件存儲了搜索到的課程及其結果的列表。 如果用戶輸入的課程已經存在于.json文件中,那么我只是加載結果,而不是每次有請求時都重復相同的過程。
In the end, I created a small Flask app, defined different API endpoints and functions so as to convert all these programs into one unit. I used Heroku to deploy it on the web.
最后,我創建了一個小的Flask應用程序,定義了不同的API端點和功能,以便將所有這些程序轉換為一個單元。 我使用Heroku在網絡上部署了它。
This is a screenshot taken of the home page of the site https://courseraanalyzer.herokuapp.com/.
這是網站https://courseraanalyzer.herokuapp.com/的主頁的屏幕截圖。
I’m clearly not the best at designing but hey, as long as it works, right?我顯然不是最擅長的設計,但是,嘿,只要可行,對嗎?At the bottom left, there is displayed a list of courses that have already been analyzed. If the course you are looking for isn’t there, enter the URL of the course in the search box and click on ‘Analyze’.
左下方顯示了已經分析過的課程列表。 如果您要查找的課程不存在,請在搜索框中輸入課程的URL,然后單擊“分析”。
It will give a brief report of all the reviews and display a few visualizations to help you reach a conclusion.
它將給出所有評論的簡短報告,并顯示一些可視化效果,以幫助您得出結論。
And that’s it!
就是這樣!
There are many areas where this application is still lagging but for the first version, this works quite well. Feel free to reach out to me if you have any queries regarding this. Thank you!
在許多方面,該應用程序仍然滯后,但是對于第一個版本,它運行良好。 如果您對此有任何疑問,請隨時與我聯系。 謝謝!
Github Repository:
Github存儲庫:
https://github.com/sthitaprajna-mishra/coursera-analyzer
https://github.com/sthitaprajna-mishra/coursera-analyzer
翻譯自: https://towardsdatascience.com/confused-about-how-to-pick-the-best-online-course-on-coursera-d56f55aa36ae
問題解決方案
總結
 
                            
                        - 上一篇: sprintf用法
- 下一篇: Linux0.11内核--缓冲区机制大致
