openai-gpt_GPT-3报告存在的问题
openai-gpt
I’ve recently seen a massive number of articles about GPT-3, on Medium and elsewhere. I even wrote one. The language model is a significant development in AI, so it’s only natural that writers want to share their excitement with the world.
我最近在Medium和其他地方看到了大量關于GPT-3的文章。 我什至寫了一個 。 語言模型是AI的重大發展,因此,作家與世界分享自己的興奮是很自然的。
Here’s the problem: the ability of GPT-3 — namely the quality of its writing — is often exaggerated by published samples. In fact, there are not one, but two filters keeping the AI’s worst results from wide dissemination.
這就是問題所在:GPT-3的能力(即其寫作質量)經常被已發布的樣本夸大。 實際上,沒有一個過濾器,但是有兩個過濾器使AI的最壞結果無法廣泛傳播。
Selection bias wouldn’t be a problem if any interested reader could access the GPT-3 API and make their own observations of its ability. However, access is currently severely limited. (AI Dungeon is often used to test GPT-3 by those of us without the full version, but its creator has recently outlined ways backdoor access to GPT-3 is being prevented.)
如果有興趣的讀者可以訪問GPT-3 API并對其功能進行自己的觀察,那么選擇偏向就不會成為問題。 但是,當前訪問受到嚴重限制。 ( AI地牢通常被我們那些沒有完整版本的人用來測試GPT-3,但其創建者最近概述了如何防止對GPT-3進行后門訪問。)
When reporting — and I use that term in its broadest possible interpretation to mean any writing about GPT-3 — is the only source of public information, selection biases ought to be considered in our understanding of the product. Here, I outline the obvious bias, and a less-obvious bias which exacerbates the issue.
報告時(我在最廣泛的解釋中使用該術語表示任何有關GPT-3的文字)是唯一的公共信息來源,因此,在理解產品時應考慮選擇偏見。 在這里,我概述了明顯的偏見,以及不太明顯的偏見,這加劇了該問題。
1.選擇寫作樣本以提高質量 (1. Writing samples are selected for quality)
Say I’m writing an informative piece on GPT-3. I want to demonstrate that it can put together coherent strings of sentences, so I give it a prompt and examine the output.
假設我正在撰寫有關GPT-3的內容豐富的文章。 我想證明它可以將連貫的句子串在一起,所以我給了它一個提示,并檢查了輸出。
If I don’t like what I see, I’m likely to try again with a slightly different (perhaps longer) prompt. Even if I’m not actively selecting particular sentences that suit the purpose of my article, massaging the output creates a biased sample of writing that is not representative of GPT-3’s overall quality.
如果我不喜歡自己看到的內容,則可能會再次嘗試使用稍有不同(可能更長)的提示。 即使我沒有積極選擇適合我文章目的的特定句子,但對輸出進行按摩也會產生有偏見的寫作樣本,不能代表GPT-3的整體素質。
In the context of creating a narrative about the AI, it makes sense to showcase its best work rather than a fair representation of its limitations. This is the first problem.
在創建有關AI的敘述的背景下,有意義的是展示其最佳作品,而不是公平地表述其局限性。 這是第一個問題。
2.文章越酷,觀看次數越多 (2. The cooler the article, the more views)
Consider the case where something does gets written about a function GPT-3 cannot perform. It might be a list of writing fails, or code that doesn’t compile.
考慮以下情況: 確實編寫了有關GPT-3 無法執行的功能的信息。 可能是寫入失敗或代碼未編譯的列表。
To me, that wouldn’t be an interesting piece, and I suspect it wouldn’t intrigue others either. I’m sure Tweets, Reddit posts, and longer articles detailing GPT-3’s unexpected failures are out there, but the fact of the matter is they’re not getting read.
對我來說,那不是件有趣的事,而且我懷疑它也不會吸引其他人。 我敢肯定,這里有Tweets,Reddit帖子和更長的文章,詳細介紹了GPT-3的意外故障,但是事實是它們沒有被閱讀 。
On the surface, this doesn’t seem like a problem. It definitely isn’t necessary to read about everything that GPT-3 can’t do. The real problem is when positive results are favoured over negative ones for the same task. For example, if someone reported positive results for getting GPT-3 to write a legal document, this would undoubtedly receive more attention than an instance where the AI fails to generate a coherent document.
從表面上看,這似乎不是問題。 絕對沒有必要閱讀GPT-3不能做的所有事情。 真正的問題是,對于同一任務,正面結果勝于負面結果。 例如,如果有人報告說讓GPT-3撰寫法律文件取得了積極成果,那么毫無疑問,這將比AI無法生成連貫文件的情況受到更多關注。
In essence, the way GPT-3 reporting currently works is analogous to running scientific trials without pre-registration. Publication bias, where statistically insignificant results don’t get published, can cause absurd findings to be accepted as solid research.
從本質上講,GPT-3報告的當前工作方式類似于無需預先注冊即可進行的科學試驗。 出版偏見不會發表統計上微不足道的結果,這可能會導致荒謬的發現被接受為可靠的研究。
To be clear, I don’t think there is an imperative for writers to publish more negative results from GPT-3. There is, however, an obligation to contextualize samples with the way in which they were generated and how many negative results were obtained in the process.
需要明確的是,我認為作者沒有必要發布GPT-3的更多負面結果。 但是,有義務根據樣本的生成方式和在此過程中獲得多少負面結果來對樣本進行情境化。
After all, human selection — on the level of individual pieces of writing or how the larger body of work gets consumed — of an AI’s output is a combination of our intelligence with that of a computer program, and that’s a beautiful thing.
畢竟,人工智能的輸出是人工選擇,無論是在單個作品的層次上還是在更大的工作量上,人類的選擇都是我們的智慧與計算機程序的結合,這是一件很美的事情。
翻譯自: https://towardsdatascience.com/the-problem-with-gpt-3-reporting-93c7b5b58400
openai-gpt
總結
以上是生活随笔為你收集整理的openai-gpt_GPT-3报告存在的问题的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 属实赚麻了!《满江红》7天为光线传媒创收
- 下一篇: 质感拉满!一加Ace2官方渲染图流出 配