图片中的暖色或冷色滤色片是否会带来更多点击? —机器学习A / B测试
A/B test on ads is the art of choosing the best advertisement that optimizes your goal (number of clicks, likes, etc). For example, if you change a simple thing like a filter in your pictures you will drive more traffic to your links.
廣告的A / B測試是一種選擇最佳廣告的藝術(shù),該廣告可以優(yōu)化您的目標(點擊次數(shù),喜歡的次數(shù)等)。 例如,如果您更改圖片中的過濾器之類的簡單內(nèi)容,則會為鏈接帶來更多流量。
In this post we will see how to do evaluate by yourself the efficacy of your own A/B test using Python and Bayesian Statistics.
在這篇文章中,我們將看到如何使用Python和貝葉斯統(tǒng)計自己評估自己的A / B測試的效率。
Article TL;DR for the Busy Data Scientist
Article TL; DR for the Busy Data Scientist
Skip to section “The Code (starting from the end)” if you just want to copy and past some Python code to try your own A/B Test.
如果您只想復(fù)制并粘貼一些Python代碼來嘗試自己的A / B測試,請?zhí)痢?代碼(從末尾開始) ”部分。
Skip to the section “Putting all together” if you want to see my final result.
如果要查看我的最終結(jié)果,請?zhí)痢?全部放在一起 ”部分。
問題 (The Problem)
Given two ads (A and B), which one has the highest Click-Through-Rate (CTR)? From a statistical point of view, the problem is to learn the unobservable TRUE CTR parameter from the observed data of impressions (ad views) and clicks. Just to avoid confusion, remember that CTR is calculated as the number of clicks divided by the number of impressions:
給定兩個廣告(A和B),哪個具有最高點擊率(CTR)? 從統(tǒng)計的角度來看,問題是要從所觀察到的印象(廣告視圖)和點擊數(shù)據(jù)中了解到不可觀察的TRUE CTR參數(shù)。 為避免混淆,請記住,點擊率的計算方法是點擊次數(shù)除以展示次數(shù):
Click-Through-Rate formula.點擊率公式。From this simple formula you might be thinking:
從這個簡單的公式中,您可能會想到:
“If the equation is really just a simple division the only thing I need to do is to get the data from the ad performance , do the simple division and the higher number is the best Ad. DONE!?”
“如果等式實際上只是一個簡單的除法運算,那么我唯一需要做的就是從廣告效果中獲取數(shù)據(jù),然后進行簡單的除法運算,數(shù)值越大越好。 完成!?”
Well, not really.
好吧,不是真的。
Say you have only one impression and one click, that is a CTR of 100%, but you should not be assuming that your TRUE CTR is actually 100% from one single view/impression. It most likely is not 100%, by the way. To put it simply, the observed CTR alone cannot tell us the performance of an ad. We need more, we need more data and we still need some Bayesian statistics. But before let me setup our our ad campaign.
假設(shè)您只有一次展示和一次點擊,即100%的點擊率,但您不應(yīng)假設(shè)您的TRUE CTR實際上是從單個視圖/展示獲得的100%。 順便說一句,它很可能不是100%。 簡而言之,僅觀察到的點擊率并不能告訴我們廣告的效果。 我們需要更多,我們需要更多數(shù)據(jù),我們?nèi)匀恍枰恍┴惾~斯統(tǒng)計數(shù)據(jù)。 但是在讓我設(shè)置廣告系列之前。
設(shè)置 (The Setup)
Let us explorer the problem a little bit further. For that matter, I ran two real ads in Twitter with real money (I spent $20 USD in case you are curious). The difference between both ads was only the filter in the images. One image had a cold filter and the other had a warm filter (default filters in my smart phone). The ad was for an Affiliate Amazon link to the book ‘Designing Data Intensive Applications ’(https://amzn.to/3iycLi6).
讓我們進一步探討該問題。 為此,我在Twitter上用真錢投放了兩個真實的廣告(如果您好奇的話,我花了20美元)。 這兩個廣告之間的區(qū)別只是圖片中的過濾器。 一個圖像有一個冷濾鏡,另一個圖像有一個暖濾鏡(我的智能手機中的默認濾鏡)。 該廣告是針對一個亞馬遜附屬機構(gòu)的鏈接,該鏈接指向《 設(shè)計數(shù)據(jù)密集型應(yīng)用程序 》( https://amzn.to/3iycLi6 )一書。
The question is ‘Which filter maximizes my the Click-Through-Rate?’, i.e., which image makes the Amazon link most likely to be clicked? Here are the ads side by side:
問題是“哪個過濾器會最大化我的點擊率?”,即哪個圖像使Amazon鏈接最有可能被點擊? 以下是這些廣告的并排:
Almost identical ads I ran in Twitter. Both ads are the same but the filter applied to the pictures. The left (ad A) has a cold filter, while the right (ad B) has a warm filter. Which one do you think performed better?我在Twitter上投放的廣告幾乎相同。 這兩個廣告都是相同的,但濾鏡已應(yīng)用于圖片。 左側(cè)(廣告A)具有冷濾鏡,而右側(cè)(廣告B)具有熱濾鏡。 您認為哪個表現(xiàn)更好?After running both ads for a few hours I got the following impressions and clicks:
將兩個廣告都投放了幾個小時后,我得到了以下印象和點擊:
Ad A (cold filter): 190 impressions, 13 clicks, CTR 0.068 (6.8%)
廣告A(冷過濾器):190次展示,13次點擊,點擊率0.068(6.8%)
Ad B(warm filter): 143 impressions, 9 clicks, CTR 0.062 (6.0%)
廣告B(熱過濾器):143次展示,9次點擊,點擊率0.062(6.0%)
From the data, we can see that A’s observed CTR is higher than B’s observed CTR (6.8% > 6.0%). The remaining of this post we will answer the following two questions:
從數(shù)據(jù)中我們可以看到,A的觀察到的點擊率高于B的觀察到的點擊率 (6.8%> 6.0%)。 在本文的其余部分,我們將回答以下兩個問題:
One way to stretch this question to the extreme would be the following hypothetical situation: Assume that ad A had 1 view and 0 clicks while ad B had 1 view and 1 click. Their observed CTR are 0% and 100%, but no one would say that ad B performs better than ad A just from these data points. Let’s see how to estimate our CTRs while our ad campaign is running.
將這個問題擴展到極端的一種方法是,在以下假設(shè)情況下進行:假設(shè)廣告A具有1個視圖和0次點擊,而廣告B具有1個視圖和1次點擊。 他們觀察到的點擊率分別為0%和100%,但沒有人會說從這些數(shù)據(jù)點來看,廣告B的效果要好于廣告A。 讓我們看看如何在廣告系列投放期間估算點擊率。
代碼(從頭開始) (The Code (starting from the end))
It will be easier to understand where we are going if we start at the end. So, let’s have a look at some code. Do not worry if you do not understand every line yet, the main point now is the plot the code produces, also I will walk you through the code later.
如果我們從頭開始,將會更容易理解我們的發(fā)展方向。 因此,讓我們看一些代碼。 如果您還不了解每一行,請不要擔(dān)心,現(xiàn)在的重點是代碼產(chǎn)生的圖形,稍后我將引導(dǎo)您完成代碼。
Running this code in your notebook, you should get the following plot:
在筆記本中運行此代碼,應(yīng)該得到以下圖表:
A/B Test from our first Ads. The blue line is our A ad (cold) while the orange is our B ad (warm). The filled region under the plots is what we call ‘Highest Density Interval (HDI).’ It is the area that contains 95% of the distribution of CTR. Also, not that the averages are different than simply clicks divided by impressions. We discuss this later in the post.我們的第一個廣告進行A / B測試。 藍線是我們的A廣告(冷),而橙色是我們的B廣告(暖)。 曲線下的填充區(qū)域就是所謂的“最高密度間隔(HDI)”。 該區(qū)域包含點擊率分布的95%。 同樣,平均值并非僅是點擊次數(shù)除以展示次數(shù)即可得出的平均值。 我們將在后面的文章中對此進行討論。This plot shows the most likely values for each CTR given the observed data. Since the regions with the most likely values (the Highest Density intervals above) we cannot say that one ad is better than the other only, at least not with only this data. For comparison, see the same code if it were ran with the following FAKE data:
此圖顯示了給定觀察到的數(shù)據(jù)的每個點擊率的最可能值。 由于具有最可能值的區(qū)域(上述最高密度區(qū)間),我們不能說一個廣告僅比另一個廣告好,至少不能僅憑此數(shù)據(jù)。 為了進行比較,如果它與以下FAKE數(shù)據(jù)一起運行,請參見相同的代碼:
Example data for A/B test. See that the HDIs do not intercept. In this example one could be confident in accepting that ad A (blue line, cold) has a higher CTR than the B ad (orange line, warm).A / B測試的示例數(shù)據(jù)。 看到HDI不會攔截。 在此示例中,可以確信廣告A(藍線,冷線)的點擊率比B廣告(橙線,熱線)的點擊率高。We will perform our A/B test in these 3 simple steps. We will:
我們將通過以下三個簡單步驟來執(zhí)行A / B測試。 我們會:
If you paid attention to the Python code above (no judgement if you did not), you saw that we used a few keywords, most importantly prior, posterior and beta. In the next section we will jump to the math and explain why the previous Python code works. So buckle up, enter the Bayesian statistics.
如果您注意上面的Python代碼(如果沒有,則無法判斷),您會看到我們使用了一些關(guān)鍵字,最重要的是before,posterior和beta。 在下一節(jié)中,我們將跳至數(shù)學(xué)并解釋為什么以前的Python代碼可以工作。 系好安全帶,輸入貝葉斯統(tǒng)計數(shù)據(jù)。
A / B測試背后的數(shù)學(xué) (The Math behind A/B Test)
先驗 (The Prior)
The goal here is to estimate the real CTR for an ad given the observed data. Since seeing the REAL CTR would imply serving an ad to every single user of Twitter, it is monetarily (and often practically) impossible to do that. So we will need to make our 20 bucks worth it. The estimation will come with an uncertainty that we will be able to quantify, that is also important for our decision making when it comes to stop an ad and increase the budget on others.
這里的目標是根據(jù)觀察到的數(shù)據(jù)估算廣告的實際點擊率。 由于看到REAL CTR意味著向Twitter的每個用戶投放廣告,因此從金錢上(通常實際上)是不可能做到的。 因此,我們需要使我們的20美元值得。 估算將帶有我們可以量化的不確定性,這對于我們停止廣告并增加其他廣告的預(yù)算時的決策也很重要。
Let’s call θ the parameter we want to estimate (in this case the CTR of an ad) and p(θ) the probability distribution for θ. In a previous article, I talked about p(θ) being discrete or sometimes uniform, but here we will see that using the beta distribution to describe p(θ) makes sense and it convenient for computation.
我們將θ稱為我們要估算的參數(shù)(在這種情況下,是廣告的點擊率),而將p(θ)稱為θ的概率分布。 在上一篇文章中 ,我談到了p(θ)是離散的或有時是均勻的,但是在這里我們將看到使用beta 分布來描述p(θ)是有意義的,并且便于計算。
One way to think about p(θ) is to think as if it is our belief on the possible values of θ. For example, say we are talking about a problem that we have no prior knowledge, and all possible values of θ are equally likely. In this case, note that beta(1,1) describes our knowledge (or lack of) for possible values of θ. By the same token, if we were to use θ to describe our prior belief on the probability of a coin flip to be be heads, the distribution beta(25,25) would be a good candidate, as it is centered around 50% , but we would still allow some room for small biases (HDI spans within 0.4~0.6). See figure below:
思考p(θ)的一種方法是,認為它似乎是我們對θ可能值的信念。 例如,假設(shè)我們正在談?wù)摰氖且粋€我們沒有先驗知識的問題,并且所有可能的θ值都是同等可能。 在這種情況下,請注意beta(1,1)描述了我們對θ可能值的認識(或缺乏)。 出于同樣的原因,如果我們使用θ來描述我們對硬幣翻轉(zhuǎn)為正面概率的先前信念,則分布beta(25,25)將是一個很好的候選者,因為它的中心在50%左右,但是我們?nèi)匀粫舫鲆恍┬〉钠羁臻g(HDI范圍在0.4?0.6之間)。 參見下圖:
Beta distribution examples. The left distribution is an example of a problem we have total lack of prior knowledge where all possible θs are equally likely. While the right is an example of a coin toss prior, where our prior knowledge assigns higher probability to 50% but with some bias within 40%~60%.Beta發(fā)行示例。 左分布是我們完全缺乏先驗知識的問題的示例,在該問題中所有可能的θs都可能相等。 右邊是擲硬幣的例子,我們的先驗知識將較高的概率分配給50%,但有些偏差在40%?60%之內(nèi)。The parameters a and b can seem a little artificial, but away to describe a particular prior of a problem is via the prior mean m and the number of events n observed in the past. With m and n one can find the parameters a and b as follows:
參數(shù)a和b看起來有些虛假,但是描述問題的特定先驗條件是通過先驗均值m和過去觀察到的事件數(shù)n來描述。 使用m和n可以找到參數(shù)a和b ,如下所示:
Formula to find the parameters a and b for a prior beta distribution given the previous mean m and the number of events n. Note that, given a and b we can also calculate the mean for the beta distribution.給定先前的均值m和事件數(shù)n ,找到用于先前的beta分布的參數(shù)a和b的公式。 注意,給定a和b,我們還可以計算β分布的平均值。It is important to note that a beta distribution might not be the best fit for the problem, for that we could use model selection via Bayes Factor, a topic way beyond this article. Historically, the beta distribution is used because it is very easy to compute, specially when we use it with the bayesian rule given some new data. Let’s have a look on that next.
需要注意的是AB 埃塔分布可能不是問題的最佳契合,對于我們可以通過貝葉斯因子,一個主題的方式這篇文章以后使用模式的選擇是很重要的。 從歷史上看,使用beta分布是因為它非常易于計算,尤其是當(dāng)我們將貝葉斯規(guī)則與給定一些新數(shù)據(jù)一起使用時。 接下來讓我們看一下。
a and a和b. (Source: b具有不同值的Beta分布。 (來源: Wikipedia)維基百科 )后部 (The Posterior)
The way we will do our A/B test will be assuming the prior distribution for our parameter θ (the real click-through-rate of the ad) to be beta(1,1), i.e., we will assume total lack of prior knowledge and set any possible value of θ to be equally likely. Then, we will update our p(θ) given our observed data, for that we use bayes rule (For an introduction to the Bayes’ Rule see my previous Article here):
我們進行A / B測試的方式將假設(shè)參數(shù)θ(廣告的實際點擊率)的先驗分布為beta(1,1) ,即,我們將假設(shè)完全沒有先驗分布知道并將任何可能的θ值設(shè)置為同等可能。 然后,我們將給予我們的觀測數(shù)據(jù)來更新我們的P(θ),對于我們使用貝葉斯法則(對于一個介紹貝葉斯法則看我以前的文章在這里 ):
here.在這里 。Our goal will be to calculate the posterior distribution p(θ|D) for our advertisement given the number of impressions and clicks the ad received.
我們的目標是給定印象數(shù)和點擊數(shù),計算出廣告的后驗分布p(θ| D)。
First, note that we are assuming that given an ad, the probably of clicking it when it is seen is equal to θ and the probability of not being clicked is equal to 1-θ, that is what we call a Bernoulli distribution.
首先,請注意,我們假設(shè)給定一個廣告,看到它時點擊它的可能性等于θ,未被點擊的可能性等于1-θ,這就是我們所說的伯努利分布。
Result: After N impressions and z clicks, if the probability of the data (N, z) is a Bernoulli distribution, and if the prior distribution of θ is a beta(a,b) distribution, then the following holds to be true:
結(jié)果:在N次展示和z次點擊之后,如果數(shù)據(jù)( N,z )的概率是伯努利分布,并且如果θ的先前分布是beta(a,b)分布,則以下條件成立:
If the distribution of the data is given by a binomial distribution and the prior distribution is given by a beta distribution, the posterior distribution is a beta distribution. When that happens we say that the binomial and the beta distributions are conjugate, and the update of the prior distribution to the posterior becomes a simple arithmetic calculation.如果數(shù)據(jù)的分布由二項式分布給出,而先驗分布由beta分布給出,則后驗分布是beta分布。 當(dāng)發(fā)生這種情況時,我們說二項分布和β分布是共軛的,并且先驗分布到后驗分布的更新成為一種簡單的算術(shù)計算。For a detailed proof of this result I suggest John Kruschke’s “Doing Bayesian Data Analysis” (https://amzn.to/345ouR2), or ask me in Twitter (@solvingthehuman).
有關(guān)此結(jié)果的詳細證明,我建議約翰·克魯施克(John Kruschke)的“做貝葉斯數(shù)據(jù)分析”( https://amzn.to/345ouR2 ),或在Twitter(@solvingthehuman)上問我。
放在一起 (Putting all together)
With the equation from last section we can say that after N impressions and z clicks in an ad, the updated distribution of the ad’s CTR (the θ parameter) is given by:
使用上一節(jié)中的公式,我們可以說,在廣告中獲得N次展示和z次點擊后,廣告CTR的更新分布(θ參數(shù))由下式給出:
beta(1+z, N-z+1)
beta(1 + z,N-z + 1)
Let’s see how this look like with some real data along with the complete Python code so you can do your own A/B tests:
讓我們看看一些真實數(shù)據(jù)以及完整的Python代碼的樣子,以便您可以進行自己的A / B測試:
Real Time A/B test for cold and warm filters. Each plot are days a part, with the third one being two days from the start of the ad campaign. In the beginning we do not have enough data to make up our minds regarding any specific value of ads A or B real CTR. Moreover, we cannot say if A or B is performing better. By the end of the campaign we have almost 95% chance that the real value of Cold filer ad is performing better than the ad with the warm filter.實時A / B測試,用于冷和熱過濾器。 每幅圖都是一部分,第三幅是從廣告系列開始算起的兩天。 最初,我們沒有足夠的數(shù)據(jù)來確定廣告A或B實際CTR的任何具體價值。 此外,我們不能說A或B的表現(xiàn)是否更好。 到廣告系列結(jié)束時,我們幾乎有95%的機會認為Cold Filer廣告的實際價值要比帶有暖過濾器的廣告更好。結(jié)論 (Conclusion)
In this article we saw how to use the beta distribution and Python to quickly decide which ad is performing better in an A/B test campaign.
在本文中,我們了解了如何使用Beta發(fā)行版和Python快速確定哪個廣告在A / B測試廣告系列中的效果更好。
Also, we saw that if you are planning to sell data science book on Twitter you might be better off using the cold filter for your pictures. I am very curious to know if this result holds up to other people.
此外,我們看到,如果您打算在Twitter上出售數(shù)據(jù)科學(xué)書籍,那么最好使用冷濾器來處理圖片。 我很想知道這個結(jié)果是否能勝任其他人。
Let me know of your own A/B tests! Did you have similar results as mine? I am in Twitter @solvingthehuman.
讓我知道您自己的A / B測試! 您有和我類似的結(jié)果嗎? 我在Twitter @solvingthehuman。
更多? (More?)
If you want to read:
如果您想閱讀:
An introduction about Bayesian Rule — Go here
關(guān)于貝葉斯規(guī)則的介紹-轉(zhuǎn)到此處
The advantages of Bayesian over other methods — Go here
圍棋^ h -貝葉斯優(yōu)于其他方法的優(yōu)點ERE
翻譯自: https://medium.com/solving-the-human-problem/do-warm-or-cold-filters-in-your-pictures-drive-more-clicks-a-machine-learning-python-a-b-testing-ccf5bdd89d4c
總結(jié)
以上是生活随笔為你收集整理的图片中的暖色或冷色滤色片是否会带来更多点击? —机器学习A / B测试的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 梦到一条小红蛇什么意思
- 下一篇: 图卷积 节点分类_在节点分类任务上训练图