當前位置：首頁 > 编程语言 > python >内容正文

python

spotify音乐下载_使用Python和R对音乐进行聚类以在Spotify上创建播放列表。

發(fā)布時間：2023/12/15 python 43 豆豆

生活随笔收集整理的這篇文章主要介紹了 spotify音乐下载_使用Python和R对音乐进行聚类以在Spotify上创建播放列表。小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

spotify音樂下載

Spotify is one of the most famous Music Platforms to discover new music. The company uses a lot of different algorithms to recommend the user new music based on their music preferences and most of these recommendations are located in Playlists. These Playlists are created for different users based on a wide diversity of music genres and even Spotify is capable to recommend new music based in moods.

小號 potify是最有名的音樂平臺之一，以發(fā)現(xiàn)新的音樂。該公司使用許多不同的算法根據(jù)用戶的音樂喜好向他們推薦新音樂，并且其中大多數(shù)推薦都位于播放列表中。這些播放列表是根據(jù)各種各樣的音樂類型為不同的用戶創(chuàng)建的，甚至Spotify也能夠根據(jù)心情推薦新音樂。

Music has been in my daily routine during all my life, It’s a kind of drug that I need when I’m doing housework, working at the office, walking the dog, workouts and so on. I have a lot of music on Spotify that I always wanted to separate according to the similarities of the songs and save them into different playlists. Fortunately, with a little knowledge of Machine Learning Algorithms and Python, I could achieve that goal !!!.

在我的一生中，音樂一直是我的日常生活，這是我做家務(wù)，在辦公室工作，walking狗，鍛煉身體時需要的一種藥物。我在Spotify上有很多音樂，我一直想根據(jù)歌曲的相似性進行分離，然后將它們保存到不同的播放列表中。幸運的是，只要對機器學習算法和Python有所了解，我就能實現(xiàn)這一目標！

So to do that, first I will list the tools required and some definitions of the Spotify Audio Features that I will use for built the Clustering model.

為此，首先，我將列出所需的工具以及將用于構(gòu)建集群模型的Spotify音頻功能的一些定義。

工具： (Tools:)

Pandas and Numpy for data analysis.
Pandas和Numpy用于數(shù)據(jù)分析。
Sklearn to build the Machine Learning model.
Sklearn建立機器學習模型。
Spotipy Python Library (click here for more info).
Spotipy Python庫( 單擊此處了解更多信息)。
Spotify Credentials to access Api Database and Playlists Modify (click here for more info).
Spotify憑據(jù)可訪問Api數(shù)據(jù)庫和播放列表修改( 單擊此處了解更多信息)。

Spotify音頻功能： (Spotify Audio Features:)

Spotify uses a series of different features to classify the tracks. I copy/paste the information from the Spotify Webpage.

Spotify使用一系列不同的功能對曲目進行分類。我從Spotify網(wǎng)頁復(fù)制/粘貼信息。

Acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
聲學：軌道是否聲學的置信度，范圍為0.0到1.0。 1.0表示音軌是聲學的高置信度。
Danceability: Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
舞蹈性：舞蹈性是根據(jù)節(jié)奏，節(jié)奏穩(wěn)定性，拍子強度和整體規(guī)律性等音樂元素的組合來描述軌道適合跳舞的方式。值0.0最低可跳舞，而1.0最高可跳舞。
Energy: Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
能量：能量是從0.0到1.0的量度，表示強度和活動的感知量度。通常，充滿活力的曲目會感覺快速，響亮且嘈雜。例如，死亡金屬具有較高的能量，而巴赫前奏的得分則較低。有助于此屬性的感知特征包括動態(tài)范圍，感知的響度，音色，發(fā)作率和一般熵。
Instrumentalness: Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.
器樂性：預(yù)測曲目是否不包含人聲。在這種情況下，“哦”和“啊”的聲音被當作工具。說唱或說出的單詞軌跡顯然是“發(fā)聲的”。器樂性值越接近1.0，則軌道中沒有聲音的可能性越大。高于0.5的值旨在表示樂器軌跡，但隨著該值接近1.0，置信度更高。
Liveness: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides a strong likelihood that the track is live.
生動度：檢測記錄中是否有聽眾。較高的活躍度值表示增加了實時執(zhí)行軌道的可能性。高于0.8的值很有可能會啟用該軌道。
Loudness: the overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing the relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db.
響度：軌道的整體響度，以分貝(dB)為單位。響度值是整個軌道的平均值，可用于比較軌道的相對響度。響度是聲音的質(zhì)量，它是身體力量(振幅)的主要心理關(guān)聯(lián)。值通常在-60到0 db之間。
Speechiness: Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audiobook, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.
語音能力：語音能力可檢測曲目中是否存在口語。與錄音類似的錄音(例如脫口秀，有聲讀物，詩歌)越多，屬性值就越接近1.0。高于0.66的值描述的曲目可能完全由口語組成。介于0.33到0.66之間的值描述了可能同時包含音樂和語音的曲目，無論是分段還是分層的(包括說唱音樂)。低于0.33的值最有可能代表音樂和其他非語音類曲目。
Valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
價：從0.0到1.0的小節(jié)，描述了曲目傳達的音樂積極性。價態(tài)高的音軌聽起來更積極(例如，快樂，開朗，欣快)，而價態(tài)低的音軌聽起來更加消極(例如，悲傷，沮喪，憤怒)。
Tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, the tempo is the speed or pace of a given piece and derives directly from the average beat duration.
速度：以每分鐘節(jié)拍數(shù)(BPM)為單位的曲目的總體估計速度。用音樂術(shù)語來說，節(jié)奏是指給定樂曲的速度或節(jié)奏，它直接來自平均拍子持續(xù)時間。

For information reduction purposes I decided to use the features of Loudness, Valence, Energy, and Danceability because they have more influence to differentiate between Energetic and Relaxed songs.

為了減少信息量，我決定使用響度，價，能量和可跳舞性的功能，因為它們具有更大的影響力來區(qū)分活力和輕松的歌曲。

1.獲取和分析數(shù)據(jù)： (1. Obtaining and Analysing the Data:)

My favorite band is Radiohead, so I decided to obtain their discography and all the music created by the musicians in their solo music careers.

我最喜歡的樂隊是Radiohead ，所以我決定獲得他們的唱片以及音樂家在其個人音樂事業(yè)中創(chuàng)作的所有音樂。

Using the Spotipy Library I created some functions to download all the songs of Radiohead, Thom Yorke, Atoms For Peace, Jonny Greenwood, Ed O’Brien, Colin Greenwood, and Phil Selway (Yes, I’m obsessed with their music hehe). You can access those functions on my Github Repository (click here).

使用Spotipy庫，我創(chuàng)建了一些函數(shù)來下載Radiohead ， Thom Yorke ， Atoms For Peace ， Jonny Greenwood ， Ed O'Brien ， Colin Greenwood和Phil Selway的所有歌曲(是的，我很著迷于他們的音樂呵呵)。您可以在我的Github存儲庫中訪問這些功能( 單擊此處 )。

I obtained the following data:

我獲得了以下數(shù)據(jù)：

Data Frame with shape of 423 rows and 10 columns. (Image by author)具有423行和10列形狀的數(shù)據(jù)框。 (作者提供的圖片)

I always wondered why I like a lot the music of Radiohead and I realized that most of their songs tend towards melancholy. Describing the features above, the data showed me that Valence and Energy are less than 0.5 and Danceability tends to low values, so I like tracks with low energy and negative sound (there is still something of my 2000’s Emo Side watching MTV Videos).

我一直想知道為什么我非常喜歡Radiohead的音樂，并且我意識到他們的大多數(shù)歌曲都傾向于憂郁。描述上述功能后，數(shù)據(jù)顯示價價和能量小于0.5，而可跳舞性趨向于低值，因此我喜歡低能量和負面聲音的音軌(仍然有我2000年的Emo Side觀看MTV視頻的內(nèi)容)。

Main Stats of Data Frame (Image by author)數(shù)據(jù)框的主要統(tǒng)計數(shù)據(jù)(作者提供) Histograms of Songs Features (Image by author)歌曲直方圖功能(作者提供)

2.建立模型： (2. Building the Model:)

I decided to use K-means Clustering for Unsupervised Machine Learning due to the shape of my data (423 tracks ) and considering I want to create 2 playlists separating Relaxed tracks from Energetic tracks (K=2).

由于數(shù)據(jù)的形狀(423個音軌)，我決定將K均值聚類用于無監(jiān)督機器學習，并考慮到我要創(chuàng)建2個播放列表，將輕松音軌與能量音軌(K = 2)分開。

Important: I’m not using train and test data because in this case I just want to group all the tracks into 2 different groups to create playlists with the entire data.

重要提示 ：我不使用訓(xùn)練和測試數(shù)據(jù)，因為在這種情況下，我只想將所有軌道分為2個不同的組，以創(chuàng)建包含整個數(shù)據(jù)的播放列表。

So let’s do it!. I first import the libraries:

因此，讓我們開始吧！我首先導(dǎo)入庫：

from sklearn.cluster import KMeansfrom sklearn.preprocessing import MinMaxScaler

Then I need to define features and normalize the values of the model. I’ will use MinMaxScaler to preserve the shape of the original distribution and scale the features between a range from 0 to 1. Once I have the values in the correct format, I just simply create the K-Means model and then save the labels into the main Data Frame called “df”.

然后，我需要定義特征并標準化模型的值。我將使用MinMaxScaler保留原始分布的形狀，并在0到1的范圍內(nèi)縮放要素。一旦以正確的格式獲得了值，我只需創(chuàng)建K-Means模型，然后將標簽保存到主數(shù)據(jù)幀稱為“ df”。

col_features = ['danceability', 'energy', 'valence', 'loudness']
X = MinMaxScaler().fit_transform(df[col_features])kmeans = KMeans(init="kmeans++",
n_clusters=2,
random_state=15).fit(X)df['kmeans'] = kmeans.labels_

That’s All, I have the music clustered in 2 groups !!!

就是這樣，我將音樂分為2組！！！

But now I need to study the features of these labels, so I plot the tracks in a 3D Scatter and then I analyze the respective mean of each feature grouping the data frame by the K-Means result labels.

但是現(xiàn)在我需要研究這些標簽的特征，因此我將軌跡繪制在3D散點圖中，然后分析通過K-Means結(jié)果標簽將數(shù)據(jù)框分組的每個特征的平均值。

3D Scatter plot of tracks using the features “Energy”, “Danceability” and “Loudness” (Image by author)使用“能量”，“可跳舞性”和“響度”功能的3D散點圖(作者提供) Mean Features of each K-mean Label (Image by author)每個K均值標簽的平均特征(作者提供的圖片)

As I noticed on the graph the values are quite well grouped, blue values are located in label 0 and red values in label 1. Looking at the table of means, the label 0 grouped tracks with less danceability, energy, valence, loudness, so this one corresponds to Relaxed songs, likewise, the label 1 has the Energetic songs.

正如我在圖表上所注意到的，這些值被很好地分組了，藍色值位于標簽0中，紅色值位于標簽1中。從均值表來看， 標簽0分組后的音軌具有較低的可跳舞性，活力，化合價，響度，因此這首歌曲對應(yīng)于“輕松的歌曲” ，同樣， 標簽1具有“勁歌” 。

3.具有R的模型的準確性： (3. Accuracy of the Model with R:)

I know that Clustering accuracy is a bit subjective trying to evaluate the best result of a Clustering Algorithm, but in the same way, I wanted to observe if my model is separating the tracks well. So with a little help from Rstudio, I used the Silhouette Analysis. to measure the accuracy of my model.

我知道聚類精度在嘗試評估聚類算法的最佳結(jié)果時有點主觀，但是以同樣的方式，我想觀察我的模型是否很好地分隔了軌道。因此，在Rstudio的一點幫助下，我使用了“輪廓分析”。來衡量我模型的準確性。

In Rstudio I used the library “cluster” and “factoextra” to visualize and calculate the Silhouette Analysis using the Euclidean distance. the complete code is on my Github Repository (click here):

在Rstudio中，我使用庫“群集”和“ factoextra”來使用歐幾里得距離可視化和計算“輪廓分析”。完整的代碼在我的Github存儲庫中( 單擊此處 )：

#Calculate The euclidean distance of my dataframe values.
dd <- dist(df,method="euclidean")#Silhouette Analysis using the K-means model(km) and distance(dd)
sil.km <- silhouette(km$cluster,dd)
fviz_silhouette(sil.km)

The result is:

結(jié)果是：

Silhouette Analysis (Image by author)輪廓分析(作者提供)

The Silhouette Analysis is a way to measure how close each point in a cluster is to the points in its neighboring clusters. Silhouette values lies in the range of [-1, 1]. A value of +1 indicates that the sample is far away from its neighboring cluster and very close to the cluster its assigned. Similarly, a value of -1 indicates that the point is close to its neighboring cluster than to the cluster its assigned. So in my case values are between 0.25 and 0.60 inferring that most of the values are quite well grouped.

輪廓分析是一種測量聚類中的每個點與其相鄰聚類中的點的接近程度的方法。輪廓值在[-1，1]的范圍內(nèi)。值+1表示樣本離其鄰近的簇很遠，并且非常接近為其分配的簇。類似地，值-1表示該點比其分配的群集更靠近其相鄰群集。因此，在我的情況下，值介于0.25和0.60之間，這表明大多數(shù)值都進行了很好的分組。

4.在Spotify上創(chuàng)建播放列表： (4. Creating the Playlists on Spotify:)

To create the playlists and add the clustered tracks I use the library Spotipy explained in the first part of this article. You just need to get a client id, client secret, and username code to use the Spotify’s Apis and manipulate your library music. I let you the link with the information (Click Here).

為了創(chuàng)建播放列表并添加群集的曲目，我使用了本文第一部分中解釋的庫Spotipy。您只需要獲取客戶端ID，客戶端密鑰和用戶名代碼，即可使用Spotify的Apis并操作您的音樂庫。我讓您鏈接信息( 單擊此處 )。

I had to separate the tracks grouped into 2 different variables and then having the ids of the tracks I just create 2 new playlists and pass them the ids of the tracks. The code is the following one:

我必須將軌道分為2個不同的變量，然后讓它們的ID分別創(chuàng)建兩個新的播放列表，并將它們的ID傳遞給它們。代碼如下：

#Separating the clusters into new variables
cluster_0 = df[df['kmeans']==0]
cluster_1 = df[df['kmeans']==1]#Obtaining the ids of the songs and conver the id dataframe column to a list.
ids_0 = cluster_0['id'].tolist()
ids_1 = cluster_1['id'].tolist()#Creating 2 new playlists on my Spotify User
pl_energy = sp.user_playlist_create(username=username,
name="Radiohead :)")pl_relaxed = sp.user_playlist_create(user=username,
name="Radiohead :(")#Adding the tracks into the playlists
#For energetic Playlist
sp.user_playlist_add_tracks(user=username,
playlist_id = pl_energy['id'],
tracks=ids_1)#For relaxed Playlist
sp.user_playlist_add_tracks(user=username,
playlist_id = pl_relaxed['id'],
tracks=ids_0)

Finally, I have my 2 playlists of Radiohead’s songs using a K-means Clustering Algorithm to separate Energetic Songs from Relaxed Songs !!!!.

最后，我有2個Radiohead歌曲的播放列表，使用K-means聚類算法將精力充沛的歌曲與輕松的歌曲分開！！！！

If you want to listen to both playlists you can access them below.

如果您想同時收聽兩個播放列表，則可以在下面訪問它們。

最具活力的Radiohead歌曲(209首曲目) (Most Energetic Radiohead Songs (209 tracks))

Playlist created by the author作者創(chuàng)建的播放列表

最輕松的Radiohead歌曲(214首曲目) (Most Relaxed Radiohead Songs (214 tracks))

Playlist created by the author作者創(chuàng)建的播放列表

結(jié)論 (Conclusion)

Machine Learning Algorithms are a lot of fun to implement ideas or projects related to things you like. In my case, I like music a lot, so I could use this knowledge to create cool ways helping me to automate a task that can take a long time to perform it. I also could learn more about this amazing world of Data Science and my tendencies to music tastes.

機器學習算法對于實現(xiàn)與您喜歡的事物相關(guān)的想法或項目很有趣。就我而言，我非常喜歡音樂，因此我可以利用這些知識來創(chuàng)建一些很酷的方法，以幫助我自動執(zhí)行一項需要很長時間才能完成的任務(wù)。我還可以了解有關(guān)數(shù)據(jù)科學這個神奇世界的更多信息，以及我對音樂品味的傾向。

翻譯自: https://towardsdatascience.com/clustering-music-to-create-your-personal-playlists-on-spotify-using-python-and-k-means-a39c4158589a