當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

机器学习嵌入式实现_机器学习中的嵌入

發布時間：2023/12/15 编程问答 26 豆豆

生活随笔收集整理的這篇文章主要介紹了机器学习嵌入式实现_机器学习中的嵌入小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

機器學習嵌入式實現

I first came across the concept of embeddings while developing the RNN typing practice app.

在開發RNN打字練習應用程序時，我首先遇到了嵌入的概念。

Even though I am just beginning to understand the range of uses for embeddings, I thought it would be useful to write down some of the basics.

即使我才剛剛開始了解嵌入的用途范圍，但我認為寫下一些基本知識將很有用。

First, let’s look at what I knew before embeddings, one-hot vectors.

首先，讓我們看一看在嵌入矢量之前我所知道的。

刷新一鍵向量 (Refresher on one-hot vectors)

Remember one-hot vectors? No? Well do you remember unit vectors from math class? Still no? Okay — assume that we have three labels, [🍎, 🍊, 🍌]. We want to represent these values in a way that machines can understand. Initially, we might be tempted to assign the values [1, 2, 3], but the issue here is we don’t necessarily want a 🍌 to equal three 🍎.

還記得一個熱門向量嗎？沒有？您還記得數學課上的單位向量嗎？仍然沒有？好的-假設我們有三個標簽[🍎，🍊，🍌]。我們希望以機器可以理解的方式來表示這些值。最初，我們可能很想分配值[1、2、3]，但是這里的問題是我們不必讓🍌等于3。

We could instead assign vectors to each label, where the dimension of each vector is equal to the number of labels. In this case we have three labels, so three dimensions.

相反，我們可以將矢量分配給每個標簽，其中每個矢量的尺寸等于標簽的數量。在這種情況下，我們有三個標簽，所以三個維度。

嵌入 (Embeddings)

Think about a company that has a million books in that company’s catalogue, which they would like to use as input data. It is not practical to create a system that needs a million-dimension vector to represent each book. Unless they are specifically looking for something unique to each book, there must be some aspect that some books will share. How can we manage that?

考慮一家公司，該公司的目錄中有一百萬本書籍，他們希望將其用作輸入數據。創建一個需要一百萬維向量來表示每本書的系統是不切實際的。除非他們專門為每本書尋找獨特的東西，否則某些書必須分享某些方面。我們該如何處理？

With one-hot vectors, each vector is unique, and the dot product of any two vectors is always 0. However, what if this is not desired and we want there to be some similarity between vectors (a non-zero dot product)? Going back to our example above, what if in our application we are looking at the shapes of fruit? It is logical that there would be some overlap of values.

對于單熱向量，每個向量都是唯一的，并且任何兩個向量的點積始終為0。但是，如果不希望這樣，并且我們希望向量之間存在某種相似性(非零點積)怎么辦？回到上面的示例，如果在我們的應用程序中正在查看水果的形狀該怎么辦？值會有些重疊是合乎邏輯的。

An 🍎 and an 🍊 share a similar shape, so the values are similar. A 🍌 is quite a unique shape, but slightly more like an 🍎 than an 🍊. I also snuck a 🥭 in there. A 🥭 is a bit more oblong, but has a round top, so the numbers are a mix. One of the main benefits of using embeddings is that we can have the number of labels exceed the dimension of the embedding. In this case, we have 4 labels, but a dimension of 3.

🍎和🍊具有相似的形狀，因此值相似。 🍌的形狀非常獨特，但比🍊更像🍎。我也在那里偷了一個a。 🥭略長一些，但頂部為圓形，因此數字是混合的。使用嵌入的主要好處之一是，我們可以使標簽的數量超過嵌入的尺寸。在這種情況下，我們有4個標簽，但尺寸為3。

However, these numbers are just me, a human, squinting my eyes and pressing numbers on my keypad. These numbers are trainable and we can let our model adjust these values during training. If our data was the ratio of length to the width of the fruit, after training, the values may vaguely look like what I put above.

但是，這些數字只是我，一個人，著眼睛，按鍵盤上的數字。這些數字是可訓練的，我們可以讓我們的模型在訓練期間調整這些值。如果我們的數據是水果的長度與寬度之比，那么在訓練后，這些值可能看起來像我上面所說的那樣。

嵌入尺寸 (Embedding dimensions)

Here’s where the alchemy begins, the embedding dimension hyperparameter. Google’s Machine Learning Crash Course on Embeddings mentions the following as a good starting point.

這是煉金術開始的地方，嵌入尺寸超參數。 Google的“嵌入式機器學習速成課程”將以下內容作為一個很好的起點。

For the RNN typing practice app, there are 28 values (a-z, space, and a null character for masking). If I followed the above formula, would have resulted in a dimension of 2 or 3. After training, I didn’t get acceptable results. I thought some more about the problem and I theorized that setting the dimension to 28 might work. This way there would be a high enough dimension to have each character compared to all characters. I ran tests with dimensions of [2, 28, 282 ? 784], and 28 performed the best. Is there a better value? Possibly, but I was happy with the results.

對于RNN打字練習應用程序，有28個值(az，空格和用于屏蔽的空字符)。如果我遵循上述公式，則結果將為2或3。經過訓練，我沒有得到可接受的結果。我對問題有更多的想法，并得出理論認為將尺寸設置為28可能可行。這樣，與所有字符相比，將有足夠高的尺寸來包含每個字符。我進行了尺寸為[2，28，282] 784]的測試，其中28臺表現最佳。有更好的價值嗎？可能，但是我對結果感到滿意。

TL; DR。 (TL;DR.)

Embeddings are a basic method to encode label information into a vector. This information can begin as the representation of the label and can be trained so that similar labels have similar vectors. The possibility of representing many labels with far fewer dimensions is one of the main benefits. Additional information can also be encoded, such as positional information, which is used prominently in transformers.

嵌入是將標簽信息編碼為向量的基本方法。該信息可以作為標簽的表示開始，并且可以被訓練為相似的標簽具有相似的向量。代表許多尺寸較小的標簽的可能性是主要優點之一。還可以對其他信息進行編碼，例如位置信息，該信息在變壓器中非常常用。

Originally from:

最初是從：

翻譯自: https://medium.com/swlh/embeddings-in-machine-learning-548eef7b2b5