當(dāng)前位置：首頁 > 人工智能 > ChatGpt >内容正文

ChatGpt

monk js_使用Monk AI进行手语分类

發(fā)布時(shí)間：2023/12/15 ChatGpt 38 豆豆

生活随笔收集整理的這篇文章主要介紹了 monk js_使用Monk AI进行手语分类小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

monk js

計(jì)算機(jī)視覺，深度學(xué)習(xí) (Computer Vision, Deep Learning)

The Monk AI library is a low code library, for Computer Vision, that supports Mxnet-Gluon, Pytorch, and Keras backend. This is an amazing library that allows us to solve CV problems, easily.

Monk AI庫是針對計(jì)算機(jī)視覺的低代碼庫，它支持Mxnet-Gluon，Pytorch和Keras后端。這是一個(gè)了不起的庫，使我們能夠輕松解決CV問題。

關(guān)于該項(xiàng)目： (About the project:)

This project is to classify between the sign languages, corresponding to the alphabets, and thereby allow us to interpret the information a person with speech-impairment is trying to tell.

該項(xiàng)目旨在對與字母相對應(yīng)的手語進(jìn)行分類，從而使我們能夠解釋語音障礙人士試圖告訴的信息。

This project is very beneficial and is easy to make as well using the MonkAI Library. So through this blog, I’ll share the details on how to easily build this project and how more awesome projects can be built using MonkAI.

這個(gè)項(xiàng)目是非常有益的，并且易于使用MonkAI庫進(jìn)行制作。因此，通過此博客，我將分享有關(guān)如何輕松構(gòu)建此項(xiàng)目以及如何使用MonkAI構(gòu)建更出色的項(xiàng)目的詳細(xì)信息。

數(shù)據(jù)集： (The Dataset:)

For any classification, we need data to feed into our model architecture. In this case, the original Sign Language Dataset (MNIST) had image pixel values stored in a .csv file. To make the task easy for me, I pre-processed them into .jpg images. And a CSV file containing the labels, corresponding to each image.

對于任何分類，我們都需要數(shù)據(jù)輸入到我們的模型架構(gòu)中。在這種情況下，原始手語數(shù)據(jù)集(MNIST)的圖像像素值存儲(chǔ)在.csv文件中。為使任務(wù)輕松完成，我將它們預(yù)處理為.jpg圖像。還有一個(gè)CSV文件，其中包含與每個(gè)圖像相對應(yīng)的標(biāo)簽。

The dataset has 27,454 train images and 7171 test images, each of 28x28 size.

數(shù)據(jù)集包含27,454個(gè)火車圖像和7171個(gè)測試圖像，每個(gè)圖像的尺寸均為28x28。

Dataset Link: https://drive.google.com/file/d/1A5cvK7bhsP3pexL4urMOX2Dof2mdFqPX/view?usp=sharing

數(shù)據(jù)集鏈接： https : //drive.google.com/file/d/1A5cvK7bhsP3pexL4urMOX2Dof2mdFqPX/view? usp =sharing

安裝MonkAI： (Installing MonkAI:)

I was working on the project in Google colab. , so to install and MonkAI library I used,

我當(dāng)時(shí)在Google colab中從事該項(xiàng)目。，因此要安裝和我使用的MonkAI庫，

But in case you are working on a local device or on Kaggle, you can install it there also.

但是，如果您在本地設(shè)備或Kaggle上工作，則也可以在此處安裝它。

設(shè)置庫以執(zhí)行分類： (Setting up the library, to perform classification:)

Now using Monk AI is quite simple, there are just two-three steps to set the library to perform the task.

現(xiàn)在，使用Monk AI非常簡單，只需兩三個(gè)步驟即可設(shè)置庫以執(zhí)行任務(wù)。

Selecting the backend to perform the classification, as you can see here I am using PyTorch backend.

選擇后端進(jìn)行分類，如您在這里看到的，我正在使用PyTorch后端。

2. Initializing the project would create a separate workspace folder for you to store the project.

2.初始化項(xiàng)目將為您創(chuàng)建一個(gè)單獨(dú)的工作區(qū)文件夾，以供您存儲(chǔ)項(xiàng)目。

3. Now getting the training dataset, to perform training

3.現(xiàn)在獲取訓(xùn)練數(shù)據(jù)集，以執(zhí)行訓(xùn)練

Here dataset path is the path to the folder having the train images, and path_to_csv contains the path to the labels file.

在這里，數(shù)據(jù)集路徑是具有火車圖像的文件夾的路徑，而path_to_csv包含標(biāo)簽文件的路徑。

比較和分析模型性能： (Compare and analyze model performance:)

Often when we do transfer learning, we want to try different models, to check which model architecture suits best. Monk AI easily allows you to compare the different models, and choose which performs better.

通常，當(dāng)我們進(jìn)行遷移學(xué)習(xí)時(shí)，我們想嘗試不同的模型，以檢查哪種模型架構(gòu)最合適。 Monk AI可以輕松地比較不同的模型，并選擇性能更好的模型。

Pre-trained models available提供預(yù)訓(xùn)練的模型

So in order to analyze and compare model performances, we will have to set up an experiment.

因此，為了分析和比較模型性能，我們將必須進(jìn)行實(shí)驗(yàn)。

Then the next step is to decide the models, that we want to compare performances.

然后，下一步是確定我們要比較性能的模型。

Here we are initially checking with 5 different models, like vgg16,vgg19,resnet18,resnet34, and resnext50_32x4d. The performance would be based on results obtained from 5% of training data and training for 5 epochs.

在這里，我們最初使用5種不同的模型進(jìn)行檢查，例如vgg16，vgg19，resnet18，resnet34和resnext50_32x4d。表現(xiàn)將基于從5％的訓(xùn)練數(shù)據(jù)和5個(gè)時(shí)期的訓(xùn)練中獲得的結(jié)果。

After calling the analyzer function, it takes about 10 mins on a standard GPU to complete the process. Once the process ends, we get to see a result like this.

調(diào)用分析器函數(shù)后，在標(biāo)準(zhǔn)GPU上大約需要10分鐘才能完成該過程。該過程結(jié)束后，我們將看到這樣的結(jié)果。

This gives a complete idea of which model can perform better in this task.

這給出了一個(gè)完整的概念，表明哪個(gè)模型可以在此任務(wù)中表現(xiàn)更好。

執(zhí)行分類： (Perform classification:)

Once the model is decided, we will again pass the training dataset.

確定模型后，我們將再次通過訓(xùn)練數(shù)據(jù)集。

Once this happens, you will get to see an output like this.

一旦發(fā)生這種情況，您將看到這樣的輸出。

Here you can see a train-val split of 0.7 is done automatically, along with some transformation on the train and validation data. However, you are free to change the values of these, as per requirements.

在這里，您可以看到自動(dòng)完成0.7的火車-val分割，以及火車上的一些轉(zhuǎn)換和驗(yàn)證數(shù)據(jù)。但是，您可以根據(jù)需要隨意更改這些值。

Now we will perform training.

現(xiàn)在我們將進(jìn)行培訓(xùn)。

This starts the training process. Once completed we will test the model performance.

這將開始訓(xùn)練過程。完成后，我們將測試模型性能。

測試模型： (Testing the model:)

For testing the model, we have a test dataset, which is already provided in the main dataset.

為了測試模型，我們有一個(gè)測試數(shù)據(jù)集，該數(shù)據(jù)集已在主數(shù)據(jù)集中提供。

This loads your experiment, in inference mode, for you to check the model performance.

這會(huì)以推理模式加載您的實(shí)驗(yàn)，以供您檢查模型的性能。

After loading the test dataset, the output will be displayed showing the number of test images, and no classification classes.

加載測試數(shù)據(jù)集后，將顯示輸出，顯示測試圖像的數(shù)量，并且沒有分類類別。

This will take some time to run, depending on the size of the test dataset, once completed you will see an output like this.

根據(jù)測試數(shù)據(jù)集的大小，這將需要一些時(shí)間來運(yùn)行，完成后，您將看到類似這樣的輸出。

This can differ in your case, based upon how you change the hyper-parameters.

這取決于您的情況，具體取決于您如何更改超參數(shù)。

Apart from class-based accuracy, we can check our model performance on single images.

除了基于類的準(zhǔn)確性外，我們還可以檢查單個(gè)圖像上的模型性能。

As you can see it predicted the class of the image with around 99% confidence.

如您所見，它以大約99％的置信度預(yù)測了圖像的類別。

In this way, using the Monk AI library, you can perform classification tasks, even append more custom layers to the network, and build better classifiers.

這樣，使用Monk AI庫，您可以執(zhí)行分類任務(wù)，甚至可以將更多的自定義圖層添加到網(wǎng)絡(luò)中，并構(gòu)建更好的分類器。

翻譯自: https://medium.com/towards-artificial-intelligence/sign-language-classification-using-monkai-f481f6c26fd0