當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

wandb: 深度学习轻量级可视化工具入门教程

發(fā)布時間：2024/10/8 pytorch 102 豆豆

生活随笔收集整理的這篇文章主要介紹了 wandb: 深度学习轻量级可视化工具入门教程小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

本文，就來給大家介紹一款新型的機器學習可視化工具，能夠讓人工智能研發(fā)過程變得更加簡單明了。

wandb: 深度學習輕量級可視化工具入門教程

引言
wandb
- 驗證數(shù)據(jù)可視化
- 自然語言處理
重要工具
極簡教程
- 1 安裝庫
- 2 創(chuàng)建賬戶
- 3 初始化
- 4 聲明超參數(shù)
- 5 記錄日志
- 6 保存文件
PyTorch應用wandb
參考文獻

引言

人工智能方向的項目，和數(shù)據(jù)可視化是緊密相連的。

模型訓練過程中梯度下降過程是什么樣的？損失函數(shù)的走向如何？訓練模型的準確度怎么變化的？

清楚這些數(shù)據(jù)，對我們模型的優(yōu)化至關重要。

由于人工智能項目往往伴隨著巨大數(shù)據(jù)量，用肉眼去逐個數(shù)據(jù)查看、分析是不顯示的。這時候就需要用到數(shù)據(jù)可視化和日志分析報告。

TensorFlow自帶的Tensorboard在模型和訓練過程可視化方面做得越來越好。但是，也越來越臃腫，對于初入人工智能的同學來說有一定的門檻。

人工智能方面的項目變得越來越規(guī)范化，以模型訓練、數(shù)據(jù)集準備為例，目前很多大公司已經(jīng)發(fā)布了各自的自動機器學習平臺，讓工程師把更多精力放在優(yōu)化策略上，而不是在準備數(shù)據(jù)、數(shù)據(jù)可視化方面。

wandb

wandb是Weights & Biases的縮寫，這款工具能夠幫助跟蹤你的機器學習項目。它能夠自動記錄模型訓練過程中的超參數(shù)和輸出指標，然后可視化和比較結果，并快速與同事共享結果。

通過wandb，能夠給你的機器學習項目帶來強大的交互式可視化調(diào)試體驗，能夠自動化記錄Python腳本中的圖標，并且實時在網(wǎng)頁儀表盤展示它的結果，例如，損失函數(shù)、準確率、召回率，它能夠讓你在最短的時間內(nèi)完成機器學習項目可視化圖片的制作。

總結而言，wandb有4項核心功能：

看板：跟蹤訓練過程，給出可視化結果
報告：保存和共享訓練過程中一些細節(jié)、有價值的信息
調(diào)優(yōu)：使用超參數(shù)調(diào)優(yōu)來優(yōu)化你訓練的模型
工具：數(shù)據(jù)集和模型版本化

也就是說，wandb并不單純的是一款數(shù)據(jù)可視化工具。它具有更為強大的模型和數(shù)據(jù)版本管理。此外，還可以對你訓練的模型進行調(diào)優(yōu)。

wandb另外一大亮點的就是強大的兼容性，它能夠和Jupyter、TensorFlow、Pytorch、Keras、Scikit、fast.ai、LightGBM、XGBoost一起結合使用。

因此，它不僅可以給你帶來時間和精力上的節(jié)省，還能夠給你的結果帶來質(zhì)的改變。

驗證數(shù)據(jù)可視化

wandb會自動選取一部分驗證數(shù)據(jù)，然后把它展示到面板上。例如，手寫體預測的結果、目標識別的包圍盒。

自然語言處理

使用自定義圖表可視化基于NLP注意力的模型

這里只給出2個示例，除了這些，它目前還有更多實用有價值的功能。而且，它還不斷在增加新功能。

重要工具

wandb(Weights & Biases)是一個類似于tensorboard的極度絲滑的在線模型訓練可視化工具。

wandb這個庫可以幫助我們跟蹤實驗，記錄運行中的超參數(shù)和輸出指標，可視化結果并共享結果。

下圖展示了wandb這個庫的功能，Framework Agnostic的意思是無所謂你用什么框架，均可使用wandb。wandb可與用戶的機器學習基礎架構配合使用：AWS，GCP，Kubernetes，Azure和本地機器。

下面是wandb的重要的工具

Dashboard: Track experiments（跟蹤實驗）, visualize results（可視化結果）；
Reports：Save and share reproducible findings（分享和保存結果）；
Sweeps：Optimize models with hyperparameter tuning（超參調(diào)優(yōu)）；
Artifacts：Dataset and model versioning, pipeline tracking（數(shù)據(jù)集和模型的版本控制）；

極簡教程

1 安裝庫

pip install wandb

2 創(chuàng)建賬戶

wandb login

3 初始化

# Inside my model training code import wandb wandb.init(project="my-project")

4 聲明超參數(shù)

wandb.config.dropout = 0.2 wandb.config.hidden_layer_size = 128

5 記錄日志

def my_train_loop():for epoch in range(10):loss = 0 # change as appropriate :)wandb.log({'epoch': epoch, 'loss': loss})

6 保存文件

# by default, this will save to a new subfolder for files associated # with your run, created in wandb.run.dir (which is ./wandb by default) wandb.save("mymodel.h5")# you can pass the full path to the Keras model API model.save(os.path.join(wandb.run.dir, "mymodel.h5"))

使用wandb以后，模型輸出，log和要保存的文件將會同步到cloud。

PyTorch應用wandb

我們以一個最簡單的神經(jīng)網(wǎng)絡為例展示wandb的用法：

首先導入必要的庫：

from __future__ import print_function import argparse import random # to set the python random seed import numpy # to set the numpy random seed import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from torchvision import datasets, transforms from torch.utils.data import DataLoader # Ignore excessive warnings import logginglogging.propagate = False logging.getLogger().setLevel(logging.ERROR)# WandB – Import the wandb library import wandb

登陸你的wandb賬戶：

# WandB – Login to your wandb account so you can log all your metrics !wandb login

定義Convolutional Neural Network：

# 定義Convolutional Neural Network：class Net(nn.Module):def __init__(self):super(Net, self).__init__()# In our constructor, we define our neural network architecture that we'll use in the forward pass.# Conv2d() adds a convolution layer that generates 2 dimensional feature maps# to learn different aspects of our image.self.conv1 = nn.Conv2d(3, 6, kernel_size=5)self.conv2 = nn.Conv2d(6, 16, kernel_size=5)# Linear(x,y) creates dense, fully connected layers with x inputs and y outputs.# Linear layers simply output the dot product of our inputs and weights.self.fc1 = nn.Linear(16 * 5 * 5, 120)self.fc2 = nn.Linear(120, 84)self.fc3 = nn.Linear(84, 10)def forward(self, x):# Here we feed the feature maps from the convolutional layers into a max_pool2d layer.# The max_pool2d layer reduces the size of the image representation our convolutional layers learnt,# and in doing so it reduces the number of parameters and computations the network needs to perform.# Finally we apply the relu activation function which gives us max(0, max_pool2d_output)x = F.relu(F.max_pool2d(self.conv1(x), 2))x = F.relu(F.max_pool2d(self.conv2(x), 2))# Reshapes x into size (-1, 16 * 5 * 5)# so we can feed the convolution layer outputs into our fully connected layer.x = x.view(-1, 16 * 5 * 5)# We apply the relu activation function and dropout to the output of our fully connected layers.x = F.relu(self.fc1(x))x = F.relu(self.fc2(x))x = self.fc3(x)# Finally we apply the softmax function to squash the probabilities of each class (0-9)# and ensure they add to 1.return F.log_softmax(x, dim=1)

定義訓練函數(shù)

def train(config, model, device, train_loader, optimizer, epoch):# switch model to training mode. This is necessary for layers like dropout, batchNorm etc.# which behave differently in training and evaluation mode.model.train()# we loop over the data iterator, and feed the inputs to the network and adjust the weights.for batch_id, (data, target) in enumerate(train_loader):if batch_id > 20:break# Loop the input features and labels from the training dataset.data, target = data.to(device), target.to(device)# Reset the gradients to 0 for all learnable weight parametersoptimizer.zero_grad()# Forward pass: Pass image data from training dataset, make predictions# about class image belongs to (0-9 in this case).output = model(data)# Define our loss function, and compute the lossloss = F.nll_loss(output, target)# Backward pass:compute the gradients of loss,the model's parametersloss.backward()# update the neural network weightsoptimizer.step()

定義測試函數(shù)

# wandb.log用來記錄一些日志(accuracy,loss and epoch), 便于隨時查看網(wǎng)路的性能 def test(args, model, device, test_loader, classes):model.eval()# switch model to evaluation mode.# This is necessary for layers like dropout, batchNorm etc. which behave differently in training and evaluation modetest_loss = 0correct = 0example_images = []with torch.no_grad():for data, target in test_loader:# Load the input features and labels from the test datasetdata, target = data.to(device), target.to(device)# Make predictions: Pass image data from test dataset,# make predictions about class image belongs to(0-9 in this case)output = model(data)# Compute the loss sum up batch losstest_loss += F.nll_loss(output, target, reduction='sum').item()# Get the index of the max log-probabilitypred = output.max(1, keepdim=True)[1]correct += pred.eq(target.view_as(pred)).sum().item()# Log images in your test dataset automatically,# along with predicted and true labels by passing pytorch tensors with image data into wandb.example_images.append(wandb.Image(data[0], caption="Pred:{} Truth:{}".format(classes[pred[0].item()], classes[target[0]])))# wandb.log(a_dict) logs the keys and values of the dictionary passed in and associates the values with a step.# You can log anything by passing it to wandb.log(),# including histograms, custom matplotlib objects, images, video, text, tables, html, pointclounds and other 3D objects.# Here we use it to log test accuracy, loss and some test images (along with their true and predicted labels).wandb.log({"Examples": example_images,"Test Accuracy": 100. * correct / len(test_loader.dataset),"Test Loss": test_loss})

初始化一個wandb run，并設置超參數(shù)：

# 初始化一個wandb run, 并設置超參數(shù) # Initialize a new run wandb.init(project="pytorch-intro") wandb.watch_called = False # Re-run the model without restarting the runtime, unnecessary after our next release# config is a variable that holds and saves hyper parameters and inputs config = wandb.config # Initialize config config.batch_size = 4 # input batch size for training (default:64) config.test_batch_size = 10 # input batch size for testing(default:1000) config.epochs = 50 # number of epochs to train(default:10) config.lr = 0.1 # learning rate(default:0.01) config.momentum = 0.1 # SGD momentum(default:0.5) config.no_cuda = False # disables CUDA training config.seed = 42 # random seed(default:42) config.log_interval = 10 # how many batches to wait before logging training status

主函數(shù)

def main():use_cuda = not config.no_cuda and torch.cuda.is_available()device = torch.device("cuda:0" if use_cuda else "cpu")kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}# Set random seeds and deterministic pytorch for reproducibility# random.seed(config.seed) # python random seedtorch.manual_seed(config.seed) # pytorch random seed# numpy.random.seed(config.seed) # numpy random seedtorch.backends.cudnn.deterministic = True# Load the dataset: We're training our CNN on CIFAR10.# First we define the transformations to apply to our images.transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])# Now we load our training and test datasets and apply the transformations defined abovetrain_loader = DataLoader(datasets.CIFAR10(root='./data',train=True,download=True,transform=transform), batch_size=config.batch_size, shuffle=True, **kwargs)test_loader = DataLoader(datasets.CIFAR10(root='./data',train=False,download=True,transform=transform), batch_size=config.batch_size, shuffle=False, **kwargs)classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')# Initialize our model, recursively go over all modules and convert their parameters# and buffers to CUDA tensors (if device is set to cuda)model = Net().to(device)optimizer = optim.SGD(model.parameters(), lr=config.lr, momentum=config.momentum)# wandb.watch() automatically fetches all layer dimensions, gradients, model parameters# and logs them automatically to your dashboard.# using log="all" log histograms of parameter values in addition to gradientswandb.watch(model, log="all")for epoch in range(1, config.epochs + 1):train(config, model, device, train_loader, optimizer, epoch)test(config, model, device, test_loader, classes)# Save the model checkpoint. This automatically saves a file to the cloudtorch.save(model.state_dict(), 'model.h5')wandb.save('model.h5')if __name__ == '__main__':main()

參考文獻

https://www.jianshu.com/p/148c108b00f0

https://zhuanlan.zhihu.com/p/266337608

總結

以上是生活随笔為你收集整理的wandb: 深度学习轻量级可视化工具入门教程的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Python基础：对象的深拷贝和浅拷贝的
下一篇： 161-PHP 文本替换函数str_re