cuDNN 功能模块解析
cuDNN 功能模塊解析
Abstract
本cuDNN 8.0.4開發人員指南概述了cuDNN功能,如可自定義的數據布局、支持靈活的dimension ordering,striding,4D張量的子區域,這些張量用作其所有例程的輸入和輸出。這種靈活性可簡單集成到任何神經網絡實現中。
要訪問cuDNN API參考,請參閱cuDNN
API參考指南。
https://docs.nvidia.com/deeplearning/cudnn/api/index.html
有關先前發布的cuDNN開發人員文檔,請參閱cuDNN存檔。
https://docs.nvidia.com/deeplearning/cudnn/archives/index.html
- Overview
NVIDIA? CUDA? Deep
Neural Network library? (cuDNN) 提供了深度神經網絡GPU加速庫。該cuDNN數據類型參考API描述了所有類型和枚舉的 cuDNN庫API。該cuDNN API參考描述了所有程序的API cuDNN庫。
該cuDNN庫以及這個API文檔已經被分成以下庫:
cudnn_ops_infer -該實體包含與cuDNN上下文創建和銷毀,張量描述符管理器,張量實用程序以及常見ML算法的推理部分(例如批處理歸一化,softmax,dropout等)有關的例程。
cudnn_ops_train -該實體包含通用的訓練例程和算法,例如批量歸一化,softmax,dropout等。
cudnn_ops_train 庫依賴于 cudnn_ops_infer。
cudnn_cnn_infer-該實體包含推理時所需的與卷積神經網絡相關的所有例程。的 cudnn_cnn_infer 庫依賴于 cudnn_ops_infer。
cudnn_cnn_train-該實體包含訓練期間所需的與卷積神經網絡相關的所有例程。的 cudnn_cnn_train 庫依賴于 cudnn_ops_infer, cudnn_ops_train和 cudnn_cnn_infer。
cudnn_adv_infer-該實體包含所有其他功能和算法。這包括RNN,CTC loss和多線程attention。cudnn_adv_infer 庫依賴于 cudnn_ops_infer。
cudnn_adv_train -該實體包含以下所有cudnn_adv_infer訓練對象 。 cudnn_adv_train 庫依賴于 cudnn_ops_infer, cudnn_ops_train和 cudnn_adv_infer。
cudnn -這是應用程序層和cuDNN代碼之間的可選填充層。該層在運行時適時地為API開放了適配的庫。
cudnn_ops_infer -
This entity contains the routines related to cuDNN context creation and
destruction, tensor descriptor management, tensor utility routines, and
the inference portion of common ML algorithms such as batch normalization,
softmax, dropout, etc.
cudnn_ops_train -
This entity contains common training routines and algorithms, such as
batch normalization, softmax, dropout, etc. The cudnn_ops_train library
depends on cudnn_ops_infer.
cudnn_cnn_infer -
This entity contains all routines related to convolutional neural networks
needed at inference time. The cudnn_cnn_infer library depends
on cudnn_ops_infer.
cudnn_cnn_train -
This entity contains all routines related to convolutional neural networks
needed during training time. The cudnn_cnn_train library depends
on cudnn_ops_infer, cudnn_ops_train,
and cudnn_cnn_infer.
cudnn_adv_infer -
This entity contains all other features and algorithms. This includes
RNNs, CTC loss, and Multihead Attention. The cudnn_adv_infer library depends
on cudnn_ops_infer.
cudnn_adv_train -
This entity contains all the training counterparts of cudnn_adv_infer.
The cudnn_adv_train library
depends on cudnn_ops_infer, cudnn_ops_train,
and cudnn_adv_infer.
cudnn -
This is an optional shim layer between the application layer and the cuDNN
code. This layer opportunistically opens the correct library for the API
at runtime.
- Programming Model
cudn庫公開了一個主機API,但是假設對于使用GPU的操作,可以從設備直接訪問所需的數據。
使用cuDNN的應用程序必須通過調用cudnnCreate()初始化庫上下文的句柄。這個句柄被顯式地傳遞給對GPU數據進行操作的每個后續庫函數。一旦應用程序使用完cudndn,它就可以使用cudndestory()釋放與庫句柄關聯的資源。這種方法允許用戶在使用多個主機線程、gpu和CUDA流時顯式地控制庫的功能。
例如,應用程序可以使用cudaSetDevice將不同的設備與不同的主機線程相關聯,并且在每個主機線程中,使用一個唯一的cuDNN句柄,該句柄將庫調用定向到與之關聯的設備。因此,使用不同句柄進行的cudn庫調用將自動在不同的設備上運行。
假定與特定cuDNN關聯的設備在相應的cudncreate()和cudndestory()調用之間保持不變。為了使cuDNN庫在同一主機線程中使用不同的設備,應用程序必須通過調用cudaSetDevice()設置要使用的新設備,然后通過調用cudnnCreate()創建另一個cuDNN,cuDNN將與新設備相關聯。
cuDNN API Compatibility
Beginning in cuDNN 7, the binary compatibility of a patch and minor releases is maintained as follows:
Any patch release x.y.z is forward or backward-compatible with applications built against another cuDNN patch release x.y.w (meaning, of the same major and minor version number, but having w!=z).
cuDNN minor releases beginning with cuDNN 7 are binary backward-compatible with applications built against the same or earlier patch release
(meaning, an application built against cuDNN 7.x is binary compatible with cuDNN library 7.y, where y>=x).
Applications compiled with a cuDNN version 7.y are not guaranteed to work with 7.x release when y > x.
- Convolution Formulas
This section describes the various convolution formulas implemented in convolution functions.
The convolution terms described in the table below apply to all the convolution formulas that follow.
Table 1. Convolution terms
總結
以上是生活随笔為你收集整理的cuDNN 功能模块解析的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 在OpenShift平台上验证NVIDI
- 下一篇: 使用TensorRT集成推理infere