當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

音频存储格式wav介绍与解析

發布時間：2024/3/26 编程问答 38 豆豆

生活随笔收集整理的這篇文章主要介紹了音频存储格式wav介绍与解析小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

音頻格式

音頻格式中規定了使用多少 bits 來對信號進行編碼。

無壓縮的格式
無損壓縮
有損壓縮

1. wav 音頻格式介紹

微軟和 IBM 于 1991 年提出的資源交換的文件格式 RIFF（ resource interchange File Format）；

wav 是屬于RIFF 中的一個應用實例；

1.1 RIFF 的組成

RIFF的其他實例還包含了其他的音視頻格式 AVI，圖像動畫ANI；
RIFF 文件由一個表頭 header, 多個區塊 chunk 組成；

1.2 wav 的組成

打開該網址：

http://soundfile.sapp.org/doc/WaveFormat/

wav 的 Header : 使用 32 位正整數表示整個文件的大小，故wav 大小不超過 4 ＧＢ；

第一個區塊，格式子塊，Format chunk: 記錄了音頻的相關格式信息包括如下：
編碼格式，通道數，采樣率，Ｂyte Ｒate 傳輸速率（字節每秒），　塊對齊,

第二個區塊，數據子塊，data chunk ：開始存儲音頻的數據，

注意到在數據塊中，左通道和右通道的數據有依次間隔存放的，按照　
左通道１，右通道１，左通道２，　右通道２　這樣的順序依次交替存放；

2. python 讀取 wav 文件，

調用　struct module,

https://docs.python.org/3/library/struct.html?highlight=struct#struct.unpack_from

注意，使用 struct.unpack(),

其中，關鍵點：

字節順序 byte order ：區分高位在前還是低位在前；

< 表示，低位在前；
>表示，高位在前；

數據類型：

H unsigned short　　integer　2　　

Ｈ　表示無符號的短整形，　integer, 　占2 個字節；

I unsigned int　integer　4

Ｉ: 表示無符號的整形，　integer占4 個字節；

2.1 struct.unpack() 的使用

注意到根據字節序來判斷使用的場景，

byteorder 是big: 　即高位在前時，　使用 f.read()直接打開；

byteorder 是little: 　即低位在前時，　使用 struct.unpack() 函數打開；

import structf = open("./male_audio.wav", mode = "rb") #　以二進制的只讀模式打開該文件；chunk_id = f.read(4) # 　文件的前４byte　字節代表　ＲＩＦＦ； print("the chunk id:", chunk_id)# < 代表低位在前，　　Ｉ：代表無符號的整數, 4byte； chunk_size = struct.unpack('<I', f.read(4))[0] print("the chunk size :", chunk_size)wav_format = f.read(4) print("the wav format", wav_format)sub_chunck_1_id = f.read(4) print("the first sub chunk id:", sub_chunck_1_id)sub_chunck_1_size = struct.unpack('<I', f.read(4))[0] print("the sub chunk 1 size: ", sub_chunck_1_size)# < 代表低位在前，　　H：代表無符號的短整數，　２個字節； audio_format = struct.unpack('<H', f.read(2))[0] print("the audio format", audio_format) # PCM = 1 (i.e. Linear quantization) # Values other than 1 indicate some form of compression.# Mono = 1, Stereo = 2, etc. num_channel = struct.unpack('<H', f.read(2))[0] print(" the num channel of audio:", num_channel)# sampel rate 8000, 44100, etc. sample_rate = struct.unpack('<I', f.read(4))[0] print("the sample rate: ", sample_rate)#ByteRate == SampleRate * NumChannels * BitsPerSample / 8 byte_rate = struct.unpack('<I', f.read(4))[0] print("the byte rate: ", byte_rate)# BlockAlign == NumChannels * BitsPerSample/8 # The number of bytes for one sample including # all channels. I wonder what happens when # this number isn't an integer? block_align = struct.unpack('<H', f.read(2))[0] print(" the block align:", block_align)# BitsPerSample 8 bits = 8, 16 bits = 16, etc. bits_per_sample = struct.unpack('<H', f.read(2))[0] print("the bits per sample:", bits_per_sample)# ---- the following data sub chunk---sub_chunck_2_id = f.read(4) print("\n the sub chunk 2:", sub_chunck_2_id)sub_chunck_2_size = struct.unpack('<I', f.read(4))[0] print("the sub chunk_2_size:", sub_chunck_2_size)data0 = struct.unpack('<H', f.read(2))[0] print("the first data:", data0)# 這里需要注意，　第一個前４　個字節中，　前兩個字節代表左聲道，　后兩字節代表右聲道；

2.2 f.read() 函數的使用：

# ”讀“的細節操作 # 1. # f.read(字節數)：讀取的是字節 # 字節數默認是文件長度；下標會自動后移 # f = open('test.txt','r') # print(f.read()) # f.close()# 2.f.readline([limit]) # 讀取一行數據 # limit # 限制的最大字節數 # f = open('test.txt', 'r') # # content = f.readline()#只讀取一行 # print(content) # # content = f.readline()#只讀取一行 # print(content) # f.close()# 3.f.readlines() # 會自動的將文件按照換行符進行處理 # 將處理好的每一行組成一個列表返回 f = open('test.txt', 'r') cn = f.readlines() for line in cn:print(line, end='') f.close()

3. 其他存儲格式

RIFF: 由微軟和ＩＢＭ提出;
AIFF : 蘋果公司提出；

無損格式：　ＦＬＡＣ
　free lossless audio codec;
lossy : 有損格式

MP3, mostly for music, based on: ? Modified discrete cosine transform (MDCT) ? Sub-band coding ? Advanced Audio Coding (AAC) ? OPUS ? Speex

總結

以上是生活随笔為你收集整理的音频存储格式wav介绍与解析的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Postgresql 通过出生日期获取年
下一篇： Vue2版本六边形蜂窝 Demo