當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Matlab2013a学习之男女的声音识别

發(fā)布時(shí)間：2023/12/8 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了 Matlab2013a学习之男女的声音识别小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

人能夠很容易的聽(tīng)出說(shuō)話人的性別，我們能不能讓機(jī)器也像人一樣，聽(tīng)聲辨別性別？這個(gè)答案是肯定的，特別是隨著人工智能算法的發(fā)展，識(shí)別性能是不斷的提升。識(shí)別男女聲，也變的相對(duì)容易了。

人類基音的范圍約為70～350Hz左右，由于生理結(jié)構(gòu)的不同，男性與女性的聲音呈現(xiàn)出不同的聽(tīng)覺(jué)特征，男聲的基音頻率大都在100—200HZ之間，而女聲則在200—350HZ之間；在會(huì)話中，同一發(fā)音者的基音頻率變化的統(tǒng)計(jì)結(jié)果，如圖一所示。女聲與男聲相比，前者的平均值、標(biāo)準(zhǔn)差都為后者的兩倍左右。不同發(fā)音者的基音頻率分布如圖二所示，在對(duì)數(shù)頻率軸上男聲，女聲分別呈現(xiàn)正態(tài)分布，男聲的基音頻率的平均值和標(biāo)準(zhǔn)差分別為125HZ及其20HZ。女聲約為男聲的2倍。鑒于男女聲存在基音頻率的明顯差異，基音頻率可作為男女聲識(shí)別的依據(jù)。

代碼分為幾個(gè)部分，不同的部分實(shí)現(xiàn)不同的功能；
通過(guò)錄入一段音頻；代碼名稱：luru.m

fs=16000; fprintf('testing...\n');y=audiorecorder(fs, 16, 1); % 16000Hz 16bit 單聲道 recordblocking(y,5);%錄制5秒rbd=get(con_rbd,'value') ; if (rbd)delete('test_record/*.wav');m=1;%從頭開(kāi)始 endname=strcat('test_record\',...num2str(m),'.wav'); y1 = getaudiodata(y,'int16'); audiowrite(name,y1,fs); %生成音頻文件 1.wav cut(name); result = PitchDetect(name); disp(result); m=m+1; set(con_text,'string',result);

剪掉靜音時(shí)間段，代碼名稱：jiandiao.m

function y1=cut(s_address)y=audioread(s_address); h=hamming(320);% 計(jì)算短時(shí)平均能量SAE（short average energe） %信號(hào)的平方在與窗函數(shù)相卷 % E(n)＝[x(m)]^2*h(n-m),m從負(fù)無(wú)窮到正無(wú)窮求和，h(n-m)為漢明窗 e=conv(y.*y,h); % y.*2對(duì)y中各元素平方；conv(u,v) 求u與v的卷積% 對(duì)語(yǔ)音信號(hào)進(jìn)行切割，當(dāng)SAE小于能量大值的1/100時(shí)，認(rèn)為是起點(diǎn)或終點(diǎn)mx=max(e); n=length(e); y(n)=0; % 將原始語(yǔ)音信號(hào)矩陣擴(kuò)充至n維 for i=1:nif e(i)<mx*0.01e(i)=0;else e(i)=1; % e中非0的數(shù)用1來(lái)代替end end y1=y.*e; y1(find(y1==0))=[]; % 把0元素剔除 fs=16000; audiowrite(s_address,y1,fs);

男女聲基因頻率識(shí)別，代碼名稱：shibie.m

function pd=PitchDetect(s_address)waveFile = s_address; % fs = 16000 % y = cut(s_address); [y, fs] = audioread(waveFile); time=(1:length(y))/fs; frameSize=floor(40*fs/1000); %幀長(zhǎng)40ms 一共640個(gè)點(diǎn) floor不大于x的最大整數(shù) startIndex=round(7000); %起始序號(hào) endIndex=startIndex+frameSize-1; %結(jié)束序號(hào) frame = y(startIndex:endIndex); %取出該幀 frameSize=length(frame); frame2=frame.*hamming(length(frame)); % 加hamming窗 rwy = rceps(frame2); % 求倒譜 ylen=length(rwy); cepstrum=rwy(1:ylen/2); %基音檢測(cè) LF=floor(fs/500); %設(shè)置基音搜索的范圍點(diǎn)數(shù) HF=floor(fs/70); %設(shè)置基音搜索的范圍點(diǎn)數(shù) cn=cepstrum(LF:HF); %求倒譜 [mx_cep ind]=max(cn); %設(shè)置門限，找到峰值位置 if mx_cep > 0.08 & ind >LF a= fs/(LF+ind); elsea=0; end figure(2); plot(time, y); title(waveFile); axis tight ylim=get(gca, 'ylim'); line([time(startIndex), time(startIndex)], ylim, 'color', 'r'); line([time(endIndex), time(endIndex)], ylim, 'color', 'r'); title('語(yǔ)音波形'); figure(3); subplot(2,1,1); plot(frame); title('取出幀的波形'); subplot(2,1,2); plot(cepstrum); title('倒譜圖');[x,sr]=audioread(s_address); meen=mean(x); x= x - meen; updRate=floor(20*sr/1000); %每20ms更新 fRate=floor(40*sr/1000); %40ms一幀 n_samples=length(x); nFrames=floor(n_samples/updRate)-1; %幀數(shù) k=1; pitch=zeros(1,nFrames); f0=zeros(1,nFrames); LF=floor(sr/500); HF=floor(sr/70); m=1; avgF0=0; for t=1:nFramesyin=x(k:k+fRate-1);cn1=rceps(yin);cn=cn1(LF:HF);[mx_cep ind]=max(cn);if mx_cep > 0.08 & ind >LFa= sr/(LF+ind);elsea=0;endf0(t)=a;if t>2 & nFrames>3 %中值濾波對(duì)基音軌跡圖進(jìn)行平滑z=f0(t-2:t);md=median(z);pitch(t-2)=md;if md > 0avgF0=avgF0+md;m=m+1;endelseif nFrames<=3pitch(t)=a;avgF0=avgF0+a;m=m+1;endendk=k+updRate; end figure(4) subplot(211); plot((1:length(x))/sr, x); ylabel('幅度'); xlabel('時(shí)間'); subplot(212); xt=1:nFrames; xt=20*xt; plot(xt,pitch) xlim([0,3]); axis([xt(1) xt(nFrames) 0 max(pitch)+50]); ylabel('基音頻率/HZ'); xlabel('時(shí)間');Mypitch = max(pitch) if Mypitch>220pd = ['Woman ', num2str(Mypitch)]; elseif Mypitch<200pd = ['Man ', num2str(Mypitch)]; else pd = ['Sorry ', num2str(Mypitch)]; end

一個(gè)非常簡(jiǎn)陋的界面，不得不說(shuō)MATLAB功能還是比較強(qiáng)大的，代碼名稱：UI.m

clear;clc;close all; global n; n=1;set(0,'defaultfigurecolor','w'); %歸一化圖形界面 con_car=figure('position',[400 200 680 380],...'numbertitle','off',...'name','Man or Woman'); set(con_car,'defaultuicontrolunits','normalized');rbd=0; con_rbd=uicontrol('Style','radiobutton',...'Position',[0.15 0.62 0.15 0.05],...'Value',rbd,... rbd的值為0或1，選中為1，未選中為0'String','重新測(cè)試','backgroundcolor',get(gcf,'color'));% 關(guān)閉按鈕 con_close=uicontrol('style','pushbutton','position',[0.5 0.6 0.2 0.1],...'string','關(guān)閉','callback','close');% 測(cè)試按鈕 con_test=uicontrol('style','pushbutton','position',[0.3 0.6 0.2 0.1],...'string','測(cè)試'); % [left bottom width height]% 顯示字符串‘請(qǐng)一直說(shuō)話’和測(cè)試結(jié)果 con_text=uicontrol('style','text','position',[0.3 0.1 0.4 0.4],...'FontSize',30,'string','請(qǐng)一直說(shuō)話','backgroundcolor',get(gcf,'color'));% 調(diào)用錄音測(cè)試程序 set(con_test,'callback','test_record');

最后的運(yùn)行結(jié)果：

但在測(cè)試過(guò)程中，有時(shí)也會(huì)存在誤判，這多半與說(shuō)話的方式的有關(guān)，建議說(shuō)數(shù)字0-9，正確率比較高！

總結(jié)

以上是生活随笔為你收集整理的Matlab2013a学习之男女的声音识别的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： APP二维码微信扫描后无法下载的问题微
下一篇： HDOJ 2545 树上战争

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

Matlab2013a学习之男女的声音识别

總結(jié)