可以使用的MFCC程序
在****里面找了几个MFCC的程序,一点点调试,终于得到一个可以使用程序,作为人生中的第一篇博客贴出来。关于MFCC提取过程中的数学推导还不够理解,以后再看咯,以下我自己整理的从录音到MFCC提取的两个MATLAB程序组:
主程序:
%% 自我录音,录音为95184726
%% 录音
%音频采样率Fs = 8000,采样位宽BitPerSmpl = 16,双声道ChanNum = 2
Fs = 8000;
BitPerSmpl = 16;
ChanNum = 2;
MyVoice = audiorecorder(Fs, BitPerSmpl, ChanNum);
%% 开始录制
record(MyVoice);
%% 录制结束
% pause(MyVoice);
% resume(MyVoice);
stop(MyVoice);
%% 播放录音
% play(MyVoice);
MyVoice = getaudiodata(MyVoice);
audiowrite('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav', MyVoice, Fs);
%% 读取音频
MyVoice = audioread('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav');
MyVoiceInfo = audioinfo('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav');
sound(MyVoice);
%% 观察频谱
MyVoiceNfft = abs(rfft(MyVoice));
figure
plot(MyVoiceNfft);
%% MFCC处理
WinTime = 0.025;
[cepstra_left, aspectrum_left, pspectrum_left] = MFCC(MyVoice(:, 1), Fs, WinTime);
[cepstra_right, aspectrum_right, pspectrum_right] = MFCC(MyVoice(:, 2), Fs, WinTime);
figure
subplot(211)
plot(cepstra_left);
title('左声道梅尔倒谱系数');
subplot(212)
plot(cepstra_right);
title('右声道梅尔倒谱系数');
子程序:
function [cepstra, aspectrum, pspectrum] = MFCC(samples, sr, wintime, steptime, nfilts, numcep, preemph)
% [cepstra, aspectrum, pspctrum] = melfcc(samples, sr, wintime, steptime, numcep, preemph)
% - take power spectra of the STFT
% - warp to a mel frequency scale
% - take the DCT of the log-Mel-spectrum
% - return the first <numcep> components
% samples: vector of signal
% sr: sample rate
% wintime (0.025): window length in second
% steptime (0.010): step between successive windows in second
% numcep (13): number of cepstra to return
% nfilts (40): number of triangle filter to use
% preemph (0.97): pre-emphasis filter coefficient
if nargin < 2; sr = 16000; end
if nargin < 3; wintime = 0.025; end
if nargin < 4; steptime = 0.010; end
if nargin < 5; numcep = 13; end
if nargin < 6; nfilts = 40; end
if nargin < 7; preemph = 0.97; end
winpts = round(wintime*sr);
steppts = round(steptime*sr);
NFFT = 2^(ceil(log2(winpts)));
% figure
% subplot(211)
% plot(samples(:,1));
% title('信号的频谱');
samples = filter([1, -preemph], 1, samples);
% subplot(212)
% plot(samples(:,1));
% title('预加重后信号的频谱');
% subplot(212)
% plot(samples(:,2));
% compute FFT power spectrum
%pspectrum = powspec(samples, sr, wintime, steptime, NFFT);
%pspectrum = abs(spectrogram(samples*32768, winpts, winpts - round(steptime*sr), NFFT)).^2;
[frame, tc, ~] = enframe(samples, hamming(winpts), steppts,sr);
pspectrum = abs(rfft(frame', NFFT)).^2;
%% 为MFCCtest编写的filterBank
% obtain mel filter bank
% [x,mc,mn,mx]=melbankm(p,n,fs,fl,fh,w)
% filterBank=melbankm(213, 2047, sr);
% figure
% subplot(211)
% plot((0:floor(2047/2))*sr/2047,melbankm(213, 2047, sr)') ;
% % filterBank = melFilterBank(NFFT, sr, nfilts); %%%% 程序内原来调用的函数
% subplot(212)
% plot(pspectrum);
% auditory spectrum
%% 尝试编写通用filterBank
filterBank = melbankm(nfilts, NFFT, sr);
%%
aspectrum = filterBank * pspectrum;
%%
% apply DCT to convert aspectrum to mel cepstrum
logAspec = log(aspectrum);
cepstra = dct(logAspec);
cepstra = cepstra(1:numcep,:);
if nargout < 1
[nf, nc] = size(cepstra);
imh = imagesc(tc/sr, 1:nf, cepstra);
axis('xy');
xlabel('Time (s)');
ylabel('Mel-cepstrum coefficient');
map = (0:63)' / 63;
colormap([map, map, map]);
colorbar;
end
end
8000Hz采样率下,我声音的频谱(横坐标并不是严格的频率)
MFCC系数:
希望如果有大佬看了能给些关于该算法应用的建议,是通过梅尔倒谱系数做互相关还是什么的操作,来达到语音识别的目的?啦啦啦