转载自:http://ganeshtiwaridotcomdotnp.blogspot.com/2011/08/silence-removal-and-end-point-detection.html
Visit http://ganeshtiwaridotcomdotnp.blogspot.com/2011/06/final-report-text-prompted-remote.html for more detail about our project.
For the purpose of silence removal of captured sound, we used the algorithm specified in
"A New Silence Removal and Endpoint Detection Algorithm for Speech and Speaker Recognition Applications"
Our actual system was in JAVA but we verified the performance of this algorithm in MATLAB.
Inputs and Output
Here is the Matlab code :
It first records sound for 5 seconds and removes the silence and then plays back.
//silence removal and end point detection THRESHOLD=0.3; // adjust value yourself TIME=5; %capture; Fs = 11025; y = wavrecord(TIME*Fs,Fs); %plot(y) %wavplay(y,Fs); samplePerFrame=floor(Fs/100); bgSampleCount=floor(Fs/5); %according to formula, 1600 sample needed for 8 khz %---------- %calculation of mean and std bgSample=[]; for i=1:1:bgSampleCount bgSample=[bgSample y(i)]; end meanVal=mean(bgSample); sDev=std(bgSample); %---------- %identify voiced or not for each value for i=1:1:length(y) if(abs(y(i)-meanVal)/sDev > THRESHOLD) voiced(i)=1; else voiced(i)=0; end end % identify voiced or not for each frame %discard insufficient samples of last frame usefulSamples=length(y)-mod(length(y),samplePerFrame); frameCount=usefulSamples/samplePerFrame; voicedFrameCount=0; for i=1:1:frameCount cVoiced=0; cUnVoiced=0; for j=i*samplePerFrame-samplePerFrame+1:1:(i*samplePerFrame) if(voiced(j)==1) cVoiced=(cVoiced+1); else cUnVoiced=cUnVoiced+1; end end %mark frame for voiced/unvoiced if(cVoiced>cUnVoiced) voicedFrameCount=voicedFrameCount+1; voicedUnvoiced(i)=1; else voicedUnvoiced(i)=0; end end silenceRemovedSignal=[]; %----- for i=1:1:frameCount if(voicedUnvoiced(i)==1) for j=i*samplePerFrame-samplePerFrame+1:1:(i*samplePerFrame) silenceRemovedSignal= [silenceRemovedSignal y(j)]; end end end %---display plot and play both sounds figure; plot(y); figure; plot(silenceRemovedSignal); %%%play wavplay(y,Fs); wavplay(silenceRemovedSignal,Fs);
NOTE: Don't forget to adjust the microphone level and sound boost feature to achieve good results.
问:Wow.. Excellent.. Can u help me compare the efficiency of this algorithm with the STE and ZCR methods?? Thank you
答:As shown in research paper, the comparision of efficiency is as follows :
Phrases______STE______ZCR-STE__Proposed_Method
Combination
lock number__77.9531% 70.3720% 83.5565%
Running Text_50.8391% 50.1231% 59.7181%