Table 1     Characteristics of sounds of CVR

Sound Characteristics in time domain Characteristics in frequency domain Short time energy Zero crossing rate

surd less obvious Obvious, energy mainly locates in high frequency band weak high

sonant obvious and periodic has formant, energy mainly locates in low frequency band strong low

silence less obvious less obvious weak high

impulse noise transient strong high

Background sounds mainly consist of various sounds except cockpit voices and aviation noises。 Different background sound implies that special event has happened [3]。Cockpit voices involve conversations between pilot and co-flyers, communication from control tower and speech for navigation and identification。 Voice signals are a time-varying and non-stationary random process, but its characteristic keep unchangeable in a short time 10-30 ms because of relatively stability of vocal  cords  sound channel。 Chinese language includes surd and sonant。 Table 1 shows some characteristics of sounds of  CVR。

3。Basic VAD algorithm based on double thresholds

3。1。Basic conception

(1)Short Time Energy (STE): The sound intensity of a speech series x(n) is described by short  time energy, which is defined as follows:

mGenerally, we use ZCR to detect sonant and STE to surd in practical applications [4]。 The whole VAD process is pided to four sections: silence section (status=0), transition section (status=1), speech section (status=2) and end section。 At the beginning of VAD, we set two thresholds for STE and ZCR each other, for example, high threshold Tamp1 and Tzcr1, low threshold Tamp2 and Tzcr2。 Besides, we define a variable count as a speech counter, silence as silence counter, minlen as a minimum time threshold。 Figure 1 shows flow of VAD based on double thresholds。

Many practices prove that this method can separate speeches from background noises effectively and efficiently in high SNR according to table 1。 However, the aviation condition with low SNR and awful environment for record causes the method loses its own performance, because the speeches are submerged in strong aviation background noises。 Therefore, former noise reduction and speech enhancement are becoming extremely important。

Especially, STE of surd is very weak  and  STE  of sonant is quite strong。

(2)Zero Crossing Rate (ZCR): The ZCR of a speech

series

x(n) is defined as follows:

where sgn xis a sign function and w(n) is a window function, which are defined as follows:

4。

Scheme of basic spectral subtraction

The basic spectral subtraction (SS) method is described

sgn x1 (x Š0)

(3)

briefly in this section。 Assume that a noisy speech signal is expressed as

where s(i) and

d (i) are  a  frame  of  clean  speech  and

3。2。 VAD based on double thresholds

noise,    respectively。    Considering    human’s    ear   be

上一篇:可重构机床设计英文文献和中文翻译
下一篇:模糊TOPSIS方法对初级破碎机英文文献和中文翻译

酵母菌发酵生产天然香料...

从政策角度谈黑龙江對俄...

基于Joomla平台的计算机学院网站设计与开发

浅论职工思想政治工作茬...

上海居民的社会参与研究

AES算法GPU协处理下分组加...

浅谈高校行政管理人员的...

压疮高危人群的标准化中...

STC89C52单片机NRF24L01的无线病房呼叫系统设计

提高教育质量,构建大學生...