YMEC software - Analysis of Japanese voice 3 (Simple Sound Measurement with PC)

| Japanese | English |

MEASUREMENT OF A FEMALE VOICE, CONTRAST WITH MALE VOICE
(Analysis of the Japanese voice 3)

This time, a female voice is measured and compared with the previous male voice.

Date:	17:00, 25 Sep. 2002
Place:	Nagoya, Japan
Microphone:	SONY ECM-MS957
Microphone amplifier	SONY DAT WALKMAN TCD-D100
Personal computer:	DELL INSPIRON 7500
OS:	Windows 2000 Professional
Software:	DSSF3
WAVE sound file:	voice3.wav (44.1kHz / Stereo / 2sec / 345KB)

Note: In this measurement and later, microphone amplifier is used to improve the S/N ratio. The running ACF measurement is performed with monitoring the result in RA.

The power spectrum of female voice /a/. The graph is zoomed up into 100-5k Hz. The "level range" and the "frequency range" were adjusted manually.

The fundamental frequency is found at 260 Hz. Formant frequency is seen at 800 Hz and higher harmonics are found at 1060, 1500, 1750.

As a reference, it is said that the fundamental frequency of general female voice is 225 Hz. Male voice has 120 Hz and a small child's voice has 300 Hz (Ray D. Kent and Charles Read, "The acoustic analysis of speech", 1992, Singular Publishing Group, Inc. ).

Next, the running ACF was checked to find the peak at this formant frequency (800 Hz).

The maximum peak in the ACF is found at 1.25 ms. This corresponds to 800 Hz (1000/1.25).

In the same graph, the peak at 4.55 ms is found. This peak is the fundamental frequency at 219 Hz.

The peak identified is not fixed during utterance. It is necessary to gather the data not only for an instant but its time change. Next, as in the last time, the ACF is calculated in every 5ms after the utterance.

This is the running ACF analysis window.

5 ms after the utterance.

The t_e value is 12.58 ms. From the beginning of utterance to the maximum power level, the t_e decreases roughly. In the ACF waveform, there are several peaks at 0.2, 0.28, 0.4, 0.68, 0.85, 1,1.18, 1.2, 1.35, 1.7, 1.95 ms.

10 ms after the utterance

The t_e value is 12.51 ms.

In the figure above, there are several peaks at 0.2, 0.4, 0.6, 0.9, 1.1 ms. Also you can see a large peak at 3.8 ms. This seems the fundamental frequency. This time, the fundamental frequency was identified until 10 ms.

Formant frequency

F3 1000/0.7=1500Hz (The oppressed peak)
F2 1000/0.9=1100Hz (The oppressed peak)
F1 1000/1.1=900Hz Peak frequency

15 ms after the utterance

The t_e value is 5.20 ms.

0.8, 1.35 ms, fundamental frequency, 4.5ms 222Hz
0.8 ms formant frequency 1000/0.8=1250 Hz, 3rd formant (F3)
1 ms, second volley 1000/1=1000Hz, 2nd formant (F2)
1.35 ms, formant frequency 1000/1.35=750 Hz 1st formant (F1)

The lowest peak in the spectrum is called the first formant (F1), followed by the other formants F2, F3, and so on. As for the autocorrelation for 15ms after utterance, each above-mentioned peak corresponds to the formant frequencies.

20 ms after the utterance

t_e 8.50ms
0.68 ms , 1.21 ms, fundamental frequency, 4.5 ms 222Hz
0.68 ms, formant frequency 1000/0.68=1500Hz, 3rd formant (F3)
Second volley is 0.96 ms, 1000Hz 2nd formant (F2)
1.28 ms, formant frequency, 1000/1.28=750 Hz 1st formant (F1)

25 ms after the utterance

t_e 23.14 ms
0.62, 1.25 ms, fundamental frequency, 4.55 ms 219 Hz
First peak 0.62 ms, 1612 Hz 3rd formant (F3)
Second volley is 0.9 ms, 1111Hz 2nd formant (F2)
1.3 ms, formant frequency, 1000/1.25=800 Hz 1st formant (F1)

Here, these values are compared with those of male voice.

First peak 0.68 ms 1470 Hz 3rd formant (F3)
Second volley is 1.05 ms 952Hz 2nd formant (F2)
1.3ms, formant frequency, 1000/1.3=769Hz New 1st formant (F1)

Male voice has slightly lower formant frequencies. Also, male voice has the wide bandwidth at low frequency as shown in the spectrum. The highest frequency of the male voice is 3 kHz. That is different from the female voice that has higher frequency until 12 kHz.

So, the ACF of the female voice has clear peaks in the short range. These peaks represent formant frequencies separately. Doesn't this mean that female voice tends to be intelligible rather than male voice?

April 2003 by Masatsugu Sakurai

MEASUREMENT OF A FEMALE VOICE, CONTRAST WITH MALE VOICE (Analysis of the Japanese voice 3)

MEASUREMENT OF A FEMALE VOICE, CONTRAST WITH MALE VOICE
(Analysis of the Japanese voice 3)