Back Home Next  

| Japanese | English |

MEASUREMENT OF JAPANESE VOWEL A, I, U, E, O
(Analysis of the Japanese voice 5)

 
Date: 23:00, 30 Sep. 2002
Place: Nagoya, Japan
Microphone: SONY ECM-MS957
Microphone amplifier SONY DAT WALKMAN TCD-D100
Personal computer: DELL INSPIRON 7500
OS: Windows 2000 Professional
Software: DSSF3
WAVE sound file: voice5.wav (44.1kHz / Stereo / 7.58sec / 1.27MB)

Five vowels a, i, u, e, and o were measured by the running ACF.

This is the power spectrum of "i".

The peak frequencies are found at 300, 600, 900, 1200, 3000 Hz. The high frequency range of the spectrum roughly decreases 6dB/Oct. It is the general characteristic of the speech signal. The fundamental frequency is at 300 Hz and the first formant (F1) is at 600 Hz.

This is the power spectrum of "u".

Same as "i", the fundamental frequency and the first formant (F1) is at 300 and 600 Hz.

The spectrum of "e".

The fundamental and F1 frequency is seen at 240 and 720 Hz.

The spectrum of "o".

The fundamental and the F1 frequencies are at 280 and 840 Hz.

Next, the running ACF analysis is shown.

This is the time change of the sound level. Five utterances can be seen.

This is the ACF of "e" measured at the point when the te is minimum.

This point is 15 ms after the utterance began. It is the point that the change of the voice is the maximum.

The t1 value corresponds to the high frequency component above 3 kHz. It is the characteristic of "e".

This is the ACF of "o" measured at the point when the te is minimum.

This point is 15 ms after the utterance began. It is the point that the change of the voice is the maximum.

The t1 value corresponds to the frequency component at 800 Hz. It is the characteristic of "o".

All of the analysis results are summarized.

The figure below shows all of the ACF and the IACF parameters. This figure can be output by one click using the screen copy function of the DSSF3.

In the t1 graph, the difference between five vowels can be seen clearly.

 
vowel   SPL time (s) utterance (ms) t1(ms)

f1

te (ms)
a start -57.41 0.605 0 0.25 0.21 6.48
    -22.43 0.61 5 1.18 0.44 3.98
    -21.02 0.615 10 1.25 0.52 12.25
    -14.6 0.62 15 1.16 0.52 41.98
  max -11.26 0.625 20 1.13 0.54 44.08
    -11.52 0.63 25 1.13 0.48 49.13
i start -59.03 2.075 0 0.14 0.06 9.88
    -45.37 2.08 5 0.32 0.65 9.88
    -30.26 2.085 10 0.27 0.97 6.78
    -25.26 2.09 15 2.13 0.35 17.19
  max -23.75 2.095 20 1.79 0.11 26.88
    -23.99 2.1 25 0.27 0.76 29.22
u start -58.14 3.495 0 0.7 0.18 5.03
    -24.77 3.5 5 0.05 1 5.19
    -17.23 3.505 10 2.18 1 5.05
    -13.56 3.51 15 2.15 0.98 56.93
    -11.84 3.515 20 2.06 0.79 49.3
  max -11.42 3.52 25 1.97 0.58 58.22
    -11.9 3.525 30 1.86 0.38 35.99
e start -59.34 5.04 0 0.18 0.08 3.88
    -30.84 5.045 5 1.84 0.7 24.14
    -27.51 5.05 10 2.04 0.43 29.05
    -21.17 5.055 15 0.29 0.5 15.48
    -17.05 5.06 20 1.59 0.36 17.98
    -15.88 5.065 25 1.63 0.36 34.68
    -15.45 5.07 30 1.61 0.34 22.88
  max -14.66 5.075 35 1.59 0.33 24.45
    -14.75 5.08 40 1.56 0.34 53.48
o start -59.8 6.64 0 0.29 0.07 20.92
    -47.02 6.645 5 1.7 0.28 25.34
    -37.01 6.65 10 1 0.93 25.34
    -16.71 6.655 15 1.18 0.35 21.8
    -13.03 6.66 20 1.29 0.32 126.27
    -13.28 6.665 25 1.25 0.41 65.41
    -11.13 6.67 30 1.2 0.58 136.91
  max -9.85 6.675 35 1.22 0.5 59.52
    -10.03 6.68 40 1.2 0.49 44.89


In the present ACF analysis, the measured parameters are the time delay and the amplitude of the first peak. In the table above, the ACF parameters t1, f1, and te measured in every 5 ms for each vowel are summarized. To identify what sound is uttered or who is talking, it is necessary to compare those parameters for various conditions. As the first step, the ACF was measured for different vowels uttered by the same speaker. 

For comparison, the ACF parameters measured in the last experiment is summarized below.

  time (s) utterance (ms) t1(ms) f1 te (ms)
a tune do 0.625 35 1.13 0.27 6.8
a tune re 2.04 10 1.22 0.32 4.3
a tune mi 3.575 20 1.2 0.32 6.27
a tune fa 5.075 20 1.22 0.38 4.94
a tune so 6.605 25 1.2 0.66 7.76
a tune ra 8.14 5 1.18 0.49 6.45
a tune si 9.65 10 1.3 0.3 4.91
a tune do high 11.255 20 1.25 0.5 4.45
a vowel compare   5 1.18 0.44 3.98
i   10 0.27 0.97 6.78
u   10 2.18 1 5.05
e   15 0.29 0.5 15.48
o   15 1.18 0.35 21.8

April 2003 by Masatsugu Sakurai


Back Home Next