close

Вход

Забыли?

вход по аккаунту

JP2011151559

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2011151559
The present invention provides a high sound pressure area forming method capable of
suppressing interference of respective sounds and forming a high sound pressure area excellent
in S / N ratio when a speaker array is used. In a method of forming a high sound pressure area
by controlling the phase and volume of an audio signal from each speaker of a speaker array in
which a plurality of speakers are linearly or planarly arranged, in a sound overlapping area,
Sound interference is suppressed by cutting a spectrum having an intensity equal to or less than
a predetermined threshold value by a frequency band selection method. [Selected figure] Figure
1
High sound pressure area formation method
[0001]
The present invention relates to a method of forming a high sound pressure area by controlling
the phase and volume of sound from each speaker of a speaker array in which a plurality of
speakers are linearly or planarly arranged.
[0002]
The speaker array is a system in which a plurality of speakers are arranged in a straight line or a
plane, the phase and volume of the sound from each speaker are controlled, and the sound is
given directivity to be output.
09-05-2019
1
FIG. 1 shows the principle of the speaker array.
[0003]
The conventional speaker array determines the phase difference of the sound wave generated
from the difference in distance from the target position to each speaker, and controls the phase
and volume of the sound output from each speaker in consideration of the attenuation of the
sound, thereby achieving the target position. The phase and amplitude of the sound wave are
made to be uniform only in the above, and a high sound pressure area (hereinafter also referred
to as a focal point) is formed at the target position. By such focus forming beamforming, a high
sound pressure area in the range of several tens of degrees called a main lobe is formed around
the focus direction, and a speaker having directivity in a pseudo manner is played. .
[0004]
FIG. 2 schematically shows the structure of the flat loudspeaker array system. In this system, an
audio signal to which a delay for each speaker is added is created by a real-time OS software in a
general-purpose PC, an output voltage is amplified by a digital amplifier, an audio is output from
the speaker, and a high-pitched sound toward a focus position A pressure beam is to be formed.
Also, FIG. 3 shows photographs of the front and back of the 128ch flat speaker array system.
[0005]
However, in the conventional speaker array, when a plurality of high sound pressure areas are
formed, there is a problem that the respective sounds interfere with each other and a high sound
pressure area is formed outside the target position.
[0006]
The present invention solves such problems of the prior art, and when using a speaker array,
high sound pressure can suppress interference of each sound and form a high sound pressure
region excellent in S / N ratio. An object is to provide a region formation method.
[0007]
In order to solve the above problems, the present invention is a method of forming a high sound
pressure area by controlling the phase and volume of an audio signal from each speaker of a
09-05-2019
2
speaker array in which a plurality of speakers are linearly or planarly arranged. There is
provided a high sound pressure area forming method characterized in that the interference of
sound is suppressed by cutting a spectrum having an intensity equal to or less than a
predetermined threshold in an area where sounds overlap by a frequency band selection method.
[0008]
According to the present invention, since the above method is adopted, it is possible to suppress
the interference of each sound and form a high sound pressure area excellent in the S / N ratio
when the speaker array is used.
The present invention can be applied to a speaker system for art museum guidance, a speaker
system for street advertisement, a voice interface for robots, and the like.
[0009]
It is a figure which shows the principle of a speaker array.
It is a figure showing typically composition of a plane speaker array system.
It is a photograph of the front and the back of a 128ch plane speaker array system. It is a figure
which shows the relationship between background noise and a threshold. It is a figure showing
the overlap rate when changing the FFT resolution of a male voice and a female voice. It is a
figure which shows the example of high sound pressure area ¦ region formation by this invention
method. It is a figure which shows the example of high sound pressure area ¦ region formation
by the conventional method.
[0010]
Hereinafter, the present invention will be described in detail.
[0011]
09-05-2019
3
The present inventors have conducted studies using delay-and-sum method (hereinafter also
referred to as DSBF), but also in this case, when the number of microphones, that is, the number
of sound sources increases, other than the extracted sound sources We experienced that the
accuracy of sound source localization and sound source separation decreased.
On the other hand, it is conceivable to construct a large scale microphone array system and raise
the S / N ratio, but there is a limit in relation to the demand for miniaturization of the device and
the application aspect. Therefore, in the present invention, this problem is addressed by focusing
on the sparsity of speech and using a frequency band selection (hereinafter also referred to as
FBS) method.
[0012]
That is, according to the present invention, in a method of forming a high sound pressure area by
controlling the phase and volume of an audio signal from each speaker of a speaker array in
which a plurality of speakers are linearly or planarly arranged, an area where sounds overlap.
The present invention is characterized in that sound interference is suppressed by cutting a
spectrum having an intensity equal to or less than a predetermined threshold value in the
frequency band selection method.
[0013]
When a plurality of sound sources are output from the speaker array, a phenomenon occurs in
which the sounds of each other interfere with each other to form a high sound pressure area at a
place other than the focal point.
The cause is considered to be that the power sources overlap in the "frequency-power" region at
or near the same frequency as each sound source. If the frequencies of the sounds differ from
each other, the sounds will cancel out due to the phase difference, but there is a point where the
phases overlap at the same frequency or near that frequency, and it is considered that a high
sound pressure region other than the purpose is formed at that point. Be
[0014]
In the present invention, the audio signal of a certain sound source is subjected to discrete
09-05-2019
4
Fourier transform (DFT), and if there is a sound overlapping at the same frequency, only the
sound with strong power is left, the weak sound is cut, and the interference of the sound at the
same frequency is suppressed. Improve the S / N ratio.
[0015]
For example, consider the overlap of sounds in the same frequency band for two voices.
When two sounds are output from the speaker array, the overlap of sounds can be checked by
the following procedure.
[0016]
1. The speech sa (n) and sb (n) are discrete Fourier transformed to be Sa (f) and Sb (f).
[0017]
2. Set the threshold and take out only the strong spectrum.
[0018]
3. The following equation shows how strong the sounds overlap.
[0019]
[0020]
4.
09-05-2019
5
From the beginning to the end of the sound source, create a histogram to see how the values in
the above equation are distributed. The histogram is, for example, as follows.
[0021]
(A) −10 <OverLap ≦ 0 (b) −20 <OverLap ≦ −10 (c) −30 <OverLap ≦ −20 (d) OverLap ≦
−30 When the OverLap value is (a), the two sounds are There is a strong interference at the
same frequency. On the other hand, in (c) and (d), the strong power spectrum and the weak
power spectrum interfere with each other at the same frequency. What is a problem in creating a
high sound pressure area using a speaker array is the formation of a high sound pressure area
other than the focal point due to the interference of a strong power spectrum and a strong power
spectrum. If cut by the FBS method, it is possible to suppress the sound interference in the
frequency band near or at the same frequency and to improve the S / N ratio.
[0022]
In addition, even if weak power spectra interfere with each other, only a weak sound pressure
region is formed, which is not a problem. Furthermore, the above equation can not compare the
weak power spectrum and the strong power spectrum. Therefore, this problem can be solved by
cutting the weak power spectrum.
[0023]
Follow the steps below for threshold creation.
[0024]
1.
The background noise is extracted from the sound source and the power of the background noise
is determined.
09-05-2019
6
[0025]
2. Check if there is a sound within background noise + α dB.
[0026]
3. Take out the high power spectrum that interferes with each other.
[0027]
FIG. 4 shows the relationship between background noise and thresholds. The horizontal axis of
the figure represents the number of the sound source. The horizontal axis 1 represents the sound
source 1, 5 represents the sound source 5, and 10 represents the sound source 10. The vertical
axis represents the sound source presence rate. As an example, the sound source 10 has 80% or
more of the sound of background noise or more, and only 5% of sound of background noise +40
dB or more. Note that the sound source in the figure is for each of five men and five women to
read the manuscript for 30 seconds. Male voices are represented by odd numbers, and female
voices are represented by even numbers.
[0028]
From the figure, looking at the relationship between the background noise and the power spectra
of sound source 1, sound source 2 ... sound source 10 and all the sound sources, when
background noise is +40 dB or more, the conditions given for all voices are satisfied. It can be
seen that there is almost no sound source (5% or less). The sound taken out here is a sound of
background noise +30 dB or more based on a rule of thumb. From the experimental results of the
sound field measurement of the past speaker array, it is known that the difference between the
sound pressure at the focal point and the average sound pressure at other locations is about 15
dB. Therefore, from the focus sound pressure to the sound of the focus sound pressure -15 dB
forms a high sound pressure area at a place other than the focus. If the sound has a background
noise of +30 dB or more, any sound with the same frequency and the same sound pressure will
overlap, and it is a marginal range that affects the above region. Therefore, the sound source
extracted by the threshold here is a sound of background noise +30 dB or more.
09-05-2019
7
[0029]
When performing FBS, if the FFT resolution is coarse, inverse FFT is performed, and a problem
occurs when returning from the "frequency-spectrum region" to the original "time-amplitude"
region. For example, when the weak spectrum of the sound source is cut by the FBS when the
FFT resolution is 172 Hz, the sound corresponding to the width of 172 Hz disappears. In this
case, the original sound may be far from being heard. On the other hand, if the FFT resolution is
too high, the sparsity of the sound is established, and the sound becomes independent in the
"frequency-power spectrum region". From the above, it can be understood that it is important to
use an optimal resolution when performing FBS. In the case of performing FBS, the optimal FFT
resolution is determined based on the overlapping rate of sound in the frequency domain.
[0030]
FIG. 5 shows the overlapping rate when the FFT resolution of the male voice and the female voice
is changed. Looking at the portion of 0 to -10 dB where the high power spectrums overlap, when
the FFT resolution is changed from 345 Hz to 10.8 Hz, the overlapping rate of sound hardly
changes. However, when the FFT resolution is 5.4 Hz, the overlapping rate of sound is extremely
reduced. This phenomenon is the same for -10 to -20 dB and -20 to -30 dB. On the other hand, in
the case of -30 dB or more, the overlapping rate is the highest when the FFT resolution is 5.4 Hz,
and is otherwise the same. This is because the sparsity is established when the FFT resolution is
5.4 Hz. From the above, it can be said from the data in the figure that an FFT resolution of 10.8
Hz is optimal.
[0031]
Here, FIG. 6 shows an example of the high sound pressure area formation according to the
method of the present invention, and FIG. 7 shows an example of the high sound pressure area
formation according to the conventional method (without using the FBS). In this example,
according to the method of the present invention, an improvement of 1.6 times the S / N ratio
was observed compared to the case of the conventional method.
09-05-2019
8
1/--страниц
Пожаловаться на содержимое документа