JP2003111185

Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2003111185
[0001]
TECHNICAL FIELD The present invention is intended to divide a space into a plurality of zones by
using at least two or more microphones, when the positions of a target sound source and a noise
source are given in the space. The present invention relates to an apparatus for collecting a
sound from a sound source (target sound source) in a desired zone independently of other zones
(noise sources), and more particularly to a microphone arrangement configuration.
[0002]
2. Description of the Related Art As a conventional zone separation sound collection technique,
for example, there is one using the following characteristics of sound. It is known that sound is
expressed as the sum of several frequency components. Therefore, when a plurality of sounds are
sounding at the same time, the sound source signal input to the microphone of each channel is
divided into bands where the frequency components from each sound source do not overlap on
the frequency axis, and Based on the arrival time difference and arrival level difference, it is
determined from which zone each frequency component is from, and by collecting and
combining the components from the same zone, the sound for each zone is collected separately
Methods were used. (Reference: Japanese Patent Application Laid-Open No. 10-313497
(Japanese Patent Application No. 9-252312) "Sound source separation method, apparatus and
recording medium")
[0003]
03-05-2019
1
However, in the sound collection means comprising a plurality of microphones in the
conventional sound source separation technology, the arrangement of the plurality of
microphones and the respective sound sources is not specified, and a plurality of sound
collection means are also provided. The center angle is 360 / n degrees (n (n) because the
arrangement of the microphones is limited to two microphones arranged in the same straight
line with the sound source or three microphones arranged at the apex of an equilateral triangle.
Is disadvantageous in that it is limited to the formation of zones such as arcs of the number of
microphones).
[0004]
SUMMARY OF THE INVENTION In order to solve the above problems, the present invention is
directed to an arrangement (a target sound source given a position in advance, a noise source)
for collecting a plurality of microphones. It is characterized in that the degree of freedom of the
zone shape that can be formed is increased by changing accordingly, that is, defining the distance
between each sound source and the plurality of microphones.
The specific method of each means is demonstrated in the following embodiment.
[0005]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS (Embodiment of the Invention of
Claim 1) FIG. 1 shows the configuration of a sound pickup apparatus which is an embodiment of
the invention of claim 1. The target sound source A is arranged closer to the microphone 1 (1-1)
(a2> a1) than the microphone 2 (1-2). On the other hand, the noise sources B, C,... Are arranged
closer to the microphone 2 than the microphone 1 (for example, c1> c2, b1> b2,...). Here, the
positions of the target sound source A, the noise sources B, C,... Are preset, and a, b, c,... Indicate
the distances between the respective sound sources and the respective microphones. The sound
signals SA (n), SB (n), SC (n),... ((N): time) from the target sound source A and noise sources B, C,. ,
Electrical signals (channel signals) x1 (n), x2 (n).
[0006]
The signals x1 (n) and x2 (n) from the microphones 1 and 2 frequency-analyzed by the two band
03-05-2019
2
division means are expressed by the equations (1) and (2), respectively. X1 (f) = ¦ X1 (f) ¦ exp (j
argX1 (f)) (1) X2 (f) = ¦ X2 (f) ¦ exp (j argX2 (f)) (2) Moreover, the band distinction of 4 The interchannel time difference and the inter-channel level difference detected by the parameter value
difference detection means are defined as in equations (3) and (4).
[0007]
Δφ (f) = argX1 (f) −argX2 (f) (3) ΔA (f) = 20 log 10 (│X 1 (f) │ / │X 2 (f) │) (4)
Arrangement as shown in FIG. For the target sound source, Δφ (f)> 0 and ΔA (f)> 0. Conversely,
for a noise source, Δφ (f) <0 and ΔA (f) <0. Therefore, in the sound source signal determination
means 5, frequency components satisfying certain positive values γ1 and γ2 and certain
negative values γ3 and γ4 and satisfying Δφ (f)> γ1 (5) ΔA (f)> γ2 (6) Is determined as the
frequency component of the target sound source, and the frequency component satisfying Δφ
(f) <γ3 (7) ΔA (f) <γ4 (8) is determined as the frequency component of the noise source. That
is, based on the inter-channel parameter difference, it is determined to which sound source the
corresponding frequency component belongs. Here, γ is a threshold at which a desired (set) S /
N (for example, 20 dB) can be obtained.
[0008]
The sound source signal selecting means 6 and the sound source signal synthesizing means 7 are
the same as in the prior art (see the above-mentioned publication). The sound source signal
selection means 6 selects at least one signal input from the same sound source (target sound
source A) among the band-divided output channel signals based on the sound source
determination signal, and outputs a selected sound source signal, The sound source signal
synthesizing means 7 synthesizes the plurality of selected band signals and outputs the sound
source signal SA ^ (n) of the target sound source A.
[0009]
According to the above configuration, it is possible to collect sound with high S / N only in the
vicinity of the target sound source. Here, the arrangement of the noise source may be such that
the conditions of c1> c2 and b1> b2 described above are satisfied for each noise source. For
example, in desktop communication in an environment with a noise source, if you want to pick
up only the voice of the speaker sitting in front of the desktop, place the microphone 1 near the
03-05-2019
3
front of the display and the microphone 2 as the main voice speaker The arrangement of FIG. 1
can be realized by arranging it behind (for example, the back of a seat when sitting in a seat).
[0010]
(Embodiment 1 of the Invention of Claim 2) FIG. 2 shows a configuration of a sound collection
apparatus (an example of arrangement of microphones (part 1)) which is Embodiment 1 of the
invention of claim 2. The difference from the invention of claim 1 is that there are two or more
microphones arranged closer to the noise source than the target sound source, and the signal
addition of adding the output signals of the microphones arranged near the noise source It is in
the place which added the means 11. As shown in FIG. 2, the microphones 3 to 6 (1-3 to 1-6)
placed near the noise sources may have the same number as the number of noise sources, or less
than the number of noise sources. It is good. In any case, it may be arranged such that the
relationship of b1> b3, c1> c4, d1> d5,... Is maintained.
[0011]
With respect to the microphones arranged in this way, in the signal addition means of 7, the
outputs x3 (n) to x6 of the microphones 3 to 6 (1-3 to 1 -6) arranged near the noise source (n) is
added and the signal x (n) is supplied to the band dividing means 3. The means after the band
dividing means are the same as the embodiment of the invention of claim 1.
[0012]
(Second Embodiment of the Invention of the Second Aspect) Next, an arrangement example (2) of
the microphones in the sound pickup apparatus according to the second embodiment of the
invention of the second aspect will be described. FIG. 3 shows an example of the arrangement of
the microphones. For example, in a car, the position where each speaker speaks is fixed to a
certain extent, and by arranging a microphone near each speaker, it is possible to pick up the
voice of each seat at a high S / N. In FIG. 3, microphones 1 (1-1), microphones 11 (1-11), and
microphones 12 (1-12) are microphones for collecting the voice of the driver's seat (driver's seat)
in comparison with other seats. Yes, use at least one of these. The arrangement position of the
microphone 1 is, for example, a microphone arranged at a position close to the driver's seat
speaker and on the ceiling. The microphone 11 is disposed, for example, near the steering wheel.
The microphone 12 is disposed, for example, near the window on the right side of the driver's
03-05-2019
4
seat.
[0013]
Similarly, microphones 2 (1-2), microphones 3 (1-3), and microphones 4 (1-4) are shown in the
figure as microphones for mainly collecting the voice of the passenger seat, respectively. It can
be arranged as. The microphone 2 is a microphone located at a position close to the front
passenger's seat speaker and on the ceiling. The microphone 3 is, for example, a microphone
disposed near the sun visor. The same applies to the rear seat. For example, the microphones 5
(1-5) are microphones disposed near the backrest of the passenger seat, and the microphones 7
(1-7) are located near the mouth of the rear seat left speaker, and , Is a microphone placed on the
ceiling. The microphone 6 (1-6) is a microphone disposed near the window on the left side of the
rear seat. The same applies to the rear seat right side. Among the microphones arranged as
described above, the target sound has a high S / N ratio by using at least one microphone
arranged near the target sound source and at least one microphone arranged near the noise
source. It can be picked up. In addition, it is possible to apply also to noise other than human
voice such as car audio by using the above configuration.
[0014]
(Embodiment of the Invention of Claim 3) FIGS. 4 and 5 show an arrangement example of the
microphone of the invention of claim 3. As shown in FIG. 4, at least two microphones are
arranged near the listener's ears (arrangement where a1 ≒ a2 is satisfied). (A range in which the
listener can recognize that the sound is in the vicinity of the front (a difference in arrival time of
signals and a range in which the difference in arrival levels can not be recognized) And 5. In the
sound source signal determination means of 5, the frequency band in which the absolute value of
the inter-channel level difference and the inter-channel time difference is less than or equal to a
certain value α1, α2 is determined as the frequency component of the target sound source. By
doing so, for example, it is possible to pick up only the sound near the front of the listener with
high S / N. Here, α is set to a threshold at which a predetermined S / N (for example, 20 dB) can
be obtained. As another arrangement example, as shown in FIG. 5, when the target sound source
is a speaker in front of the display, the microphones are arranged at two fronts of the display
(arrangement satisfying a1 満 た a2), Only the sound source on the front of the display can be
picked up with high S / N.
[0015]
03-05-2019
5
(Example of the Invention of Claim 4) FIG. 6 shows an arrangement example (1) of the
microphone when the invention of Claim 4 is applied to a telephone, and FIG. 7 shows its use
example. By arranging as shown in the figure, the inter-channel level difference of signals
entering target microphones 1 and 2 from target sound source A becomes larger than the interchannel level difference of signals entering noise sources 1 and 2. This is because the signal level
attenuates in inverse proportion to the square of the distance, and in general, the level difference
of the proximity sound source is based on the property that the level difference is larger than the
sound source far away. This is shown below. As shown in FIG. 7, the distance from the target
sound source A to the microphone 1 is a1, the distance from the target sound source to the
microphone 2 is a2, the distance from the noise source B to the microphone 1 b1, and the
distance from the noise source to the microphone 2 It is b2. Here, the following equation holds.
a1-a2 = y1 (9) b1-b2 = y2 (10) Thus, the inter-channel level difference of the target sound source
is proportional to equation (9), and the inter-channel level difference of the noise source is
proportional to equation (10). a12 / a22 = (a2 + y1) 2 / a22 = (1 + y1 / a2) 2 (11) b12 / b22 =
(b2 + y2) 2 / b22 = (1 + y2 / b2) 2 (12) Here, if b2 >> a2, Since y2 / b2 << y1 / a2, the target
sound source inter-channel level difference becomes larger than that from the noise source.
Therefore, the sound source signal determination unit 5 determines the frequency band where
the absolute value of the inter-channel level difference is a certain value β or more as the
frequency component of the target sound source, for example, the S / N of the target sound
source has a high S / N. Can be picked up with Here, β is a threshold at which a desired S / N
(eg, 20 dB) can be obtained. This concept can be applied not only to the use of a handset but also,
for example, to the use of a headset. The example is shown to FIGS. 8-10. In FIG. 8, the
microphone 1 (1-1) is disposed near the mouth and the microphone 2 (1-2) is disposed, for
example, in the vicinity of the back of the head so that the level difference between the target
sound sources is increased. . It is a condition that the microphone 2 is placed at a position farther
from the mouth than the microphone 1.
[0016]
In the case of FIG. 9, the concept is the same, even if the microphone 1 (1-1) is mounted near the
mouth of the target sound source A and the microphone 2 (1-2) is attached to the body like a tie
pin microphone, for example. good.
[0017]
In the case of FIG. 10, for example, assuming that the voice of the speaker present on the stage is
the target voice and the voice of the audience listening to the lecture and other noises are
considered as noise sources, the microphone 1 (1-1) is The microphones 2 (1-2) are placed, for
example, on the desktop near the platform or on the ceiling relatively near the audience, etc.
03-05-2019
6
[0018]
According to the present invention, when the positions of the target sound source and the noise
source are given in advance, the degree of freedom of the zone shape which can be formed by
changing the arrangement of the microphones according to the objects to be picked up. To
separate the sound signal of the target sound source with high S / N.
[0019]
Brief description of the drawings
[0020]
1 is a block diagram of a sound collection device according to an embodiment of the invention of
claim 1.
[0021]
2 is a configuration diagram of a sound collection device (example arrangement of microphones
(part 1)) according to the embodiment of the invention of claim 2.
[0022]
3 is a diagram showing an arrangement example (2) of the microphone of the invention of FIG.
[0023]
4 is a diagram showing an arrangement example (1) of the microphone according to the
invention of FIG.
[0024]
5 is a diagram showing an arrangement example (part 2) of the microphone of the invention of
FIG.
[0025]
6 is a diagram showing an arrangement example of microphones (part 1) when the invention of
claim 4 is applied to a telephone.
03-05-2019
7
[0026]
7 is a view for explaining an example of use of the telephone set shown in FIG.
[0027]
8 is a diagram showing an arrangement example (2) of the microphone of the invention of FIG.
[0028]
9 is a diagram showing an arrangement example (3) of the microphone of the invention of FIG.
[0029]
10 is a diagram showing an arrangement example (4) of the microphone of the invention of FIG.
[0030]
Explanation of sign
[0031]
Reference Signs List 1 microphone 2 band dividing means 3 band-to-channel parameter value
difference detecting means 4 sound source signal judging means 5 sound source signal selecting
means 6 sound source signal synthesizing means 7 signal adding means
03-05-2019
8