close

Вход

Забыли?

вход по аккаунту

JP2009239500

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2009239500
A microphone device capable of reducing a buffer between a microphone and a delay unit is
provided. A second microphone (14a, 14b) for determining a sound source direction is provided
separately from the first microphone (10a, 10b) used to generate an output signal, and is closer
to the sound source A than the first microphone (10a, 10b). The second microphones 14a and
14b are provided at the positions. Then, before the sound from the sound source A reaches the
first microphones 10a and 10b, the direction of the sound source is determined, and the delay
time of the delay units 12a and 12b is set based on the determination result. [Selected figure]
Figure 1
Microphone device
[0001]
The present invention relates to a microphone device. More specifically, the present invention
relates to a delay-and-sum array type microphone device with enhanced sensitivity to a sound
source.
[0002]
2. Description of the Related Art A voice recognition device for recognizing speech contents by
collecting human-generated sound with a microphone, and a teleconferencing device that enables
conversation between remote places in a loud speech communication mode with a microphone
and a speaker are widely known. ing.
04-05-2019
1
[0003]
In such an apparatus for processing voice, it is desirable to receive the voice of the speaker with
high quality even when the microphone as the sound collector is at a distance from the speaker.
[0004]
Therefore, there is a delay-and-sum array as a technique for receiving the voice of the speaker
with high sound quality even with a microphone at a position distant from the speaker (see, for
example, Patent Document 1).
In this technology, the sound emitted from a sound source is collected by each of a plurality of
microphones, and an acoustic signal obtained by each of these microphones is delayed by the
delay amount based on the direction of the sound source to be in phase and added. Therefore, it
emphasizes the sound arriving from the direction of the sound source and enhances the
sensitivity to the sound source.
[0005]
FIG. 6 shows a schematic configuration of a conventional delay-and-sum type microphone device.
As shown in the figure, the conventional microphone device 100 includes microphones 101a and
101b, analog-to-digital converters (A / D) 102a and 102b, FIFOs (First In First Out) 103a and
103b, and a buffer (BUF). A delay unit 105 a, a delay unit 105 a, a delay unit 105 b, an adder
106, a sound source direction determination unit 110, and a delay time setting unit 111 are
provided. In addition, the sound source direction determination unit 110 is provided with
memories (MEM) 120 a and 120 b and a determination unit 121. In the following, any one of the
microphones 101a and 101b is referred to as the microphone 101, and any one of the delay
units 105a and 105b is referred to as the delay unit 105.
[0006]
04-05-2019
2
The sound emitted from the sound source A is collected by a plurality of microphones 101a and
101b, and these microphones 101a and 101b are assumed to be electrical analog signals
(hereinafter referred to as "analog sound signals") according to the sound collection level.
Converted to). Each analog sound signal is converted to a digital signal (hereinafter referred to as
"digital sound signal") by the analog-to-digital converters 102a and 102b. , And are input to the
buffers 104a and 104b and the memories 120a and 120b of the sound source direction
determination unit 110 via the FIFOs 103a and 103b.
[0007]
The determination unit 121 of the sound source direction determination unit 110 detects the
phase difference (time shift) of each analog audio signal output from the plurality of
microphones 101a and 101b based on the information of the digital audio signal stored in the
memories 120a and 120b. The direction of the sound source A (the direction of the sound source
A with respect to the microphones 101a and 101b) is determined from the phase difference. The
memories 120a and 120b have storage capacities for storing digital acoustic signal information
as many as necessary to detect the phase difference of the analog acoustic signal by the
determination unit 121.
[0008]
The delay time setting unit 111 determines the delay time of each of the plurality of delay units
105a and 105b based on the direction of the sound source A determined by the sound source
direction determination unit 110, and the determined delay time Set to 105a and 105b. As a
result, delay processing is performed only for the time difference until the sound emitted from
the sound source A reaches the microphone 101a and the microphone 101b. That is, the digital
acoustic signal corresponding to the microphone 101 whose sound arrival time is early is
delayed by the delay unit 105 by the time difference. As a result, digital acoustic signals
corresponding to the microphones 101a and 101b are in phase.
[0009]
Then, the adding unit 106 adds the digital acoustic signals that are in phase via the delaying
units 105a and 105b, so that an emphasizing signal that emphasizes the sound arriving from the
direction of the sound source A is output. JP 2001-313992 A
04-05-2019
3
[0010]
However, in the above-mentioned conventional delay-and-sum type microphone device 100, it
takes time for the sound source direction judgment processing by the sound source direction
judgment unit 110, and the information of the digital acoustic signal is accumulated in the
buffers 104a and 104b for that time. The
[0011]
This is because the sound signal obtained by the microphone 101 is used at the same timing as
the sound signal used for the determination by the sound source direction determination unit
110 in order not to reduce the sensitivity even when the direction of the sound source A
changes. It is from.
[0012]
However, a large storage capacity is required for the buffers 104a and 104b, and there are
problems with miniaturization and cost.
[0013]
An object of the present invention is to provide a delay-and-sum array type microphone device
capable of reducing a buffer between a microphone and a delay unit, and a speech recognition
device provided with the same.
[0014]
In order to achieve the above object, the invention according to claim 1 delays the sound signals
output from the plurality of first microphones for collecting the sound emitted from the sound
source and the plurality of first microphones independently of each other. A plurality of delay
units that can be delayed by time, an addition unit that adds and outputs each delay signal output
from the plurality of delay units, and a position closer to the sound source than the plurality of
first microphones; A plurality of second microphones for collecting the sound emitted from the
sound source; a sound source direction determining unit that determines the direction of the
sound source based on the acoustic signals obtained by the plurality of second microphones; And
a delay time setting unit configured to determine the delay time of each of the plurality of delay
units based on the direction of the sound source determined by the determination unit, and to set
the determined delay time in the plurality of delay units. And there is provided a microphone
device.
04-05-2019
4
[0015]
According to a second aspect of the present invention, in the first aspect, the plurality of first
microphones are disposed on a first straight line, and the plurality of second microphones are
disposed on a second straight line. A straight line and the second straight line may be in parallel.
[0016]
The invention according to claim 3 is the invention according to claim 1 or 2, wherein the
plurality of first microphones are configured by two microphones, and the plurality of second
microphones are configured by two microphones. It is characterized by being done.
[0017]
The invention according to claim 4 is characterized in that, in the invention according to claim 3,
an interval between the second microphones is made larger than an interval between the first
microphones.
[0018]
The invention according to claim 5 is a voice recognition comprising the microphone device
according to any one of claims 1 to 4 and a voice recognition unit for performing voice
recognition based on an output signal from the microphone device. It was an apparatus.
[0019]
According to the first aspect of the present invention, it is possible to provide a delay-and-sum
array type microphone device that can reduce the number of buffers between the microphone
and the delay unit.
[0020]
Further, according to the second aspect of the invention, since the straight line connecting the
second microphones and the straight line connecting the first microphones are parallel to each
other, the sound source directions for the plurality of first microphones, The direction of the
sound source with respect to the second microphone is the same, and the conversion of the delay
time becomes easy.
[0021]
Further, according to the third aspect of the present invention, since the number of the first
04-05-2019
5
microphone and the number of the second microphone are two each, the sensitivity to the sound
source can be enhanced with a simple configuration.
[0022]
Further, according to the fourth aspect of the present invention, since the distance between the
second microphones is larger than the distance between the first microphones, the directivity can
be broadened.
[0023]
According to the fifth aspect of the present invention, it is possible to provide a speech
recognition apparatus provided with a delay-and-sum array type microphone device that can
reduce the number of buffers between the microphone and the delay unit.
[0024]
Hereinafter, an embodiment of a microphone device according to the present invention and a
voice recognition device provided with the same will be described.
[0025]
[1.
Overview of Microphone Device] The microphone device in the present embodiment can delay
the plurality of first microphones for collecting the sound emitted from the sound source, and the
acoustic signals output from these first microphones by independent delay times. A plurality of
delay units and an addition unit that adds and outputs each delay signal output from the delay
units are provided.
[0026]
The delay time setting unit is configured to determine the delay time of each of the plurality of
delay units based on the direction of the sound source, and to set the determined delay time to
the plurality of delay units.
04-05-2019
6
The delay time set by the delay time setting unit is a shift of time for the sound emitted from the
sound source A to reach each microphone (hereinafter, time shift ).
Set based on).
For example, when there are two microphones as the first microphone, if there is a time lag Δta
of the sound arriving from the sound source between these microphones, a delay is caused when
the sound signal of the microphone from which the sound from the sound source is early arrives.
Δta is set in the unit.
As a result, digital audio signals corresponding to analog audio signals output from these
microphones are in phase and output from the delay unit.
Then, the digital acoustic signals thus in-phased in this way are added by the adding section, and
the signal emitted from the sound source is emphasized (hereinafter referred to as emphasis
signal ).
Is output.
[0027]
Moreover, in the microphone device according to the present embodiment, the plurality of
second microphones disposed at positions closer to the sound source than the plurality of first
microphones and collecting the sound emitted from the sound source are obtained respectively
by the plurality of second microphones. And a sound source direction determining unit that
determines the direction of the sound source based on the sound signal.
[0028]
Therefore, the specific sound emitted from the sound source to the plurality of first microphones
(hereinafter, referred to as sound B ).
04-05-2019
7
The sound B from the sound source is collected by the plurality of second microphones before
the) reaches.
Based on the sound B collected in this manner, the sound source direction determination unit
determines the direction of the sound source at the timing when the sound B is emitted.
[0029]
As a result, before the sound B from the sound source reaches the plurality of first microphones,
it is possible to determine the direction of the sound source at the timing when the sound B is
emitted, and the voice of the speaker is emphasized and output , And buffers between the first
microphone and the delay unit can be reduced.
[0030]
When the direction of the sound source is specified based on the sound B collected by the second
microphone when the space between the first microphone and the second microphone can not
be arranged sufficiently apart or depending on the direction of the sound source, The sound
emitted from the sound source next to the sound B (hereinafter referred to as "sound C".
) May have already been collected.
At this time, the emphasis signal of the sound C is output based on the direction of the sound
source at the timing when the sound B is emitted.
However, the emphasizing signal is output in a direction closer to the sound source emitting the
sound collected by the first microphone than when the first microphone and the second
microphone are arranged in a straight line and the same processing is performed. Can.
[0031]
The microphone device can be used in various voice processing devices such as a voice
recognition device and a teleconference device.
04-05-2019
8
[0032]
[2.
Specific Example of Microphone Device] Next, a specific example of the microphone device in the
present embodiment will be described with reference to the drawings.
FIG. 1 is a block diagram of the microphone device in the present embodiment, FIG. 2 is a
diagram showing the positional relationship between the first microphone and the second
microphone with respect to the sound source, FIG. 3 is a diagram for explaining the maximum
detection range for the sound source, It is a figure which shows the sound source direction, and
the positional relationship of a 1st microphone and a 2nd microphone.
[0033]
As shown in FIG. 1, the microphone device 1 of this embodiment includes first microphones 10 a
and 10 b, analog / digital converters (A / D) 11 a and 11 b, delay units 12 a and 12 b, and an
adder 13. Is equipped.
[0034]
The first microphones 10a and 10b collect the sound emitted from the sound source A to form
an electrical analog signal (hereinafter referred to as "analog sound signal").
These first microphones 10a and 10b are arranged at predetermined intervals and converted
into S1a and S1b.
[0035]
Analog sound signals S1a and S1b output from the first microphones 10a and 10b are converted
to digital signals (hereinafter referred to as "digital sound signals") by the analog / digital
converters 11a and 11b.
04-05-2019
9
) S2a and S2b are converted and output.
[0036]
Here, the microphone device 1 targets the voice of the speaker as the sound emitted by the
sound source A, and the analog / digital converters 11a and 11b sample the analog sound signals
S1a and S1b at 44.1 kHz, for example. Digital acoustic signals S2a and S2b are generated.
[0037]
The digital acoustic signals S2a and S2b generated in this manner are respectively input to the
delay units 12a and 12b.
The delay units 12a and 12b are delay units capable of delaying with independent delay times,
and are configured of ring buffers and the like.
The delay units 12a and 12b delay the digital audio signals S2a and S2b by a set delay time
(hereinafter referred to as "delayed signal").
) Output S3a and S3b.
[0038]
For example, when the sound emitted from the sound source A reaches one of the first
microphones 10a and then reaches .DELTA.ta after reaching the other first microphone 10b, the
delay time .DELTA.ta is set in the delay unit 12a and the delay unit 12a is delayed Time 0 is set.
When such delay time is set in the delay units 12a and 12b, the digital audio signal S2a is
delayed by the delay time Δta by the delay unit 12a and output as the delay signal S3a, and the
digital audio signal S2b is delayed by the delay unit 12b. It is output as a delay signal S3b
without.
04-05-2019
10
[0039]
Therefore, when the time when the sound emitted from the sound source A reaches the first
microphones 10a and 10b is shifted, the time shifts are adjusted by the delay units 12a and 12b,
and the digital acoustic signals S2a and S2b having a phase difference are the same. The signals
are phased and output as the delay signals S3a and S3b.
[0040]
The delayed signals S3a and S3b are input to the adder 13, added and output.
As described above, since the delay signals S3a and S3b are in phase with the sound emitted
from the sound source A and there is no phase shift, the sound emitted from the sound source A
is emphasized by adding these delay signals S3a and S3b. Signal (hereinafter referred to as
"emphasis signal"). ) S4 is generated.
[0041]
Here, in order to set the delay time to the delay units 12a and 12b, the microphone device 1 of
the present embodiment includes the second microphones 14a and 14b, and the analog / digital
converters (A / D) 15a and 15b. , FIFO (First In First Out) 16a and 16b, a sound source direction
determination unit 17, and a delay time setting unit 18.
[0042]
In the above-described conventional microphone device 100, a microphone for collecting a
sound from a sound source to generate an enhancement signal and a microphone for collecting a
sound from a sound source to determine a sound source direction are the same microphones
101a and 101b. However, in the microphone device 1 of this embodiment, separate microphones
are used.
[0043]
That is, separately from the first microphones 10a and 10b collecting the sound from the sound
source A to generate the emphasis signal S4, the second microphone 14a collecting the sound
from the sound source A to determine the sound source direction θ , 14b are provided.
04-05-2019
11
[0044]
FIG. 2 is a view showing the positional relationship between the first microphones 10a and 10b
and the second microphones 14a and 14b with respect to the sound source A. As shown in FIG.
As shown in the figure, the second microphones 14a and 14b are disposed at positions closer to
the sound source A than the first microphones 10a and 10b so that the sound emitted by the
sound source A arrives earlier than the first microphones 10a and 10b. Be done.
[0045]
Therefore, before the sound from the sound source A reaches the first microphones 10a and 10b
(in the example shown in FIG. 2, before reaching the wavefront position a3 of the sound from the
sound source A), the plurality of second microphones 14a and 14b Are collected at the wavefront
positions a1 and a2 of the sound from the sound source A, respectively.
The time shift between the second microphones 14a and 14b of the sound collected in this
manner (the time when the sound from the sound source A reaches the wave front position a2
from the wave front position a1) is detected by the sound source direction determination unit 17,
thereby The direction θ is determined.
[0046]
As a result, before the sound from the sound source A reaches the plurality of first microphones
10a and 10b, the direction of the sound source A can be determined, and between the first
microphones 10a and 10b and the delay units 12a and 12b. Buffer can be reduced.
[0047]
Here, the straight line connecting the second microphones 14a and 14b and the straight line
connecting the first microphones 10a and 10b are parallel to each other, and the distance
between the second microphones 14a and 14b is the first microphones 10a and 10b. Suppose
that it is larger than the interval between each other.
04-05-2019
12
At this time, as shown in FIG. 3, assuming that an angle formed by a line connecting the second
microphone 14a and the first microphone 10b and a line connecting the second microphones
14a and 14b is α, from the first microphones 10a and 10b The maximum range of the sound
source direction which reaches the second microphones 14a and 14b as soon as possible is π2α.
[0048]
By arranging the first microphones 10a and 10b and the second microphones 14a and 14b in
this manner, the sound sources A and B generate sound sources before the sound from the sound
source A reaches the first microphones 10a and 10b. The sound from A can be collected.
[0049]
Here, the second microphones 14a and 14b are arranged at predetermined intervals, collect the
sound emitted from the sound source A, convert the sound into analog sound signals S5a and
S5b, and output them.
The analog sound signals S5a and S5b output from the second microphones 14a and 14b are
converted into digital sound signals S6a and S6b by the analog / digital converters 15a and 15b,
respectively, and then output.
[0050]
The digital acoustic signals S6a and S6b generated in this manner are sequentially input to the
sound source direction determination unit 17 as digital acoustic signals S7a and S7b through the
FIFOs 16a and 16b.
The FIFOs 16a and 16b are provided to adjust the difference between the operation timings of
the analog / digital converters 15a and 15b and the sound source direction determination unit
17.
[0051]
04-05-2019
13
The sound source direction determination unit 17 includes the memories (MEMs) 20a and 20b
and the determination unit 21, and determines the sound source direction θ.
[0052]
The memories 20a and 20b store a predetermined number N (for example, 256) or more of
information on the signal level of the latest digital acoustic signal among the digital acoustic
signals output from the analog / digital converters 15a and 15b. The digital acoustic signals
output from the analog / digital converters 15a and 15b are sequentially stored via the FIFOs
16a and 16b.
[0053]
The distance between the second microphones 14a and 14b and the speed of sound are known,
and the judging means 21 of the sound source direction judging unit 17 judges the sound source
direction θ based on the information and the digital acoustic signals S7a and S7b.
[0054]
Hereinafter, the determination process of the sound source direction θ by the determination unit
21 of the sound source direction determination unit 17 will be specifically described.
[0055]
Assuming that the signal level of the digital acoustic signal S7a on the side of the second
microphone 14a is X1 (i), and the signal level of the digital acoustic signal S7b on the side of the
second microphone 14b is X2 (i), two second ones for the sound from the sound source A The
time lag τ of sound collection by the microphones 14a and 14b can be derived from the
following equations (1) and (2).
Note that 0 ≦ j ≦ N−1 (j is an integer) and 0 ≦ i ≦ N−1 (i is an integer), and the latest digital
sound signal is i = 0, j = 0, and the memories 20a and 20b are used. The oldest digital acoustic
signal out of the stored N digital acoustic signals is i = N-1, j = N-1.
[0056]
04-05-2019
14
[0057]
[0058]
First, the determination unit 21 performs the calculation according to the above equation (1).
That is, the determination means 21 takes out a predetermined X1 (i) from the memory 20a, and
takes out X2 (0) to X2 (N-1) from the memory 20b.
Then, the determination unit 21 calculates the sum of values obtained by integrating X2 (0) to
X2 (N-1) with respect to a predetermined X1 (i).
The determination unit 21 performs this process on all of X1 (0) to X1 (N-1) stored in the
memory 20a.
[0059]
Next, as shown in the equation (2), the determination means 21 sets the largest value among
RX1X2 (0) to RX1X2 (N-1) (hereinafter referred to as maximum value RX1X2 (γ) ) Is
determined.
[0060]
Here, assuming that the sampling frequency of the analog / digital converters 15a and 15b is
44.1 kHz, it becomes 22.676 μs per one sampling.
On the other hand, the determined γ is a time shift represented by the number of samplings.
Therefore, the determination unit 21 of the sound source direction determination unit 17 detects
the time lag τ by performing the calculation of the following equation (3).
04-05-2019
15
[0061]
[0062]
Next, the determination means 21 calculates the positional deviation D (see FIG. 4) between the
second microphones 14 a and 14 b with respect to the sound source A.
When the sound velocity is defined as c, the positional deviation D is obtained by adding the
sound velocity c to the time lag τ as shown in the following equation (4), and the determination
means 21 performs the calculation based on this equation (4) .
[0063]
[0064]
Next, the determination means 21 determines the sound source direction θ.
The relationship between the sound source direction θ, the positional deviation D, and the
distance L 0 between the second microphones 14 a and 14 b is a relationship shown in the
following equation (5), and the determination means 21 performs an operation based on the
equation (5).
[0065]
[0066]
As described above, the determination unit 21 determines the sound source direction θ based
on the outputs of the second microphones 14 a and 14 b, and the information on the sound
source direction θ is notified to the delay time setting unit 18.
[0067]
04-05-2019
16
The delay time setting unit 18 has a delay amount table set therein, and determines the delay
time to the delay units 12 a and 12 b based on the information on the sound source direction θ
notified from the sound source direction determination unit 17.
The delay amount table is a table in which delay times to the delay units 12a and 12b are
associated with respective values of the sound source direction θ, and calculation is performed
based on the following formulas (6) to (8) The delay time based on the positional deviation Diff is
set.
The positional deviation Diff is a positional deviation between the first microphones 10a and 10b
with respect to the sound source A, as shown in FIG.
[0068]
[0069]
[0070]
[0071]
Here, the line connecting the first microphones 10a and 10b and the line connecting the second
microphones 14a and 14b are parallel, and when the sound source direction θ is + 30 ° and
the distance L1 is 10 cm, the distance Diff is 5 cm. .
In addition, assuming that the sampling frequency of the analog / digital converters 11a and 11b
is 44.1 kHz and the sound speed is 340 m / s, one sampling period is 22.676 μs, and the
traveling distance of the sound from the sound source A in one sampling period Is 7.710 mm.
Therefore, the delay time is 5 / 0.771 = 6.4 sampling time.
04-05-2019
17
[0072]
At this time, the delay time setting unit 18 sets 6.4 sampling time as a delay time in the delay
unit 12a that delays the digital acoustic signal S2a on the first microphone 10a side where the
sound from the sound source A arrives early.
On the other hand, 0 sampling time is set as a delay time of the delay unit 12 b that delays the
digital sound signal S 2 b on the first microphone 10 b side.
[0073]
Then, the delay units 12a and 12b delay the digital audio signals S2a and S2b according to the
delay time set in this way, and output the delayed signals S3a and S3b to the addition unit 106 as
delayed signals S3a and S3b.
The delayed signals S3a and S3b are signals that are in phase with the sound emitted from the
sound source A and are in phase with each other, and the delayed signals S3a and S3 are added
by the adding unit 106, and the sound from the sound source A is generated. An emphasis signal
S4 is generated with the
[0074]
The plurality of first microphones 10a and 10b are disposed on the first straight line, and the
plurality of second microphones 14a and 14b are disposed on the second straight line, and the
first straight line and the second straight line are parallel to each other. It is desirable to have.
For example, as shown in FIG. 4, the straight line connecting the second microphones 14a and
14b and the straight line connecting the first microphones 10a and 10b are parallel to each
other. By doing this, since the sound source direction θ for the plurality of first microphones
10a and 10b and the sound source direction θ for the plurality of second microphones 14a and
14b become the same, conversion of the delay time becomes easy.
[0075]
04-05-2019
18
Moreover, it is desirable to make the space ¦ interval of 2nd microphones 14a and 14b larger
than the space ¦ interval of 1st microphones 10a and 10b. When the distance between the
second microphones 14a and 14b is smaller than the distance between the first microphones
10a and 10b, the directivity becomes narrow.
[0076]
When wide directivity is required, the distance between the second microphones 14a and 14b is
made larger than the distance between the first microphones 10a and 10b, and the line between
the first microphones 10a and 10b and the second microphone 14a, 14b Increase the distance
between the lines made by each other.
[0077]
Further, in the above description, since the number of the first microphone and the number of
the second microphone are two each, it is possible to manufacture a microphone device in which
the sensitivity to the sound source A is enhanced with a simple configuration.
[0078]
Here, the straight line connecting the second microphones 14a and 14b and the straight line
connecting the first microphones 10a and 10b are parallel to each other, and the distance
between the second microphones 14a and 14b is the first microphones 10a and 10b. Suppose
that it is larger than the interval between each other.
At this time, as shown in FIG. 3, an angle formed by a line connecting the second microphone 14
a and the first microphone 10 b and a line connecting the second microphones 14 a and 14 b is
α, and the second microphone 14 a and the first microphone Assuming that an angle formed by
a line connecting 10a and a line connecting the second microphones 14a and 14b is β, the
sound source direction θ, the angle α, and the angle β are defined by the following equations
(9) and (10) Ru.
Therefore, in order to adjust the sound source direction θ so that the second microphones 14a
and 14b reach the second microphones 14a and 14b earlier than the first microphones 10a and
10b, the angle α and the angle satisfy the following equations (9) and (10). We will change β.
04-05-2019
19
[0079]
[0080]
[0081]
Although the number of first microphones 10a and 10b is two, it may be a large number of
microphones.
By providing a large number of first microphones, it is possible to add a large number of acoustic
signals and output an enhanced signal in which the sound from the sound source direction θ is
more emphasized.
Similarly, the second microphones 14a and 14b may be not only two but a large number of
microphones. Thereby, not only a wavefront of a plane but also a wavefront of a solid can be
accommodated.
[0082]
As described above, in the microphone device 1 according to the present embodiment, the
plurality of second microphones are provided at positions closer to the sound source than the
plurality of first microphones used to output the emphasis signal, and the sound obtained by the
second microphone The direction of the sound source is determined based on the signal.
Therefore, it is possible to determine the direction of the sound source before the sound from the
sound source reaches the plurality of first microphones, and it is possible to reduce the buffer
between the first microphone and the delay unit.
[0083]
[3. Example of Device to which Microphone Device is Applied] As a device to which the
above-described microphone device 1 is applied, a voice interaction device provided with a voice
04-05-2019
20
recognition device will be described as an example with reference to the drawings. FIG. 5 is a
block diagram of a speech dialogue apparatus provided with a speech recognition apparatus. The
voice interaction device is a device that provides information or service requested by the user by
interacting with the user by voice.
[0084]
As shown in FIG. 5, the voice interaction device 30 includes a control unit 41, a storage unit 42, a
decoder unit 43, an image processing unit 44, a display device 45, an audio processing unit 46, a
speaker 47, and an input I / F (interface) unit. 48, an input operation unit 49, and a speech
recognition device 50. The control unit 41, the storage unit 42, the decoder unit 43, the input I /
F unit 48, and the voice recognition device 50 are mutually connected via the system bus 51.
[0085]
The control unit 41 includes a central processing unit (CPU), a read only memory (ROM), and a
random access memory (RAM), and controls the entire voice dialogue apparatus 30.
[0086]
The storage unit 42 is configured by a hard disk drive or the like, and stores a dialogue scenario
or the like for interacting with the user.
[0087]
The decoder unit 43 decodes image data and audio data based on the dialogue scenario stored in
the storage unit 42.
The image data decoded by the decoder unit 43 is converted into information that can be
displayed on the display device 45 by the image processing unit 44, and displayed on the display
device 45.
Further, the audio data decoded by the decoder unit 43 is converted by the audio processing unit
46 into information that can be output as a sound wave by the speaker 47, and is output from
the speaker 47.
04-05-2019
21
[0088]
The input I / F unit 48 detects the user's operation on the input operation unit 49 and notifies
the control unit 41 of the operation. The control unit 41 performs processing according to this
input operation.
[0089]
The voice recognition device 50 is a device for recognizing a voice emitted by a user, and is
provided with a voice recognition unit 61 that performs voice recognition based on output
signals from the microphone device 60 and the microphone device 60. The operation of the voice
recognition device 50 is controlled by the control unit 41. Then, the speech content of the user
recognized by the voice recognition device 50 in the operating state is recognized and notified to
the control unit 41 as character information.
[0090]
By applying the above-described microphone device 1 as the microphone device 60 used for the
voice recognition device 50, it is a signal according to the voice emitted by the user, and the
reduction in sensitivity is suppressed when the direction of the sound source A fluctuates. A
signal can be input to the speech recognition unit 61. Therefore, the rate of recognizing the voice
emitted by the user can be increased, and a voice recognition apparatus with high recognition
rate can be realized.
[0091]
The voice dialogue apparatus 30 presents information based on the dialogue scenario stored in
the storage unit 42 by the control unit 41 from the display unit 45 or the speaker 47 to the user,
and the user issues the information presented in this way. The voice is recognized by the voice
recognition device 50, and the control unit 41 determines the information to be presented next
based on the recognized information and the dialogue scenario, and finally, the information or
service required by the user. provide.
04-05-2019
22
[0092]
Although some of the embodiments of the present invention have been described above in detail
based on the drawings, these are merely examples, and the present invention may be embodied
in other forms that are variously modified and improved based on the knowledge of those skilled
in the art. It is possible to carry out.
[0093]
It is a figure which shows the block diagram of the microphone apparatus in one Embodiment of
this invention.
It is a figure which shows the positional relationship of the 1st microphone and 2nd microphone
with respect to a sound source.
It is a figure for demonstrating the maximum detection range with respect to a sound source. It is
a figure which shows the sound source direction, and the positional relationship of a 1st
microphone and a 2nd microphone. It is a block diagram of the speech interactive apparatus
provided with the speech recognition apparatus which has a microphone apparatus of FIG. It is a
schematic configuration of a conventional delay-and-sum type microphone device.
Explanation of sign
[0094]
1, 60 Microphone device 10a, 10b First microphone 11a, 11b, 15a, 15b Digital / analog
converter (A / D) 12a, 12b 13 Addition unit 14a, 14b Second microphone 16a, 16b FIFO 17
Sound source direction determination Unit 18 Delay time setting unit 20a, 21b Memory 21
Judgment means 30 Speech dialogue device 50 Speech recognition device 61 Speech recognition
unit
04-05-2019
23
1/--страниц
Пожаловаться на содержимое документа