Patent Translate Powered by EPO and Google Notice This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate, complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or financial decisions, should not be based on machine-translation output. DESCRIPTION JP2009239500 A microphone device capable of reducing a buffer between a microphone and a delay unit is provided. A second microphone (14a, 14b) for determining a sound source direction is provided separately from the first microphone (10a, 10b) used to generate an output signal, and is closer to the sound source A than the first microphone (10a, 10b). The second microphones 14a and 14b are provided at the positions. Then, before the sound from the sound source A reaches the first microphones 10a and 10b, the direction of the sound source is determined, and the delay time of the delay units 12a and 12b is set based on the determination result. [Selected figure] Figure 1 Microphone device [0001] The present invention relates to a microphone device. More specifically, the present invention relates to a delay-and-sum array type microphone device with enhanced sensitivity to a sound source. [0002] 2. Description of the Related Art A voice recognition device for recognizing speech contents by collecting human-generated sound with a microphone, and a teleconferencing device that enables conversation between remote places in a loud speech communication mode with a microphone and a speaker are widely known. ing. 04-05-2019 1 [0003] In such an apparatus for processing voice, it is desirable to receive the voice of the speaker with high quality even when the microphone as the sound collector is at a distance from the speaker. [0004] Therefore, there is a delay-and-sum array as a technique for receiving the voice of the speaker with high sound quality even with a microphone at a position distant from the speaker (see, for example, Patent Document 1). In this technology, the sound emitted from a sound source is collected by each of a plurality of microphones, and an acoustic signal obtained by each of these microphones is delayed by the delay amount based on the direction of the sound source to be in phase and added. Therefore, it emphasizes the sound arriving from the direction of the sound source and enhances the sensitivity to the sound source. [0005] FIG. 6 shows a schematic configuration of a conventional delay-and-sum type microphone device. As shown in the figure, the conventional microphone device 100 includes microphones 101a and 101b, analog-to-digital converters (A / D) 102a and 102b, FIFOs (First In First Out) 103a and 103b, and a buffer (BUF). A delay unit 105 a, a delay unit 105 a, a delay unit 105 b, an adder 106, a sound source direction determination unit 110, and a delay time setting unit 111 are provided. In addition, the sound source direction determination unit 110 is provided with memories (MEM) 120 a and 120 b and a determination unit 121. In the following, any one of the microphones 101a and 101b is referred to as the microphone 101, and any one of the delay units 105a and 105b is referred to as the delay unit 105. [0006] 04-05-2019 2 The sound emitted from the sound source A is collected by a plurality of microphones 101a and 101b, and these microphones 101a and 101b are assumed to be electrical analog signals (hereinafter referred to as "analog sound signals") according to the sound collection level. Converted to). Each analog sound signal is converted to a digital signal (hereinafter referred to as "digital sound signal") by the analog-to-digital converters 102a and 102b. , And are input to the buffers 104a and 104b and the memories 120a and 120b of the sound source direction determination unit 110 via the FIFOs 103a and 103b. [0007] The determination unit 121 of the sound source direction determination unit 110 detects the phase difference (time shift) of each analog audio signal output from the plurality of microphones 101a and 101b based on the information of the digital audio signal stored in the memories 120a and 120b. The direction of the sound source A (the direction of the sound source A with respect to the microphones 101a and 101b) is determined from the phase difference. The memories 120a and 120b have storage capacities for storing digital acoustic signal information as many as necessary to detect the phase difference of the analog acoustic signal by the determination unit 121. [0008] The delay time setting unit 111 determines the delay time of each of the plurality of delay units 105a and 105b based on the direction of the sound source A determined by the sound source direction determination unit 110, and the determined delay time Set to 105a and 105b. As a result, delay processing is performed only for the time difference until the sound emitted from the sound source A reaches the microphone 101a and the microphone 101b. That is, the digital acoustic signal corresponding to the microphone 101 whose sound arrival time is early is delayed by the delay unit 105 by the time difference. As a result, digital acoustic signals corresponding to the microphones 101a and 101b are in phase. [0009] Then, the adding unit 106 adds the digital acoustic signals that are in phase via the delaying units 105a and 105b, so that an emphasizing signal that emphasizes the sound arriving from the direction of the sound source A is output. JP 2001-313992 A 04-05-2019 3 [0010] However, in the above-mentioned conventional delay-and-sum type microphone device 100, it takes time for the sound source direction judgment processing by the sound source direction judgment unit 110, and the information of the digital acoustic signal is accumulated in the buffers 104a and 104b for that time. The [0011] This is because the sound signal obtained by the microphone 101 is used at the same timing as the sound signal used for the determination by the sound source direction determination unit 110 in order not to reduce the sensitivity even when the direction of the sound source A changes. It is from. [0012] However, a large storage capacity is required for the buffers 104a and 104b, and there are problems with miniaturization and cost. [0013] An object of the present invention is to provide a delay-and-sum array type microphone device capable of reducing a buffer between a microphone and a delay unit, and a speech recognition device provided with the same. [0014] In order to achieve the above object, the invention according to claim 1 delays the sound signals output from the plurality of first microphones for collecting the sound emitted from the sound source and the plurality of first microphones independently of each other. A plurality of delay units that can be delayed by time, an addition unit that adds and outputs each delay signal output from the plurality of delay units, and a position closer to the sound source than the plurality of first microphones; A plurality of second microphones for collecting the sound emitted from the sound source; a sound source direction determining unit that determines the direction of the sound source based on the acoustic signals obtained by the plurality of second microphones; And a delay time setting unit configured to determine the delay time of each of the plurality of delay units based on the direction of the sound source determined by the determination unit, and to set the determined delay time in the plurality of delay units. And there is provided a microphone device. 04-05-2019 4 [0015] According to a second aspect of the present invention, in the first aspect, the plurality of first microphones are disposed on a first straight line, and the plurality of second microphones are disposed on a second straight line. A straight line and the second straight line may be in parallel. [0016] The invention according to claim 3 is the invention according to claim 1 or 2, wherein the plurality of first microphones are configured by two microphones, and the plurality of second microphones are configured by two microphones. It is characterized by being done. [0017] The invention according to claim 4 is characterized in that, in the invention according to claim 3, an interval between the second microphones is made larger than an interval between the first microphones. [0018] The invention according to claim 5 is a voice recognition comprising the microphone device according to any one of claims 1 to 4 and a voice recognition unit for performing voice recognition based on an output signal from the microphone device. It was an apparatus. [0019] According to the first aspect of the present invention, it is possible to provide a delay-and-sum array type microphone device that can reduce the number of buffers between the microphone and the delay unit. [0020] Further, according to the second aspect of the invention, since the straight line connecting the second microphones and the straight line connecting the first microphones are parallel to each other, the sound source directions for the plurality of first microphones, The direction of the sound source with respect to the second microphone is the same, and the conversion of the delay time becomes easy. [0021] Further, according to the third aspect of the present invention, since the number of the first 04-05-2019 5 microphone and the number of the second microphone are two each, the sensitivity to the sound source can be enhanced with a simple configuration. [0022] Further, according to the fourth aspect of the present invention, since the distance between the second microphones is larger than the distance between the first microphones, the directivity can be broadened. [0023] According to the fifth aspect of the present invention, it is possible to provide a speech recognition apparatus provided with a delay-and-sum array type microphone device that can reduce the number of buffers between the microphone and the delay unit. [0024] Hereinafter, an embodiment of a microphone device according to the present invention and a voice recognition device provided with the same will be described. [0025] [1. Overview of Microphone Device] The microphone device in the present embodiment can delay the plurality of first microphones for collecting the sound emitted from the sound source, and the acoustic signals output from these first microphones by independent delay times. A plurality of delay units and an addition unit that adds and outputs each delay signal output from the delay units are provided. [0026] The delay time setting unit is configured to determine the delay time of each of the plurality of delay units based on the direction of the sound source, and to set the determined delay time to the plurality of delay units. 04-05-2019 6 The delay time set by the delay time setting unit is a shift of time for the sound emitted from the sound source A to reach each microphone (hereinafter, time shift ). Set based on). For example, when there are two microphones as the first microphone, if there is a time lag Δta of the sound arriving from the sound source between these microphones, a delay is caused when the sound signal of the microphone from which the sound from the sound source is early arrives. Δta is set in the unit. As a result, digital audio signals corresponding to analog audio signals output from these microphones are in phase and output from the delay unit. Then, the digital acoustic signals thus in-phased in this way are added by the adding section, and the signal emitted from the sound source is emphasized (hereinafter referred to as emphasis signal ). Is output. [0027] Moreover, in the microphone device according to the present embodiment, the plurality of second microphones disposed at positions closer to the sound source than the plurality of first microphones and collecting the sound emitted from the sound source are obtained respectively by the plurality of second microphones. And a sound source direction determining unit that determines the direction of the sound source based on the sound signal. [0028] Therefore, the specific sound emitted from the sound source to the plurality of first microphones (hereinafter, referred to as sound B ). 04-05-2019 7 The sound B from the sound source is collected by the plurality of second microphones before the) reaches. Based on the sound B collected in this manner, the sound source direction determination unit determines the direction of the sound source at the timing when the sound B is emitted. [0029] As a result, before the sound B from the sound source reaches the plurality of first microphones, it is possible to determine the direction of the sound source at the timing when the sound B is emitted, and the voice of the speaker is emphasized and output , And buffers between the first microphone and the delay unit can be reduced. [0030] When the direction of the sound source is specified based on the sound B collected by the second microphone when the space between the first microphone and the second microphone can not be arranged sufficiently apart or depending on the direction of the sound source, The sound emitted from the sound source next to the sound B (hereinafter referred to as "sound C". ) May have already been collected. At this time, the emphasis signal of the sound C is output based on the direction of the sound source at the timing when the sound B is emitted. However, the emphasizing signal is output in a direction closer to the sound source emitting the sound collected by the first microphone than when the first microphone and the second microphone are arranged in a straight line and the same processing is performed. Can. [0031] The microphone device can be used in various voice processing devices such as a voice recognition device and a teleconference device. 04-05-2019 8 [0032] [2. Specific Example of Microphone Device] Next, a specific example of the microphone device in the present embodiment will be described with reference to the drawings. FIG. 1 is a block diagram of the microphone device in the present embodiment, FIG. 2 is a diagram showing the positional relationship between the first microphone and the second microphone with respect to the sound source, FIG. 3 is a diagram for explaining the maximum detection range for the sound source, It is a figure which shows the sound source direction, and the positional relationship of a 1st microphone and a 2nd microphone. [0033] As shown in FIG. 1, the microphone device 1 of this embodiment includes first microphones 10 a and 10 b, analog / digital converters (A / D) 11 a and 11 b, delay units 12 a and 12 b, and an adder 13. Is equipped. [0034] The first microphones 10a and 10b collect the sound emitted from the sound source A to form an electrical analog signal (hereinafter referred to as "analog sound signal"). These first microphones 10a and 10b are arranged at predetermined intervals and converted into S1a and S1b. [0035] Analog sound signals S1a and S1b output from the first microphones 10a and 10b are converted to digital signals (hereinafter referred to as "digital sound signals") by the analog / digital converters 11a and 11b. 04-05-2019 9 ) S2a and S2b are converted and output. [0036] Here, the microphone device 1 targets the voice of the speaker as the sound emitted by the sound source A, and the analog / digital converters 11a and 11b sample the analog sound signals S1a and S1b at 44.1 kHz, for example. Digital acoustic signals S2a and S2b are generated. [0037] The digital acoustic signals S2a and S2b generated in this manner are respectively input to the delay units 12a and 12b. The delay units 12a and 12b are delay units capable of delaying with independent delay times, and are configured of ring buffers and the like. The delay units 12a and 12b delay the digital audio signals S2a and S2b by a set delay time (hereinafter referred to as "delayed signal"). ) Output S3a and S3b. [0038] For example, when the sound emitted from the sound source A reaches one of the first microphones 10a and then reaches .DELTA.ta after reaching the other first microphone 10b, the delay time .DELTA.ta is set in the delay unit 12a and the delay unit 12a is delayed Time 0 is set. When such delay time is set in the delay units 12a and 12b, the digital audio signal S2a is delayed by the delay time Δta by the delay unit 12a and output as the delay signal S3a, and the digital audio signal S2b is delayed by the delay unit 12b. It is output as a delay signal S3b without. 04-05-2019 10 [0039] Therefore, when the time when the sound emitted from the sound source A reaches the first microphones 10a and 10b is shifted, the time shifts are adjusted by the delay units 12a and 12b, and the digital acoustic signals S2a and S2b having a phase difference are the same. The signals are phased and output as the delay signals S3a and S3b. [0040] The delayed signals S3a and S3b are input to the adder 13, added and output. As described above, since the delay signals S3a and S3b are in phase with the sound emitted from the sound source A and there is no phase shift, the sound emitted from the sound source A is emphasized by adding these delay signals S3a and S3b. Signal (hereinafter referred to as "emphasis signal"). ) S4 is generated. [0041] Here, in order to set the delay time to the delay units 12a and 12b, the microphone device 1 of the present embodiment includes the second microphones 14a and 14b, and the analog / digital converters (A / D) 15a and 15b. , FIFO (First In First Out) 16a and 16b, a sound source direction determination unit 17, and a delay time setting unit 18. [0042] In the above-described conventional microphone device 100, a microphone for collecting a sound from a sound source to generate an enhancement signal and a microphone for collecting a sound from a sound source to determine a sound source direction are the same microphones 101a and 101b. However, in the microphone device 1 of this embodiment, separate microphones are used. [0043] That is, separately from the first microphones 10a and 10b collecting the sound from the sound source A to generate the emphasis signal S4, the second microphone 14a collecting the sound from the sound source A to determine the sound source direction θ , 14b are provided. 04-05-2019 11 [0044] FIG. 2 is a view showing the positional relationship between the first microphones 10a and 10b and the second microphones 14a and 14b with respect to the sound source A. As shown in FIG. As shown in the figure, the second microphones 14a and 14b are disposed at positions closer to the sound source A than the first microphones 10a and 10b so that the sound emitted by the sound source A arrives earlier than the first microphones 10a and 10b. Be done. [0045] Therefore, before the sound from the sound source A reaches the first microphones 10a and 10b (in the example shown in FIG. 2, before reaching the wavefront position a3 of the sound from the sound source A), the plurality of second microphones 14a and 14b Are collected at the wavefront positions a1 and a2 of the sound from the sound source A, respectively. The time shift between the second microphones 14a and 14b of the sound collected in this manner (the time when the sound from the sound source A reaches the wave front position a2 from the wave front position a1) is detected by the sound source direction determination unit 17, thereby The direction θ is determined. [0046] As a result, before the sound from the sound source A reaches the plurality of first microphones 10a and 10b, the direction of the sound source A can be determined, and between the first microphones 10a and 10b and the delay units 12a and 12b. Buffer can be reduced. [0047] Here, the straight line connecting the second microphones 14a and 14b and the straight line connecting the first microphones 10a and 10b are parallel to each other, and the distance between the second microphones 14a and 14b is the first microphones 10a and 10b. Suppose that it is larger than the interval between each other. 04-05-2019 12 At this time, as shown in FIG. 3, assuming that an angle formed by a line connecting the second microphone 14a and the first microphone 10b and a line connecting the second microphones 14a and 14b is α, from the first microphones 10a and 10b The maximum range of the sound source direction which reaches the second microphones 14a and 14b as soon as possible is π2α. [0048] By arranging the first microphones 10a and 10b and the second microphones 14a and 14b in this manner, the sound sources A and B generate sound sources before the sound from the sound source A reaches the first microphones 10a and 10b. The sound from A can be collected. [0049] Here, the second microphones 14a and 14b are arranged at predetermined intervals, collect the sound emitted from the sound source A, convert the sound into analog sound signals S5a and S5b, and output them. The analog sound signals S5a and S5b output from the second microphones 14a and 14b are converted into digital sound signals S6a and S6b by the analog / digital converters 15a and 15b, respectively, and then output. [0050] The digital acoustic signals S6a and S6b generated in this manner are sequentially input to the sound source direction determination unit 17 as digital acoustic signals S7a and S7b through the FIFOs 16a and 16b. The FIFOs 16a and 16b are provided to adjust the difference between the operation timings of the analog / digital converters 15a and 15b and the sound source direction determination unit 17. [0051] 04-05-2019 13 The sound source direction determination unit 17 includes the memories (MEMs) 20a and 20b and the determination unit 21, and determines the sound source direction θ. [0052] The memories 20a and 20b store a predetermined number N (for example, 256) or more of information on the signal level of the latest digital acoustic signal among the digital acoustic signals output from the analog / digital converters 15a and 15b. The digital acoustic signals output from the analog / digital converters 15a and 15b are sequentially stored via the FIFOs 16a and 16b. [0053] The distance between the second microphones 14a and 14b and the speed of sound are known, and the judging means 21 of the sound source direction judging unit 17 judges the sound source direction θ based on the information and the digital acoustic signals S7a and S7b. [0054] Hereinafter, the determination process of the sound source direction θ by the determination unit 21 of the sound source direction determination unit 17 will be specifically described. [0055] Assuming that the signal level of the digital acoustic signal S7a on the side of the second microphone 14a is X1 (i), and the signal level of the digital acoustic signal S7b on the side of the second microphone 14b is X2 (i), two second ones for the sound from the sound source A The time lag τ of sound collection by the microphones 14a and 14b can be derived from the following equations (1) and (2). Note that 0 ≦ j ≦ N−1 (j is an integer) and 0 ≦ i ≦ N−1 (i is an integer), and the latest digital sound signal is i = 0, j = 0, and the memories 20a and 20b are used. The oldest digital acoustic signal out of the stored N digital acoustic signals is i = N-1, j = N-1. [0056] 04-05-2019 14 [0057] [0058] First, the determination unit 21 performs the calculation according to the above equation (1). That is, the determination means 21 takes out a predetermined X1 (i) from the memory 20a, and takes out X2 (0) to X2 (N-1) from the memory 20b. Then, the determination unit 21 calculates the sum of values obtained by integrating X2 (0) to X2 (N-1) with respect to a predetermined X1 (i). The determination unit 21 performs this process on all of X1 (0) to X1 (N-1) stored in the memory 20a. [0059] Next, as shown in the equation (2), the determination means 21 sets the largest value among RX1X2 (0) to RX1X2 (N-1) (hereinafter referred to as maximum value RX1X2 (γ) ) Is determined. [0060] Here, assuming that the sampling frequency of the analog / digital converters 15a and 15b is 44.1 kHz, it becomes 22.676 μs per one sampling. On the other hand, the determined γ is a time shift represented by the number of samplings. Therefore, the determination unit 21 of the sound source direction determination unit 17 detects the time lag τ by performing the calculation of the following equation (3). 04-05-2019 15 [0061] [0062] Next, the determination means 21 calculates the positional deviation D (see FIG. 4) between the second microphones 14 a and 14 b with respect to the sound source A. When the sound velocity is defined as c, the positional deviation D is obtained by adding the sound velocity c to the time lag τ as shown in the following equation (4), and the determination means 21 performs the calculation based on this equation (4) . [0063] [0064] Next, the determination means 21 determines the sound source direction θ. The relationship between the sound source direction θ, the positional deviation D, and the distance L 0 between the second microphones 14 a and 14 b is a relationship shown in the following equation (5), and the determination means 21 performs an operation based on the equation (5). [0065] [0066] As described above, the determination unit 21 determines the sound source direction θ based on the outputs of the second microphones 14 a and 14 b, and the information on the sound source direction θ is notified to the delay time setting unit 18. [0067] 04-05-2019 16 The delay time setting unit 18 has a delay amount table set therein, and determines the delay time to the delay units 12 a and 12 b based on the information on the sound source direction θ notified from the sound source direction determination unit 17. The delay amount table is a table in which delay times to the delay units 12a and 12b are associated with respective values of the sound source direction θ, and calculation is performed based on the following formulas (6) to (8) The delay time based on the positional deviation Diff is set. The positional deviation Diff is a positional deviation between the first microphones 10a and 10b with respect to the sound source A, as shown in FIG. [0068] [0069] [0070] [0071] Here, the line connecting the first microphones 10a and 10b and the line connecting the second microphones 14a and 14b are parallel, and when the sound source direction θ is + 30 ° and the distance L1 is 10 cm, the distance Diff is 5 cm. . In addition, assuming that the sampling frequency of the analog / digital converters 11a and 11b is 44.1 kHz and the sound speed is 340 m / s, one sampling period is 22.676 μs, and the traveling distance of the sound from the sound source A in one sampling period Is 7.710 mm. Therefore, the delay time is 5 / 0.771 = 6.4 sampling time. 04-05-2019 17 [0072] At this time, the delay time setting unit 18 sets 6.4 sampling time as a delay time in the delay unit 12a that delays the digital acoustic signal S2a on the first microphone 10a side where the sound from the sound source A arrives early. On the other hand, 0 sampling time is set as a delay time of the delay unit 12 b that delays the digital sound signal S 2 b on the first microphone 10 b side. [0073] Then, the delay units 12a and 12b delay the digital audio signals S2a and S2b according to the delay time set in this way, and output the delayed signals S3a and S3b to the addition unit 106 as delayed signals S3a and S3b. The delayed signals S3a and S3b are signals that are in phase with the sound emitted from the sound source A and are in phase with each other, and the delayed signals S3a and S3 are added by the adding unit 106, and the sound from the sound source A is generated. An emphasis signal S4 is generated with the [0074] The plurality of first microphones 10a and 10b are disposed on the first straight line, and the plurality of second microphones 14a and 14b are disposed on the second straight line, and the first straight line and the second straight line are parallel to each other. It is desirable to have. For example, as shown in FIG. 4, the straight line connecting the second microphones 14a and 14b and the straight line connecting the first microphones 10a and 10b are parallel to each other. By doing this, since the sound source direction θ for the plurality of first microphones 10a and 10b and the sound source direction θ for the plurality of second microphones 14a and 14b become the same, conversion of the delay time becomes easy. [0075] 04-05-2019 18 Moreover, it is desirable to make the space ¦ interval of 2nd microphones 14a and 14b larger than the space ¦ interval of 1st microphones 10a and 10b. When the distance between the second microphones 14a and 14b is smaller than the distance between the first microphones 10a and 10b, the directivity becomes narrow. [0076] When wide directivity is required, the distance between the second microphones 14a and 14b is made larger than the distance between the first microphones 10a and 10b, and the line between the first microphones 10a and 10b and the second microphone 14a, 14b Increase the distance between the lines made by each other. [0077] Further, in the above description, since the number of the first microphone and the number of the second microphone are two each, it is possible to manufacture a microphone device in which the sensitivity to the sound source A is enhanced with a simple configuration. [0078] Here, the straight line connecting the second microphones 14a and 14b and the straight line connecting the first microphones 10a and 10b are parallel to each other, and the distance between the second microphones 14a and 14b is the first microphones 10a and 10b. Suppose that it is larger than the interval between each other. At this time, as shown in FIG. 3, an angle formed by a line connecting the second microphone 14 a and the first microphone 10 b and a line connecting the second microphones 14 a and 14 b is α, and the second microphone 14 a and the first microphone Assuming that an angle formed by a line connecting 10a and a line connecting the second microphones 14a and 14b is β, the sound source direction θ, the angle α, and the angle β are defined by the following equations (9) and (10) Ru. Therefore, in order to adjust the sound source direction θ so that the second microphones 14a and 14b reach the second microphones 14a and 14b earlier than the first microphones 10a and 10b, the angle α and the angle satisfy the following equations (9) and (10). We will change β. 04-05-2019 19 [0079] [0080] [0081] Although the number of first microphones 10a and 10b is two, it may be a large number of microphones. By providing a large number of first microphones, it is possible to add a large number of acoustic signals and output an enhanced signal in which the sound from the sound source direction θ is more emphasized. Similarly, the second microphones 14a and 14b may be not only two but a large number of microphones. Thereby, not only a wavefront of a plane but also a wavefront of a solid can be accommodated. [0082] As described above, in the microphone device 1 according to the present embodiment, the plurality of second microphones are provided at positions closer to the sound source than the plurality of first microphones used to output the emphasis signal, and the sound obtained by the second microphone The direction of the sound source is determined based on the signal. Therefore, it is possible to determine the direction of the sound source before the sound from the sound source reaches the plurality of first microphones, and it is possible to reduce the buffer between the first microphone and the delay unit. [0083] [3. Example of Device to which Microphone Device is Applied] As a device to which the above-described microphone device 1 is applied, a voice interaction device provided with a voice 04-05-2019 20 recognition device will be described as an example with reference to the drawings. FIG. 5 is a block diagram of a speech dialogue apparatus provided with a speech recognition apparatus. The voice interaction device is a device that provides information or service requested by the user by interacting with the user by voice. [0084] As shown in FIG. 5, the voice interaction device 30 includes a control unit 41, a storage unit 42, a decoder unit 43, an image processing unit 44, a display device 45, an audio processing unit 46, a speaker 47, and an input I / F (interface) unit. 48, an input operation unit 49, and a speech recognition device 50. The control unit 41, the storage unit 42, the decoder unit 43, the input I / F unit 48, and the voice recognition device 50 are mutually connected via the system bus 51. [0085] The control unit 41 includes a central processing unit (CPU), a read only memory (ROM), and a random access memory (RAM), and controls the entire voice dialogue apparatus 30. [0086] The storage unit 42 is configured by a hard disk drive or the like, and stores a dialogue scenario or the like for interacting with the user. [0087] The decoder unit 43 decodes image data and audio data based on the dialogue scenario stored in the storage unit 42. The image data decoded by the decoder unit 43 is converted into information that can be displayed on the display device 45 by the image processing unit 44, and displayed on the display device 45. Further, the audio data decoded by the decoder unit 43 is converted by the audio processing unit 46 into information that can be output as a sound wave by the speaker 47, and is output from the speaker 47. 04-05-2019 21 [0088] The input I / F unit 48 detects the user's operation on the input operation unit 49 and notifies the control unit 41 of the operation. The control unit 41 performs processing according to this input operation. [0089] The voice recognition device 50 is a device for recognizing a voice emitted by a user, and is provided with a voice recognition unit 61 that performs voice recognition based on output signals from the microphone device 60 and the microphone device 60. The operation of the voice recognition device 50 is controlled by the control unit 41. Then, the speech content of the user recognized by the voice recognition device 50 in the operating state is recognized and notified to the control unit 41 as character information. [0090] By applying the above-described microphone device 1 as the microphone device 60 used for the voice recognition device 50, it is a signal according to the voice emitted by the user, and the reduction in sensitivity is suppressed when the direction of the sound source A fluctuates. A signal can be input to the speech recognition unit 61. Therefore, the rate of recognizing the voice emitted by the user can be increased, and a voice recognition apparatus with high recognition rate can be realized. [0091] The voice dialogue apparatus 30 presents information based on the dialogue scenario stored in the storage unit 42 by the control unit 41 from the display unit 45 or the speaker 47 to the user, and the user issues the information presented in this way. The voice is recognized by the voice recognition device 50, and the control unit 41 determines the information to be presented next based on the recognized information and the dialogue scenario, and finally, the information or service required by the user. provide. 04-05-2019 22 [0092] Although some of the embodiments of the present invention have been described above in detail based on the drawings, these are merely examples, and the present invention may be embodied in other forms that are variously modified and improved based on the knowledge of those skilled in the art. It is possible to carry out. [0093] It is a figure which shows the block diagram of the microphone apparatus in one Embodiment of this invention. It is a figure which shows the positional relationship of the 1st microphone and 2nd microphone with respect to a sound source. It is a figure for demonstrating the maximum detection range with respect to a sound source. It is a figure which shows the sound source direction, and the positional relationship of a 1st microphone and a 2nd microphone. It is a block diagram of the speech interactive apparatus provided with the speech recognition apparatus which has a microphone apparatus of FIG. It is a schematic configuration of a conventional delay-and-sum type microphone device. Explanation of sign [0094] 1, 60 Microphone device 10a, 10b First microphone 11a, 11b, 15a, 15b Digital / analog converter (A / D) 12a, 12b 13 Addition unit 14a, 14b Second microphone 16a, 16b FIFO 17 Sound source direction determination Unit 18 Delay time setting unit 20a, 21b Memory 21 Judgment means 30 Speech dialogue device 50 Speech recognition device 61 Speech recognition unit 04-05-2019 23
© Copyright 2021 DropDoc