Patent Translate Powered by EPO and Google Notice This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate, complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or financial decisions, should not be based on machine-translation output. DESCRIPTION JP2007235358 The present invention provides a sound pickup device capable of effectively removing background noise using a small-sized microphone array with a small shape. SOLUTION: The first and second sound collecting sections have an angle area including a desired sound source position from different positions as a sound collection range, and an angle area including no desired sound source position from a similar position as a sound collection area A fifth sound collecting unit that sets an angle range including a desired sound source position from a midpoint between the sound collecting positions of the third and fourth sound collecting units and the first and second sound collecting units as a sound collecting range, and a fifth sound collecting unit And a sixth sound pickup unit that sets an angle area not including the desired signal source from a position equivalent to the sound pickup position of the sound pickup area as the sound pickup area, and the sound pickup signal of each sound pickup unit is frequency domain converted. The gain coefficient determined by the desired signal to background noise ratio is determined, and the gain coefficient is multiplied by each frequency domain of the signal mainly composed of the desired signal, and the gain coefficient is approximately 0 in the frequency domain where the background noise component is dominant. And greatly attenuate the amount of background noise. [Selected figure] Figure 2 Sound collecting device, program and recording medium recording the same [0001] The present invention relates to a sound collection device for collecting sound in a hands-free manner such as voice communication and operation of a device, and relates in particular to a large number of noise sources other than a desired sound source emitting a desired sound. 04-05-2019 1 [0002] FIG. 18 shows a configuration of a sound collection device having a noise removal function described in Non-Patent Document 1. As shown in FIG. In the prior art, when M microphones M1 to Mm are used as a signal for the sound emitted from the desired sound source 1 at the point of coordinates (p, q) and noise for the other points, It emphasizes only the signal and picks up sound with high SN ratio. First, adding the delay amount Dm and the gain gm to the signal xm (n) (m = 1... M) received by the microphones M1 to Mm arranged at the coordinates (pm, qm) as in equation (1) Thus, the signal ym (n) is obtained. ym (n) = gmxm (n−Dm) (1) At this time, the delay amount Dm and the gain gm are respectively given by Equation (2) and Equation (3) from the position (p, q) of the desired sound source 1 given in advance. It can be derived by Here, rm and rc are the microphone-sound source distance and the critical distance defined by the equations (4) and (5), respectively, c represents the velocity of sound, and V and T represent the chamber volume and the reverberation time in the chamber, respectively. Next, a signal z (n) emphasizing the sound emitted from the position of the desired sound source 1 is obtained by adding ym (n) obtained just as in the equation (6). The above is the conventional noise removal method. In order to pick up a signal with a higher S / N ratio using this prior art, it is necessary to increase the number of microphones and to make the microphone array large because the microphones must be widely spaced from each other. Hiroaki Nomura, Yutaka Kanada, Junji Kojima, "Near-field microphone array," Journal of the Acoustical Society of Japan, Vol. 53, No. 2, pp. 110-116, 1997 [0003] In order to select, emphasize and pick up any one of the sounds emitted from the sound sources arranged at different points in the same direction from the sound collection device using the prior art noise removal method, A large scale microphone array is required because the positions of the microphones relative to the desired sound source need to be largely different. Therefore, the use range is limited because extensive work is required for installation and transportation when used. Further, in the prior art, the amount of improvement of the SN ratio due to the treatment is insufficient in practical use. On the other hand, in the prior art using a small scale microphone array, since it has only the discrimination ability in the direction in principle, only one of the sounds emitted from the sound sources arranged in the same direction but different in distance is It was impossible to pick and pick. 04-05-2019 2 [0004] SUMMARY OF THE INVENTION The object of the present invention is to solve the above problems, and it is easy to install and transport, and it has not only the direction but also the discrimination ability regarding distance, and the sound from the desired sound source with a higher SN ratio than the prior art. It is to realize a device that picks up sound. [0005] A sound collection device according to the present invention uses the output signals of a microphone array configured by mounting a plurality of microphones to collect first and second sound collection of sound in an angular region including a desired sound source position from different positions. And third and fourth sound pickup units that pick up sound in an angular area not including the desired sound source position from different positions using output signals of the microphone array, and the desired point from the midpoint between different positions A fifth sound pickup unit that picks up sound in an angle area including the sound source position, a sixth sound pickup section that picks up sound from an angle area not including the desired sound source position from the middle point, and the respective sound pickup sections A sound source signal component estimation unit that estimates the signal amount of the desired sound source and the signal amounts of the other sound sources from each collected sound signal, the signal amounts of the desired sound source, and the signals of all the sound sources including the signal amount of the desired sound source Gain factor to obtain the gain factor from the ratio A calculation unit, and a multiplication unit that multiplies the signal whose main component is the signal of the desired sound source obtained by the first and second sound collection units with the gain coefficient calculated by the gain coefficient calculation unit. . [0006] Furthermore, according to the sound pickup apparatus of the present invention, in the above sound pickup apparatus, the microphone array is formed of two microphone arrays arranged at positions approximately equidistant from the desired sound source position, and the outputs of these two microphone arrays The first to sixth sound collecting units execute sound collection processing using a signal. Furthermore, in the sound collection device according to the present invention, in the sound collection device, the sound collection signals obtained by the first to sixth sound collection units are subjected to frequency domain conversion processing by frequency domain conversion means, and gain coefficients are frequency domain conversion means. The corresponding 04-05-2019 3 frequency domain component of the signal whose main component is the amount of the desired sound source which is calculated for each frequency domain component converted by and the gain coefficients for each frequency domain calculated are determined by the first and second sound collecting parts It is characterized by multiplying by. [0007] Furthermore, in the sound collection device according to the present invention, in the sound collection device described above, the sound source signal component estimation unit calculates the power value of each sound source signal and estimates the gain coefficient by the ratio of the power values of each sound source signal. It features. Furthermore, in the sound collection device according to the present invention, in the sound collection device described above, the sound source signal component estimation unit calculates the absolute value of each sound source signal to estimate the signal amount of each sound source signal. The gain coefficient may be estimated by a ratio. Furthermore, in the sound collection device according to the present invention, in the sound collection device described above, the gain coefficient calculated by the gain coefficient calculation unit has a small value to the amount of the desired sound source signal such that the amounts of other sound source signals can be ignored. In some cases, it is characterized in that it is given at a predetermined maximum value, and given a value close to 0 when the value of the desired sound source signal is small enough to neglect the amount of other sound source signals. . [0008] Furthermore, in the sound collection device according to the present invention, in the sound collection device described above, the gain coefficient calculation characteristic of the gain coefficient calculation unit is obtained, and the value of the gain coefficient is calculated in the region where the amount of other sound source signals is smaller than the amount It has a change characteristic to maintain the maximum value or a value close to the maximum value, and keep the value of the gain coefficient close to 0 to 0 in a region where the amount of other sound source signals is larger than the amount of the desired sound source signal. It features. [0009] According to the sound collection device according to the present invention, the first to sixth 04-05-2019 4 sound collection units mainly use the signals obtained from the microphone array, and the sounds in the angle area including the desired sound source position and the angles not including the desired sound source position. The configuration for obtaining the sound collection characteristic for collecting the sound in the region, that is, the configuration for setting the directivity, makes it possible to distinguish the desired angular region where sound collection is desired even if the distance between the microphones is short. As a result, the shape of the microphone array can be miniaturized. Furthermore, according to the sound collection device of the present invention, the collected signal is divided into frequency domains, and the component quantity of each sound source signal is estimated for each frequency domain divided into frequency domains, and the estimated component quantity of each sound source signal To calculate the gain coefficient for each frequency domain corresponding to the SN ratio. Attenuating the amount of another sound source signal contained in the sound source signal having the sound of the desired sound source as the main component by multiplying each frequency domain component of the sound source signal having the sound source of the desired sound source as the main component by this gain coefficient. Can. As a result, it is possible to emphasize and extract only the desired sound source signal. [0010] Furthermore, according to the sound collection device according to the present invention, although the size of each microphone array is small, it is easy to install and transport, but the sound sources are different in distance in the same direction, which was impossible in the prior art. When it is arranged, it is possible to emphasize and pick up any one of them. The experimental result by simulation of Example 1 and Example 2 which are demonstrated later in order to show the effect regarding improvement of the SN ratio by this invention is shown. The situation setting in simulation is shown in FIG. In each microphone array, five microphones are arranged at a straight line at a distance of 4 cm on a straight line, and each coordinate (unit: meter). The same applies to (0.4, 0) and (-0.4, 0). In Case 1 shown in FIG. 14A, the desired sound source 1 is arranged at (0, 0.5), and one background noise source 2 is arranged at (0, 2.5). In the case 2 shown in FIG. 14B, in addition to the case 1, the background noise sources 2 are arranged at two points of (−1.6, 2.5) and (1.6, 2.5) respectively. [0011] 15A shows the signal of the desired sound source 1 in case 1, FIG. 15B shows the signal received 04-05-2019 5 by the microphone, and FIG. 15C shows the signal after the processing of the second embodiment. 16A, B and C respectively show the same signals in case 2. In either case, the signal processed by the present invention is closer to the sound of the desired sound source and the sound from the desired sound source 1 is enhanced and collected as compared to the signal before processing in either of FIGS. 15 and 16 I understand. Next, FIG. 17 shows the SN ratio improvement amount of the signal before processing and the signal after processing. It can be seen that the SN ratio improvement amount when using the present invention is about 13 dB, which is larger than that of the prior art by 10 dB or more. Further, in the second embodiment, the addition of the non-linear processing increases the SN ratio improvement amount, and the effect of the addition of this processing can be confirmed. As described above, according to the present invention, it can be understood that, while the installation and transportation of the device are easy, any one of the sounds emitted by a plurality of sound sources can be selectively emphasized and collected. Further, it can be seen that by using the present invention, the SN ratio improvement amount at the time of sound collection is greatly improved to a practically sufficient level. [0012] Although it is possible to configure everything by hardware in order to realize the sound collection device according to the present invention, in the simplest implementation, the program according to the present invention is installed in a computer and the sound collection device according to the present invention is installed in the computer The mode to function as is the best embodiment. In order to realize the sound collection device according to the present invention by a computer, at least first to sixth sound collection units, a frequency domain conversion unit, and a sound source signal component estimation unit in the computer according to a sound collection program installed in the computer A gain coefficient calculation unit and a multiplication unit are constructed to function as a sound collection device. [0013] FIG. 1 shows an example of usage of the present invention. Two small scale microphone arrays 3L and 3R are arranged at different positions to some extent (for example, the same distance as the distance between the microphone arrays 3L and 3R and the desired sound source 1), and for each signal received by the microphone The processing described below is performed. By performing the processing described below, the sound of the desired sound source 1 is emphasized and collected, and the sound of the background noise source 2 is suppressed. FIG. 2 shows the entire configuration of the sound collection device according to the present invention. 04-05-2019 6 The outline of the sound collection device according to the present invention will be described with reference to FIG. The respective sound receiving signals generated by the respective microphones of the microphone array 3L are inputted to the first sound collecting unit 4-1 and the third sound collecting unit 4-3 in this example. Further, the respective sound receiving signals generated by the respective microphones of the microphone array 3R are input to the second sound collecting unit 4-2 and the fourth sound collecting unit 4-4 in this example. The signals of the microphones located at the centers of the microphone arrays 3L and 3R are input to the fifth sound collecting unit 4-5 and the sixth sound collecting unit 4-6. The number of microphones mounted on both microphone arrays 3L and 3R is not necessarily the same. [0014] As shown in FIG. 4, the first sound collecting unit 4-1 to the fourth sound collecting unit 4-4 have M filter processing units 41 to which the sound reception signals x1 to xm of the respective microphones are input, It is comprised by the addition part 42 which adds each output signal of the filter process part 41. FIG. Each filter processing unit 41 is constituted by, for example, an FIR filter, and performs analysis processing for each frequency component included in the collected sound signal by digital processing to set the directivity characteristics of the microphone arrays 3L and 3R. Such a technology is described, for example, in "Sound system and digital processing" co-authored by Oga Juro, Yoshio Yamazaki and Toyoda Kanada on March 25, 1995, published by The Institute of Electronics, Information and Communication Engineers, and can be realized by a well-known technology. it can. Here, the directivity characteristics of the first sound collection unit 4-1 and the directivity characteristics of the second sound collection unit 42 are angle regions Θ L including the position of the desired sound source 1 shown in FIG. 3 from the approximate center position of the microphone arrays 3L and 3R. Set to a characteristic that sets 収 and Θ R as the sound collection range. The directional characteristics of the third sound collecting unit 4-3 and the fourth sound collecting unit 4-4 are angular regions Θ L Θ and ¯ R な い not including the position of the desired sound source 1 shown in FIG. And set the characteristic as the sound collection range. Further, the directivity of the fifth sound collecting unit 4-5 is set to a characteristic that the angular region ΘC including the position of the desired sound source 1 from the approximate middle position of the microphone arrays 3L and 3R is the sound collection range. The directivity of the sixth sound collecting unit 4-6 is set to a characteristic in which the angular range from the approximate middle position between the microphone arrays 3 L and 3 R to the angular range C excluding the position of the desired sound source 1 is the sound collection range. [0015] 04-05-2019 7 The sound collection signal collected by the directional characteristics of the first to sixth sound collection units 4-1 to 4-6 is converted to a signal in the frequency domain by the frequency domain conversion unit 5. In the conversion to the frequency domain, the input signal is decomposed into frames of a short time length (for example, about 256 samples in the case of sampling frequency 16000 Hz), and discrete Fourier transform is performed in each frame. For the discrete Fourier transform, for example, a fast Fourier transform or the like called FFT or the like can be used. The signal transformed into the frequency domain is divided into a plurality of frequency domain components. The collected sound signal converted into the signal in the frequency domain is input to the addition unit 6 and the sound source signal component estimation unit 7. The output signals of the first sound collecting unit 4-1 and the second sound collecting unit 4-2 are input to the adding unit 6. The adder 6 adds the signals of each frequency domain converted to the frequency domain for each same frequency domain component. [0016] The sound source signal component estimation unit 7 receives all output signals of the first sound collection unit 4-1 to the sixth sound collection unit 4-6, and estimates the signal amount of each sound source for each frequency region. If the signal amount of each sound source can be estimated, the ratio of the signal amount of the desired sound source 1 to the signal amount of other sound sources, that is, the SN ratio can be obtained. This SN ratio is determined for each frequency domain, and the SN ratio is used as a gain coefficient by multiplying the signal having the signal of the desired sound source 1 given from the adding unit 6 as the main component for each frequency domain, It is possible to suppress the background noise component contained in the signal whose main component is the signal of the desired sound source 1. The multiplication result of the multiplication unit 9 is converted to a time domain signal by the inverse frequency domain conversion unit 10, and is output as a signal after noise removal. The above is the outline of the present invention. [0017] The configuration and operation of each part will be described in detail below. FIG. 4 shows the configuration of the first to fourth sound collecting units 4-1 to 4-4. Here, although the first sound collecting unit 4-1 is described as an example, the same process is performed for the second sound collecting unit 4-2, the third sound collecting unit 4-3, and the fourth sound collecting unit 4-4. It will be. These first sound collecting units 4-1 to 4-4 do not include the sound collecting characteristic and the desired sound source position that set the angle range 04-05-2019 8 including the desired sound source position from the directions on both sides of the position of the desired sound source 1 Since it is set to the sound collection characteristic which makes an angle area a sound collection range, it functions as a side beam former. The signal xLmL (n) (mL = 1, 2,..., ML) input to the first sound collection unit 4-1 is input to the filter processing unit 41. The filter processing unit 41 substitutes a filter coefficient wLmL (n) given in advance (the determination method will be described later) and the input signal xLmL (n) into the convolution operation shown in equation (7) to obtain a signal x'LmL ( Output n). The output signal of each filter processing unit 41 is input to the addition unit 42. The adding unit 42 adds the input signals as shown in equation (8) to obtain an output signal ySL (n) of the first sound collecting unit 4-1. Here, the filter coefficient wLmL (n) is designed using, for example, the least squares method or the like so that the directivity characteristic DLSPB (ω, θ) of the first sound collection unit has the characteristic shown in the equation (9). Similarly, the second sound collecting unit, the third sound collecting unit, and the fourth sound collecting unit are designed to satisfy the conditions of Equations (10) to (12). Each of Θ and 示 す indicates a peripheral direction of the desired signal (for example, a direction within a range of about ± 10 ° from the desired signal direction) and the other direction. Further, D (.omega., .Theta.) Shown in the equations (9) to (12) represents the directivity characteristics of each sound collecting unit. The first sound collection unit 4-1 emphasizes and collects only the sound emitted in the direction of the desired sound source 1 when viewed from the microphone array 3L. As viewed from the microphone array 3L, the third sound collection unit emphasizes and collects only sounds emitted in directions other than the direction of the desired sound source. As viewed from the microphone array 3R, the second sound collection unit 4-2 emphasizes and collects only the sound emitted in the direction of the desired sound source 1. The fourth sound collecting unit 4-4 emphasizes and collects only sounds emitted in directions other than the direction of the desired sound source 1 as viewed from the microphone array 3R. [0018] FIG. 5 shows the flow of processing in the fifth sound collecting unit 4-5 and the sixth sound collecting unit 4-6 which function as frontal beam formers. In the front beamformer, a signal xL (ML / 2) (n) received by the microphone disposed at the center of the microphone array 3L and a signal xR received by the microphone disposed at the center of the microphone array 3R (MR / 2) (n) is input to the filter processing units 51 and 52, respectively. In the filter processing units 51 and 52, the input signals xL (ML / 2) (n) and xR (MR / 2) (n) are given filters given in advance as shown in equations (13) and (14). Outputs x 'L (ML / 2) (n) and x' R (MR / 2) (n) obtained by convolving coefficients wC (ML / 2) (n) and wC (MR / 2) (n) Do. Here, it is desirable that the filter coefficients wC (ML / 2) (n) and wC (MR / 2) (n) have the same phase characteristics, and for example, a single impulse signal is used. In the fifth sound collecting unit 4-5, the output signals x′L (ML / 2) (n) and x′R (MR / 2) (n) of the filter processing units 51 and 52 are input to the 04-05-2019 9 adding unit 53. The adding unit 53 adds the input signals as shown in equation (16), and outputs a signal ySC (n). As a result, in the fifth sound collecting unit 4-5, only the sound emitted in the direction of the desired sound source 1 is emphasized and collected as viewed from the midpoint between the microphone array 3L and the microphone array 3R. [0019] ySC (n) = x'L (ML / 2) (n) + x'R (MR / 2) (n) (16) In the sixth sound collection unit 4-6, the output signals x 'of the filter processing units 51 and 52 L (ML / 2) (n) and x′R (MR / 2) (n) are input to the subtraction unit 54. The subtractor 54 subtracts the input signal as shown in equation (17) and outputs a signal yNC (n). Therefore, in the sixth sound collecting unit 4-6, only the sound emitted in the direction other than the direction of the desired sound source 1 is emphasized and collected, as viewed from the middle point between the microphone array 3L and the microphone array 3R. yNC (n) = x'L (ML / 2) (n) -x'R (MR / 2) (n) (17) FIG. 6 shows the flow of processing in the sound source signal component estimation unit 7. The frequency components YSL (ω, l), YNL (ω, l), YSC (ω, l), YNC (ω, l), YSR (ω, l), YNR (ω) input to the sound source signal component estimation unit 7 , l) are input to the power calculation unit 61, and the power values ¦ YSL (ω, l) ¦ <2>, ¦ YNL (ω, l) ¦ <2>, ¦ YSC (ω, l) ¦ 2>, ¦ YNC (ω, l) ¦ <2>, ¦ YSR (ω, l) ¦ <2>, ¦ YNR (ω, l) ¦ <2> is output and input to the vectorization unit 62 . The vectorization unit 62 groups power values of the input first to sixth output signals of the first to sixth sound collection units 4-1 to 4-6 in a vector format as shown in equation (18). Output ω, l). Note that letters with suffix * and capital letters in the expressions represent vectors. [0020] The power vector Y * (ω, l) is input to the multiplication unit 63. The power estimation matrix T * <+>, which is the other input of the multiplier 63, is an output signal of the pseudo inverse matrix calculator 64. The gain matrix T * defined by equation (19) is input to the pseudo inverse matrix operation unit 64, and the pseudo inverse matrix T * <+> is output. Each element of the gain inverse matrix T * is set in the fifth sound collecting unit 4-5, the sixth sound collecting unit 4-6, and the first sound collecting unit 4-1 to the fourth sound collecting unit 4-4. The gain of the directional characteristic with respect to the direction or the Θx direction is, for example, an average value of the frequency and direction of the directional characteristic as shown in the equations (20) to (23). [alpha] x is an average value of the directivity characteristics set in the first, second, and fifth sound collecting units 4-1, 4-2, and 4-5 with respect to the peripheral direction of the desired sound. [beta] x is an average value of the directional characteristics set in the first, second, and fifth sound collecting units 4-1, 4-2, and 4-5 with respect to the peripheral 04-05-2019 10 direction of the desired signal. [gamma] x is an average value of directivity characteristics set in the third, fourth, and sixth sound collecting units 4-3, 4-4, and 4-6 with respect to the peripheral direction of the desired signal. [delta] x is an average value of directivity characteristics set in the third, fourth, and sixth sound collecting units 4-3, 4-4, and 4-6 with respect to directions other than the peripheral direction of the desired signal. In the equations (20) to (23), the subscript x represents any one of R, C, and L. [0021] The multiplication unit multiplies the input beamformer output power vector and the power estimation matrix as shown in equation (24) for each frequency component, and outputs an estimated signal power vector X * opt (ω, l). FIG. 7 shows the flow of processing in gain coefficient calculation 8. The estimated signal power vector X * opt (ω, l) input from the sound source signal component estimation unit 7 shown in FIG. 6 is input to the vector element extraction unit 81. The vector element extraction unit 81 estimates the first component of the input estimated signal power vector as the estimated signal power ¦ S (ω, l) ¦ <2>, and estimates the second component as shown in equation (25), and the left direction noise Power ¦ NL (ω, l) ¦ <2>, estimated third component Front direction noise power ¦ NC (ω, l) ¦ <2>, estimated fourth component right direction noise power ¦ NR (ω, l) They are output as ¦ <2>, respectively, and they are input to the SN ratio estimation unit 82. The SN ratio estimation unit 82 calculates the estimated SN ratio ESNR (ω, l) using the equation (26). The estimated SN ratio ESNR (ω, l), which is the output of the SN ratio estimator 82, is output as a gain coefficient R (ω, l). As shown in FIG. 8, the gain coefficient R (ω, l) determined by the equation (26) is as follows: noise component Nx = ¦ NL (ω, l) ¦ <2> + ¦ NC (ω, l) ¦ <2> When the relationship between + ¦ NR (ω, l) ¦ <2> and the desired signal Sx = ¦ S (ω, l) ¦ <2> is Nx >> Sx, the gain coefficient R (ω, l) ≒ 0 In the case of Nx << Sx, R (ω, l) ≒ 1, that is, a predetermined maximum value. The gain factor R (ω, l) is calculated for each frequency domain. Therefore, in the frequency domain where the amount of noise mixing is small, the gain coefficient R (ω, l) has a value close to 1 , and the desired signal component is output as it is. Further, in the frequency domain where the amount of noise mixing is large, the gain coefficient R (ω, l) becomes a value close to 0 , and the signal component in the frequency domain is largely attenuated to suppress the noise amount. As described above, by multiplying the signal YS (ω, l) whose main component is the desired signal supplied from the adding unit 6 by the gain coefficient R (ω, l) for each frequency domain, the noise component for each frequency domain Can be suppressed, and the SN ratio of the signal converted to the time domain by the inverse frequency domain transform unit 10 can be improved. [0022] 04-05-2019 11 Here, the principle by which the present invention enables sound collection with the desired sound selected and emphasized will be described. The output power of each sound collection signal, which is each element of the power vector Y * (ω, l) of the signal output from each of the sound collection units 4-1 to 4 can be expressed by Equation (27) to Equation (32) As shown respectively, the power of the signal Xθ (ω, l) received by the microphone array can be approximated in the form of being multiplied by the directivity characteristic based on the sound source direction and frequency of the signal. However, it is assumed here that the sounds emitted by the respective sound sources are uncorrelated with each other, and the sound is received at the same level in all the microphones. [0023] Now, consider the position of the sound source as shown in FIG. 3 by dividing it into the desired sound source 1 and the other three background noise sources 2R, 2C, 2L, and the signal Xθ (ω, l) is S ^ (ω, l), N It is assumed that it is included in any of ^ L (ω, l), N ^ C (ω, l), and N ^ R (ω, l). At this time, assuming that the directivity characteristic of each sound collection unit designed under the range of Equation (9) to Equation (12) is uniform within the angle region of Θ or ¯, Y * (ω, l ) Is expressed by equation (33). In this embodiment, the average value of the directivity characteristics determined by Equations (20) to (23) is used as a representative value of the directivity characteristics for each angle region. [0024] From the above relationship, by multiplying the beamformer output power vector Y * (ω, l) from the left side, the pseudoinverse matrix T * <+> of T * given in advance is X * (ω, l) An estimated signal power vector X * opt (ω, l) which is an estimated value is obtained. [0025] The second embodiment is a modification of the procedure in the gain coefficient calculation unit 8 of the first embodiment. FIG. 9 shows the processing procedure of the gain coefficient calculator 8 used in the second embodiment. The difference from the gain coefficient calculation unit 8 in the first embodiment is that a non-linear processing unit 83 is added. In order to emphasize the distinction between 04-05-2019 12 the desired voice and the background noise, the non-linear processing unit 83 multiplies the estimated input SN ratio by the non-linear function Z (ω, l) that fluctuates between 0 and 1, R (ω). , l) are output. Here, the nonlinear function Z (ω, l) is given in advance, and maintains a value close to 1 or 1 in a region where the ESNR (ω, l) is large, and a region where the SN ratio ESNR (ω, l) is small For example, a function that maintains 0 or a value close to 0 is used, for example, one that is combined with the hypobolic tangent shown in equation (35) or the logarithmic function shown in equation (36). FIG. 10 shows an example of the non-linear function Z (ω, l). [0026] Here, ρ and は are arbitrarily set by parameters that change the characteristics of the non-linear function. The other parts are the same as those of the first embodiment, so the description will be omitted. According to the non-linear characteristic shown in FIG. 10, it is possible to emphasize the frequency component in the frequency region in which the desired voice is dominant, and to suppress the frequency component in the frequency region in which the background noise is dominant. Has the effect of improving the amount. [0027] In the third embodiment, the procedure in the sound source signal component estimation unit 7 and the gain coefficient calculation unit 8 in the first embodiment is modified. The configuration of the sound source signal component estimation unit 7 used in the third embodiment is shown in FIG. 11, and the configuration of the gain coefficient calculation unit 8 is shown in FIG. The frequency components YSL (ω, l), YNL (ω, l), YSC (ω, l), YNC (ω, l), YSR (ω, l), YNR (ω) input to the sound source signal component estimation unit 7 , l) are input to the absolute value calculator 61 ′, and the absolute values ¦ YSL (ω, l) ¦, ¦ YNL (ω, l) ¦, ¦ YSC (ω, l) ¦, ¦ YNC (ω) of the signal are obtained. , l) ¦, ¦ YSR (ω, l) ¦, ¦ YNR (ω, l) ¦ is output to the vectorization unit 62. The vectorization unit 62 outputs an absolute value vector Y * (ω, l) shown in equation (37) for the input signal. [0028] The absolute value vector is input to the multiplication unit 63. The absolute value estimation matrix T * <+>, which is the other input of the multiplier 63, is an output signal of the pseudo 04-05-2019 13 inverse matrix calculator 64. The pseudo inverse matrix operation unit 64 outputs the pseudo inverse matrix T * <+> of the input gain matrix T *. The gain matrix T * functions as the fifth and sixth sound collecting units 4-5 and 4-6 functioning as frontal beam formers and the first to fourth sound collecting units 4-1-4 functioning as side beam formers. It is defined by Formula (38) from the gain amount of the directional characteristic calculated from the filter coefficient used in the filter processing unit 41 (see FIG. 4) provided in −4 and is given in advance. [0029] The multiplication unit 63 multiplies the input beamformer output power vector by the power estimation matrix for each frequency component, and outputs an estimated signal absolute value vector X * opt (ω, l). Next, vector element extraction section 81 estimates the first component of the estimated signal absolute value vector input as shown in equation (39), the estimated signal absolute value ¦ S (ω, l) ¦, and estimates the second component leftward noise Absolute value ¦ NL (ω, l) ¦, third component estimated as front direction noise absolute value ¦ NC (ω, l) ¦, fourth component as estimated right direction noise absolute value ¦ NR (ω, l) ¦ These are output to the SN ratio estimation unit 82. The SN ratio estimation unit 82 calculates the estimated SN ratio ESNR (ω, l) using the equation (40). The other parts are the same as those of the first embodiment, and thus further description will be omitted. According to the third embodiment, the calculation amount can be reduced because squared calculations are not required compared to the first embodiment. The third embodiment can also be applied to the signal source component estimation unit 7 and the gain coefficient calculation unit 8 of the second embodiment. FIG. 13 shows the configuration of the gain coefficient calculation unit 8 in the case where a change of the third embodiment is added to the second embodiment. [0030] Although the above-described sound collecting apparatus according to the present invention can be entirely configured by hardware, the simplest implementation can be achieved by the present invention in which each procedure described above is described by a computer readable program language. It is best to create a sound program, install this sound collection program on a computer, have the computer execute the sound collection program, and have the computer function as a sound collection device. The sound collection program according to the present invention is recorded in a computer readable recording medium such as a magnetic medium, a CD-ROM, a semiconductor memory, etc., and installed from the recording medium or the computer through a communication line. The installed sound collection program is decoded by the CPU provided in the computer, and the computer functions as a sound collection device. 04-05-2019 14 [0031] The sound collection device according to the invention is used, for example, in the field of handsfree calling devices such as teleconferencing systems. [0032] BRIEF DESCRIPTION OF THE DRAWINGS The arrangement ¦ positioning figure for demonstrating the outline ¦ summary of this invention. BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a block diagram for explaining the whole of a sound collection device according to the present invention. The top view for demonstrating the directivity of the 1st-6th sound pickup part used for this invention. FIG. 7 is a block diagram for explaining the configuration of first to fourth sound collecting units functioning as a side beam former used in the present invention. FIG. 6 is a block diagram for explaining the configuration of fifth and sixth sound collection units that function as front beam formers used in the present invention. The block diagram for demonstrating the structure of the sound source signal component estimation part used for this invention. FIG. 2 is a block diagram for explaining a configuration of a gain coefficient calculation unit used in the present invention. The graph for demonstrating the example of the gain coefficient calculated by the gain coefficient calculation part shown in FIG. FIG. 8 is a block diagram for explaining a modification of the gain coefficient calculation unit shown in FIG. 7; FIG. 8 is a graph for explaining an example of the characteristics of gain coefficients obtained by the gain coefficient calculation unit shown in FIG. 7; FIG. 7 is a block diagram for explaining a modification of the sound source signal component estimation unit shown in FIG. 6. FIG. 12 is a block diagram for explaining the configuration of a gain coefficient calculation unit that calculates a gain coefficient using the estimated value obtained by the sound source signal component estimation unit shown in FIG. 11; FIG. 13 is a block diagram for explaining an embodiment in which the gain coefficient calculator shown in FIG. 9 is applied to the gain coefficient calculator shown in FIG. 12; A for demonstrating the application example of the simulation for confirming the effect of this invention A is a layout for demonstrating the case where there are three background noise sources, when there are one background noise source. A is a signal waveform diagram of a desired sound source for explaining the effect of the simulation shown in FIG. 14A, B is a waveform diagram when background noise is superimposed on the desired sound source signal, and C is a sound collection process by the sound collection device of the present invention FIG. A is a signal waveform diagram of a desired sound source for explaining the results of the simulation shown in FIG. 14B, B is a waveform diagram in which background noise is superimposed on the desired 04-05-2019 15 sound source signal, and C is a result of sound collection processing by the sound collection device of the present invention FIG. The graph for demonstrating the effect of this invention. The block diagram for demonstrating a prior art. Explanation of sign [0033] DESCRIPTION OF SYMBOLS 1 desired sound source 5 frequency domain conversion part 2 background noise source 6 addition part 3L, 3R microphone array 7 sound source signal component estimation part 4-1 1st sound collection part 8 gain coefficient calculation part 4-2 2nd sound collection part 9 multiplication part 4-3 Third Sound Collection Unit 10 Reverse Frequency Domain Conversion Unit 4-4 Fourth Sound Collection Unit 4-5 Fifth Sound Collection Unit 4-6 Sixth Sound Collection Unit 04-05-2019 16
© Copyright 2021 DropDoc