Patent Translate Powered by EPO and Google Notice This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate, complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or financial decisions, should not be based on machine-translation output. DESCRIPTION JP2017130899 Abstract: The present invention provides a sound field estimation apparatus, a method and a program thereof, which have a larger space area in which sound field estimation is effective than the prior art. A sound field estimation apparatus 200 uses collected signals u (ω, m, j) in the frequency domain of M spherical microphone arrays 1 and 2 provided with microphones at predetermined positions on a sphere, respectively. Using a plane wave decomposition unit 213 for estimating a vector consisting of the strength of a plane wave forming the sound field, an estimated value a (ω) of the vector consisting of the strength of the plane wave, and a position r of a virtual microphone And an interpolation estimation unit 216 configured to estimate a sound collection signal u ^ (ω, r) in the frequency domain at the position r. [Selected figure] Figure 2 Sound field estimation device, method and program thereof [0001] The present invention relates to a technique for estimating a sound collection signal obtained when a microphone is arranged at another position using a sound collection signal of a microphone arranged at a certain position. [0002] In recent years, audio reproduction technology has been expanded from 2-channel stereo to 5.1channel reproduction, and further research and development on 22.2 channel reproduction and wave-field synthesis methods are advanced, greatly improving the realism of reproduction itself, and the realism It is intended to expand the high reproduction area of 10-05-2019 1 [0003] In order to evaluate and verify such a multi-channel audio reproduction method, it is important to measure the reproduced sound field. For example, in the wavefront synthesis method, it is necessary to compare the actually recorded sound field with the reproduced sound field to grasp the difference. The reason is that various factors such as signal processing for converting the recorded sound field into reproduced signals, encoding and decoding of the recorded signals, and acoustic characteristics of the room in which the reproducing apparatus is installed affect the reproduction accuracy of the sound field. This is because it is important to establish a method with high reproduction accuracy. [0004] (Conventional method 1) As a method of measuring a sound field, it is conceivable to locally arrange a microphone locally on a part of a target measurement area and estimate a sound field of a surrounding area from the measurement result. As an example, examination of a spherical microphone array is in progress. The spherical microphone array is a microphone array in which several tens or more microphone elements are arranged on a spherical surface of radius r a, and r a is in the range of several cm to several tens cm. [0005] FIG. 1 shows a signal flow of sound field estimation processing using the spherical microphone array 1 in the prior art. The time domain signals y (t, ra, Ω j) collected by the J microphones disposed on the spherical surface are converted by the short time Fourier transform unit 111 into the frequency domain signals u (i, ω, ra, Converted to Ω j). Where t is time, i is a frame, J is an integer of 2 or more, ω is a time frequency, j = 1, 2,. In the following processing, processing is performed in frame units, but i is omitted to simplify the notation. Ω j is a position on the spherical surface of the j-th microphone element, and is specified by a pair of elevation angle θ j and azimuth angle φ j. Ω j = (θ j, φ j). 10-05-2019 2 [0006] The spherical wave spectrum conversion unit 112 obtains the spherical wave spectrum u n, m (ω, r a) by the following equation for each frequency ω. [0007] [0008] Where α j is a weight appropriately set such that the product-sum of equation (1) satisfies the orthogonality of the spherical harmonics expressed by the following equation. [0009] [0010] Y n <m> (θ j, φ j) is a spherical harmonic function of order n and order m, and * means complex conjugate. n = 0, 1,..., N, m = −n, −n + 1,. δ nn 'is 1 when n = n' and 0 when n0n ', δ mm' is 1 when m = m 'and 0 when m ≠ m' It is a value. In order to obtain a spherical wave spectrum up to the order number N, (N + 1) <2> or more microphone elements are required. [0011] Note that, from this point onward, measuring the sound field generated by the sound source located outside the measurement target range, that is, the internal problem is dealt with. 10-05-2019 3 In other words, the sound field generated by the sound source outside the sphere of the spherical microphone array is measured. [0012] The sound field is considered in the polar coordinate system (r, Ω) = (r, θ, φ) with the center of the spherical microphone array as the origin. [0013] The extrapolation estimation unit 116 extrapolates the sound field from the position of the radius ra to the radius r at the frequency ω according to the following equation, and the sound collection signal u (r, Ω) = (r, θ, φ) Find ω, r, Ω). In other words, the sound collection signal of the microphones disposed on the spherical microphone array is used to estimate the sound field outside the sphere of the spherical microphone array. [0014] [0015] Where k is the wavenumber k = ω / c (c is the speed of sound) and b n () is the mode intensity function. [0016] Non-Patent Document 1 shows the case of an open sphere spherical microphone array in which microphone elements are hollow and arranged on a spherical surface. In this case, the mode intensity function is expressed by the following equation. 10-05-2019 4 [0017] [0018] である。 Here, i is an imaginary number, and j n () is an n-order spherical Bessel function. When the spherical microphone array is configured by arranging the microphone elements on the surface of the hard sphere, the mode intensity function is expressed by the following equation based on Non-Patent Document 2. [0019] [0020] Where h n () is the n-th first kind Hankel function. "A '" means the derivative of A. [0021] The short time inverse Fourier transform unit 118 converts the spatially extrapolated sound collection signal from the signal u (ω, r, Ω) in the frequency domain to the signal y (t, r, Ω) in the time domain and outputs the signal. [0022] In equation (3), b n (kr) / b n (k r a) is applied to the spherical wave spectrum, and product-sum is obtained by Y n <m> (θ, φ). 10-05-2019 5 The product sum of Y n <m> (θ, φ) corresponds to inverse spherical wave spectrum conversion. Therefore, the spatially collected sound pickup signal u (ω, r, Ω) is a signal in the frequency domain. [0023] In the measurement by the open-ball microphone array, the influence of a singular point can not be avoided, and measurement becomes impossible at k and r when j n (kr) = 0. Specifically, when j n (kr) = 0 is satisfied even if there is a sound field, the output becomes zero. However, there is no singularity in a rigid-sphere microphone array and it can not be measured. Therefore, it is the mainstream to use a hard sphere microphone array as a spherical microphone array. [0024] T. Abhayapala and D. Ward, "Theory and design of high order sound field microphones using spherical microphone array", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE International Conference on, 2002, pp. II-1949. Meyer, Jens; Elko, Gary, "A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield", Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on 2002, II-1781-II- 1784. [0025] In the prior art, the Bessel function j n (kr) or the Bessel function j n (kr) and the Hankel function h n (kr) are used when extrapolating the sound field from the position of the radius r a to the radius r. According to reference 1, as kr tends to increase as a global tendency for both functions, it decreases at a pace of 1 / kr. (Reference 1) E. G. Williams, "Fourier acoustics", Springer Fearak, 2005, pages 234-236. For example, when r becomes ten times as large as r a, the extrapolation estimation value decreases sharply to about 1/10. Therefore, the space area where extrapolation is effective is limited to the periphery of the spherical microphone array surface. Also, for the same reason, even if the frequency ω is increased, k = ω / c is increased, and the extrapolation estimated value is sharply reduced. In other words, if the frequency is increased, the space area where extrapolation is effective narrows sharply. 10-05-2019 6 [0026] An object of the present invention is to provide a sound field estimation device, a method, and a program for the sound field estimation device in which the space region in which the sound field estimation is effective is larger than the prior art. [0027] In order to solve the above problems, according to one aspect of the present invention, the sound field estimation apparatus sets m = 1, 2,..., M as an index of the spherical microphone array, and jm = 1, 2,. Let m and rm be the radius of polar coordinates, θ j̲m and φ j̲m be the declination of polar coordinates, ω be the index of time frequency, and d <-> of declination angles θ j̲m and φ j̲m on a sphere of radius rm centered on m J m spherical microphone arrays with microphones at each position r <-> (m, jm) = d <-> m + [rm sin θ j̲m cos φ j̲m rm sin θ j̲m sin φ j̲m rm cos θ j̲m] <T> A plane wave decomposition unit that estimates a vector consisting of the intensities of plane waves that make up the sound field using the collected sound signal u (ω, m, jm) in the frequency domain of m, Use ω) and the virtual microphone position r <-> p to estimate the frequency-domain collected signal u ^ (ω, r <-> p) at the virtual microphone position r <-> p The And an interpolation estimation unit. [0028] In order to solve the above problems, according to another aspect of the present invention, the sound field estimation method is such that m = 1, 2,..., M is the index of the spherical microphone array, jm = 1, 2,. Let m and rm be the radius of polar coordinates, θ j̲m and φ j̲m be the declination of polar coordinates, ω be the index of time frequency, and the plane wave decomposition part be a declination on a sphere of radius rm centered at d <-> m J m positions of θ j̲m and φ j̲m r <-> (m, jm) = d <-> m + [rm sin θ j̲m cos φ j̲m rm sin θ j̲m sin φ j̲m rm cos θ j̲m] <T> Each has a microphone A plane wave decomposition step of estimating a vector consisting of intensities of plane waves constituting a sound field using the collected sound signals u (ω, m, jm) of the frequency domain of M spherical microphone arrays m, and an interpolation estimation unit, Using the estimated value a (ω) of the vector consisting of the plane wave intensity and the virtual microphone position r <-> p, the virtual microphone position r <-> p Collected sound signal of frequency domain u ^ (ω, r - p <>) and a interpolation estimation step of estimating. [0029] According to the present invention, there is an effect that the space area where sound field estimation is effective is larger than the prior art. 10-05-2019 7 [0030] The functional block diagram of the sound field estimation apparatus which concerns on a prior art. FIG. 1 is a functional block diagram of a sound field estimation device according to a first embodiment. The figure which shows the example of the processing flow of the sound field estimation apparatus which concerns on 1st embodiment. FIG. 7 is a diagram showing an outline of positions of virtual microphones in the first embodiment and its modified example 1; The functional block diagram of the sound field estimation apparatus which concerns on 2nd embodiment. The figure which shows the example of the processing flow of the sound field estimation apparatus which concerns on 2nd embodiment. The functional block diagram of the sound field estimation apparatus which concerns on 3rd embodiment. The figure which shows the example of the processing flow of the sound field estimation apparatus which concerns on 3rd embodiment. [0031] Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference numerals are given to constituent parts having the same functions and steps for performing the same processing, and redundant description will be omitted. In the following description, the symbols ^ , <-> , etc. used in the text should originally be written directly above the previous character, but due to the limitations of the text notation Describe. In the formula, these symbols are described at their original positions. Moreover, the processing performed in each element unit of a vector or a matrix is applied to all elements of the vector or the matrix unless otherwise noted. 10-05-2019 8 [0032] <Point of First Embodiment> In the present embodiment, (1) using a plurality of microphone arrays instead of a single spherical microphone array, (2) using sound collection signals from the frequency domain instead of the spherical wave spectrum We directly find the collection of plane waves that make up the field, and use this plane wave to estimate the sound field. (1)And (2), it is possible to greatly expand the spatial range in which sound field estimation is effective. The method will be described below. [0033] <Sound Field Estimation Device 200 According to First Embodiment> FIG. 2 shows a functional block diagram of the sound field estimation device 200 according to the first embodiment, and FIG. 3 shows its processing flow. [0034] The sound field estimation apparatus 200 includes a short time Fourier transform unit 211, a plane wave decomposition unit 213, an interpolation estimation unit 216, and a short time inverse Fourier transform unit 218. [0035] The sound field estimation apparatus 200 receives the collected sound signals y (t, 1, j) (where j = 1, 2,..., J) in the time domain from the spherical microphone array 1 and Receive the collected sound signal y (t, 2, j) (where j = 1, 2, ..., J), and receive the position information r <-> of the virtual microphone, and the time domain at the position r <-> of the virtual microphone Of the collected sound signal y (t, r <->) of the [0036] In this embodiment, it is assumed that the number of microphones on the spherical surface included in the spherical microphone arrays 1 and 2, the arrangement of the microphones in the spherical microphone arrays 1 and 2, and the radius of the spherical microphone arrays 1 and 2 are the same. The spherical microphone arrays 1 and 2 are open spherical spherical microphone arrays, and J 10-05-2019 9 microphones are arranged on the spherical surface of radius ra, and the microphone arrangement on the spherical surface is a pair of elevation angle and azimuth angle (θ j, φ j) specified. The center position of the spherical microphone array 1 is d <-> 1 = [x1, y1, z1], and the center position of the spherical microphone array 2 is d <-> 2 = [x2, y2, z2] I assume. The three-dimensional position of the j-th microphone on the spherical microphone array m (m = 1, 2) is [0037] [0038] Given by A sound collection signal by this microphone is set as a sound collection signal y (t, m, j) (where m = 1, 2, j = 1, 2, ..., J) in the time domain. [0039] <Short-Time Fourier Transform Unit 211> The short-time Fourier transform unit 211 calculates the time-domain sound collection signal y (t, m, j) (where m = 1, 2, j = 1, 2,..., J). Received, and by short-time Fourier transform, the collected signal y (t, m, j) in the time domain from the collected signal u (i, ω, m, j) in the frequency domain (where i is the frame number, ω = 1 , 2, ..., F, j = 1, 2, ..., J) (S211) and output. Although the subsequent processing is performed for each frame i, the frame number i is omitted to simplify the description. Moreover, as long as it is a method of converting a time domain signal into a frequency domain signal, a method other than short time Fourier transform may be used. [0040] 10-05-2019 10 <Planar Wave Decomposition Unit 213> The plane wave decomposition unit 213 generates a sound collection signal u (ω, m, j) in the frequency domain (where ω = 1, 2,..., F, m = 1, 2, j = 1, 2) , ..., J) and use these values to estimate a vector consisting of the intensity of the plane wave that composes the sound field (S213), and estimate value a <-> (ω) (where ω = 1, 2) , ..., F) are output. For example, the plane wave decomposition unit 213 obtains a solution vector (estimated value) a <−> (ω) that minimizes the next cost function J in order to obtain a collection of plane waves that constitute a sound field. [0041] [0042] Here, λ is a constant for regularization, and the estimation can be made more robust against noise on u (ω, m, j) as λ becomes larger. Also, D (ω) in equation (7) is called a dictionary matrix, [0043] [0044] The l-th column vector D (ω, l) is the phase at the origin when a plane wave of amplitude 1 is incident from the direction specified by the elevation angle and azimuth angle pair (θ 1, φ 1) It is a vector which consists of an observation value in each spherical microphone array 1 and 2 in the state where 0 is 0. For example, L ′ incident angles (θ l, φ l) are set so that L ′ plane waves can be acquired from all directions uniformly. For example, L ′ incident angles (θ l, φ l) are set such that plane waves are incident from the direction of the vertex of the regular polyhedron. [0045] 10-05-2019 11 The l-th column vector D (ω, l) of the dictionary matrix D (ω) is an observation value of the j-th microphone of the spherical microphone array m when a plane wave with an incident angle (θ 1, φ 1) arrives. [0046] をもちいて、 [0047] Is represented by [0048] Also, the l-th value of the solution vector a <-> (ω) corresponds to the amplitude of the l-th plane wave. Let a (ω) = [a 1 (ω), a 2 (ω),..., a l (ω),..., a L ′ (ω)] <T>. [0049] Convex optimization using the above L1 norm regularization term derives a sparse vector containing many 0s as a solution vector a <-> (ω). Therefore, as shown in Reference 2, it is possible to extract plane waves well even in a redundant case where the number L ′ of plane waves assumed in advance largely exceeds the number of microphones. (Reference 2) A. Wabnitz, N. Epain, A. van. Shaik, C. Jin, "Reconstruction of spatial sound field using compressed sensing", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE International Conference on, 2011. [0050] 10-05-2019 12 <Interpolation estimation unit 216> The interpolation estimation unit 216 receives the estimated value a (ω) and the position information r <-> = (rx, ry, rz) of the virtual microphone, and the position r <-> of the virtual microphone The voice-pickup signal u ^ (ω, r <->) (where ω = 1, 2,..., F) in the frequency domain is estimated by the following equation (S216) (in other words, the solution vector a <-> (ω The output value u ^ (ω, r <->) of the virtual microphone is estimated from the plane wave model whose parameter is)) and output. [0051] [0052] Here, ● means an inner product, and k <-> l is a wave number vector corresponding to the incident direction of the l-th plane wave, and is expressed by the following equation. [0053] [0054] <T> represents transposition. The position information r <-> of the virtual microphone is input by the user of the sound field estimation device 200, for example. [0055] <Short-time Inverse Fourier Transform Unit 218> The short-time inverse Fourier transform unit 218 receives the collected sound signal u ^ (ω, r <->) (where ω = 1, 2,..., F) in the frequency domain, The collected sound signal u ^ (ω, r <->) is converted to the collected sound signal y (t, r <->) in the time domain by inverse short-time Fourier transform (S218) and output. A method corresponding to the conversion method in the short time Fourier transform unit 211 may be used as a method of converting a time domain signal into a frequency domain signal. 10-05-2019 13 [0056] <Effects> With the above configuration, it is possible to realize a sound field estimation apparatus in which the space area in which sound field estimation is effective is larger than that of the prior art. [0057] <Modification 1> In the first embodiment, one virtual microphone is assumed, and a signal picked up at that position is estimated. However, as a matter of course, virtual microphones may be assumed at a plurality of positions. In addition, by arranging virtual microphones on the same spherical surface, it is possible to configure an open spherical virtual microphone array of radius r. For example, when the center of the virtual microphone array is at the position of D = [dxdydz] from the origin and the virtual microphone array including P virtual microphones is configured, the sound field estimation device 200 , P position information (r <-> p) (where p = 1, 2, ..., P) of the virtual microphones, and pick-up signal y (t, r <-> p) (where p = Output 1, 2, ..., P). [0058] Assuming that the position of the p-th microphone on the spherical surface of the virtual microphone array is r <-> p (where p = 1, 2,..., P), the interpolation estimation unit 216 generates the collected sound signal u in the frequency domain. ^ (ω, r <-> p) is estimated by the following equation. [0059] [0060] FIG. 4 shows an outline of the position of the virtual microphone in the first embodiment and the 10-05-2019 14 first modification thereof. Since the first embodiment is used when the center [dxdydz] of the virtual microphone array is [rxryrz], the radius r = 0, and the number P of virtual microphones provided in the virtual microphone array is 1, this is the first embodiment. The form can be said to be an example of the first modification (however, in the present modification, a microphone on a spherical surface with a radius r is provided. On the other hand, in the first embodiment, the radius r = 0, that is, the microphone is provided on the point, not on the spherical surface). [0061] <Modification 2> In the first embodiment, the case of installing two open-ball type microphone arrays in a sound field has been described. In this modification, the case of installing two hardball microphone arrays in a sound field will be described. [0062] The rigid sphere microphone array has a radius r a and the number J of microphones, and the microphone arrangement on the spherical surface is designated by a pair of elevation angles and azimuth angles (θ j, φ j). When a plane wave with an amplitude of 1 is incident from the direction specified by the elevation angle-azimuth angle pair (θ 1, φ 1), the sound field consists of the incident wave and the scattered wave. [0063] When the center of the rigid sphere microphone array coincides with the origin of the coordinate system, the signal observed by the jth microphone is [0064] [0065] 10-05-2019 15 になる。 When the center of the m-th (m = 1, 2) rigid-spherical microphone array is separated from the origin by d m, the signal observed by the j-th microphone takes into consideration the phase difference. [0066] になる。 [0067] Therefore, if the l-th column vector D (ω, l) of the dictionary matrix D (ω) of equation (10) is generated using equation (16) instead of equation (9), the rest is similarly optimized By solving the problem, it is possible to extract a plane wave from the output signal of the rigid sphere microphone array. [0068] Since equation (15) includes an infinite number of terms, in practice, a finite n is used, and v <rigid> (ω, l, m, j) is obtained by numerical calculation. When r a = 4 cm, n should be about 10. [0069] <Other Modifications> In this embodiment, the arrangement of the microphones in the microphone arrays 1 and 2 and the radius of the spherical microphone arrays 1 and 2 are the same, but may be different. Further, the number of microphone arrays does not have to be two, but may be plural. 10-05-2019 16 For example, let M be any integer of 2 or more, and the three-dimensional position of the j m microphone on M spherical microphone arrays m (m = 1, 2,..., M) corresponds to the center of the spherical microphone array m. Let d <-> m and radius rm, [0070] [0071] Given by Let the collected signal by this microphone be the collected signal y (t, m, jm) (jm = 1, 2, ..., Jm in the time domain, where Jm is the number of microphones included in the spherical microphone array m) . Equations (7) and (8) are replaced by the following equations. [0072] [0073] The l-th column vector D (ω, l) of the dictionary matrix D (ω) is an observation value of the j m microphone of the spherical microphone array m when a plane wave with an incident angle (θ l, φ l) arrives [0074] をもちいて、 [0075] Is represented by [0076] Moreover, Formula (15), (16) of the modification 2 of 1st embodiment is substituted to following Formula. 10-05-2019 17 [0077] [0078] Second Embodiment A description will be made focusing on parts different from the first modification of the first embodiment. [0079] In Modified Example 1 of the first embodiment, a microphone array of an open sphere type is virtually assumed, and the collected sound signal is estimated. In the second embodiment, based on the configuration of the first modification of the first embodiment, instead of the open ball type microphone array, a rigid ball type microphone array is virtually assumed and the collected sound signal is estimated. [0080] FIG. 5 shows a functional block diagram of the sound field estimation apparatus 300 according to the second embodiment, and FIG. 6 shows its processing flow. [0081] The sound field estimation apparatus 300 includes a short time Fourier transform unit 211, a plane wave decomposition unit 213, an interpolation estimation unit 216, and a short time inverse Fourier transform unit 218, and further includes an array type transform unit 317. [0082] First, as a virtual spherical microphone array, it is assumed that sound is collected by a dual open sphere microphone array of reference 3. In this microphone array, the microphone elements are disposed on a spherical surface of radius r or a spherical surface of radius α r, and α = 1.2 is recommended. 10-05-2019 18 (Reference 3) I. Balmages, B. Rafaely, "Open-Sphere Designs for Spherical Microphone Arrays", IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 2, pp 727-732, 2007. For example, it is assumed that Q = P × 2, and the positions of P virtual microphone elements among the Q virtual microphone elements are the same as in the first modification. That is, the center of the virtual microphone array is at the position of D = [d x d y d z] from the origin, and the position r <-> p of the pth virtual microphone on the spherical surface of the virtual microphone array is [0083] [0084] である。 Of the Q virtual microphone elements, the remaining P virtual microphones are arranged on a sphere with a center of [d x d y d z] and a radius αr, and the position of the q th virtual microphone is [0085] [0086] とする。 Further, it is assumed that Ω q = (θ q, φ q) = Ω p = (θ p, φ p). 10-05-2019 19 That is, the q-th (q = P + p) microphone and the p-th microphone are in the same direction from the center of the virtual microphone array, and the q-th microphone is on the spherical surface of radius r Microphones are on the sphere of radius α r. [0087] The interpolation estimation unit 216 calculates the estimated value A, the center D of the virtual microphone array, and P position information (r <-> p) (where p = 1, 2, ..., P) of the virtual microphone and P Receive the position information (r <-> q) (where q = P + 1, P + 2, ..., Q), and the position of the virtual microphone (r <-> p) and (r <-> q) Sound pickup signal u ^ (ω, r <-> p) (where p = 1, 2, ..., P) in the frequency domain in), u ^ (ω, r <-> q) (where q = P + 1, P + 2,..., Q) are estimated (S216) and output. Here, instead of P pieces of position information (r <-> q) (where q = P + 1, P + 2,..., Q), only α is received, and P pieces of position information (r < P position information (r <-> q) may be calculated from-> p) and α. [0088] <Array type conversion unit 317> The array type conversion unit 317 is a sound collection signal u ^ (ω, r <-> p) (where p = 1, 2, ..., P) in the frequency domain, u ^ (ω, r <-> q) (where q = P + 1, P + 2, ..., Q) and receive the spherical wave spectra un, m (ω, r) and un, m (ω, αr) Convert to [0089] [0090] In an open sphere spherical microphone array, measurement becomes impossible at k and r where j n (kr) = 0 due to the influence of singular points. However, by selecting the larger absolute value of un, m (ω, r) and un, m (ω, αr), the double open-ball spherical microphone array avoids the influence of singularity be able to. 10-05-2019 20 [0091] Therefore, the array type conversion unit 317 [0092] [0093] And when ¦ un, m (ω, r) ¦> ¦ un, m (ω, αr) ¦ [0094] [0095] And when ¦ u n, m (ω, r) ¦ ≦ ¦ u n, m (ω, αr) ¦ [0096] [0097] The spherical wave spectrum v n, m (ω, r) is determined as [0098] The array type conversion unit 317 finally performs inverse spherical wave spectrum conversion [0099] [0100] Apply. As a result, it is possible to obtain in the frequency domain a sound collection signal in the case where a hard-sphere type microphone array of radius r is installed at the position of the double 10-05-2019 21 open-sphere type spherical microphone array virtually installed first. The array type conversion unit 317 outputs a signal v (ω, r <-> p) (where p = 1, 2, ..., P) in the frequency domain to the short time inverse Fourier transform unit 218. [0101] <Effect> With such a configuration, it is possible to obtain the same effect as that of Modification 1 of the first embodiment. Furthermore, it is possible to virtually obtain a collected sound signal in the case where a rigidball type microphone array is installed. The present embodiment may be combined with the modification 2 of the first embodiment. [0102] Third Embodiment The application of a hard-sphere microphone array to virtual reality is shown in reference 4. (Ref. 4) R. Duraiswami 1, DN Zotkin 1, Z. Li, E. Grassi, NA Gumerov, LS Davis, "High Order Spatial Audio Capture and Binaural Head-Tracked Playback over Headphones with HRTF Cues", Proceedings 119th convention of AES , 2005. In this reference 4, the sound pickup signal of a fixed hard-ball microphone array and the direction of a virtual head are input, and when the head is turned in a designated direction, a signal (binaural signal) that can be heard by the right ear and the left ear The method of outputting is shown. Since the spherical microphone array picks up in all directions, it is possible to generate binaural signals corresponding to any specified direction without moving the microphone elements and 10-05-2019 22 the microphone array. That is, when the head rotation of the listener is measured and input in real time, a binaural signal can be generated to be presented to the listener following the rotation movement. [0103] In the second embodiment, a method of obtaining a collected sound signal of a virtually installed rigid-sphere microphone array has been shown. A configuration in which the binaural signal generation method is combined with this collected sound signal as shown in FIG. 7 is the configuration of the present embodiment. [0104] Description will be made focusing on parts different from the second embodiment. [0105] FIG. 7 shows a functional block diagram of a sound field estimation apparatus 400 according to the third embodiment, and FIG. 8 shows its process flow. [0106] The sound field estimation apparatus 400 includes a short time Fourier transform unit 211, a plane wave decomposition unit 213, an interpolation estimation unit 216, an array type transform unit 317, and a short time inverse Fourier transform unit 218, and further includes a binaural signal generation unit 419. [0107] <Binaural Signal Generation Unit 419> The binaural signal generation unit 419 is a virtual head direction (posture) and time domain sound collection signal y (t, r <-> p) (where p = 1, 2, ..., P, corresponding to the sound pickup signals of a hard-sphere type spherical microphone array), and for example, according to the method described in reference 4, binaural signals y (t, R), y (t, 10-05-2019 23 L) are generated (S419), and output as an output value of the sound field estimation device 400. Note that the position of the virtual head corresponds to the center D = [dxdydz] of the virtual microphone array, and the collected signal y (t, r <-> p) in the time domain is the virtual head. Corresponds to the sound pickup signal of the rigid sphere type spherical microphone array at the position of. Therefore, in the binaural signal generation unit 419, a binaural signal at a virtual head position and direction from the virtual head direction (posture) and the collected sound signal y (t, r <-> p) in the time domain y (t, R), y (t, L) can be generated. [0108] The method of reference 4 can follow only the rotational movement of the head and can not cope with the translational movement of the head. However, in the configuration of the present embodiment, the rigid sphere type spherical microphone array can be virtually translated. To this end, the present embodiment makes it possible to generate binaural signals following both the rotational and translational movements of the head. [0109] <Other Modifications> The present invention is not limited to the above embodiment and modifications. For example, the various processes described above may be performed not only in chronological order according to the description, but also in parallel or individually depending on the processing capability of the apparatus that executes the process or the necessity. 10-05-2019 24 In addition, changes can be made as appropriate without departing from the spirit of the present invention. [0110] <Program and Recording Medium> In addition, various processing functions in each device described in the above embodiment and modification may be realized by a computer. In that case, the processing content of the function that each device should have is described by a program. By executing this program on a computer, various processing functions in each of the above-described devices are realized on the computer. [0111] The program describing the processing content can be recorded in a computer readable recording medium. As the computer readable recording medium, any medium such as a magnetic recording device, an optical disc, a magneto-optical recording medium, a semiconductor memory, etc. may be used. [0112] Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable recording medium such as a DVD, a CD-ROM or the like in which the program is recorded. Furthermore, the program may be stored in a storage device of a server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network. [0113] For example, a computer that executes such a program first temporarily stores a program recorded on a portable recording medium or a program transferred from a server computer in its own storage unit. Then, at the time of execution of the process, the computer reads the program stored in its storage unit and executes the process according to the read program. In another embodiment of the program, the computer may read the program directly from the portable 10-05-2019 25 recording medium and execute processing in accordance with the program. Furthermore, each time a program is transferred from this server computer to this computer, processing according to the received program may be executed sequentially. In addition, a configuration in which the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes processing functions only by executing instructions and acquiring results from the server computer without transferring the program to the computer It may be Note that the program includes information provided for processing by a computer that conforms to the program (such as data that is not a direct command to the computer but has a property that defines the processing of the computer). [0114] In addition, although each device is configured by executing a predetermined program on a computer, at least a part of the processing content may be realized as hardware. 10-05-2019 26

© Copyright 2021 DropDoc