close

Вход

Забыли?

вход по аккаунту

JP2017130899

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2017130899
Abstract: The present invention provides a sound field estimation apparatus, a method and a
program thereof, which have a larger space area in which sound field estimation is effective than
the prior art. A sound field estimation apparatus 200 uses collected signals u (ω, m, j) in the
frequency domain of M spherical microphone arrays 1 and 2 provided with microphones at
predetermined positions on a sphere, respectively. Using a plane wave decomposition unit 213
for estimating a vector consisting of the strength of a plane wave forming the sound field, an
estimated value a (ω) of the vector consisting of the strength of the plane wave, and a position r
of a virtual microphone And an interpolation estimation unit 216 configured to estimate a sound
collection signal u ^ (ω, r) in the frequency domain at the position r. [Selected figure] Figure 2
Sound field estimation device, method and program thereof
[0001]
The present invention relates to a technique for estimating a sound collection signal obtained
when a microphone is arranged at another position using a sound collection signal of a
microphone arranged at a certain position.
[0002]
In recent years, audio reproduction technology has been expanded from 2-channel stereo to 5.1channel reproduction, and further research and development on 22.2 channel reproduction and
wave-field synthesis methods are advanced, greatly improving the realism of reproduction itself,
and the realism It is intended to expand the high reproduction area of
10-05-2019
1
[0003]
In order to evaluate and verify such a multi-channel audio reproduction method, it is important
to measure the reproduced sound field.
For example, in the wavefront synthesis method, it is necessary to compare the actually recorded
sound field with the reproduced sound field to grasp the difference.
The reason is that various factors such as signal processing for converting the recorded sound
field into reproduced signals, encoding and decoding of the recorded signals, and acoustic
characteristics of the room in which the reproducing apparatus is installed affect the
reproduction accuracy of the sound field. This is because it is important to establish a method
with high reproduction accuracy.
[0004]
(Conventional method 1) As a method of measuring a sound field, it is conceivable to locally
arrange a microphone locally on a part of a target measurement area and estimate a sound field
of a surrounding area from the measurement result. As an example, examination of a spherical
microphone array is in progress. The spherical microphone array is a microphone array in which
several tens or more microphone elements are arranged on a spherical surface of radius r a, and
r a is in the range of several cm to several tens cm.
[0005]
FIG. 1 shows a signal flow of sound field estimation processing using the spherical microphone
array 1 in the prior art. The time domain signals y (t, ra, Ω j) collected by the J microphones
disposed on the spherical surface are converted by the short time Fourier transform unit 111
into the frequency domain signals u (i, ω, ra, Converted to Ω j). Where t is time, i is a frame, J is
an integer of 2 or more, ω is a time frequency, j = 1, 2,. In the following processing, processing is
performed in frame units, but i is omitted to simplify the notation. Ω j is a position on the
spherical surface of the j-th microphone element, and is specified by a pair of elevation angle θ j
and azimuth angle φ j. Ω j = (θ j, φ j).
10-05-2019
2
[0006]
The spherical wave spectrum conversion unit 112 obtains the spherical wave spectrum u n, m
(ω, r a) by the following equation for each frequency ω.
[0007]
[0008]
Where α j is a weight appropriately set such that the product-sum of equation (1) satisfies the
orthogonality of the spherical harmonics expressed by the following equation.
[0009]
[0010]
Y n <m> (θ j, φ j) is a spherical harmonic function of order n and order m, and * means complex
conjugate.
n = 0, 1,..., N, m = −n, −n + 1,.
δ nn 'is 1 when n = n' and 0 when n0n ', δ mm' is 1 when m = m 'and 0 when m ≠ m' It is a
value.
In order to obtain a spherical wave spectrum up to the order number N, (N + 1) <2> or more
microphone elements are required.
[0011]
Note that, from this point onward, measuring the sound field generated by the sound source
located outside the measurement target range, that is, the internal problem is dealt with.
10-05-2019
3
In other words, the sound field generated by the sound source outside the sphere of the spherical
microphone array is measured.
[0012]
The sound field is considered in the polar coordinate system (r, Ω) = (r, θ, φ) with the center of
the spherical microphone array as the origin.
[0013]
The extrapolation estimation unit 116 extrapolates the sound field from the position of the
radius ra to the radius r at the frequency ω according to the following equation, and the sound
collection signal u (r, Ω) = (r, θ, φ) Find ω, r, Ω).
In other words, the sound collection signal of the microphones disposed on the spherical
microphone array is used to estimate the sound field outside the sphere of the spherical
microphone array.
[0014]
[0015]
Where k is the wavenumber k = ω / c (c is the speed of sound) and b n () is the mode intensity
function.
[0016]
Non-Patent Document 1 shows the case of an open sphere spherical microphone array in which
microphone elements are hollow and arranged on a spherical surface.
In this case, the mode intensity function is expressed by the following equation.
10-05-2019
4
[0017]
[0018]
である。
Here, i is an imaginary number, and j n () is an n-order spherical Bessel function.
When the spherical microphone array is configured by arranging the microphone elements on
the surface of the hard sphere, the mode intensity function is expressed by the following
equation based on Non-Patent Document 2.
[0019]
[0020]
Where h n () is the n-th first kind Hankel function.
"A '" means the derivative of A.
[0021]
The short time inverse Fourier transform unit 118 converts the spatially extrapolated sound
collection signal from the signal u (ω, r, Ω) in the frequency domain to the signal y (t, r, Ω) in
the time domain and outputs the signal.
[0022]
In equation (3), b n (kr) / b n (k r a) is applied to the spherical wave spectrum, and product-sum
is obtained by Y n <m> (θ, φ).
10-05-2019
5
The product sum of Y n <m> (θ, φ) corresponds to inverse spherical wave spectrum conversion.
Therefore, the spatially collected sound pickup signal u (ω, r, Ω) is a signal in the frequency
domain.
[0023]
In the measurement by the open-ball microphone array, the influence of a singular point can not
be avoided, and measurement becomes impossible at k and r when j n (kr) = 0. Specifically, when
j n (kr) = 0 is satisfied even if there is a sound field, the output becomes zero. However, there is
no singularity in a rigid-sphere microphone array and it can not be measured. Therefore, it is the
mainstream to use a hard sphere microphone array as a spherical microphone array.
[0024]
T. Abhayapala and D. Ward, "Theory and design of high order sound field microphones using
spherical microphone array", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE
International Conference on, 2002, pp. II-1949. Meyer, Jens; Elko, Gary, "A highly scalable
spherical microphone array based on an orthonormal decomposition of the soundfield",
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on 2002,
II-1781-II- 1784.
[0025]
In the prior art, the Bessel function j n (kr) or the Bessel function j n (kr) and the Hankel function
h n (kr) are used when extrapolating the sound field from the position of the radius r a to the
radius r. According to reference 1, as kr tends to increase as a global tendency for both functions,
it decreases at a pace of 1 / kr. (Reference 1) E. G. Williams, "Fourier acoustics", Springer Fearak,
2005, pages 234-236. For example, when r becomes ten times as large as r a, the extrapolation
estimation value decreases sharply to about 1/10. Therefore, the space area where extrapolation
is effective is limited to the periphery of the spherical microphone array surface. Also, for the
same reason, even if the frequency ω is increased, k = ω / c is increased, and the extrapolation
estimated value is sharply reduced. In other words, if the frequency is increased, the space area
where extrapolation is effective narrows sharply.
10-05-2019
6
[0026]
An object of the present invention is to provide a sound field estimation device, a method, and a
program for the sound field estimation device in which the space region in which the sound field
estimation is effective is larger than the prior art.
[0027]
In order to solve the above problems, according to one aspect of the present invention, the sound
field estimation apparatus sets m = 1, 2,..., M as an index of the spherical microphone array, and
jm = 1, 2,. Let m and rm be the radius of polar coordinates, θ j̲m and φ j̲m be the declination
of polar coordinates, ω be the index of time frequency, and d <-> of declination angles θ j̲m
and φ j̲m on a sphere of radius rm centered on m J m spherical microphone arrays with
microphones at each position r <-> (m, jm) = d <-> m + [rm sin θ j̲m cos φ j̲m rm sin θ j̲m sin
φ j̲m rm cos θ j̲m] <T> A plane wave decomposition unit that estimates a vector consisting of
the intensities of plane waves that make up the sound field using the collected sound signal u (ω,
m, jm) in the frequency domain of m, Use ω) and the virtual microphone position r <-> p to
estimate the frequency-domain collected signal u ^ (ω, r <-> p) at the virtual microphone position
r <-> p The And an interpolation estimation unit.
[0028]
In order to solve the above problems, according to another aspect of the present invention, the
sound field estimation method is such that m = 1, 2,..., M is the index of the spherical microphone
array, jm = 1, 2,. Let m and rm be the radius of polar coordinates, θ j̲m and φ j̲m be the
declination of polar coordinates, ω be the index of time frequency, and the plane wave
decomposition part be a declination on a sphere of radius rm centered at d <-> m J m positions of
θ j̲m and φ j̲m r <-> (m, jm) = d <-> m + [rm sin θ j̲m cos φ j̲m rm sin θ j̲m sin φ j̲m rm
cos θ j̲m] <T> Each has a microphone A plane wave decomposition step of estimating a vector
consisting of intensities of plane waves constituting a sound field using the collected sound
signals u (ω, m, jm) of the frequency domain of M spherical microphone arrays m, and an
interpolation estimation unit, Using the estimated value a (ω) of the vector consisting of the
plane wave intensity and the virtual microphone position r <-> p, the virtual microphone position
r <-> p Collected sound signal of frequency domain u ^ (ω, r - p <>) and a interpolation
estimation step of estimating.
[0029]
According to the present invention, there is an effect that the space area where sound field
estimation is effective is larger than the prior art.
10-05-2019
7
[0030]
The functional block diagram of the sound field estimation apparatus which concerns on a prior
art.
FIG. 1 is a functional block diagram of a sound field estimation device according to a first
embodiment.
The figure which shows the example of the processing flow of the sound field estimation
apparatus which concerns on 1st embodiment.
FIG. 7 is a diagram showing an outline of positions of virtual microphones in the first
embodiment and its modified example 1;
The functional block diagram of the sound field estimation apparatus which concerns on 2nd
embodiment. The figure which shows the example of the processing flow of the sound field
estimation apparatus which concerns on 2nd embodiment. The functional block diagram of the
sound field estimation apparatus which concerns on 3rd embodiment. The figure which shows
the example of the processing flow of the sound field estimation apparatus which concerns on
3rd embodiment.
[0031]
Hereinafter, embodiments of the present invention will be described. In the drawings used in the
following description, the same reference numerals are given to constituent parts having the
same functions and steps for performing the same processing, and redundant description will be
omitted. In the following description, the symbols ^ , <-> , etc. used in the text should
originally be written directly above the previous character, but due to the limitations of the text
notation Describe. In the formula, these symbols are described at their original positions.
Moreover, the processing performed in each element unit of a vector or a matrix is applied to all
elements of the vector or the matrix unless otherwise noted.
10-05-2019
8
[0032]
<Point of First Embodiment> In the present embodiment, (1) using a plurality of microphone
arrays instead of a single spherical microphone array, (2) using sound collection signals from the
frequency domain instead of the spherical wave spectrum We directly find the collection of plane
waves that make up the field, and use this plane wave to estimate the sound field. (1)And (2), it is
possible to greatly expand the spatial range in which sound field estimation is effective. The
method will be described below.
[0033]
<Sound Field Estimation Device 200 According to First Embodiment> FIG. 2 shows a functional
block diagram of the sound field estimation device 200 according to the first embodiment, and
FIG. 3 shows its processing flow.
[0034]
The sound field estimation apparatus 200 includes a short time Fourier transform unit 211, a
plane wave decomposition unit 213, an interpolation estimation unit 216, and a short time
inverse Fourier transform unit 218.
[0035]
The sound field estimation apparatus 200 receives the collected sound signals y (t, 1, j) (where j =
1, 2,..., J) in the time domain from the spherical microphone array 1 and Receive the collected
sound signal y (t, 2, j) (where j = 1, 2, ..., J), and receive the position information r <-> of the
virtual microphone, and the time domain at the position r <-> of the virtual microphone Of the
collected sound signal y (t, r <->) of the
[0036]
In this embodiment, it is assumed that the number of microphones on the spherical surface
included in the spherical microphone arrays 1 and 2, the arrangement of the microphones in the
spherical microphone arrays 1 and 2, and the radius of the spherical microphone arrays 1 and 2
are the same.
The spherical microphone arrays 1 and 2 are open spherical spherical microphone arrays, and J
10-05-2019
9
microphones are arranged on the spherical surface of radius ra, and the microphone
arrangement on the spherical surface is a pair of elevation angle and azimuth angle (θ j, φ j)
specified.
The center position of the spherical microphone array 1 is d <-> 1 = [x1, y1, z1], and the center
position of the spherical microphone array 2 is d <-> 2 = [x2, y2, z2] I assume.
The three-dimensional position of the j-th microphone on the spherical microphone array m (m =
1, 2) is
[0037]
[0038]
Given by
A sound collection signal by this microphone is set as a sound collection signal y (t, m, j) (where
m = 1, 2, j = 1, 2, ..., J) in the time domain.
[0039]
<Short-Time Fourier Transform Unit 211> The short-time Fourier transform unit 211 calculates
the time-domain sound collection signal y (t, m, j) (where m = 1, 2, j = 1, 2,..., J). Received, and by
short-time Fourier transform, the collected signal y (t, m, j) in the time domain from the collected
signal u (i, ω, m, j) in the frequency domain (where i is the frame number, ω = 1 , 2, ..., F, j = 1, 2,
..., J) (S211) and output. Although the subsequent processing is performed for each frame i, the
frame number i is omitted to simplify the description. Moreover, as long as it is a method of
converting a time domain signal into a frequency domain signal, a method other than short time
Fourier transform may be used.
[0040]
10-05-2019
10
<Planar Wave Decomposition Unit 213> The plane wave decomposition unit 213 generates a
sound collection signal u (ω, m, j) in the frequency domain (where ω = 1, 2,..., F, m = 1, 2, j = 1,
2) , ..., J) and use these values to estimate a vector consisting of the intensity of the plane wave
that composes the sound field (S213), and estimate value a <-> (ω) (where ω = 1, 2) , ..., F) are
output. For example, the plane wave decomposition unit 213 obtains a solution vector (estimated
value) a <−> (ω) that minimizes the next cost function J in order to obtain a collection of plane
waves that constitute a sound field.
[0041]
[0042]
Here, λ is a constant for regularization, and the estimation can be made more robust against
noise on u (ω, m, j) as λ becomes larger.
Also, D (ω) in equation (7) is called a dictionary matrix,
[0043]
[0044]
The l-th column vector D (ω, l) is the phase at the origin when a plane wave of amplitude 1 is
incident from the direction specified by the elevation angle and azimuth angle pair (θ 1, φ 1) It
is a vector which consists of an observation value in each spherical microphone array 1 and 2 in
the state where 0 is 0.
For example, L ′ incident angles (θ l, φ l) are set so that L ′ plane waves can be acquired
from all directions uniformly. For example, L ′ incident angles (θ l, φ l) are set such that plane
waves are incident from the direction of the vertex of the regular polyhedron.
[0045]
10-05-2019
11
The l-th column vector D (ω, l) of the dictionary matrix D (ω) is an observation value of the j-th
microphone of the spherical microphone array m when a plane wave with an incident angle (θ
1, φ 1) arrives.
[0046]
をもちいて、
[0047]
Is represented by
[0048]
Also, the l-th value of the solution vector a <-> (ω) corresponds to the amplitude of the l-th plane
wave.
Let a (ω) = [a 1 (ω), a 2 (ω),..., a l (ω),..., a L ′ (ω)] <T>.
[0049]
Convex optimization using the above L1 norm regularization term derives a sparse vector
containing many 0s as a solution vector a <-> (ω).
Therefore, as shown in Reference 2, it is possible to extract plane waves well even in a redundant
case where the number L ′ of plane waves assumed in advance largely exceeds the number of
microphones.
(Reference 2) A. Wabnitz, N. Epain, A. van. Shaik, C. Jin, "Reconstruction of spatial sound field
using compressed sensing", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE
International Conference on, 2011.
[0050]
10-05-2019
12
<Interpolation estimation unit 216> The interpolation estimation unit 216 receives the estimated
value a (ω) and the position information r <-> = (rx, ry, rz) of the virtual microphone, and the
position r <-> of the virtual microphone The voice-pickup signal u ^ (ω, r <->) (where ω = 1, 2,...,
F) in the frequency domain is estimated by the following equation (S216) (in other words, the
solution vector a <-> (ω The output value u ^ (ω, r <->) of the virtual microphone is estimated
from the plane wave model whose parameter is)) and output.
[0051]
[0052]
Here, ● means an inner product, and k <-> l is a wave number vector corresponding to the
incident direction of the l-th plane wave, and is expressed by the following equation.
[0053]
[0054]
<T> represents transposition.
The position information r <-> of the virtual microphone is input by the user of the sound field
estimation device 200, for example.
[0055]
<Short-time Inverse Fourier Transform Unit 218> The short-time inverse Fourier transform unit
218 receives the collected sound signal u ^ (ω, r <->) (where ω = 1, 2,..., F) in the frequency
domain, The collected sound signal u ^ (ω, r <->) is converted to the collected sound signal y (t, r
<->) in the time domain by inverse short-time Fourier transform (S218) and output.
A method corresponding to the conversion method in the short time Fourier transform unit 211
may be used as a method of converting a time domain signal into a frequency domain signal.
10-05-2019
13
[0056]
<Effects> With the above configuration, it is possible to realize a sound field estimation apparatus
in which the space area in which sound field estimation is effective is larger than that of the prior
art.
[0057]
<Modification 1> In the first embodiment, one virtual microphone is assumed, and a signal
picked up at that position is estimated.
However, as a matter of course, virtual microphones may be assumed at a plurality of positions.
In addition, by arranging virtual microphones on the same spherical surface, it is possible to
configure an open spherical virtual microphone array of radius r.
For example, when the center of the virtual microphone array is at the position of D = [dxdydz]
from the origin and the virtual microphone array including P virtual microphones is configured,
the sound field estimation device 200 , P position information (r <-> p) (where p = 1, 2, ..., P) of
the virtual microphones, and pick-up signal y (t, r <-> p) (where p = Output 1, 2, ..., P).
[0058]
Assuming that the position of the p-th microphone on the spherical surface of the virtual
microphone array is r <-> p (where p = 1, 2,..., P), the interpolation estimation unit 216 generates
the collected sound signal u in the frequency domain. ^ (ω, r <-> p) is estimated by the following
equation.
[0059]
[0060]
FIG. 4 shows an outline of the position of the virtual microphone in the first embodiment and the
10-05-2019
14
first modification thereof.
Since the first embodiment is used when the center [dxdydz] of the virtual microphone array is
[rxryrz], the radius r = 0, and the number P of virtual microphones provided in the virtual
microphone array is 1, this is the first embodiment. The form can be said to be an example of the
first modification (however, in the present modification, a microphone on a spherical surface with
a radius r is provided.
On the other hand, in the first embodiment, the radius r = 0, that is, the microphone is provided
on the point, not on the spherical surface).
[0061]
<Modification 2> In the first embodiment, the case of installing two open-ball type microphone
arrays in a sound field has been described. In this modification, the case of installing two hardball microphone arrays in a sound field will be described.
[0062]
The rigid sphere microphone array has a radius r a and the number J of microphones, and the
microphone arrangement on the spherical surface is designated by a pair of elevation angles and
azimuth angles (θ j, φ j). When a plane wave with an amplitude of 1 is incident from the
direction specified by the elevation angle-azimuth angle pair (θ 1, φ 1), the sound field consists
of the incident wave and the scattered wave.
[0063]
When the center of the rigid sphere microphone array coincides with the origin of the coordinate
system, the signal observed by the jth microphone is
[0064]
[0065]
10-05-2019
15
になる。
When the center of the m-th (m = 1, 2) rigid-spherical microphone array is separated from the
origin by d m, the signal observed by the j-th microphone takes into consideration the phase
difference.
[0066]
になる。
[0067]
Therefore, if the l-th column vector D (ω, l) of the dictionary matrix D (ω) of equation (10) is
generated using equation (16) instead of equation (9), the rest is similarly optimized By solving
the problem, it is possible to extract a plane wave from the output signal of the rigid sphere
microphone array.
[0068]
Since equation (15) includes an infinite number of terms, in practice, a finite n is used, and v
<rigid> (ω, l, m, j) is obtained by numerical calculation.
When r a = 4 cm, n should be about 10.
[0069]
<Other Modifications> In this embodiment, the arrangement of the microphones in the
microphone arrays 1 and 2 and the radius of the spherical microphone arrays 1 and 2 are the
same, but may be different.
Further, the number of microphone arrays does not have to be two, but may be plural.
10-05-2019
16
For example, let M be any integer of 2 or more, and the three-dimensional position of the j m
microphone on M spherical microphone arrays m (m = 1, 2,..., M) corresponds to the center of
the spherical microphone array m. Let d <-> m and radius rm,
[0070]
[0071]
Given by
Let the collected signal by this microphone be the collected signal y (t, m, jm) (jm = 1, 2, ..., Jm in
the time domain, where Jm is the number of microphones included in the spherical microphone
array m) . Equations (7) and (8) are replaced by the following equations.
[0072]
[0073]
The l-th column vector D (ω, l) of the dictionary matrix D (ω) is an observation value of the j m
microphone of the spherical microphone array m when a plane wave with an incident angle (θ l,
φ l) arrives
[0074]
をもちいて、
[0075]
Is represented by
[0076]
Moreover, Formula (15), (16) of the modification 2 of 1st embodiment is substituted to following
Formula.
10-05-2019
17
[0077]
[0078]
Second Embodiment A description will be made focusing on parts different from the first
modification of the first embodiment.
[0079]
In Modified Example 1 of the first embodiment, a microphone array of an open sphere type is
virtually assumed, and the collected sound signal is estimated.
In the second embodiment, based on the configuration of the first modification of the first
embodiment, instead of the open ball type microphone array, a rigid ball type microphone array
is virtually assumed and the collected sound signal is estimated.
[0080]
FIG. 5 shows a functional block diagram of the sound field estimation apparatus 300 according
to the second embodiment, and FIG. 6 shows its processing flow.
[0081]
The sound field estimation apparatus 300 includes a short time Fourier transform unit 211, a
plane wave decomposition unit 213, an interpolation estimation unit 216, and a short time
inverse Fourier transform unit 218, and further includes an array type transform unit 317.
[0082]
First, as a virtual spherical microphone array, it is assumed that sound is collected by a dual open
sphere microphone array of reference 3.
In this microphone array, the microphone elements are disposed on a spherical surface of radius
r or a spherical surface of radius α r, and α = 1.2 is recommended.
10-05-2019
18
(Reference 3) I. Balmages, B. Rafaely, "Open-Sphere Designs for Spherical Microphone Arrays",
IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 2, pp 727-732, 2007.
For example, it is assumed that Q = P × 2, and the positions of P virtual microphone elements
among the Q virtual microphone elements are the same as in the first modification.
That is, the center of the virtual microphone array is at the position of D = [d x d y d z] from the
origin, and the position r <-> p of the pth virtual microphone on the spherical surface of the
virtual microphone array is
[0083]
[0084]
である。
Of the Q virtual microphone elements, the remaining P virtual microphones are arranged on a
sphere with a center of [d x d y d z] and a radius αr, and the position of the q th virtual
microphone is
[0085]
[0086]
とする。
Further, it is assumed that Ω q = (θ q, φ q) = Ω p = (θ p, φ p).
10-05-2019
19
That is, the q-th (q = P + p) microphone and the p-th microphone are in the same direction from
the center of the virtual microphone array, and the q-th microphone is on the spherical surface of
radius r Microphones are on the sphere of radius α r.
[0087]
The interpolation estimation unit 216 calculates the estimated value A, the center D of the virtual
microphone array, and P position information (r <-> p) (where p = 1, 2, ..., P) of the virtual
microphone and P Receive the position information (r <-> q) (where q = P + 1, P + 2, ..., Q), and
the position of the virtual microphone (r <-> p) and (r <-> q) Sound pickup signal u ^ (ω, r <-> p)
(where p = 1, 2, ..., P) in the frequency domain in), u ^ (ω, r <-> q) (where q = P + 1, P + 2,..., Q)
are estimated (S216) and output.
Here, instead of P pieces of position information (r <-> q) (where q = P + 1, P + 2,..., Q), only α is
received, and P pieces of position information (r < P position information (r <-> q) may be
calculated from-> p) and α.
[0088]
<Array type conversion unit 317> The array type conversion unit 317 is a sound collection signal
u ^ (ω, r <-> p) (where p = 1, 2, ..., P) in the frequency domain, u ^ (ω, r <-> q) (where q = P + 1, P
+ 2, ..., Q) and receive the spherical wave spectra un, m (ω, r) and un, m (ω, αr) Convert to
[0089]
[0090]
In an open sphere spherical microphone array, measurement becomes impossible at k and r
where j n (kr) = 0 due to the influence of singular points.
However, by selecting the larger absolute value of un, m (ω, r) and un, m (ω, αr), the double
open-ball spherical microphone array avoids the influence of singularity be able to.
10-05-2019
20
[0091]
Therefore, the array type conversion unit 317
[0092]
[0093]
And when ¦ un, m (ω, r) ¦> ¦ un, m (ω, αr) ¦
[0094]
[0095]
And when ¦ u n, m (ω, r) ¦ ≦ ¦ u n, m (ω, αr) ¦
[0096]
[0097]
The spherical wave spectrum v n, m (ω, r) is determined as
[0098]
The array type conversion unit 317 finally performs inverse spherical wave spectrum conversion
[0099]
[0100]
Apply.
As a result, it is possible to obtain in the frequency domain a sound collection signal in the case
where a hard-sphere type microphone array of radius r is installed at the position of the double
10-05-2019
21
open-sphere type spherical microphone array virtually installed first.
The array type conversion unit 317 outputs a signal v (ω, r <-> p) (where p = 1, 2, ..., P) in the
frequency domain to the short time inverse Fourier transform unit 218.
[0101]
<Effect> With such a configuration, it is possible to obtain the same effect as that of Modification
1 of the first embodiment.
Furthermore, it is possible to virtually obtain a collected sound signal in the case where a rigidball type microphone array is installed.
The present embodiment may be combined with the modification 2 of the first embodiment.
[0102]
Third Embodiment The application of a hard-sphere microphone array to virtual reality is shown
in reference 4.
(Ref. 4) R. Duraiswami 1, DN Zotkin 1, Z. Li, E. Grassi, NA Gumerov, LS Davis, "High Order Spatial
Audio Capture and Binaural Head-Tracked Playback over Headphones with HRTF Cues",
Proceedings 119th convention of AES , 2005.
In this reference 4, the sound pickup signal of a fixed hard-ball microphone array and the
direction of a virtual head are input, and when the head is turned in a designated direction, a
signal (binaural signal) that can be heard by the right ear and the left ear The method of
outputting is shown.
Since the spherical microphone array picks up in all directions, it is possible to generate binaural
signals corresponding to any specified direction without moving the microphone elements and
10-05-2019
22
the microphone array.
That is, when the head rotation of the listener is measured and input in real time, a binaural
signal can be generated to be presented to the listener following the rotation movement.
[0103]
In the second embodiment, a method of obtaining a collected sound signal of a virtually installed
rigid-sphere microphone array has been shown.
A configuration in which the binaural signal generation method is combined with this collected
sound signal as shown in FIG. 7 is the configuration of the present embodiment.
[0104]
Description will be made focusing on parts different from the second embodiment.
[0105]
FIG. 7 shows a functional block diagram of a sound field estimation apparatus 400 according to
the third embodiment, and FIG. 8 shows its process flow.
[0106]
The sound field estimation apparatus 400 includes a short time Fourier transform unit 211, a
plane wave decomposition unit 213, an interpolation estimation unit 216, an array type
transform unit 317, and a short time inverse Fourier transform unit 218, and further includes a
binaural signal generation unit 419.
[0107]
<Binaural Signal Generation Unit 419> The binaural signal generation unit 419 is a virtual head
direction (posture) and time domain sound collection signal y (t, r <-> p) (where p = 1, 2, ..., P,
corresponding to the sound pickup signals of a hard-sphere type spherical microphone array),
and for example, according to the method described in reference 4, binaural signals y (t, R), y (t,
10-05-2019
23
L) are generated (S419), and output as an output value of the sound field estimation device 400.
Note that the position of the virtual head corresponds to the center D = [dxdydz] of the virtual
microphone array, and the collected signal y (t, r <-> p) in the time domain is the virtual head.
Corresponds to the sound pickup signal of the rigid sphere type spherical microphone array at
the position of.
Therefore, in the binaural signal generation unit 419, a binaural signal at a virtual head position
and direction from the virtual head direction (posture) and the collected sound signal y (t, r <-> p)
in the time domain y (t, R), y (t, L) can be generated.
[0108]
The method of reference 4 can follow only the rotational movement of the head and can not cope
with the translational movement of the head.
However, in the configuration of the present embodiment, the rigid sphere type spherical
microphone array can be virtually translated.
To this end, the present embodiment makes it possible to generate binaural signals following
both the rotational and translational movements of the head.
[0109]
<Other Modifications> The present invention is not limited to the above embodiment and
modifications.
For example, the various processes described above may be performed not only in chronological
order according to the description, but also in parallel or individually depending on the
processing capability of the apparatus that executes the process or the necessity.
10-05-2019
24
In addition, changes can be made as appropriate without departing from the spirit of the present
invention.
[0110]
<Program and Recording Medium> In addition, various processing functions in each device
described in the above embodiment and modification may be realized by a computer. In that
case, the processing content of the function that each device should have is described by a
program. By executing this program on a computer, various processing functions in each of the
above-described devices are realized on the computer.
[0111]
The program describing the processing content can be recorded in a computer readable
recording medium. As the computer readable recording medium, any medium such as a magnetic
recording device, an optical disc, a magneto-optical recording medium, a semiconductor memory,
etc. may be used.
[0112]
Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable
recording medium such as a DVD, a CD-ROM or the like in which the program is recorded.
Furthermore, the program may be stored in a storage device of a server computer, and the
program may be distributed by transferring the program from the server computer to another
computer via a network.
[0113]
For example, a computer that executes such a program first temporarily stores a program
recorded on a portable recording medium or a program transferred from a server computer in its
own storage unit. Then, at the time of execution of the process, the computer reads the program
stored in its storage unit and executes the process according to the read program. In another
embodiment of the program, the computer may read the program directly from the portable
10-05-2019
25
recording medium and execute processing in accordance with the program. Furthermore, each
time a program is transferred from this server computer to this computer, processing according
to the received program may be executed sequentially. In addition, a configuration in which the
above-described processing is executed by a so-called ASP (Application Service Provider) type
service that realizes processing functions only by executing instructions and acquiring results
from the server computer without transferring the program to the computer It may be Note that
the program includes information provided for processing by a computer that conforms to the
program (such as data that is not a direct command to the computer but has a property that
defines the processing of the computer).
[0114]
In addition, although each device is configured by executing a predetermined program on a
computer, at least a part of the processing content may be realized as hardware.
10-05-2019
26
1/--страниц
Пожаловаться на содержимое документа