close

Вход

Забыли?

вход по аккаунту

JP2012151530

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2012151530
[PROBLEMS] To simultaneously reproduce and provide binaural sound to two listeners. A
correction unit (10) performs adaptive correction on a left recorded signal and a right recorded
signal recorded in binaural, and a left and right output sound signal which is a sound signal for
reproduction by combining the corrected signals. A reproduction sound processing unit 20, a left
speaker 3 and a right speaker 4 outputting the duplicated left output sound signals, and a middle
speaker 9 installed at a midpoint between the left and right speakers and outputting a right
output sound signal , And reproduces and outputs left and right output audio signals to one and
the other listeners 7 and 8 located at symmetrical points with reference to the middle speaker.
[Selected figure] Figure 1
Binaural speech reproduction system, binaural speech reproduction method
[0001]
The present invention relates to a transaural reproduction system for reproducing binaurally
recorded audio.
[0002]
A microphone is attached to the pinnae of the recording person (or a dummy head simulating a
human head), the surrounding acoustic signals of the recording person (dummy head) are
recorded in advance by this microphone, and the recorded sound signals are paired When
playback output to the listener with stereo speakers, the signal from the left speaker reaches only
the listener's left ear, and the signal from the right speaker reaches only the listener's right ear. A
binaural recording and reproducing system is used which reproduces while canceling crosstalk
by performing filter correction, and provides a listener with a three-dimensional sound including
10-05-2019
1
a sound field at the time of recording.
[0003]
In the filter correction for the recorded audio signal in the above-described binaural recording
and reproducing system, the collected audio signal collected by the microphone attached to the
listener's binaural is compared with the reproduced audio signal sent to the left and right
speakers for reproduction output. Based on the relationship, the filter function as an inverse
transfer function corresponding to the space transfer function related to the positional
relationship between the left and right speakers and the left and right microphones is corrected
by performing convolution operation on the left and right recorded audio signals (crosstalk
Cancel filter).
[0004]
As a result, the listener can listen to three-dimensional sound without using headphones.
Furthermore, even if the positional relationship between the listener and the speaker is mutated,
the filter function is updated as a variation of the space transfer function. Correction can be
performed on the left and right recorded speech signals according to the variation of the space
transfer function.
[0005]
However, in the above-described binaural recording and reproducing system, the speaker from
which sound is reproduced and output and the three-dimensional only at the listener's ear (that
is, the position of the left and right microphones) located at the point where the transfer function
to the listener It is possible to listen to various sounds, for example, it is impossible to
simultaneously provide stereophonic sound to a plurality of listeners.
[0006]
On the other hand, as a configuration of a related art for simultaneously providing threedimensional sound to a plurality of listeners, for example, as shown in FIG. 8, in front of each of
two listeners (A, B) in a row. Two speakers (L1, R1 and L2, R2) for performing voice reproduction
in the above-described binaural recording and reproducing system, a total of four speakers are
arranged, and listener A (speaker R1) and listener B (speaker L2) The system configuration which
installed the partition in order to intercept an acoustic signal between can be considered (it is set
as "related technology a").
10-05-2019
2
Here, by installing the partition, it is suppressed that the sound output from the speakers L2 and
R2 is collected by the microphone worn by the listener A.
[0007]
When the above-mentioned related technology is used in an arcade game or the like for twoperson simultaneous play, the partition does not impair the sense of sharing of the threedimensional sound of the listeners A and B who are players, such as acrylic board It is desirable
that the material is transparent.
[0008]
In the above related art, when the spatial transfer function concerning the positional relationship
between the speaker and both ears is determined by system identification, the speaker L1 and
the speaker R1 are provided with mutually uncorrelated signals, and the listener A side is The
four filter functions corresponding to the crosstalk cancellation filter of are directly calculated at
the same time.
Furthermore, at the time of binaural signal reproduction in the above method, as shown in FIG. 9,
the output of the crosstalk cancellation filter on the listener A side is also supplied in parallel to
the listener B, so that both listeners A and B have both ears. Can simultaneously provide
stereophonic sound.
[0009]
However, in the above related art, a soundproof partition for installing between a pair of
speakers (L, R) and listeners corresponding to each listener is required, and the system
configuration becomes complicated, and the system operation Also has the disadvantage that the
cost associated with
[0010]
In addition, as another configuration for simultaneously providing three-dimensional sound to a
plurality of listeners, as shown in FIG. 10 and FIG. 11, each listening behind each of two listeners
(A, B) arranged side by side A system configuration may be considered in which one pair of left
10-05-2019
3
and right speakers is provided at a position closer to the person's ears (referred to as "related
technology b").
In this case, since it is assumed that the sound output from the installed speaker reaches only the
corresponding listeners, it is not necessary to install partitions or the like between the listeners.
[0011]
However, in this related art b, as in the above related art a, a pair of speakers (L, R) is required
corresponding to each listener, and further, installed corresponding to listener B If the sound
from the speakers (L2, R2) exceeds a certain value, this sound is picked up by the microphone
worn by the listener A, thereby causing a problem that appropriate system identification can not
be performed. obtain.
[0012]
Further, as this related art, the recorded voice is output reproduced from the left and right stereo
loudspeakers, and a crosstalk component in the positional relationship between the audio
reproduced from the stereo loudspeaker and the left and right ears of the listener facing the
loudspeaker A voice reproduction technology is disclosed that cancels the voice from the voice to
be reproduced (Patent Document 1).
[0013]
In addition, as a related technology for this, a position measurement device is installed between a
listener wearing a microphone and a speaker as a sound source, and relative position information
between the sound source and the microphone folon is calculated by this position measurement
device. There is disclosed a system for performing correction for transaural on the basis of the
reproduced sound (Non-Patent Document 1).
[0014]
JP 2000-278800 A
[0015]
Localization of sound image using 3D sensor and transaural processing (TVRSJ Vol.5 No.3, 2000)
10-05-2019
4
[0016]
In the related art described in Patent Document 1 and Non-patent Document 1, it is possible to
provide a single listener with three-dimensional sound that faithfully reproduces the recorded
sound field by audio output from a speaker. is there.
However, as described above, there is a disadvantage that stereophonic sound can not be
provided to a plurality of listeners simultaneously.
[0017]
[Object of the Invention] The present invention improves the disadvantages of the abovementioned related art, and provides a binaural sound reproduction system capable of
simultaneously providing, to two listeners, three-dimensional sound that faithfully reproduces the
recorded sound field, binaural The purpose is to provide an audio reproduction method.
[0018]
In order to achieve the above object, a binaural voice reproduction system according to the
present invention comprises a correction unit for performing adaptive correction on a left
recorded signal and a right recorded signal recorded in binaural, and a signal on which the
correction is performed for synthesis and reproduction. A reproduction audio processing unit
that generates a left and right output audio signal that is an audio signal, a left and right speaker
that respectively outputs one of the left and right output audio signals, and the left and right
speakers Between the two listeners located at a point opposite to the left and right speakers, and
a middle speaker disposed at the middle point between them and outputting the other one of the
left and right output sound signals. Reproduction of the left and right output audio signals with
respect to the other listener who is symmetrical with respect to the speaker and located at a
point opposite to the right and middle speakers The correction unit is configured to perform
filter correction based on first and second filter functions set in advance for the left recorded
signal that has been duplicate input. A correction filter, and third and fourth correction filters for
performing filter correction based on third and fourth filter functions respectively preset for the
respective right recorded signals duplicated, one of the listeners Left and right microphones
equipped in the left and right ear canal regions of the listener and picking up the output left and
right audio signals, and space transfer characteristics from the left, right, and middle speakers to
the left and right microphones And a filter function deriving unit for simultaneously deriving the
first to fourth filter functions based on the position of the one listener and the left speaker
Engagement with said the other of the listener and the positional relationship of the right
speaker is characterized by a symmetrical with respect to the position in said speaker.
10-05-2019
5
[0019]
In the binaural voice reproduction method according to the present invention, a correction unit
for performing filter correction on the left recorded signal and the right recorded signal recorded
in binaural, and an audio signal for reproduction by synthesizing the signal subjected to the filter
correction. A playback audio processing unit that generates left and right output audio signals, a
left and right speaker that respectively outputs one of the left and right output audio signals, and
an intermediate point between the left and right speakers In a binaural voice reproduction
system having a middle speaker which is disposed in the middle and outputs the other one of the
left and right output sound signals, and one listener is located opposite to the left speaker and
the middle speaker. The positions of the one and the other listeners with respect to the other
listener who is located opposite to the right speaker and the middle speaker A method of
reproducing a binaural audio signal, comprising: reproducing the left and right output audio
signals when they are symmetrical with respect to a speaker during recording, and one of the
listeners having a positional relationship symmetrical with respect to the middle speaker. Left
and right microphones equipped at the left and right ears of the listener pick up the output
sound signals output from the left, right and middle speakers, and the left and right sound
collection signals collected A filter function deriving unit that has acquired the left and right
output audio signals is based on the comparison between the left and right collected sound
signals and the left and right output audio signals, and the left, right, and middle speakers to the
left and right speakers. The first to fourth filter functions of the correction unit related to the
space transfer characteristic up to the right microphone are simultaneously derived, and the
reproduction audio processing unit complements each of the first and third correction filters. The
synthesized voice signal is combined for crosstalk cancellation to generate the left output voice
signal, and the voice signal corrected by the second and fourth correction filters is combined for
crosstalk cancellation, and the left output is generated. It is characterized by generating an audio
signal.
[0020]
The present invention can provide a binaural sound reproduction system and a binaural sound
reproduction method that simultaneously provide two listeners with stereophonic sound that
faithfully reproduces the recorded sound field, as described above.
[0021]
It is a schematic block diagram which shows the state at the time of the reproduction ¦
regeneration processing of one Embodiment of the binaural sound reproduction system which
10-05-2019
6
concerns on Embodiment 1 of this invention.
It is a schematic block diagram which shows the state at the time of characteristic calculation of
one Embodiment of the binaural sound reproduction system which concerns on Embodiment 1 of
this invention.
It is a schematic block diagram which shows an example of an internal structure of the filter
characteristic derivation ¦ leading-out part in the binaural sound reproduction system shown in
FIG.
FIG. 7 is a schematic block diagram of a modification of the binaural sound reproduction system
disclosed in FIG. 1;
It is a schematic block diagram which shows an example of the reproduction ¦ regeneration
apparatus in the binaural sound reproduction system shown in FIG.
It is a flowchart which shows the whole operation ¦ movement processing step in the binaural
sound reproduction system shown in FIG.
It is a schematic block diagram which shows one Embodiment of the binaural sound
reproduction system which concerns on Embodiment 2 of this invention.
It is a schematic block diagram which shows the structure which performs the dynamic
calculation of the correction ¦ amendment characteristic in related technology a.
It is a schematic block diagram which shows the structure which reproduces ¦ regenerates a
binaural signal in a related art a.
It is a schematic block diagram which shows the structure which performs the dynamic
calculation of the correction ¦ amendment characteristic in related technology b.
10-05-2019
7
It is a schematic block diagram which shows the structure which reproduces ¦ regenerates a
binaural signal in related technology b.
[0022]
Embodiment 1 Next, an embodiment for carrying out the present invention will be described with
reference to the drawings.
[0023]
As shown in FIG. 1, a binaural voice reproduction system 100 according to an embodiment of the
present invention includes a recorded voice holding unit 1 for holding a stereo recorded signal
consisting of a left recorded signal and a right recorded signal previously recorded in binaural;
The reproduction audio signal generated by performing correction processing based on a preset
filter coefficient (filter function) on the stereo recording signal provided from the holding unit 1
is transmitted to the left speaker (L) 3 installed in advance, The medium speaker (C) 9 and the
right speaker (R) 4 respectively output as a stereo sound for reproduction to the listener (A) 7
and the listener (B) 8 positioned side by side facing these speakers A playback device 2 is
provided.
[0024]
The listener (A) 7 and the listener (B) 8 are, for example, places separated by a predetermined
distance in parallel in parallel with the left speaker (L) 3, the middle speaker (C) 9, and the right
speaker (R) 4 It should be in the state of sitting on the chair installed in each.
[0025]
Here, the positional relationship between the listener (A) 7 and the listener (B) 8 is a common
speaker installed on the right ear side of the listener (A) 7 and the left ear side of the listener (B)
8 (Here, the left speaker (L) 3 and the listener (A) 7 installed at the same distance from the
middle speaker (C) 9 and installed on the left ear side of the listener (A) 7 And the distance
between the right speaker (R) 4 installed on the right ear side of the listener (B) 8 and the listener
(B) 8 is equal.
[0026]
In addition, as shown in FIG. 2, the binaural sound reproduction system 100 is equipped at the
left and right ear canal entrances (ear canal area) of the listener (A) 7 positioned at a point
opposite to the left speaker 3 and the middle speaker 9. Audio signals collected by the left and
right sound collecting microphones (left microphone and right microphone) 5 and 6 and the
10-05-2019
8
sound collecting microphones 5 and 6, respectively, which collect the sound output from the
speakers 3, Hereinafter, left and right reproduction audio signals (left D1 and right D2) are
acquired from the sound collection signal and the reproduction audio processing unit 20,
and the first to fourth adaptations shown below are based on the comparison relationship of
these signals. A filter characteristic deriving unit (JFHF) 30 that identifies the filter coefficient
(filter function) of each filter is included.
[0027]
Further, as shown in FIG. 1, the reproduction device 2 performs a correction process on the
recorded signal (binaural recorded speech) provided from the recorded speech holding unit 1,
and a corrected signal output from the filter unit 10. (Correction signal) is a crosstalk component
relating to the positional relationship between the left and middle speakers 3 and 9 and the
sound collection microphones 5 and 6, and an acoustic component ("external component") which
reaches the sound collection microphones 5 and 6 from the right speaker 4. ) To perform left and
right reproduction audio signals, and the left reproduction audio signal which is a common
reproduction audio signal from each of the speakers 3 and 4 and the right reproduction from the
speaker 9 The reproduction audio processing unit 20 outputs the audio signal for audio.
[0028]
Here, among the audio signal components output from each speaker, the audio signal component
as a cancellation target is an audio signal component that reaches the left microphone 5 from the
middle speaker 9 as shown in FIGS. Component), an audio signal component (C21: crosstalk
component) arriving from the left speaker 3 to the right microphone 6, an audio signal
component (C23) arriving from the right speaker 4 to the right microphone 6, and the right
speaker 4 to the left microphone 6 It is an audio signal component (C13) to reach.
[0029]
Generally, when a binaurally recorded stereo recorded audio signal is reproduced in stereo by a
loudspeaker, the listener, who is located at a point opposite to the left and right speakers,
spatially spreads and localizes the sound at the time of recording due to the influence of crosstalk
components. Feeling is reduced.
[0030]
Therefore, the reproduction audio processing unit 20 performs processing for canceling the
audio signal component (C12, C21) which is the crosstalk component and the external
component (C13, 23), thereby the listener (A) 7 The left reproduction sound component (C11)
only reaches the left microphone 5 attached, and only the right reproduction sound generation
10-05-2019
9
amount (C22) reaches the right microphone 6, so that the listener (A) 7 is at the time of
recording It is possible to listen to the three-dimensional sound in which the sound field of H is
reproduced.
[0031]
At this time, since the same audio signal (left audio signal) as the speaker (L) 3 is output from the
speaker (R) 4, as shown in FIG. 1, for the listener (B) 8, An audio component C33 corresponding
to the left reproduced audio component (C11) arrives from the speaker (R) 4 to the right ear.
Here, the audio component C33 and the left reproduced audio component (C11) become the
same audio component.
Further, the position of the right microphone 6 (the right ear of the listener (A) 7) and the
position of the left ear of the listener (B) 8 are in the front direction of the speaker (C) 9 (two-dot
chain line: FIGS. A sound component (C32) identical to the right reproduction sound component
(C22) reaches from the speaker (C) 9 to the right ear since the symmetrical position and the
distance from the speaker (C) 9 are equal. Do.
[0032]
As a result, the listener (B) 8 can simultaneously listen to the stereophonic sound being listened
to by the listener (A) 7 in a state in which the left and right are reversed.
[0033]
The reproduction sound processing unit 20 acquires the left recorded signal and the right
recorded signal (left correction signal and right correction signal) subjected to the correction
processing by the filter unit 10, and performs the process of mutually adding these signals. And a
right audio adder 22 (FIG. 1).
[0034]
The left audio adder 21 adds the correction signal corrected by the first correction filter 11 and
the correction signal corrected by the third correction filter 13 to reproduce from the left
10-05-2019
10
speaker 3 and the right speaker 4.
Further, the right audio adder 22 adds the correction signal corrected by the second correction
filter 12 and the correction signal corrected by the fourth correction filter 14 and reproduces it
from the middle speaker 9.
[0035]
The reproduction device 20 also includes a delay correction processing unit 25 that corrects the
delay that occurs at the time of input and output of sound at the time of characteristic calculation
(FIG. 2) and that inputs the filter characteristic deriving unit 30 as a delay correction signal.
The delay correction unit 25 generates a delay generated until the sound subjected to the
addition processing is output from the left speaker 3 (right speaker 4) and the middle speaker 9,
and the sound output from the speaker is generated by propagating through space. The device
delay that is the sum of the delay and the delay until the collected sound signal collected by the
microphones 5 and 6 is sent to the filter characteristic deriving unit 30 is corrected (FIG. 2).
[0036]
The delay correction unit 25 is assumed to be, for example, a temporary storage device such as a
semiconductor memory that holds the added voice for a predetermined time based on the delay
time.
In addition, the delay correction unit 25 may be installed outside the playback device 20 and
may provide the filter characteristic deriving unit 30 with a delay correction signal necessary for
characteristic calculation.
[0037]
The filter unit 10 uses the filter coefficients (filter characteristics) calculated based on the space
transfer characteristics identified in advance by the filter characteristics derivation unit (JFHF)
30 to correct the left recorded signal provided from the recorded voice holding unit 1. A
correction process for the right recorded signal provided from the recorded voice holding unit 1
10-05-2019
11
based on the filter coefficients identified by the first correction filter 11 and the second
correction filter 12 to be performed and the filter characteristic deriving unit (JFHF) 30 similarly
The third correction filter 13 and the fourth correction filter 14 that perform
[0038]
The left recorded signal sent from the recorded voice holding unit 1 to the filter unit 10 is copied
by the recorded voice holding unit 1 and input to each of the first correction filter 11 and the
second correction filter 12.
Similarly, the right recording signal sent from the recording voice holding unit 1 to the filter unit
10 is copied by the recording voice holding unit 1 and input to the third correction filter 13 and
the fourth correction filter 14 respectively. I assume.
[0039]
The first correction filter 11 performs a correction process on the left recorded signal with a
filter coefficient (H11) set in advance by the filter characteristic deriving unit (JFHF) 30 for the
left recorded signal, and corrects the left recorded signal ( The left correction signal is input to
the left audio adder 21.
Specifically, the first correction filter 11 performs convolution processing on the input left
recorded signal using the filter coefficient (H11) set by the filter function identification
processing unit (JFHF) 30. It is assumed that it is an FIR filter.
[0040]
The second correction filter 12 performs a correction process on the left recorded signal with the
filter coefficient (H21) set in advance by the filter characteristic deriving unit (JFHF) 30 for the
left recorded signal, and corrects the left recorded signal ( The left correction signal is input to
the right audio adder 22.
[0041]
Specifically, the second correction filter 12 performs convolution processing on the input left
10-05-2019
12
recorded signal using the filter coefficient (H21) set by the filter function identification
processing unit (JFHF) 30. It is assumed that it is an FIR filter.
[0042]
The third correction filter 13 performs correction processing on the right recorded signal by the
filter coefficient (H12) set in advance by the filter characteristic deriving unit (JFHF) 30, and also
corrects the corrected signal (referred to as a right correction signal). Input to left voice adder
21.
Specifically, the third correction filter 13 performs FIR convolution processing on the input right
recorded signal using the filter coefficient (H12) set by the filter function identification
processing unit (JFHF) 30. It shall be a filter.
[0043]
The fourth correction filter 14 performs correction processing on the right recorded signal with
the filter coefficient (H22) preset by the filter characteristic deriving unit (JFHF) 30 on the right
recorded signal, and corrects the corrected signal (right correction Signal) is input to the right
audio adder 22.
Specifically, the fourth correction filter 14 performs FIR convolution processing on the input
right recorded signal using the filter coefficient (H22) set by the filter characteristic deriving unit
(JFHF) 30. It shall be a filter.
[0044]
As shown in FIG. 5, the filter characteristic deriving unit 30 transmits the sound signal (D) sent
from the reproduction sound processing unit 20 (delay correction unit 25) and the sound
collection microphones (left and right microphones) 5 and 6 The filter coefficients (H11, H21,
H12, H22) in the correction filters 11 to 14 are calculated from the comparison relationship
based on the collected sound signals Y (Y1, Y2).
10-05-2019
13
Specifically, it is assumed that the filter characteristic deriving unit 30 calculates filter
coefficients based on the matrix equation [Equation 1] shown below.
[Equation 1] C = Y / D H = 1 / C (D represents a signal delayed due to space transfer with the
voice input / output device.
C indicates the space transfer function of sound)
[0045]
Thereby, the filter characteristic deriving unit 30 can quickly calculate the first to fourth filter
functions based on the acquired reproduction audio signal and the collected sound signal.
[0046]
The filter characteristic deriving unit 30 includes the collected signals Y1 (n) and Y2 (n), which
are the signals collected by the left and right microphones 5 and 6, respectively, and the delay
processed signal D1 sent from the delay correction unit 25. n−d) and D2 (n−d) are input, and
the filter coefficients of the correction filters (11 to 14) in the filter unit 10 are calculated based
on these signals (adaptive signal processing).
Note that the delay processing signal D1 (n-d) is a delay processing signal obtained by
performing correction processing on left output speech, and the delay processing signal D2 (n-d)
performs correction processing on right output speech It is assumed that it is the delay
processing signal that has been performed.
[0047]
As shown in FIG. 3, the filter characteristic deriving unit 30 is a sound pickup signal Y 1 (n)
which is an audio signal collected by the left microphone 5 and a delay processing signal D 1
(n−d) sent from the delay correction unit 25. Filter coefficient calculation means 311 for
calculating the filter coefficient (H11) of the correction filter 11 based on the) and the collected
sound signal Y2 (n) which is a sound signal collected by the right microphone 6 and sent from
the delay correction unit 25 Filter coefficient calculation means 321 for calculating the filter
coefficient (H21) of the correction filter 12 based on the delayed processed signal D1 (n-d), and
10-05-2019
14
the collected signal Y1 (n) which is a voice signal collected by the left microphone 5 Filter
coefficient calculation means 312 for calculating the filter coefficient (H12) of the correction
filter 13 based on the delay processing signal D2 (n-d) sent from the delay correction unit 25;
voice collected by the left microphone 5 Filter coefficient calculation means 322 for calculating
the filter coefficient (H22) of the correction filter 14 based on the sound collection signal Y2 (n)
which is a signal and the delay processing signal D2 (nd) sent from the delay correction unit 25
Are
[0048]
Here, the filter coefficient calculation means 311, 321, 312, 322 for calculating the filter
coefficient (Hxx) is such that the contrast between the collected signal Y (n) and the delay
processing signal D (nd) approximates 1 It is assumed that (Y / D = 1) is a digital filter (typically,
FIR) that determines the filter coefficients of the correction filters 11, 12, 13, 14.
[0049]
As shown in FIG. 3, the filter coefficient calculation unit 311 calculates the filter coefficient (H11)
based on the sound collection signal Y1 (n) as the plus component and the delay processing
signal D1 (n-d) as the minus component. The filter coefficient calculation unit 321 calculates the
filter coefficient (H21) based on the sound collection signal Y2 (n) as the plus component and the
delay processing signal D1 (n−d) as the minus component, and the filter coefficient calculation
unit 312 The filter coefficient (H12) is calculated based on the sound collection signal Y1 (n) as
the plus component and the delay processing signal D2 (n−d) as the minus component, and the
filter coefficient calculation means 322 collects the plus component. A filter coefficient (H22) is
calculated based on the sound signal Y2 (n) and the delay processing signal D2 (n-d) as a minus
component.
[0050]
Here, the correction filters 11 and 12 are independently performed on the left recorded speech,
and the correction filters 13 and 14 are independently performed on the right recorded speech
using the determined filter coefficients. As a result, correction processing adapted to the
variation of the corresponding space transfer characteristic is performed.
[0051]
In this embodiment, calculation (characteristic calculation) of the filter coefficient of the filter
unit 10 is performed according to the configuration shown in FIG. 2, and reproduction of the
binaural signal (reproduction processing) is performed according to the configuration content
shown in FIG. It has been realized to simultaneously provide transaural reproduction of binaural
10-05-2019
15
recorded speech to a person (here, 7 and 8), but for example, as shown in FIG. 4, the
characteristic is calculated by the filter characteristic deriving unit 30. The filter coefficient may
be instantaneously set as the filter coefficient of the correction filter in the filter unit 10.
[0052]
In this case, as shown in FIG. 5, the filter coefficients of the correction filters 11 to 14 of the filter
unit 10 are adjusted for variations in space transfer characteristics from the speakers (3, 4 and 9)
to the microphones (5 and 6) It is possible to identify adaptively and instantaneously.
For this reason, it is possible to perform characteristic calculation by following the fluctuation of
the space transfer characteristic caused by wearing of the listener 7 (when the listener 7 changes
the direction of the neck or the line of sight).
[0053]
Further, in the present embodiment (the embodiment shown in FIG. 1 and FIG. 2 or FIG. 4),
adaptive signal processing by the filter coefficient calculation means (311, 321, 312, 322) of the
filter characteristic deriving unit 30 is disclosed in Japanese Patent No. 4067269. It shall be
realized using the disclosed fast HH filter (referred to as FHF ).
[0054]
As a result, the filter characteristic deriving unit 30 quickly determines the filter coefficients
(H11, H21, H12, H22) of the correction filters 11, 12, 13, 14 based on the space transfer
characteristics C11, C12 and C13, C21 and C23, C22. In addition, after calculating each space
transfer characteristic, it is not necessary to calculate the filter function as the inverse
characteristic of each, but the filter function (H11, H21, H12 and H22) can be calculated directly
and almost simultaneously.
[0055]
As described above, the filter characteristic deriving unit 30 performs adaptive signal processing
at high speed, thereby causing the positional relationship between the speakers 3, 4, 9 and the
ear position of the listener 7 (that is, the microphones 5, 6) to fluctuate. Since the variation of the
space transfer characteristic is followed in real time, the crosstalk components (C12, C21)
between the speakers 3, 4, 9 and the microphones 5, 6 are respectively canceled and the left
recorded audio signal and the right recorded audio signal are listeners. It can be heard as it is to
10-05-2019
16
each of the left and right ears, and the transaural regeneration can be maintained.
[0056]
In addition, a correction function corresponding to the space transfer function between each
speaker (left middle right speaker 3, 9, 4) and each microphone (left and right microphones 5, 6)
is calculated, and a cross related to this space transfer function (positional relationship) The
listener (A) 7 can perform the reproduction output of the reproduction sound signal corrected
based on the correction function while canceling the sound signal component (C12, C21) which
is the talk component and the external components (C13, 23). While the localization of the sound
heard by the listener (B) 8 is reversed horizontally, it becomes possible to simultaneously provide
three-dimensional sound to two listeners side by side with respect to the speaker. B) Transoral
regeneration is established simultaneously in the two ears of each of 7 and listener (B) 8.
[0057]
In the present embodiment, the case where the filter coefficients (H11, 12, 21, and 22) are
obtained each time on the basis of the collected sound signal sent from the microphones 5 and 6
and the delay processing signal sent from the delay correction unit 25 will be described.
However, the filter coefficient may be newly calculated from the comparison between the newly
input sound pickup signal and the delay processing signal based on the previously determined
filter coefficient.
[0058]
[Description of Operation of First Embodiment] Next, the overall operation content of the abovedescribed embodiment will be described.
[0059]
In the binaural voice reproduction system 100, first, the reproduction voice processing unit 20
corrects the left reproduction voice corrected by the filter unit 10 from the recorded voice
holding unit 1 from each of the speakers 3 and 4 and similarly by the filter unit 10. The right
reproduction audio is reproduced and output simultaneously from the speaker 9 (reproduction
processing).
Here, it is assumed that the filter unit 10 performs the above-mentioned correction processing
based on the preset filter coefficient.
10-05-2019
17
[0060]
Next, the sound collecting microphones (hereinafter referred to as microphones
attached to the ears of the listener 7 collect the sound reproduced and output.
) 5 and 6
Next, the filter characteristic deriving unit 30 acquires the left and right collected sound signals
collected by the microphones 5 and 6 and the left and right output signals subjected to delay
processing from the reproduction audio processing unit 8, and the left and right collected sound
signals and the left and right outputs The filter functions of the correction filters 11 to 14 are
derived based on the spatial transfer characteristics from the left and right speakers 3, 4, 9 to the
left and right microphones 5, 6 from the contrast relationship of the signals (characteristic
calculation processing).
[0061]
Here, the filter functions in the correction filters 11, 12, 13, and 14 are set to the filter functions
respectively derived by the identification process.
At this time, the setting of the filter function in the correction filters 11, 12, 13, and 14 is
performed in a state in which the reproduction process is interrupted.
[0062]
Next, the binaural voice reproduction system 100 performs reproduction processing based on
the set filter function.
At this time, the filter unit 10 performs correction processing on the left and right recorded
voices sent from the recorded voice holding unit 1 based on a preset filter function, and then
each of the signals corrected by the adders 21 and 22 Are respectively added for cancellation of
crosstalk components (or external components (C13, 23)) to generate a left reproduction audio
signal and a right reproduction audio signal, and reproduce and output them as described above
10-05-2019
18
(reproduction output ).
[0063]
In the above embodiment, the characteristic calculation process (identification process) and the
reproduction output may be simultaneously performed by the configuration shown in FIG. 4.
In this case, the filter characteristic deriving unit 30 acquires the collected sound signals from
the left, right, and middle speakers 3, 4, and 9 and the left output sound signal and the right
output which are output from the adders 21 and 22. The filter function of each of the correction
filters 11 to 14 is calculated (characteristic calculation processing) based on the audio signal and,
at this time, the filter functions of the correction filters 11, 12, 13, and 14 are also calculated as
described above. It shall be set to the filter function.
[0064]
Next, the operation in the case where the reproduction process and the characteristic calculation
process (identification process) are simultaneously performed with the configuration shown in
FIG. 4 in the above embodiment will be specifically described based on the flowchart of FIG.
[0065]
First, a left recorded signal and a right recorded signal binaurally recorded in advance are sent
from the recorded voice holding unit 1 to the filter unit 10 (step S101).
The filter unit 10 performs correction processing based on a preset filter coefficient (filter
function) on the sent left and right recording signals, and inputs the correction processing to the
reproduction voice processing unit 20 (step S102).
Here, in the above correction processing, filter coefficients predetermined for the left recorded
speech independently for the correction filters 11 and 12 and for the right recorded speech
independently for the correction filters 13 and 14 respectively. Perform convolution operation
processing at.
10-05-2019
19
[0066]
Next, the left audio adder 21 of the reproduction audio processing unit 20 adds the correction
signal corrected by the correction filter 11 and the correction signal corrected by the correction
filter 13 to output from the speakers 3 and 4 respectively. The reproduction audio signal (left
reproduction signal) is synthesized.
Further, the right audio adder 22 adds the correction signal corrected by the correction filter 13
and the correction signal corrected by the correction filter 14 to obtain a reproduction sound
signal (right reproduction signal) output from the speaker 9. Compose (step S103).
[0067]
Here, in the left audio adder 21, a crosstalk component (C12) which is a space transfer
characteristic from the speaker (C) 9 to the microphone 5 and an external component (a space
transfer characteristic from the speaker (R) 4 to the microphone 5) When the filter function
(H12) of the correction filter 13 for the combined component with C13) is correctly identified,
the crosstalk component (the left reproduction signal, which is a combined signal of the signals
corrected by the filter 11 and the filter 13 respectively) C21) and the external component (C23)
are cancelled.
Therefore, only components equivalent to the left recorded audio signal reach the left ear of the
listener 7 and the right ear of the listener 8.
That is, correction is performed such that a signal equivalent to the left recorded audio signal
reaches the left ear of the listener 7 and the right ear of the listener 8.
[0068]
Further, in the right audio adder 22, a crosstalk component (C21) which is a space transfer
characteristic from the speaker (L) 3 to the microphone 6 and an external component (C23)
which is a space transfer characteristic from the speaker (R) 4 to the microphone 6 When the
10-05-2019
20
filter function (H21) of the correction filter 12 is correctly identified, which is an inverse
characteristic of the synthesis component of the right), the right reproduction signal is a
synthesis signal of the signal corrected by the filter 12 and the filter 14 respectively. The
crosstalk component (C12) and the external component (C13) are canceled.
Therefore, only components equivalent to the right recorded audio signal reach the right ear of
the listener 7 and the left ear of the listener 8.
That is, the right ear of the listener 7 and the left ear of the listener 8 are corrected to reach a
signal equivalent to the right recorded audio signal.
[0069]
Next, the left audio adder 21 duplicates the synthesized left reproduction signal and inputs it to
the speaker 3 and the speaker 4 and also to the delay correction unit 25.
Further, the right audio adder 22 inputs the synthesized right reproduction signal to the speaker
9 and also inputs it to the delay correction unit 25 (step S104).
[0070]
The speakers 3, 4 and 9 respectively reproduce (voice output) the input reproduction signal, and
the microphones 5 and 6 pick up the sound.
The microphones 5 and 6 send the collected signals to the filter characteristic deriving unit 30 as
collected signals Y1 (n) and Y2 (n), respectively (step S105).
[0071]
Here, the filter coefficient calculation means 311, 321, 312, 322 of the filter characteristic
deriving unit 30 respectively receive the delay processed signal D1 (nd) sent from the delay
correction unit 25 based on the following [Equation 1]. And D2 (nd), adaptive signal processing is
performed to calculate the filter coefficients (H11, H21, H12, and H22) in the correction filters
10-05-2019
21
11 to 14 from the comparison relationship (step S106).
[0072]
[Equation 1] C = Y / D H = 1 / C (C represents a space transfer function of sound.
Also, D represents a signal delayed by space transmission with the voice input / output device.
)
[0073]
Here, it is assumed that the calculated filter functions are instantaneously set to the
corresponding correction filters 11 to 14, respectively.
[0074]
Thereby, even if the positional relationship between the listener 7 and the speakers 3 and 4 and
the mutual direction change, the filter characteristics (filter coefficients) of the correction filters
11 to 14 are adapted for crosstalk cancellation by adapting to the change. It can be decided.
Note that the delay processing signal D1 (n-d) is a delay processing signal obtained by
performing correction processing on left output speech, and the delay processing signal D2 (n-d)
performs correction processing on right output speech It is assumed that it is a delayed signal
that has been sent.
[0075]
Here, the correction filters 11 and 12 are each independently performed on the left recorded
speech, and the correction filters 13 and 14 are each independently performed on the right
recorded speech on the basis of the determined filter coefficients. Then, correction processing
adapted to the variation of the corresponding space transfer characteristic is performed (to step
S102).
10-05-2019
22
[0076]
Thereafter, the filter characteristic deriving unit 30 performs adaptive signal processing at high
speed to perform space transfer characteristics according to the change in the positional
relationship between the speakers 3, 4, and 9 and the listener's ear position (that is, the
microphones 5, 6). Since the fluctuation is followed in real time, the state close to HC = 1
(converged) is maintained as described above.
[0077]
As described above, in the present embodiment, it is possible to identify in real time the space
transfer function corresponding to the positional relationship between the position of the
listener's ear and the speaker, and therefore, the speakers 3 and 9 and the microphone 5 Since
the crosstalk component (C12, C21) between 6 and the external components (C12, C21) between
the speakers 9, 4 and the microphones 5, 6 can be canceled respectively, the left recorded audio
signal and the right recorded audio signal The sense of localization at the time of recording is
reproduced as stereophonic sound by the left ear and right ear of the listener 7, and the
stereophonic sound heard by the left ear and right ear of the listener 7 by the left ear and right
ear of the listener 8 is Transaural reproduction is established at the same time in the ears of each
of the side-by-side listeners 7 and 8 that are reproduced with left-right inversion.
[0078]
Second Embodiment Next, a binaural voice reproduction system 101 according to a second
embodiment of the present invention will be described.
Here, the same parts as in the first embodiment described above are given the same reference
numerals.
[0079]
In this second embodiment, transaural reproduction voices are simultaneously provided to two
listeners in the first embodiment, but as shown in FIG. The transaural reproduction sound is
simultaneously provided to four listeners (a, b, c, d) equally spaced on the circumference of).
[0080]
10-05-2019
23
In the binaural voice reproduction system 101 according to the second embodiment, four
listeners (a, b, c, d) are arranged in this order on the circumference of a circle with a constant
radial distance (the center is O). The speakers R1, L1, R2 and L2 are respectively arranged at
equal intervals and at the middle points of the listeners a and b, b and c, c and d, d and a on the
circumference.
Here, the left reproduction signal (left reproduction voice) output from the addition unit 21 is
reproduced and output from the speakers L1 and L2, and the right reproduction signal (right
reproduction) output from the addition unit 22 from the speakers R1 and R2 (right reproduction
Voice) is played back and output.
[0081]
In the second embodiment, the left output voice output from the left voice adder 21 is copied and
output from each of the speakers L1 and L2, and the right output voice output from the right
voice adder 22 is copied and the speaker It is output from each of R1 and R2.
[0082]
In the second embodiment, the left and right ear canal entrances (ear canal area) of the listener
(a) located at the midpoint between the loudspeaker L2 and the loudspeaker R1 on the
circumference are equipped with the loudspeakers L1, L2, R1, and R1. It is equipped with left
and right sound collection microphones (left microphone, right microphone) 5 and 6 for
collecting the sound output from R2.
[0083]
Here, the listeners (a) and (c) face each other at the center O (that is, they face each other), and
the listeners (b) and (d) face each other. In the chair installed in advance.
[0084]
Thus, the positional relationship between the listener (a) and the listener (b) is set on the right
ear side of the listener (a) and the left ear side of the listener (b) as in the first embodiment. The
distance from the common speaker (in this case, speaker R1: corresponding to speaker (C) 9) to
the listeners (a) and (b) is an equal distance, and the listener (a) The distance between the left
speaker L2 (corresponding to the speaker (L) 3) installed on the left ear side and the listener (a),
and the speaker L1 (speaker (R) 4 installed on the right ear side of the listener (b) And the
distance between the listener b and the listener b) are equal.
10-05-2019
24
[0085]
Thus, when the right reproduction audio is output from the speaker R1 and the left reproduction
audio is output from each of the speakers L1 and L2, the listener (a) can listen to the threedimensional sound in which the crosstalk component is canceled. Furthermore, the listener b can
simultaneously listen to the stereophonic sound being listened to by the listener (a) with the left
and right reversed.
[0086]
In the second embodiment, the right reproduction audio is also output from the speaker R2 and
may be picked up by the microphones 5 and 6 worn by the listener (a). And the positional
relationship between the microphones 5 and 6 and the speaker R2 are symmetrical with respect
to the listener (a), so the right reproduction audio output from the speaker R2 is also an external
component. It is canceled in the same manner as the left playback audio from L1.
[0087]
Also, the positions of the listeners c and d are symmetrical with respect to the positions of the
listeners (a) and (b) and the center O, and the speakers L1, R2 and L2 are the positions of the
speakers L2, R1 and L1, respectively. Because the listener is symmetrical with respect to the
center O, the listener (c) can listen to the same stereophonic sound as the listener (a), and the
listener (d) listens to the listener (c). It is possible to listen to a stereophonic sound (that is, the
same acoustic component as the listener (b) is listening to) in which the stereophonic sound
being played back and forth is reversed.
[0088]
For this reason, in the second embodiment, a space transfer function corresponding to the
positional relationship between the position of the listener (a) 's ear and the speakers (L1, L2, R1,
R2) is identified, and correction based on this is performed. Thus, crosstalk components (C12,
C21) between the speakers 3, 9 and the microphones 5, 6 and external components (C12, C21)
between the speakers 9, 4 and the microphones 5, 6 are effectively canceled. .
This reproduces the sense of localization at the time of recording as a stereophonic sound by the
left and right ears of the listener (a) as the left recorded audio signal and the right recorded audio
signal, and at the same time, the four listeners (a, b, c) , D) Transaural regeneration is established
10-05-2019
25
in each ear.
[0089]
Further, in the second embodiment, an additional speaker for reproducing the left and right
reproduction signal is newly provided, and one point for the left (or right) speaker and the
additional speaker, and the other point for the right speaker and the symmetrical speaker are
symmetrical speakers. In this case, stereophonic sound can be provided to a listener (m) located
at one point and a listener (n) located at the other point.
For this reason, by providing an additional speaker for forming a symmetrical relationship
between two listeners anew, it is assumed that there are a plurality of (but even) persons located
at points of the symmetrical relationship with respect to the additional speaker. It is possible to
provide stereophonic sound to the listener simultaneously.
[0090]
About the above-mentioned embodiment, it will be as follows if the gist of the novel technical
content is put together.
In addition, although a part or all of the said embodiment is put together as follows as a novel
technique, this invention is not necessarily limited to this.
[0091]
(Supplementary Note 1) A correction unit that performs adaptive correction on the left recorded
signal and the right recorded signal recorded in binaural, and the left and right output sound
signals, which are sound signals for reproduction, by combining the corrected signals. The left
and right output voices are disposed at a midpoint between the left and right speakers, and the
left and right speakers that respectively output one of the left and right output voice signals, and
the reproduction voice processing unit. And a middle speaker for outputting the other output
audio signal of the signals, one listener located at a point opposite to the left and middle speakers
and symmetry based on the middle speaker and the right and middle speakers A binaural sound
reproduction system for reproducing and outputting the left and right output sound signals to
the other listener located at an opposite point, comprising: The first and second correction filters
10-05-2019
26
that perform filter correction based on the first and second filter functions set in advance for the
left recorded signal that has been replicated and input, and the respective right recorded signals
that are replicated and input The third and fourth correction filters for performing filter
correction based on the third and fourth filter functions respectively preset as objects are
provided in the left and right ear canal regions of one of the listeners The first to fourth filter
functions based on the space transfer characteristics from the left, right, and middle speakers to
the left, right, and middle speakers for picking up the output left and right audio signals are
simultaneously processed. A positional relationship between the one listener and the left speaker,
and a positional relationship between the other listener and the right speaker; A binaural sound
reproduction system characterized in that is symmetrical with respect to the position of the
middle speaker.
[0092]
(Supplementary Note 2) The binaural voice reproduction system according to Supplementary
Note 1, further comprising: a symmetrical speaker that outputs the same left or right output
sound as the middle speaker corresponding to the middle speaker, and one of the left speaker
and the symmetrical speaker. A third listener located at the one point and a fourth listening
located at the other point when the point on the other side and the other point on the right
speaker and the symmetrical speaker are symmetrical with respect to the symmetrical speaker A
binaural voice reproduction system, which reproduces and outputs the left and right output
sound signals to a person.
[0093]
(Supplementary Note 3) In the binaural voice reproduction system according to supplementary
note 1, the filter function deriving unit collects the left and right output sound signals output
from the reproduction device and the left and right sound collection microphones, respectively. A
binaural voice reproduction system comprising a characteristic calculation adaptive filter for
calculating each of the first to fourth filter functions such that the contrast with each of left and
right collected sound signals approximates to 1.
[0094]
(Supplementary Note 4) In the binaural voice reproduction system according to supplementary
note 1, the filter function deriving unit newly determines the first one based on the difference
between the first to fourth filter functions calculated in advance and the newly input left and
right collected signal. A binaural voice reproduction system having a filter function update
calculation function of updating and calculating the filter functions 1 to 4
10-05-2019
27
[0095]
(Supplementary Note 5) In the binaural voice reproduction system according to supplementary
note 1 or 2, when the filter function deriving unit derives the first to fourth filter functions, the
filter output relates to the reproduction output of the left and right output audio signals. A
binaural voice reproduction system comprising: a delay processing unit that performs delay
correction on the left and right output audio signals based on a delay time and a delay time
related to sound collection by the left and right sound collection microphones.
[0096]
(Supplementary Note 6) A correction unit that performs filter correction on the left recorded
signal and the right recorded signal recorded in binaural, and the signal subjected to the filter
correction are combined to generate left and right output sound signals that are sound signals
for reproduction. Installed at the middle point between the left and right speakers and the left
and right speakers that respectively output one of the left and right output sound signals, and the
left and right outputs. In a binaural voice reproduction system having a middle speaker for
outputting the other output sound signal of the audio signals, one listener located opposite to the
left speaker and the middle speaker and the right speaker and the middle speaker A field in
which the positions of the one and the other listeners are symmetrical with respect to the middle
speaker with respect to the other listeners located opposite to each other A binaural sound
reproduction method for reproducing and outputting the left and right output sound signals, the
apparatus being provided at the left and right auricles of one listener of the listener in a
positional relationship symmetrical to the middle speaker Left and right microphones pick up the
output sound signals outputted from the left, right and middle speakers, and the left and right
sound collection signals picked up and the left and right output sound signals are obtained A
filter function deriving unit relates to the space transfer characteristic from the left, right, and
middle speakers to the left and right microphones based on the contrast between the left and
right collected sound signals and the left and right output sound signals. When the first to fourth
filter functions of the correction unit are simultaneously derived, and the correction unit
performs filter correction based on the first and second filter functions on the left recorded audio
signal. The filter correction based on the third and fourth filter functions is performed on the
right recorded audio signal, and the reproduction audio processing unit cross talks the audio
signal corrected by the first and third correction filters. The left output audio signal is generated
by combining for cancellation, and the audio signal corrected by the second and fourth
correction filters is combined for crosstalk cancellation to generate the left output audio signal.
And the binaural sound reproduction method.
[0097]
The present invention can be usefully applied to an arcade game or the like using a transaural
system in which auditory information of a virtual world recorded in binaural is superimposed on
10-05-2019
28
surrounding sounds in the real world and provided to a listener.
[0098]
DESCRIPTION OF SYMBOLS 1 recorded audio holding unit 2 reproduction device 3 left speaker 4
right speaker 5 left microphone (left sound collection device) 6 right microphone (right sound
collection device) 7, 8 listener 10 filter unit (correction unit) 20 reproduction sound processing
unit 30 Filter characteristic deriving unit
10-05-2019
29
1/--страниц
Пожаловаться на содержимое документа