close

Вход

Забыли?

вход по аккаунту

JP2010203800

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2010203800
[PROBLEMS] To provide a method and an apparatus capable of reliably estimating a sound
source even when a sudden sound or an intermittent sound is generated. SOLUTION: Sound and
video are simultaneously sampled using an audio / video sampling unit in which a plurality of
microphones and a camera are integrated, and then sound pressure waveform data and image
data are temporarily stored in a buffer; When, at t, the control unit issues a measurement start
signal command, a period between time t = t−T and time t = t + (T−T), which is a predetermined
retrogressive time length T and is traced back from time t The data saved in the buffer is taken
out to create the sound file and the moving image file and saved in the memory, and the sound
pressure of the sound sampled by a plurality of microphones using the sound pressure waveform
data of the saved sound file The phase difference between the signals was calculated to estimate
the sound source direction. [Selected figure] Figure 2
Sound source estimation method and apparatus
[0001]
The present invention relates to a method and apparatus for estimating a sound source using
information of sound collected by a plurality of microphones and information of an image
photographed by a photographing means.
[0002]
Conventionally, as a method of estimating the direction of arrival of sound, a microphone array in
which a large number of microphones are arranged at equal intervals is constructed, and the
04-05-2019
1
sound source direction which is the direction of arrival of sound waves is estimated from the
phase difference of each microphone with respect to the reference microphone. So-called
acoustic methods have been devised (see, for example, Non-Patent Document 1).
On the other hand, rather than from the phase difference of the output signals of the plurality of
microphones arranged at the measurement point, a plurality of microphone pairs arranged in a
straight line crossing each other from the plurality of microphones constitute a pair of
microphones There has been proposed a method of estimating the sound source direction from
the ratio of the arrival time difference corresponding to the phase difference and the arrival time
difference between the other two microphones (for example, see Patent Documents 1 to 3).
[0003]
Specifically, as shown in FIG. 6, two microphone pairs (M1 and M3) and a microphone pair (four
microphones M1 to M4 are arranged at predetermined intervals on two straight lines orthogonal
to each other). M2 and M4) are arranged, and the arrival time difference of sound pressure
signals input to the microphones M1 and M3 constituting the microphone pair (M1 and M3), and
the microphones constituting the microphone pair (M2 and M4) The horizontal angle θ between
the measurement point and the position of the sound source is estimated from the ratio of the
arrival time difference of the sound pressure signal input to M2 and M4, and the fifth
microphone M5 is placed on the plane made by the microphones M1 to M4. And four
microphone pairs (M5, M1), (M5, M2), (M5, M3), (M5, M4) are further arranged, The elevation
angle φ between the measurement point and the position of the sound source is estimated from
the arrival time difference between the microphones constituting the microphone pair.
[0004]
As a result, the sound source direction can be accurately estimated with a smaller number of
microphones as compared to the case where the sound source direction is estimated using the
microphone array.
Also, at this time, after an image capturing means such as a CCD camera is provided to capture
an image of the estimated sound source direction, data of the image and data of the sound source
direction are synthesized to generate the estimated sound source in the image. If the direction
and the sound pressure level are displayed graphically, the sound source can be visually grasped.
In addition, simultaneously with the collection of sound, the image is continuously taken by the
image collection means, and together with the information of the sound, the information of the
04-05-2019
2
image is stored as a moving image in a computer, and then the direction of the sound source is
calculated and estimated in the image. There is also a method of estimating a sound source by
displaying a figure on the detected sound source direction and sound pressure level.
[0005]
JP 2002-181913 A JP JP 2006-324895 A JP JP 2008-224259 A
[0006]
Jiro Oga, Yoshio Yamazaki, Yutaka Kanada; Acoustic System and Digital Processing,
Corona,1995
[0007]
By the way, in the above-mentioned conventional method, after the information of the sound and
the information of the image collected for a predetermined measurement time (for example, 120
seconds) set in advance are taken into the computer, the sound source direction is I was doing
the calculation.
Therefore, when sudden noise or intermittent noise is generated, measurement may fail, and
there is a problem that efficient measurement is difficult, such as analyzing only unnecessary
data. .
[0008]
The present invention has been made in view of the conventional problems, and it is an object of
the present invention to provide a method and apparatus capable of reliably estimating a sound
source even when a sudden sound or intermittent sound is generated. To aim.
[0009]
The invention according to claim 1 of the present application is an audio / video image pickup
unit in which a plurality of microphones and an image pickup means are integrally incorporated.
A sound source estimation method for estimating a sound source using sound pressure waveform
data which is data of a sound pressure signal and image data of a collected video, which is sound
pressure waveform data of the collected sound and image data of the collected video And when a
command to start measurement in the sound source direction is issued, the analysis set in
04-05-2019
3
advance from the time when the command to start the measurement is issued is traced back by a
predetermined retroactive time length. The sound pressure waveform data and the image data
stored in the temporary storage file until the time when the time length has elapsed is extracted,
and these data are respectively extracted as the sound file and the moving image file. The sound
pressure waveform data of the stored sound file is stored as storage means, and the phase
difference between sound pressure signals of sounds collected by the plurality of microphones is
calculated to estimate the sound source direction, and The figure indicating the estimated sound
source direction is drawn by combining the estimated sound source direction and the image data
captured within the time used for the estimation of the sound source direction stored in the
moving image file A sound source position estimation image is created, and the sound source is
estimated using this sound source position estimation image.
[0010]
The invention according to claim 2 is the method for estimating a sound source according to
claim 1, wherein sound information and image information are obtained using a sound / image
collecting unit in which a plurality of microphones and a photographing means are integrated.
And A / D converting the sound pressure signal of the collected sound and the video signal and
storing the data as sound pressure waveform data and image data for a predetermined period,
respectively. A second step of storing the information in the first storage means, a third step of
issuing a command to start measurement of the sound source direction, sound pressure
waveform data for an analysis time length used for estimating the sound source from the first
storage means A fourth step of extracting image data and storing the extracted sound pressure
waveform data and image data as a sound file and a moving image file in a second storage
means; and from the sound file, a sound source direction A fifth step of extracting sound
pressure waveform data for an operation time length used for estimation calculation, calculating
a phase difference of sound pressure signals of sounds collected by the plurality of microphones,
and estimating a sound source direction; A sixth step of extracting image data of a time between
the start time and the end time of the calculation from the file, a sound source direction
estimated in the fifth step, and the sixth step extracted A seventh step of synthesizing an image
data and creating and displaying a sound source position estimation image in which a figure
indicating the estimated sound source direction is drawn, and estimating a sound source using
the sound source position estimation image The sound pressure waveform data and the image
data for the analysis time length extracted from the first storage means in the fourth step are the
times when the measurement start command is issued. Than time Sound pressure waveform data
and image data stored in the first storage means from the second time, which is the time earlier
by the set backward time, to the first time; Sound pressure waveform data stored in the first
storage means from a time until a third time when a time length shorter than the analysis time
length by a time length shorter than the analysis time length from the time of day; It is
characterized by being image data.
04-05-2019
4
[0011]
The invention according to claim 3 is the sound source estimation method according to claim 2,
wherein the sound pressure waveform stored in the sound file is stored between the fourth step
and the fifth step. Providing a step of creating and displaying a graph of a time-series waveform
of a sound pressure waveform from the data, and a step of designating an arbitrary time of the
displayed graph, and in the fifth step, starting from the designated time Sound pressure
waveform data corresponding to the calculation time length is extracted from the sound file.
[0012]
According to the fourth aspect of the present invention, there is provided a microphone group
having two microphone pairs arranged at predetermined intervals on two straight lines
intersecting each other, a photographing means for photographing an image in the sound source
direction and a display means An image in which a figure indicating the sound source direction is
drawn is created from the sound pressure signal of the sound propagating from the sound source
collected by the microphone group and the video signal obtained by capturing the sound source
direction, and displayed on the display means An A / D converter for converting a sound
pressure signal collected by each of the microphones and a video signal photographed by the
photographing means into digital signals; Pressure signal as sound pressure waveform data, and
first storage means for temporarily storing the A / D converted video signal as image data for a
predetermined period, and for starting estimation of sound source direction Means for outputting
a command signal, second storage means for storing sound pressure waveform data and image
data used for estimating a sound source, and when the command signal is input, the first storage
means Analysis file creation means for extracting sound pressure waveform data and image data
for an analysis time length used for estimation, creating a sound file and a moving image file, and
outputting these files to the second storage means; Sound pressure waveform data for an
operation time length used for estimation calculation of a sound source direction is extracted
from the sound file, the extracted sound pressure waveform data is subjected to frequency
analysis, and each of microphones constituting the two microphone pairs Sound source direction
estimation means for obtaining a phase difference between the two, and estimating a sound
source direction from the ratio of the phase difference between the two pairs of microphone
pairs obtained, and starting the estimation operation from the moving image file Image data
extracting means for extracting image data at a time between a clock and an end time, combining
the data of the estimated sound source direction with the extracted image data to indicate the
estimated sound source direction Sound source position estimation image generation means for
generating a sound source position estimation image in which a graphic is drawn, and the
analysis file generation means is preset from a first time which is a time when the measurement
start command is issued From the sound pressure waveform data and the image data stored in
the first storage means from the second time which is the time before the retrogressive time
length to the first time, and from the first time, Sound pressure waveform data and image data
04-05-2019
5
stored in the first storage means from the first time until the third time when a time length
shorter than the analysis time length by a time length shorter than the analysis time length has
elapsed Extract the sound A file and the moving image file are created.
[0013]
The invention according to claim 5 is the sound source estimation apparatus according to claim
4, further comprising a fifth microphone not on a plane formed by the two microphone pairs,
added to the microphone group, and the sound source direction. In the estimation means, four
sets of microphones each consisting of a phase difference between the microphones constituting
the two pairs of microphones, the fifth microphone and each of four microphones constituting
the two pairs of microphones. A sound source direction is estimated using a phase difference
between microphones constituting a pair.
[0014]
According to the present invention, data of sound pressure signal of the collected sound is
obtained by simultaneously collecting the information of the sound and the information of the
image by using the sound / image collecting unit integrally incorporating the plurality of
microphones and the photographing means. The sound pressure waveform data and the image
data of the video are each stored in the primary storage file of the storage means, and when a
command to start measurement of the sound source direction is issued, a predetermined
backward time from the time the command is issued The sound pressure waveform data and the
image data stored in the temporary storage file are extracted from the time traced back by the
length to the time elapsed by the preset analysis time length, and these data are extracted as the
sound file and the moving image The sound pressure waveform data of the sound file stored in
the storage means as a file is used to calculate the phase difference between the sound pressure
signals of the sounds sampled by the plurality of microphones. Since so as to estimate the sound
source direction and, even if a sudden sound or intermittent sound is generated, the direction of
the source of these sounds can be reliably estimated.
Therefore, it is possible to eliminate a failure in measurement and to prevent an increase in
unnecessary measurement data, so that the sound source position can be estimated efficiently.
Further, when estimating the sound source, the estimated sound source direction and the image
data captured within the estimated calculation time are combined, and a sound source position
estimation image in which a figure indicating the estimated sound source direction is drawn is
04-05-2019
6
Since the sound source is generated using this sound source position estimation image, the sound
source can be reliably estimated.
[0015]
Further, if the sound source is estimated according to the steps described in claim 2, the
generation source of the sudden sound or the intermittent sound can be reliably estimated.
If the time-series waveform of the sound pressure level to be displayed is a time-series waveform
of the size of the sampled sound, the sound source position can be estimated according to the
size of the propagation sound.
In addition, while creating and displaying a graph of time series waveform of sound pressure
waveform from data of sound pressure waveform saved in the sound file, an arbitrary time of this
graph is specified, and operation time length is specified from the specified time Since sound
pressure waveform data stored in the sound file can be efficiently and effectively used by
extracting sound pressure waveform data of a minute from the sound file and estimating the
sound source direction, estimation of the sound source position is possible. Can be done
efficiently.
[0016]
Further, by using the sound source estimation apparatus according to claim 4, it is possible to
reliably estimate the sources of sudden sounds and intermittent sounds.
At this time, first to fourth microphones constituting two pairs of microphones arranged at
predetermined intervals on two straight lines intersecting each other, and a fifth microphone not
on a plane formed by the two pairs of microphones To estimate the sound source direction using
the ratio of the phase difference between the microphones constituting the two pairs of
microphones and the phase difference between the first to fifth microphones. If so, it is possible
to estimate the horizontal angle θ and the elevation angle φ efficiently and accurately with a
small number of microphones.
04-05-2019
7
[0017]
The summary of the invention does not enumerate all necessary features of the present
invention, and a subcombination of these feature groups can also be an invention.
[0018]
It is a functional block diagram showing composition of a sound source presumption system
concerning this embodiment.
5 is a flowchart illustrating a method of estimating a sound source according to the present
invention.
It is a figure for demonstrating the retrieval method of the data in a retrogressive mode.
It is a figure which shows an example of the display screen on which the sound source position
estimation screen was displayed. It is a figure which shows an example of the display screen on
which the graph of the time-sequential waveform of the sound pressure level was displayed. It is
a figure which shows arrangement ¦ positioning of each microphone in the sound source location
method using the conventional microphone pair.
[0019]
Hereinafter, the present invention will be described in detail through the embodiments, but the
following embodiments do not limit the invention according to the claims, and all combinations
of the features described in the embodiments are not limited. It is not necessarily essential to the
solution of the invention.
[0020]
Hereinafter, embodiments of the present invention will be described based on the drawings.
FIG. 1 is a functional block diagram showing the configuration of a sound source estimation
system. The sound source estimation system includes an audio / video sampling unit 10, a
04-05-2019
8
control unit 20, and a sound source position estimation device 30. The sound / image collecting
unit 10 includes a sound collecting means 11, a CCD camera (hereinafter referred to as a camera)
12 as an image collecting means, a microphone fixing unit 13, a camera support 14, a support
15, and a rotating table 16. , And the base 17. The sound collecting means 11 comprises a
plurality of microphones M1 to M5. Microphones M1 to M5 are installed on the microphone
fixing unit 13, the camera 12 is installed on the camera support 14, and the microphone fixing
unit 13 and the camera support 14 are connected by three columns 15. That is, the sound
collection means 11 and the camera 12 are integrated. The microphones M1 to M5 are disposed
above the camera 12. The base 17 is a three-legged support member, on which the turntable 16
is installed. The camera support 14 is mounted on a rotating member 16 r of the rotating table
16. Therefore, by rotating the rotating member 16r, the sound collecting means 11 and the
camera 12 can be rotated integrally. The microphones M1 to M5 measure the sound pressure
level of the sound transmitted from the sound source (not shown).
[0021]
The arrangement of the microphones M1 to M5 is the same as that shown in FIG. 6, and four
microphones M1 to M4 are arranged in two straight lines orthogonal to each other at two
predetermined intervals respectively. , M3) and a microphone pair (M2, M4), and the fifth
microphone M5 is not located on a plane formed by the microphones M1 to M4, specifically, a
square formed by the microphones M1 to M4. Place at the position of the apex of the
quadrangular pyramid whose base is. Thus, four microphone pairs (M5, M1) to (M5, M4) are
further configured. In this example, the photographing direction of the camera 12 is set to a
direction that passes through the point of intersection of the two orthogonal straight lines and
forms approximately 45 ° with the two straight lines. Therefore, the direction of the sound /
image collecting unit 10 is the direction of the white arrow D in FIG. The camera 12 collects an
image according to the direction of the sound / image collecting unit 10.
[0022]
The control unit 20 includes a mode switching unit 21, an amplifier 22, an A / D converter 23, an
image input / output unit 24, a buffer 25 which is a first storage unit, an analysis time length
setting unit 26, and A time length setting unit 27 and a file creation unit 28 are provided. The
sound source position estimation device 30 includes a memory 31, which is a second storage
unit, a display unit 32, a sound pressure waveform data extraction unit 33, a sound source
direction estimation unit 34, an image data extraction unit 35, and a data synthesis unit 36. And
04-05-2019
9
[0023]
The sound source estimation system of this example has two measurement modes, a normal
mode and a backward mode. In the normal mode, sound source estimation is performed using
data for a predetermined analysis time length from the time when the measurement start signal,
which is a command signal for starting estimation of the sound source direction, is input. In the
retrogressive mode, sound source estimation is performed using data of a predetermined time
before the input of the measurement start signal. The mode switching unit 21 includes a mode
switching unit 21a, a measurement start signal output unit 21b, a measurable display unit 21p, a
backward valid display unit 21q, and a measurement start switch 21S. The mode switching unit
21a switches the measurement mode to one of the normal mode and the backward mode, and
instructs the file creation unit 28 how to extract data from the buffer 25. The measurement start
signal output unit 21b outputs a measurement start signal when the measurement start switch
21S is pressed. The measurable display unit 21p causes the measurer to visually recognize that
the number of data that can be measured is stored in the buffer 25 by lighting an LED or the like.
The retroactive validity display unit 21 q causes the measurer to visually recognize that the
number of data for the retrogression is stored in the buffer 25 by turning on the LED or the like.
In the backward mode, the measurement start signal is not output when the LED of the backward
valid display unit 21 q is not on.
[0024]
The amplifier 22 includes a low pass filter, removes high frequency noise components from the
sound pressure signal of the sound sampled by the microphones M1 to M5, and amplifies the
sound pressure signal to output to the A / D converter 23. The A / D converter 23 creates sound
pressure waveform data obtained by A / D converting the sound pressure signal, and outputs this
to the buffer 25. The video input / output means 24 inputs a video signal continuously
photographed by the camera 12 and outputs image data in the photographing direction to the
buffer 25 every predetermined time (for example, 1/30 second). The buffer 25 temporarily
stores sound pressure waveform data and image data for a predetermined period. The buffer 25
includes a first buffer 25a and a second buffer 25b. When the first buffer 25a becomes full, new
sound pressure waveform data and image data are stored in the second buffer 25b. Then, when
the second buffer 25b becomes full, all data stored in the first buffer 25a is erased, and new
sound pressure waveform data and image data are stored in the first buffer 25a. When the sound
pressure waveform data and the image data are stored in the first buffer 25a or the second
buffer 25b, the sound pressure waveform data and the image data may be synchronized and
stored, or may be stored together with the sound pressure waveform data. A well-known method
04-05-2019
10
can be used, such as storing time data attached to each of the image data.
[0025]
The analysis time length setting unit 26 analyzes the sound pressure waveform data and the
image data to set an analysis time length Tw which is a time length for estimating a sound
source. The retrogression time length setting means 27 sets the retrogression time length Tz
from the first time t0 which is the time when the measurement start signal for estimating the
sound source is issued. The file creating means 28 is a unit for storing the sound stored in the
buffer 25 from the buffer 25 between the time t1 = t0−Tz which is the second time and the time
t2 = t0 + (Tw−Tz) which is the third time. The pressure waveform data and the image data are
extracted, the sound file 31a is created from the sound pressure waveform data, the moving
image file 31b is created from the image data, and the files 31a and 31b are stored in the
memory 31.
[0026]
The memory 31 stores the sound file 31a and the moving image file 31b created by the file
creation means. The memory 31 is composed of a RAM and is rewritable. The display unit 32
displays an image display unit 32a that displays a sound source position estimation image, which
is an image for estimating a sound source position described later, and a sound pressure level
display unit that indicates the relationship between the horizontal angle θ of the sound source
direction and the sound pressure level. And 32b, and a display screen 32M. The sound pressure
waveform data extracting means 33 extracts sound pressure waveform data for performing
estimation calculation of the sound source direction, that is, sound pressure waveform data for a
predetermined analysis time length from the sound file 31a stored in the memory 31. Output to
the sound source direction estimation means 34. The sound source direction estimating means
34 obtains phase differences among the microphones M1 to M5 from the taken sound pressure
waveform data, estimates the sound source direction from the obtained phase differences, and
outputs the estimation result to the data combining means 36. Do. The details of the estimation
of the sound source direction will be described later.
[0027]
The image data extracting means 35 extracts image data at a time between the start time and the
04-05-2019
11
end time of the analysis time from the moving image file 31 b stored in the memory 31 and
outputs the image data to the data combining means 36. The data synthesizing means 36
synthesizes the data of the sound source direction estimated by the sound source direction
estimating means 34 and the image data outputted from the image data extracting means 35,
and the sound source direction in which the graphic showing the sound source direction is drawn
in the image. An estimated image is created and output to the display means 32.
[0028]
Next, a method of estimating the sound source direction using the sound source estimation
system will be described with reference to the flowchart of FIG. First, after connecting the sound
/ video sampling unit 10, the control unit 20, and the sound source position estimation device
30, the sound / video sampling unit 10 is set as a measurement point (step S11). Next, after
setting the analysis time length Tw and the retrogression time length Tz (step S12), the
measurement mode is selected (step S13). First, the case where the measurement mode is set to
the backward mode in step S13 will be described. After the measurement mode is selected, the
shooting direction of the camera 12 is directed to the planned measurement location, sounds are
collected by the microphones M1 to M5, and an image of the planned measurement location is
captured by the camera 12 (step S14). In this example, since the sound was intermittently
generated at the planned measurement location, the sound / image collecting unit 10 was fixed
and measured. When the measurement is performed near the fountain, the entire fountain does
not enter the visual field of view, so that, for example, 3 ° / sec. Sound and images may be
collected while reciprocating around the center of the fountain at a slow speed. As a rotation
range, about ± 60 ° is appropriate. Next, the output signals of the microphones M1 to M5 and
the video signal of the camera 12 are A / D converted, and sound pressure waveform data and
image data (hereinafter referred to as data) are stored in the buffer 25 (step S15).
[0029]
When data corresponding to the backward time length Tz is stored in the buffer 25, the LED of
the backward effective display unit 21q is turned on, and the measurer determines whether or
not the measurement in the backward mode is possible by turning on the LED. (Step S16). When
the LED of the retroactive validity display section 21 q is not lighted, data corresponding to the
retroactive time length Tz is not stored in the buffer 25. Therefore, since measurement in the
retrogressive mode can not be performed, storage of data is continued until the LED of the
retroactive validity display portion 21 q is turned on. When the LED of the retroactive validity
display section 21 q is turned on, measurement in the retroactive mode is possible, so that the
04-05-2019
12
measurement start signal can be output at any time.
[0030]
Next, it is determined whether a measurement start signal is output (step S17). If the
measurement start signal is not output, data storage is continued. Even after the measurement
start signal is output, the operation of storing the output signals of the microphones M1 to M5
and the video signal of the camera 12 as sound pressure waveform data and image data in the
buffer 25 is continued (step S18). Then, it is determined whether or not data corresponding to
the analysis time length Tw is stored in the buffer 25 by lighting of the LED of the measurable
display portion 21p (step S19). When data corresponding to the analysis time length Tw is not
stored in the buffer 25, data storage is continued. When data corresponding to the analysis time
length Tw is stored in the buffer 25, the LED of the measurable display unit 21p is turned on, so
the data is extracted from the buffer 25 to create the sound file 31a and the moving image file
31b. Files 31a and 31b are stored in the memory 31 (step S20). In step S20, the data extracted
from the buffer 25 is stored in the buffer 25 from the second time t1 = t0−Tz to the first time t0
which is the measurement start time, as shown in FIG. It is composed of backward data Dz which
is data and residual measurement data Dr which is data stored in the buffer 25 between the first
time t0 and the third time t2 = t0 + (Tw-Tz). That is, the waiting time until the data for the
analysis time length Tw is stored in the buffer 25 after the measurement start signal is output is
(Tw−Tz).
[0031]
Next, sound pressure waveform data having a preset operation time length Tc is extracted from
the sound file to estimate the sound source direction (step S21). To estimate the sound source
direction, the sound pressure waveform data is frequency analyzed by FFT, the phase difference
between the microphones M1 to M5 is determined for each frequency, and the sound source
direction is estimated for each frequency from the determined phase difference. Do. In the
present example, the horizontal angle θ and the elevation angle φ are obtained using the arrival
time difference, which is a physical quantity proportional to the phase difference, instead of the
phase difference. The method of calculating the horizontal angle θ and the elevation angle φ in
step S21 will be described later.
[0032]
04-05-2019
13
Next, the image data Gc located at the central time tm of the calculation time, that is, the time tm
= tc + (Tc / 2) which is half the length of the calculation time Tc from the calculation start time tc,
is extracted from the moving image file. (Step S22). The calculated sound source direction data
(θf, φf) for each frequency and the image data Gc are synthesized, and a sound source position
estimation image, which is the synthesized image, is provided on the display screen 32M of the
display means 32. The sound source position estimation screen is displayed on the image display
unit 32a (step S23). FIG. 4 is a view showing an example thereof. A sound source position
estimation screen 38 is displayed on the image display section 32a on which a figure (circle mark
of mesh pattern) 37 representing a sound source direction is drawn. Ru. The horizontal axis of
the sound source position estimation screen 38 is the horizontal angle θf, and the vertical axis is
the elevation angle φf. Also, the size of the circle represents the sound pressure level. It is also
possible to display the estimated sound source direction for each frequency band set in advance.
In this case, the color of the mesh-shaped circle 37 may be set for each frequency band. In the
sound pressure level display unit 32b, the horizontal axis represents the horizontal angle θ (deg.
The sound pressure level display screen 39 displaying the sound pressure level (dB) at the time
of. Finally, the sound source is estimated from the sound source position estimation screen 38
(step S24). The sound source position estimation screen 38 is a video of a sound source in which
a video shown in a portion where a graphic 37 representing a sound source direction is drawn is
estimated.
[0033]
In step S13, when the normal mode is selected, the process proceeds to step S25, the shooting
direction of the camera 12 is directed to the measurement planned location, sound is collected by
the microphones M1 to M5, and the measurement planned location is configured by the camera
12. , And A / D convert the output signals of the microphones M1 to M5 and the video signal of
the camera 12, and store the sound pressure waveform data and the image data in the buffer 25.
Next, it is determined whether a measurement start signal is output (step S26). If the
measurement start signal is not output, data storage is continued. Even after the measurement
start signal is output, the operation of storing the output signals of the microphones M1 to M5
and the video signal of the camera 12 as sound pressure waveform data and image data in the
buffer 25 is continued (step S27). Then, it is determined whether or not data corresponding to
the analysis time length Tw is stored in the buffer 25 by lighting of the LED of the measurable
display portion 21p (step S28). When data corresponding to the analysis time length Tw is not
stored in the buffer 25, data storage is continued. When data corresponding to the analysis time
length Tw is stored in the buffer 25, the LED of the measurable display unit 21p is turned on, so
the data is extracted from the buffer 25 to create the sound file 31a and the moving image file
31b. Files 31a and 31b are stored in the memory 31 (step S29). In step S29, the data extracted
04-05-2019
14
from the buffer 25 is data stored in the buffer 25 from the measurement start time t0 until the
time tw = t0 + Tw when the analysis time length Tw has elapsed. Since the process steps for
estimating the sound source direction, extracting the image data, creating and displaying the
sound source position estimation screen, and estimating the sound source performed in step S29
and subsequent steps are the same in the normal mode and the backward mode, the sound file
31a After storing the moving image file 31b in the memory 31, the process proceeds to step S21,
and the processing from step S21 to step S24 is performed to estimate a sound source.
[0034]
The calculation method of the horizontal angle θ and the elevation angle φ in step S21 is as
follows. Assuming that the arrival time difference between the microphone Mi and the
microphone Mj of each microphone pair (Mi, Mj) is Dij, the horizontal angle θ and the elevation
angle φ, which are the sound incident directions, are given by the following equations (1) and (2)
Since the output signals of the microphones M1 to M5 are frequency-analyzed using FFT, and the
arrival time difference Dij between the microphones Mi and Mj at the target frequency f is
calculated, the horizontal angle .theta. It can be asked. That is, arrival of sound pressure signals
input to the microphones M1 and M3 constituting two microphone pairs (M1 and M3) and
microphone pairs (M2 and M4) arranged at predetermined intervals on two straight lines
orthogonal to each other. The horizontal angle θ between the measurement point and the sound
source position is estimated from the ratio of the time difference D13 and the arrival time
difference D24 of the sound pressure signal input to the microphones M2 and M4 constituting
the microphone pair (M2, M4), the arrival The elevation angle φ formed between the
measurement point and the sound source position is estimated from the time differences D13
and D24 and the arrival time differences D5j (j = 1 to 4) between the fifth microphone M5 and
the other microphones M1 to M4.
[0035]
The arrival time difference Dij is obtained by obtaining the cross spectrum Pij (f) of the signal
input to the two microphone pairs (Mi, Mj), and further using the phase angle information Ψ
(rad) of the target frequency f Is calculated by the following equation (3). The estimation result of
the sound source direction is performed for each of the image data in the shooting direction
stored at predetermined time intervals.
[0036]
04-05-2019
15
As described above, in the present embodiment, the sound and video are simultaneously sampled
using the audio / video sampling unit 10 in which the plurality of microphones M1 to M5 and
the camera 12 are integrated, and then A / D converted and buffered. When the command of the
measurement start signal is issued from the control unit 20, the second time t1 which is traced
back from the first time t0 when the command is issued by a predetermined backward time
length Tz The sound pressure waveform data and the image data stored in the buffer 25 between
t0-Tz and the time t2 = t0 + (Tw-Tz) which is the third time are extracted, and the sound file 31a
is extracted from the sound pressure waveform data. A moving image file 31b is created from
image data, and the files 31a and 31b are stored in the memory 31. A plurality of microphones
are generated using sound pressure waveform data of the stored sound file 31a. Since the phase
difference between the sound pressure signals of the sounds collected by the phones M1 to M5
is calculated to estimate the horizontal angle θ and the elevation angle φ as the sound source
direction, sudden sounds and intermittent sounds occur. Even in this case, the directions of the
sources of these sounds can be reliably estimated. Therefore, the failure of measurement can be
eliminated. Also, when estimating the sound source, the estimated sound source direction (θ, φ)
and the image data Gc captured within the calculation time length Tc are synthesized, and the
figure 37 indicating the estimated sound source direction is drawn Since the sound source
position estimation screen 38 thus displayed is displayed to estimate the sound source, the sound
source can be reliably estimated.
[0037]
In the above embodiment, the sound source direction was estimated from the top of the sound
file, but sound pressure waveform data is called from the sound file, a graph of sound pressure
level time series waveform is created, and this graph is A time position of sound pressure
waveform data of operation time length Tc to be used for sound file extraction may be specified.
FIG. 5 is a view showing an example of a graph of a time-series waveform of sound pressure
levels. When displaying a graph of a time-series waveform of sound pressure levels on display
screen 32M, the sound on which sound pressure level display screen 39 is displayed The
pressure level display unit 32b displays a graph of a time-series waveform of sound pressure
levels. The horizontal axis of the graph is time (seconds), and the vertical axis is sound pressure
levels (dB). By specifying a specific point (here, a peak at 3 seconds after measurement) on the
graph of the time-series waveform of the sound pressure level, it is possible to specify the time
position at which the analysis of the sound source direction is performed. The graph of the time
series waveform of the sound pressure level may be a graph showing the time variation of the
sound pressure level of all the frequencies, that is, the time of the sound pressure level in a
preset frequency band. It may be a graph showing a change. In FIG. 5, the sound pressure level
04-05-2019
16
increases with every water spouting, so by designating the peak on the graph and estimating the
sound source direction, the sound of the designated peak generated from any part of the fountain
It can be estimated.
[0038]
Further, in the above example, the horizontal angle θ and the elevation angle φ formed by the
measurement point and the sound source position are estimated using five microphones M1 to
M5, but when the sound source position is sufficient by only the horizontal angle θ The
microphone M5 may be omitted, and only two microphone pairs (M1, M3) and (M2, M4)
disposed at predetermined intervals on two straight lines intersecting each other may be used.
[0039]
As mentioned above, although this invention was demonstrated using embodiment, the technical
scope of this invention is not limited to the range as described in the said embodiment.
It is obvious to those skilled in the art that various changes or modifications can be added to the
above embodiment. It is also apparent from the scope of the claims that the embodiments added
with such alterations or improvements can be included in the technical scope of the present
invention.
[0040]
As described above, according to the present invention, even when a sudden sound or an
intermittent sound is generated, estimation of the sound source can be reliably performed, so
that measurement failure can be eliminated. Estimation can be performed efficiently.
[0041]
DESCRIPTION OF SYMBOLS 10 sound / image sampling unit, 11 sound collecting means, M1 to
M5 microphones, 12 CCD cameras, 13 microphone fixing parts, 14 camera supports, 15 posts,
16 rotating bases, 17 bases, 20 control units, 21 mode switching means , 21a mode switching
unit, 21b measurement start signal output unit, 21p measurable display unit, 21q retroactive
effective display unit, 22 amplifier, 23 A / D converter, 24 image input / output means, 25
buffer, 25a first buffer, 25b Second buffer, 26 analysis time length setting means, 27 backward
time length setting means, 28 file creation means, 30 sound source position estimation device,
31 memories, 31a sound files, 31b video files, 32 display means, 32M display screen, 32a image
04-05-2019
17
display unit, 32b sound pressure level display unit, 33 sound pressure waveform data extracting
means , 34 sound source direction estimation means, 35 image data extraction means, 36 data
combining means, 37 figures representing sound source direction, 38 sound source position
estimation screen, 39 sound pressure level display screen.
04-05-2019
18
1/--страниц
Пожаловаться на содержимое документа