close

Вход

Забыли?

вход по аккаунту

JP2012133250

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2012133250
A user can intuitively determine the position of a sound source and the information of the sound
emitted from the sound source. A microphone 101 for obtaining sound collection information
(sound signal) is disposed on the front of a transmissive head mounted display. The signal
processing unit 104 generates display data based on the sound collection information of the
microphone 101. Based on the display data, the display unit 105 superimposes and displays
sound information on the visual image at a position in the visual image corresponding to the
position information. For example, the level information of the sound emitted from the sound
source is displayed in the size of a circle. Also, the frequency information of the sound emitted
from the sound source is displayed in the color attached to the circle. When the sound source
position is in the visual image, sound information is displayed at or near the sound source
position. On the other hand, when the sound source position is out of the display surface, sound
information is displayed at the end of the visual image near the sound source position. [Selected
figure] Figure 3
Sound information display device, sound information display method and program
[0001]
The present invention relates to a sound information display device, a sound information display
method, and a program, and more particularly to a sound information display device for
displaying sound information of a sound source superimposed on a visual image on a headmounted display (head mounted display).
[0002]
03-05-2019
1
Patent Document 1 describes a technique for estimating the sound source position and the sound
intensity with a plurality of microphones.
Further, Patent Document 2 describes a head mount display for a hearing impaired person who
recognizes the direction of arrival of a sound source by a plurality of microphones and outputs a
word or an onomatopoeia corresponding to the recognition result as character information.
Patent Document 3 describes a visualization device that extracts sound information from an
input video, classifies and identifies what the sound is, and associates the sound with an image in
the video.
[0003]
Japanese Patent Application Laid-Open No. 2004-021031 Japanese Patent Application LaidOpen No. 2007-334149 Japanese Patent Application Laid-Open No. 08-179791
[0004]
A technique for estimating the sound source direction and a technique for obtaining the sound
intensity therefrom (see Patent Document 1) have already been established and are mainly used
for measurement.
However, since precision is required in the field of measurement, the current situation is that the
apparatus is large scale. On the other hand, such a sound source direction estimation technique
can also be a sound source search tool for general people and a hearing aid tool for hearing
impaired people.
[0005]
The technology described in the above-mentioned Patent Document 2 is a glasses-type head
mounted display, and is conscious of a general pedestrian. However, the direction of arrival of
the sound source is displayed in text, which is not intuitive. Furthermore, although the
recognition result is expressed using mimic language, there may be individual differences in the
sound expression of the sound generator.
03-05-2019
2
[0006]
Further, the technology described in the above-mentioned Patent Document 3 requires a huge
database, and it is difficult to identify a sound generator that is hidden in a video or a sound
generator that comes from outside the video.
[0007]
An object of the present invention is to enable a user to intuitively determine the position of a
sound source and the information of the sound emitted from the sound source.
[0008]
The concept of the present invention includes: a plurality of sound collecting units; an
information acquiring unit that acquires sound source position information and sound
information based on sound collection information of the plurality of sound collecting units; A
display data generation unit that generates display data for displaying sound information at a
position in the visual image corresponding to the position information acquired by the
information acquisition unit, superimposed on the visual image; A head-mounted image display
unit for displaying sound information of the sound source superimposed on the visual image at a
position corresponding to the sound source in the visual image based on the display data
generated in On the display.
[0009]
In the present invention, a plurality of sound collecting units, for example, a plurality of
microphones are provided.
The information acquisition unit acquires the position information and the sound information of
the sound source based on the sound collection information of the plurality of sound collection
units.
For example, the sound information may be level information of the sound emitted from the
sound source or frequency information of the sound emitted from the sound source.
[0010]
03-05-2019
3
The display data generation unit generates display data for displaying the sound information
acquired by the information acquisition unit.
The display data is generated so as to display sound information superimposed on the visual
image at a position in the visual image corresponding to the position information acquired by the
information acquisition unit.
[0011]
For example, this display data is generated so as to display the level information of the sound
emitted from the sound source in the size of a predetermined shape such as a circle. In this case,
it is possible to determine perspective. For example, it can be determined that the sound source is
approaching from a circle that gradually increases. Also, for example, this display data is
generated so as to display the frequency information of the sound emitted from the sound source
in a color given to a predetermined shape such as a circle. In this case, a specific sound source
can be searched based on the color.
[0012]
Based on the display data generated by the display data generation unit, the sound information
of the sound source is displayed on the head mounted image display unit. In this case, the sound
information is displayed superimposed on the visual image at a position corresponding to the
sound source in the visual image. For example, when the sound source position is in the visual
image, sound information is displayed at or near the sound source position. Also, for example,
when the sound source position is out of the visual image, the sound information is displayed at
the end of the visual image near the sound source position. For example, the head-mounted
image display unit is a transmissive image display unit. In this case, the visual image is, for
example, a real image that the user can observe through the image display unit.
[0013]
Further, for example, the head-mounted image display unit is a non-transmissive image display
03-05-2019
4
unit. In this case, an imaging unit for obtaining video data of a visual image is disposed in the
image display unit, and the display data generated by the display data generation unit is
superimposed on the video data obtained by the imaging unit. Then, based on the superimposed
data, a visual image is displayed on the image display unit, and sound information of the sound
source is superimposed and displayed at a position corresponding to the sound source in the
visual image. That is, the visual image in this case is a display image displayed on the image
display unit.
[0014]
As described above, in the present invention, since the sound information of the sound source is
superimposed and displayed at the position corresponding to the sound source in the visual
image, the user can intuitively use the information of the position of the sound source and the
sound emitted from the sound source. It is possible to Further, in the present invention, the
position information and sound information of the sound source are acquired based on the sound
collection information of the plurality of sound collection units, and the visual source also hides
visual information in the predetermined object. The sound information of the sound source can
be superimposed and displayed at a position corresponding to the sound source in the video.
[0015]
In the present invention, for example, the plurality of sound collecting units are disposed in the
image display unit, and the surface configured by the arrangement positions of the plurality of
sound collecting units is not orthogonal to the display surface of the image display unit. It will be.
In this case, the sound source position can be easily obtained on the display surface of the image
display unit, that is, on the two-dimensional plane. For example, the surface configured by the
arrangement positions of the plurality of sound collecting units is parallel to the display surface
of the image display unit. In this case, the calculation for acquiring the sound source position on
the display surface of the image display unit is simplified.
[0016]
Further, in the present invention, for example, the plurality of sound collecting units are
composed of a plurality of omnidirectional sound collecting units and a plurality of directional
sound collecting units, and the information acquiring unit is configured to collect sound
03-05-2019
5
collecting information of the plurality of omnidirectional sound collecting units The first direction
information of the sound source is acquired based on the first direction information, the
directivity directions of the plurality of directivity sound collecting units are controlled based on
the first direction information, and the arrangement positions of the plurality of directionality
sound collecting units are obtained. The second direction information of the sound source is
acquired, and the position information of the sound source is acquired based on the second
direction information. In this case, it is possible to improve the acquisition accuracy of the
position information of the sound source without increasing the number of sound collection
units.
[0017]
Furthermore, in the present invention, for example, the information acquisition unit further
includes a sound source identification unit that identifies a sound source as a target for acquiring
position information and sound information. For example, in the sound source identification unit,
a sound source as a target for acquiring sound information is specified by frequency. In this case,
only the sound information of the specific sound source can be displayed in the visual image, and
the search for the specific sound source becomes easy.
[0018]
According to the present invention, since the sound information of the sound source is
superimposed and displayed at the position corresponding to the sound source in the visual
image, the user intuitively determines the position of the sound source and the information of the
sound emitted from the sound source. For example, it becomes possible to support hearing aids
using the vision of a deaf person or the like.
[0019]
It is a figure which shows the external appearance of the transmission type head mounted
display (transmission type HMD) as 1st Embodiment of this invention.
It is a figure for demonstrating the relationship between the surface comprised by the
arrangement ¦ positioning position of four microphones, and a display surface. It is a block
diagram showing an example of composition of a transmission type HMD system as a 1st
embodiment of this invention. It is a block diagram which shows the structural example of the
03-05-2019
6
signal processing part which comprises a transmissive ¦ pervious HMD system. It is a figure for
demonstrating an example of the calculation method of the arrival angle required for sound
source position detection. It is a flowchart which shows the process sequence of the signal
processing part which comprises a permeation ¦ transmission type HMD system. It is a figure
which shows the example of a display on which sound information was superimposed and
displayed on the position corresponding to the sound source in a visual image. It is a figure for
demonstrating the example of another microphone arrangement ¦ positioning to transmissive ¦
pervious HMD. It is a figure which shows the external appearance of the transmission type head
mounted display (transmission type HMD) as a 2nd embodiment of this invention. It is a block
diagram which shows the structural example of the transmission type HMD system as 2nd
Embodiment of this invention. It is a block diagram which shows the structural example of the
signal processing part which comprises a transmissive ¦ pervious HMD system. It is a flowchart
which shows the process sequence of the signal processing part which comprises a permeation ¦
transmission type HMD system. It is a figure which shows the external appearance of the nontransmission type head mount display (non-transmission type HMD) as 3rd Embodiment of this
invention. It is a block diagram which shows the structural example of the non-transmission type
HMD system as 3rd Embodiment of this invention. It is a figure for demonstrating the example of
other arrangement ¦ positioning of the image pick-up element (camera) to non-transmission type
HMD. It is a block diagram which shows the structural example of the transmission type HMD
system as 4th Embodiment of this invention.
[0020]
Hereinafter, modes for carrying out the invention (hereinafter referred to as embodiments )
will be described. The description will be made in the following order. 1. First Embodiment
Second embodiment 3. Third embodiment 4. Fourth embodiment 5. Modified example
[0021]
<1. First Embodiment> [Configuration Example of Transmission-Type Head-Mounted Display]
FIG. 1 shows the appearance of a transmission-type head-mounted display (transmission-type
HMD) 10 according to a first embodiment. Four omnidirectional microphones 101 are disposed
on the front surface of the transmission HMD 10. In order to improve the accuracy of the sound
source position detection, the microphones are disposed at regular intervals. Here, the
microphone 101 constitutes a sound collection unit.
03-05-2019
7
[0022]
In this case, the surface configured by the arrangement positions of the four microphones 101 is
not orthogonal to the display surface of the transmissive HMD 10. In this case, it means that the
four microphones 101 include those whose arrangement positions are different in the horizontal
direction of the display surface and those whose arrangement positions are different in the
vertical direction of the display surface.
[0023]
FIG. 2A shows that state. In FIG. 2A, a solid square represents the display surface SFa
schematically, and a broken square represents the surface SFb formed by the arrangement
positions of the four microphones 101. In this case, the surface SFb is a surface when projected
onto the display surface SFa. Therefore, it becomes easy to obtain the sound source position on
the display surface SFa, that is, on the two-dimensional plane, based on the sound collection
information (sound signal) of the four microphones 101 described later.
[0024]
As shown in FIG. 2B, four microphones 101 may be disposed in the transmissive HMD 10 such
that the surface SFb is parallel to the display surface SFa. In this case, the calculation at the time
of acquiring the sound source position on the display surface SFa, that is, on the two-dimensional
plane is simplified based on the sound collection information of the four microphones 101
described later.
[0025]
FIG. 3 shows the system configuration of a transmissive HMD 10 according to the first
embodiment. The transmission type HMD 10 includes four microphones 101, an amplifier 102,
an A / D converter 103, a signal processing unit 104, and a display unit 105. The amplifier 102
amplifies sound collection information (sound signal) of the four microphones 101. The A / D
converter 103 converts the sound collection information (sound signal) of the four microphones
101 amplified by the amplifier 102 from an analog signal to a digital signal.
03-05-2019
8
[0026]
The signal processing unit 104 acquires the position information and sound information of the
sound source based on the sound collection information (sound signal) of the four microphones
101 obtained by the A / D converter 103, and displays the sound information. Generate display
data for The display data is for displaying sound information so as to be superimposed on the
visual image at a position in the visual image corresponding to the position information. Here,
the position in the visual image corresponding to the position information means including a
position near the position indicated by the position information in addition to the position
indicated by the position information. By including the proximity position, for example, the user
can view the sound source image in the visual image without being disturbed by the display of
the sound information. The signal processing unit 104 constitutes an information acquisition unit
and a display data generation unit.
[0027]
The sound information is, in this embodiment, level information and frequency information of
sound coming from the sound source. The signal processing unit 104 generates display data so
as to display the level information of the sound emitted from the sound source in a
predetermined shape, in this embodiment, the size of a circle (including an ellipse). Further, the
signal processing unit 104 generates display data so as to display the frequency information of
the sound emitted from the sound source in the color attached to the circle described above.
[0028]
Based on the display data generated by the signal processing unit 104, the display unit 105
superimposes the sound information of the sound source on the visual image and displays the
sound information at a position corresponding to the sound source in the visual image. In this
case, when the sound source position is in the visual image, sound information is displayed at or
near the sound source position. In this case, when the sound source position is out of the visual
image, sound information is displayed at the end of the visual image near the sound source
position. The display unit 105 is configured of, for example, a display of a transmissive structure
in which the backlight unit is removed from a liquid crystal display (LCD).
03-05-2019
9
[0029]
FIG. 4 shows the detailed configuration of the signal processing unit 104. The signal processing
unit 104 includes a digital filter 111, a gain adjustment unit 112, a sound source position
detection unit 113, a level analysis unit 114, a frequency analysis unit 115, and a display data
generation unit 116. The digital filter 111 performs filter processing for removing or reducing
frequency components such as wind noise and rubbing noise included in the sound collection
information (sound signals) S1 to S4 of the four microphones 101. This frequency component
adversely affects the sound source position detection process and the like.
[0030]
The digital filter 111 is configured by, for example, a FIR (Finite-duration Impulse Response)
filter or an IIR (Infinite-duration Impulse Response) filter. For example, this digital filter 111
constitutes a high pass filter or a band pass filter. Also, for example, the digital filter 111
configures a notch filter that partially blocks a band when there is noise of a specific frequency
that you do not want to detect.
[0031]
The gain adjustment unit 112 cuts low level signals such as reflected sound or stationary noise
from the sound collection information (sound signals) S1 to S4 of the four microphones 101. This
low level signal adversely affects the sound source position detection process and the like. The
gain adjustment unit 112 is configured of, for example, an automatic gain control circuit that
performs gain control according to the input signal level.
[0032]
The sound source position detection unit 113 detects the position information of the sound
source based on the sound collection information (sound signals) S1 to S4 of the four
microphones 101. The sound source position detection unit 113 detects position information in
which the display surface of the transmissive HMD 10, that is, the display surface (twodimensional plane) of the display unit 105 is an XY coordinate. In the sound source position
detection process in the sound source position detection unit 113, (1) the arrival time difference
03-05-2019
10
of sound from the sound source to each microphone 101 is determined, (2) the arrival angle of
sound from the sound source is calculated, (3) sound source position It is divided into three
stages of processing to estimate. The processing of each step will be described below.
[0033]
(1) The process of obtaining the arrival time difference will be described. Although this process is
not described in detail, it is performed by a conventionally known method. For example, a
method of obtaining using a cross correlation function, a CSP method using Fourier transform
(Cross-power Spectrum Phase analysis), or the like is generally used.
[0034]
(2) The process of calculating the arrival angle will be described. The sound source position
detection unit 113 performs an arrival angle calculation process on each of the pair of
microphones extracted from the four microphones 101. FIG. 5A shows an example of the arrival
angle calculation method. The arrival time difference between the pair of microphones M1 and
M2 is τ, as shown in FIG. 5 (b). When the distance between the microphones M1 and M2 is d
and the sound speed is c, the arrival angle θ of the sound from the sound source is calculated by
the following equation (1).
[0035]
[0036]
(3) The process of estimating the sound source position will be described.
The sound source position detection unit 113 combines the arrival angles respectively calculated
for each pair of microphones to estimate the position of the sound source, that is, the position on
the two-dimensional plane including the display surface of the sound source. As the sound source
position estimated in this way, two kinds of positions in the display surface (in the visual image)
or positions deviated from the display surface (in the visual image) can be considered.
03-05-2019
11
[0037]
The sound source position detection process in the sound source position detection unit 113
described above is a process using the arrival time difference of the sound from the sound
source. However, the sound source position detection process in the sound source position
detection unit 113 may be another process, for example, a process using an amplitude
characteristic and a phase characteristic (see Patent Document 1).
[0038]
Referring back to FIG. 4, the level analysis unit 114 analyzes the level (sound magnitude) of the
sound from the sound source for each sound source whose sound source position is detected by
the sound source position detection unit 113, and uses it as sound information of the sound
source. Get level information on As described above, the arrival time difference occurs in the
sound from the sound source to each microphone 101. The level analysis unit 114 adds the
collected sound information (sound signals) S1 to S4 of the four microphones 101 in
consideration of the arrival time difference, and obtains the level information of the sound source
based on the added signal.
[0039]
The frequency analysis unit 115 analyzes the frequency of the sound from the sound source for
each sound source whose sound source position is detected by the sound source position
detection unit 113, and obtains frequency information as sound information of the sound source.
For example, the frequency analysis unit 115 can perform frequency analysis using a plurality of
digital filters that extract frequency components for each type of sound source. Also, for example,
the frequency analysis unit 115 can perform frequency analysis on the sound from the sound
source by FFT (Fast Fourier Transform) processing.
[0040]
The display data generation unit 116 generates display data Dds for displaying sound
information at a position in the visual image corresponding to the position information so as to
03-05-2019
12
be superimposed on the visual image. The display data generation unit 116 includes position
information of the sound source detected by the sound source position detection unit 113, level
information of the sound from the sound source obtained by the level analysis unit 114, and
frequency information of the sound from the sound source obtained by the frequency analysis
unit 115. To generate display data Dds.
[0041]
The display data generation unit 116 generates display data Dds so as to display the level
information of the sound emitted from the sound source in the size of a circle. In this case, the
larger the level, the larger the circle. Further, the display data generation unit 116 generates the
display data Dds so as to display the frequency information of the sound emitted from the sound
source in the color attached to the circle. Thus, when the frequency components of the sound are
different for each type of sound source, each type of sound source can be identified by the color
attached to the circle.
[0042]
As described above, the position information of the sound source detected by the sound source
position detection unit 113 indicates the position in the display surface (in the visual image) and
the position out of the display surface (in the visual image). There are two ways. When the sound
source position is in the display surface (in the visual image), the display data generation unit
116 generates the display data Dds so as to display sound information at or near the sound
source position. In addition, when the sound source position is out of the display surface (in the
visual image), the display data generation unit 116 displays sound information at the end of the
display surface (visual image) near the sound source position, Generate display data Dds.
[0043]
The processing of each unit of the signal processing unit 104 shown in FIG. 4 is executed by
software processing by a computer (CPU), for example. In this case, the computer functions as
each unit of the signal processing unit 104 shown in FIG. 4 based on the processing program. Of
course, part or all of each part of the signal processing unit 104 shown in FIG. 4 can be
configured by hardware.
03-05-2019
13
[0044]
The flowchart of FIG. 6 shows the processing procedure of the signal processing unit 104 shown
in FIG. The signal processing unit 104 periodically repeats this processing procedure to
sequentially update the display data Dds. The signal processing unit 104 starts processing in step
ST1, and then proceeds to processing in step ST2.
[0045]
In this step ST 2, the signal processing unit 104 performs filter processing for removing or
reducing frequency components such as wind noise and noise of clothes included in the sound
collection information (sound signals) S 1 to S 4 of the four microphones 101. Do. Then, in step
ST3, the signal processing unit 104 performs a gain adjustment process of cutting low-level
signals such as reflected sound or stationary noise from the sound collection information (sound
signals) S1 to S4 of the four microphones 101.
[0046]
Next, in step ST4, the signal processing unit 104 detects the position information of the sound
source based on the sound collection information (sound signals) S1 to S4 of the four
microphones 101. Further, in step ST5, the signal processing unit 104 analyzes, for each sound
source whose sound source position has been detected, the level of the sound from that sound
source (sound magnitude), and obtains level information as sound information of the sound
source. Furthermore, in step ST6, the signal processing unit 104 analyzes the frequency of the
sound from the sound source for each sound source whose sound source position has been
detected, and obtains frequency information as sound information of the sound source.
[0047]
Next, in step ST7, the signal processing unit 104 is based on the position information of the
sound source obtained in step ST4, the level information of the sound from the sound source
obtained in step ST5, and the frequency information of the sound from the sound source
obtained in step ST6. To generate display data. That is, in step ST7, the signal processing unit
03-05-2019
14
104 generates display data for displaying sound information at a position in the visual image
corresponding to the position information so as to be superimposed on the visual image. After
the process of step ST7, the signal processing unit 104 ends the process in step ST8.
[0048]
The operation of the transmissive HMD 10 shown in FIGS. 1 and 3 will be described. The sound
collection information (sound signal) of the four microphones 101 is amplified by the amplifier
102, and further converted by the A / D converter 103 from an analog signal to a digital signal,
and then supplied to the signal processing unit 104. In the signal processing unit 104, based on
sound collection information (sound signals) of the four microphones 101, position information
and sound information (level information, frequency information) of a sound source are acquired.
[0049]
Further, the signal processing unit 104 generates display data for displaying sound information
based on the acquired position information and sound information of the sound source. The
display data is for displaying sound information so as to be superimposed on the visual image at
a position in the visual image corresponding to the position information. In this case, display data
is generated such that the level information of the sound emitted from the sound source is
displayed in the size of a circle. Also, in this case, display data is generated such that the
frequency information of the sound emitted from the sound source is represented by the color
attached to the circle.
[0050]
The display data generated by the signal processing unit 104 is supplied to the display unit 105.
In the display unit 105, the sound information of the sound source is displayed superimposed on
the visual image at a position corresponding to the sound source in the visual image based on the
display data. In this case, when the sound source position is in the display surface (in the visual
image), the sound information is displayed at or near the sound source position. In this case,
when the sound source position deviates from the display surface (in the visual image), sound
information is displayed at the end of the display surface (visual image) near the sound source
position.
03-05-2019
15
[0051]
FIG. 7A shows an example of a visual image (actual image) observed by the user through the
transmissive HMD 10. FIG. 7B shows an example of sound information (level information,
frequency information) displayed on the display unit 105. In addition, in FIG.7 (b), it has shown
by design instead of showing the frequency information of the sound of a sound source with a
color. The user observes an image as shown in FIG. 7 (c) in a state where the sound information
display shown in FIG. 7 (b) is superimposed on the visual image shown in FIG. 7 (a). This video is
such that sound information of the sound source is superimposed and displayed at a position
corresponding to the sound source in the visual image.
[0052]
In the transmissive HMD 10 shown in FIGS. 1 and 3, sound information of the sound source is
superimposed and displayed at a position corresponding to the sound source in the visual image.
Therefore, the user can intuitively determine the position of the sound source and the
information of the sound emitted from the sound source.
[0053]
Further, in the transmissive HMD 10 shown in FIG. 1 and FIG. 3, position information and sound
information of a sound source are acquired based on sound collection information (sound
signals) of the four microphones 101. Therefore, even with respect to the sound source hidden
by the predetermined object in the visual image, the sound information of the sound source can
be superimposed and displayed at the position corresponding to the sound source in the visual
image.
[0054]
Further, in the transmissive HMD 10 shown in FIGS. 1 and 3, since the level information of the
sound of the sound source is displayed in the size of a circle, it is possible to determine the sense
of perspective. For example, for example, it can be determined that the sound source is
03-05-2019
16
approaching from a circle that gradually increases. Further, in the transmissive HMD 10 shown
in FIG. 1 and FIG. 3, since the frequency information of the sound of the sound source is
displayed in a color attached to a circle, a specific sound source can be searched based on the
color.
[0055]
Further, in the transmissive HMD 10 shown in FIGS. 1 and 3, when the sound source position
deviates from within the visual image, sound information is displayed at the end of the visual
image near the sound source position. Therefore, the sound information can be displayed also for
the sound source at a position out of the visual image, and the user can intuitively know in which
direction the sound source is located with respect to the visual image.
[0056]
Further, in the transmissive HMD 10 shown in FIGS. 1 and 3, the surface SFb formed by the four
microphones 101 disposed in the transmissive HMD 10 should not be orthogonal to the display
surface SFa of the transmissive HMD 10. Is configured. In this case, since the surface SFb is
projected on the display surface SFa, the sound source position on the display surface SFa, that is,
on the two-dimensional plane can be easily obtained.
[0057]
In the transmission type HMD 10 shown in FIGS. 1 and 3, an analog microphone 101 is used.
However, a configuration is also possible in which a digital microphone such as MEMS is used
and an amplifier and an A / D converter are not required.
[0058]
Further, although four microphones 101 are used in the transmissive HMD 10 shown in FIGS. 1
and 3, the number of microphones 101 is not limited to four. For example, FIG. 8A shows an
example in which two microphones 101 are disposed on the front surface of the transmissive
HMD 10. FIG. 8B shows an example in which three microphones 101 are disposed on the front
03-05-2019
17
surface of the transmissive HMD 10.
[0059]
In order to improve the accuracy of the sound source position detection, the microphones need
to be disposed at a constant interval. Therefore, when two microphones 101 are disposed on the
front surface of the transmissive HMD 10, they are disposed on the left and right ends as shown
in FIG. 8A, for example. When three microphones 101 are disposed on the front surface of the
transmissive HMD 10, for example, as shown in FIG. 8B, they are disposed to form a triangle.
[0060]
Further, in the transmissive HMD 10 shown in FIGS. 1 and 3, the microphone 101 is integrally
disposed on the front surface of the transmissive HMD 10. However, the microphone 101 may be
independent of the transmissive HMD 10. In such a case, a mechanism for transferring
information on the distance between the microphone 101 and the transmissive HMD 10 and
information on the distance between the microphones 101 to the signal processing unit 104 is
required.
[0061]
In this case, it is desirable that the position of the microphone 101 be close to the transmissive
HMD 10 and be fixed. The transmission HMD 10 may be integrated with another device for the
purpose of sound collection. When two microphones are used, if the microphones are installed
near the pinna, a method using correlation between both ears or a method using phase difference
between both ears (see Japanese Patent Laid-Open No. 2004-325284), etc. It is also possible to
estimate the sound source position by
[0062]
The same can be said for the other embodiments described later with respect to each of the
above modifications.
[0063]
03-05-2019
18
<2.
Second Embodiment> [Configuration Example of Transmission-Type Head-Mounted Display] FIG.
9 illustrates an appearance of a transmission-type head-mounted display (transmission-type
HMD) 10A according to a second embodiment. Similar to the transmissive HMD 10 shown in FIG.
1, four omnidirectional microphones 101 are disposed on the front surface of the transmissive
HMD 10A. In addition, three directional microphones 101a are disposed on the front surface of
the transmissive HMD 10A. Here, the microphones 101 and 101a constitute a sound collecting
unit.
[0064]
The plane formed by the arrangement positions of the four microphones 101 is the same as that
of the transmissive HMD 10 shown in FIG. 1 and is not orthogonal to the display surface of the
transmissive HMD 10. Similarly, the surface configured by the arrangement positions of the three
microphones 101 a is also made not to be orthogonal to the display surface of the transmissive
HMD 10. That is, in this case, the three microphones 101a include those whose arrangement
positions are different in the horizontal direction of the display surface and those whose
arrangement positions are different in the vertical direction of the display surface.
[0065]
FIG. 10 shows the system configuration of a transmissive HMD 10A according to the second
embodiment. In FIG. 10, parts corresponding to FIG. 3 are given the same reference numerals,
and the detailed description thereof will be omitted as appropriate. The transmissive HMD 10A
includes four omnidirectional microphones 101, three directional microphones 101a, amplifiers
102 and 106, A / D converters 103 and 107, a signal processing unit 104A, and a display unit
105. Have. For example, the directional microphone 101a includes a plurality of microphone
arrays, and can dynamically scan the directivity direction.
[0066]
03-05-2019
19
The amplifier 102 amplifies sound collection information (sound signal) of the four microphones
101. The A / D converter 103 converts the sound collection information (sound signal) of the
four microphones 101 amplified by the amplifier 102 from an analog signal to a digital signal.
The amplifier 106 amplifies sound collection information (sound signal) of the three
microphones 101 a. The A / D converter 107 converts the sound collection information (sound
signal) of the three microphones 101 a amplified by the amplifier 106 from an analog signal to a
digital signal.
[0067]
The signal processing unit 104A receives the sound collection information (sound signal) of the
four omnidirectional microphones 101 from the A / D converter 103 and the sound collection
information of the three directional microphones 101a from the A / D converter 107 ( Based on
the sound signal, position information and sound information of the sound source are acquired.
In addition, the signal processing unit 104A generates display data for displaying sound
information based on the position information and sound information of the sound source. The
display data is for displaying sound information so as to be superimposed on the visual image at
a position in the visual image corresponding to the position information. The signal processing
unit 104A constitutes an information acquisition unit and a display data generation unit.
[0068]
The signal processing unit 104A acquires the position information of the sound source according
to the following processing procedure. That is, the signal processing unit 104A obtains the first
direction information of the sound source based on the sound collection information (sound
signal) of the four omnidirectional microphones 101. The first direction information is
information roughly indicating the direction of the sound source. Next, the signal processing unit
104A controls the pointing directions of the three directional microphones 101a based on the
first direction information to control the direction of the sound source at the arrangement
position of the three directional microphones 101a. Get 2 direction information.
[0069]
In this case, although a control line from the signal processing unit 104A to the directional
microphone 101a is not shown in FIG. 10, the signal processing unit 104A indicates that the
03-05-2019
20
directivity direction of the directional microphone 101a is indicated by the first direction
information. It is controlled to scan the range. The signal processing unit 104A sets the
directivity direction at which the level of the collected sound information (sound signal) of the
directional microphone 101a is maximum as the second direction information of the sound
source at the arrangement position of the directional microphone 101a. The second direction
information is information indicating the direction of the sound source with high accuracy. Then,
the signal processing unit 104A acquires the position information of the sound source based on
the second direction information of the sound source at the arrangement position of the three
directional microphones 101a.
[0070]
Based on the display data generated by the signal processing unit 104A, the display unit 105
superimposes sound information of the sound source on the visual image and displays the sound
information at a position corresponding to the sound source in the visual image. In this case,
when the sound source position is in the visual image, sound information is displayed at or near
the sound source position. In this case, when the sound source position is out of the visual image,
sound information is displayed at the end of the visual image near the sound source position.
[0071]
FIG. 11 shows the detailed configuration of the signal processing unit 104A. In FIG. 11, parts
corresponding to FIG. 4 are assigned the same reference numerals, and the detailed description
thereof will be omitted as appropriate. The signal processing unit 104A includes digital filters
111 and 118, gain adjusting units 112 and 119, a sound source direction estimating unit 117, a
sound source position detecting unit 113A, a level analyzing unit 114, a frequency analyzing unit
115, and display data. The generator 116 is provided.
[0072]
The digital filter 111 performs filter processing to remove or reduce frequency components such
as wind noise and clothes rubbing sound included in the sound collection information (sound
signals) S1 to S4 of the four microphones 101. This frequency component adversely affects the
sound source direction estimation processing and the like. The gain adjustment unit 112 cuts low
level signals such as reflected sound or stationary noise from the sound collection information
03-05-2019
21
(sound signals) S1 to S4 of the four omnidirectional microphones 101. This low level signal
adversely affects the sound source direction estimation processing and the like.
[0073]
The sound source direction estimation unit 117 roughly estimates the sound source direction
based on the sound collection information (sound signals) S1 to S4 of the four microphones 101
subjected to the filter processing and the gain adjustment processing. The sound source direction
estimation unit 117 determines (1) the arrival time difference of the sound from the sound
source to each of the microphones, among the three steps of processing of the signal processing
unit 104 in the transmissive HMD 10 shown in FIG. The two-step process of calculating the
arrival angle of the sound from the sound source is performed.
[0074]
The sound source position detection unit 113A is based on the plurality of arrival angle
information obtained by the sound source direction estimation unit 117 and the sound collection
information (sound signals) Sa1 to Sa4 of the three directional microphones 101. To detect. The
sound source position detection unit 113A detects positional information in which the display
surface of the transmissive HMD 10A, that is, the display surface (two-dimensional plane) of the
display unit 105 is XY coordinates.
[0075]
The sound source position detection unit 113A first obtains the sound source direction at the
arrangement position for each of the three directional microphones 101a. In this case, the sound
source position detection unit 113A scans a predetermined range indicated by the plurality of
arrival angle information (the first direction information of the sound source) obtained by the
sound source direction estimation unit 117 for the directional direction of the directional
microphone 101a. Control. Then, the sound source position detection unit 113A sets the
directivity direction at which the level of the sound collection information (sound signal) of the
directional microphone 101a is maximum as the sound source direction at the arrangement
position of the directional microphone 101a.
03-05-2019
22
[0076]
Next, the sound source position detection unit 113A acquires the position information of the
sound source based on the sound source direction information (second direction information of
the sound source) at the arrangement positions of the three directional microphones 101a. That
is, the sound source position detection unit 113A combines the sound source directions at the
arrangement positions of the three directional microphones 101a to estimate the position of the
sound source, that is, the position on the two-dimensional plane including the display surface of
the sound source. As the sound source position estimated in this way, two kinds of positions in
the display surface (in the visual image) or positions deviated from the display surface (in the
visual image) can be considered.
[0077]
For each sound source whose sound source position is detected by the sound source position
detection unit 113A, the level analysis unit 114 analyzes the sound level (sound size) of the
sound source collected by, for example, four directional microphones 101. , Obtain level
information as sound information of the sound source. As described above, the arrival time
difference occurs in the sound from the sound source to each of the microphones 101.
[0078]
The level analysis unit 114 adds the collected sound information (sound signals) S1 to S4 of the
four microphones 101 in consideration of the arrival time difference, and obtains the level
information of the sound source based on the added signal. For each sound source whose sound
source position is detected by the sound source position detection unit 113A, the frequency
analysis unit 115 analyzes the frequency of the sound of the sound source collected by, for
example, four directional microphones 101a, and uses it as sound information of the sound
source. Get frequency information of
[0079]
The display data generation unit 116 generates display data Dds for displaying sound
information at a position in the visual image corresponding to the position information so as to
03-05-2019
23
be superimposed on the visual image. The display data generation unit 116 includes position
information of the sound source detected by the sound source position detection unit 113A, level
information of the sound from the sound source obtained by the level analysis unit 114, and
frequency information of the sound from the sound source obtained by the frequency analysis
unit 115. To generate display data Dds.
[0080]
The display data generation unit 116 generates display data Dds so as to display the level
information of the sound emitted from the sound source in the size of a circle. In this case, the
larger the level, the larger the circle. Further, the display data generation unit 116 generates the
display data Dds so as to display the frequency information of the sound emitted from the sound
source in the color attached to the circle. Thus, when the frequency components of the sound are
different for each type of sound source, each type of sound source can be identified by the color
attached to the circle.
[0081]
As described above, the position information of the sound source detected by the sound source
position detection unit 113A indicates the position in the display surface (in the visual image)
and the position out of the display surface (in the visual image). There are two ways. When the
sound source position is in the display surface (in the visual image), the display data generation
unit 116 generates the display data Dds so as to display sound information at or near the sound
source position. In addition, when the sound source position is out of the display surface (in the
visual image), the indication data generation unit 116 displays the sound information on the end
of the display surface (visual image) near the sound source position, Generate display data Dds.
[0082]
The processing of each unit of the signal processing unit 104A illustrated in FIG. 11 is executed
by software processing by a computer (CPU), for example. In this case, the computer functions as
each unit of the signal processing unit 104A shown in FIG. 11 based on the processing program.
Of course, part or all of each part of the signal processing unit 104A shown in FIG. 11 can be
configured by hardware.
03-05-2019
24
[0083]
The flowchart of FIG. 12 shows the processing procedure of the signal processing unit 104A
shown in FIG. The signal processing unit 104A periodically repeats this processing procedure to
sequentially update the display data Dds. In step ST10, the signal processing unit 104A starts the
process, and then proceeds to the process of step ST11.
[0084]
In this step ST11, the signal processing unit 104A is a filter for removing or reducing frequency
components such as wind noise and noise of clothes included in the sound collection information
(sound signals) S1 to S4 of the four omnidirectional microphones 101. Do the processing. Then,
in step ST12, the signal processing unit 104A performs gain adjustment processing to cut low
level signals such as reflected sound or stationary noise from the sound collection information
(sound signals) S1 to S4 of the four omnidirectional microphones 101. Do.
[0085]
Next, in step ST13, the signal processing unit 104A roughly estimates the sound source direction
based on the sound collection information (sound signals) S1 to S4 of the four microphones 101
subjected to the filter processing and the gain adjustment processing. Do. In this case, the signal
processing unit 104A performs a two-step process of (1) obtaining an arrival time difference of
the sound from the sound source to each microphone, and (2) calculating an arrival angle of the
sound from the sound source.
[0086]
Next, in step ST14, the signal processing unit 104A removes or reduces frequency components
such as wind noise and noise of clothes included in the sound collection information (sound
signals) Sa1 to Sa3 of the three directional microphones 101a. Filter the Then, in step ST15, the
signal processing unit 104A performs gain adjustment processing for cutting low-level signals
such as reflected sound or stationary noise from the sound collection information (sound signals)
Sa1 to Sa3 of the three directional microphones 101a. Do.
03-05-2019
25
[0087]
Next, in step ST16, the signal processing unit 104A detects the position information of the sound
source. In this case, the signal processing unit 104A receives the plurality of arrival angle
information obtained in step ST13 and the sound collection information (sound signals) Sa1 to
Sa3 of the three directional microphones 101a subjected to the filter processing and the gain
adjustment processing. The position information of the sound source is detected based on
[0088]
Next, in step ST17, the signal processing unit 104A analyzes the level (sound magnitude) of the
sound from the sound source for each sound source whose sound source position has been
detected, and obtains level information as sound information of the sound source. . Further, in
step ST18, the signal processing unit 104A analyzes the frequency of the sound from the sound
source for each sound source whose sound source position is detected, and obtains frequency
information as sound information of the sound source.
[0089]
Next, in step ST19, the signal processing unit 104A is based on the position information of the
sound source obtained in step ST16, the level information of the sound from the sound source
obtained in step ST17, and the frequency information of the sound from the sound source
obtained in step ST18. To generate display data. That is, in step ST19, the signal processing unit
104A generates display data for displaying sound information so as to be superimposed on the
visual image at a position in the visual image corresponding to the position information. After the
process of step ST19, the signal processing unit 104A ends the process in step ST20.
[0090]
The operation of the transmissive HMD 10A shown in FIGS. 9 and 10 will be described. The
collected sound information (sound signal) of the four omnidirectional microphones 101 is
amplified by the amplifier 102 and further converted from an analog signal to a digital signal by
03-05-2019
26
the A / D converter 103 and supplied to the signal processing unit 104A. Ru. Also, the collected
sound information (sound signal) of the three directional microphones 101a is amplified by the
amplifier 106, and further converted from an analog signal to a digital signal by the A / D
converter 107. Supplied.
[0091]
In the signal processing unit 104A, position information and sound information of a sound
source are acquired based on sound collection information (sound signals) of four
omnidirectional microphones 101 and sound collection information (sound signals) of three
directional microphones 101a. And display data for displaying sound information is generated.
The display data is for displaying sound information so as to be superimposed on the visual
image at a position in the visual image corresponding to the position information.
[0092]
In this case, first, in the signal processing unit 104A, based on the sound collection information
(sound signals) S1 to S4 of the four omnidirectional microphones 101, first direction information
of the sound source (information roughly indicating the direction of the sound source) Is
acquired. Next, in the signal processing unit 104A, the pointing directions of the three directional
microphones 101a are controlled based on the first direction information. Then, based on the
collected sound information (sound signals) Sa1 to Sa3 of the three directional microphones
101a, the second directional information (the accuracy of the direction of the sound source) of
the sound source at the arrangement position of the three directional microphones 101a
Information shown well) is acquired. Further, in the signal processing unit 104A, the position
information of the sound source is acquired based on the second direction information of the
sound source at the arrangement position of the three directional microphones 101a.
[0093]
The display data generated by the signal processing unit 104A is supplied to the display unit
105. In the display unit 105, based on the display data, the sound information of the sound
source is displayed superimposed on the visual image at a position corresponding to the sound
source in the visual image (see FIG. 7C). In this case, when the sound source position is in the
display surface (in the visual image), the sound information is displayed at or near the sound
03-05-2019
27
source position. In this case, when the sound source position deviates from the display surface (in
the visual image), sound information is displayed at the end of the display surface (visual image)
near the sound source position.
[0094]
The transmissive HMD 10A shown in FIGS. 9 and 10 is configured in the same manner as the
transmissive HMD 10 shown in FIGS. 1 and 3 described above, so that the same effect can be
obtained. Further, in the transmission type HMD 10A shown in FIGS. 9 and 10, the signal
processing unit 104A performs processing based on the sound collection information of the four
omnidirectional microphones 101 and the sound collection information of the three directional
microphones 101a. Direction information of the sound source is acquired by the two-stage
process of the process. Therefore, it is possible to improve the acquisition accuracy of the
position information of the sound source without increasing the number of microphones so
much.
[0095]
<3. Third embodiment> [Configuration example of non-transmission type head mounted
display] Fig. 13 illustrates an appearance of a non transmission type head mounted display
(HMD) 10B according to a third embodiment. Similar to the transmissive HMD 10 shown in FIG.
1, four omnidirectional microphones 101 are disposed on the front of the non-transmissive HMD
10B. Further, an imaging element (camera) 131 for obtaining video data of a visual video is
disposed at the center of the front of the non-transmissive HMD 10B. Here, the microphone 101
constitutes a sound collection unit. The plane constituted by the arrangement positions of the
four microphones 101 is the same as that of the transmissive HMD 10 shown in FIG. 1, and is
not orthogonal to the display surface of the non-transmissive HMD 10B.
[0096]
FIG. 14 shows the system configuration of the non-transmissive HMD 10B according to the third
embodiment. In FIG. 14, parts corresponding to those in FIG. The non-transmissive HMD 10 B
includes four omnidirectional microphones 101, an amplifier 102, an A / D converter 103, a
signal processing unit 104, an imaging device (camera) 131, an imaging signal processing unit
132, and It has a unit 134 and a display unit 105B.
03-05-2019
28
[0097]
The amplifier 102 amplifies sound collection information (sound signal) of the four microphones
101. The A / D converter 103 converts the sound collection information (sound signal) of the
four microphones 101 amplified by the amplifier 102 from an analog signal to a digital signal.
The signal processing unit 104C acquires the position information and sound information of the
sound source based on the sound collection information (sound signal) of the four microphones
101 obtained by the A / D converter 103, and displays the sound information. Generate display
data for
[0098]
The imaging device (camera) 131 captures an object corresponding to the field of view of the
user. The imaging signal processing unit 132 processes an imaging signal obtained by the
imaging element 131 and outputs video data of a visual image. In this case, the imaging signal
processing unit 132 also performs a process of correcting the deviation between the imaged
image generated according to the arrangement position of the imaging element 131 and the
actual field of view of the user. Here, the imaging element 131 and the imaging signal processing
unit 132 constitute an imaging unit.
[0099]
The superimposing unit 134 superimposes the display data generated by the signal processing
unit 104 on the video data of the visual image obtained by the imaging signal processing unit
132. The display unit 105B displays the visual image based on the output data of the
superimposing unit 134, and superimposes the sound information of the sound source on the
visual image at a position corresponding to the sound source in the visual image. This display
unit 105B is different from the display unit 105 of the transmissive HMD 10 shown in FIG. 3, and
is configured of, for example, a normal liquid crystal display (LCD) in which the backlight unit is
not removed.
[0100]
03-05-2019
29
The operation of the non-transmissive HMD 10B shown in FIGS. 13 and 14 will be described. The
sound collection information (sound signal) of the four microphones 101 is amplified by the
amplifier 102, and further converted by the A / D converter 103 from an analog signal to a
digital signal, and then supplied to the signal processing unit 104. The signal processing unit 104
acquires position information and sound information of a sound source based on sound
collection information (sound signals) of the four microphones 101, and generates display data
for displaying the sound information. The display data is for displaying sound information so as
to be superimposed on the visual image at a position in the visual image corresponding to the
position information.
[0101]
Further, the imaging device 131 captures an object corresponding to the field of view of the user.
The imaging signal output from the imaging element 131 is supplied to the imaging signal
processing unit 132. The imaging signal processing unit 132 processes the imaging signal to
generate video data of a visual image. The imaging signal processing unit 132 also performs a
process of correcting the deviation between the imaging image generated according to the
arrangement position of the imaging element 131 and the actual field of view of the user.
[0102]
The image data of the visual image obtained by the imaging signal processing unit 132 is
supplied to the superimposing unit 134. Further, the display data generated by the signal
processing unit 104 is supplied to the superimposing unit 134. In the superimposing unit 134,
display data is superimposed on the video data of the visual image. The superimposed data is
supplied to the display unit 105B.
[0103]
The display unit 105B displays the visual image based on the output data (superimposed data) of
the superimposing unit 134, and the sound information of the sound source overlaps the visual
image at a position corresponding to the sound source in the visual image. Is displayed (see FIG.
7C). In this case, when the sound source position is in the display surface (in the visual image),
the sound information is displayed at or near the sound source position. In this case, when the
03-05-2019
30
sound source position deviates from the display surface (in the visual image), sound information
is displayed at the end of the display surface (visual image) near the sound source position.
[0104]
The non-transmissive HMD 10B shown in FIGS. 13 and 14 is the same as the transmissive HMD
10 shown in FIGS. 1 and 3 described above except that the visual image is also displayed on the
display unit 105B in addition to the sound information of the sound source. Since it is configured,
the same effect can be obtained. Further, in the non-transmissive HMD 10B shown in FIG. 13 and
FIG. 14, in the imaging signal processing unit 132, the deviation between the imaged image
generated according to the arrangement position of the imaging device 131 and the field of view
of the actual user is corrected. Therefore, a good visual image corresponding to the field of view
can be displayed on the display unit 105B.
[0105]
In the non-transmissive HMD 10B shown in FIGS. 13 and 14, the imaging device (camera) 131 is
integrally disposed at the center of the front surface of the non-transmissive HMD 10B, but the
arrangement position of the imaging device 131 is the same. It is not limited to. For example, as
shown in FIG. 15, it is also conceivable to arrange at the side end of the front surface of the nontransmissive HMD 10B or at a position independent of the non-transmissive HMD 10B. Even in
such a case, as described above, the imaging signal processing unit 132 corrects the difference
between the captured image generated according to the arrangement position of the imaging
element 131 and the field of view of the actual user. The display unit 105B can display a good
visual image corresponding to the field of view.
[0106]
<4. Fourth embodiment> [Configuration example of transmissive head mounted display] Fig.
16 illustrates a configuration of a transmissive head mounted display (HMD) 10C according to a
fourth embodiment. Although not shown, the appearance of the transmission HMD 10C is similar
to that of the transmission HMD 10 shown in FIG. In FIG. 16, parts corresponding to FIG. 3 are
assigned the same reference numerals, and the detailed description thereof will be omitted as
appropriate. The transmission type HMD 10C includes four omnidirectional microphones 101, an
amplifier 102, an A / D converter 103, a signal processing unit 104C, a display unit 105, and a
03-05-2019
31
sound source identification unit 135.
[0107]
The amplifier 102 amplifies sound collection information (sound signal) of the four microphones
101. The A / D converter 103 converts the sound collection information (sound signal) of the
four microphones 101 amplified by the amplifier 102 from an analog signal to a digital signal.
The signal processing unit 104C acquires the position information and sound information of the
sound source based on the sound collection information (sound signal) of the four microphones
101 obtained by the A / D converter 103, and displays the sound information. Generate display
data for
[0108]
The sound source identification unit 135 identifies a sound source as a target for acquiring the
position information and the sound information by the signal processing unit 104C. The sound
source identification unit 135 includes a sound source selection button (not shown) and the like
so that the user can perform a sound source identification operation. The specification of the
target sound source can be performed, for example, by the frequency, level, etc. of the sound, but
in this embodiment, it is performed by the frequency.
[0109]
The display unit 105 superimposes sound information of the sound source on the visual image
and displays the sound information at a position corresponding to the sound source in the visual
image based on the display data generated by the signal processing unit 104C. The display unit
105 is configured of, for example, a display of a transmissive structure in which the backlight
unit is removed from a liquid crystal display (LCD).
[0110]
The operation of the transmissive HMD 10C shown in FIG. 16 will be described. The sound
collection information (sound signal) of the four microphones 101 is amplified by the amplifier
03-05-2019
32
102 and further converted from an analog signal to a digital signal by the A / D converter 103,
and then supplied to the signal processing unit 104C. In the signal processing unit 104C, based
on sound collection information (sound signals) of the four microphones 101, position
information and sound information (level information, frequency information) of a sound source
are acquired. In this case, in the signal processing unit 104C, only the sound source specified by
the sound source specifying unit 135 is targeted, and position information and sound
information are acquired.
[0111]
Further, the signal processing unit 104C generates display data for displaying sound information
based on the acquired position information and sound information of the sound source. The
display data is for displaying sound information so as to be superimposed on the visual image at
a position in the visual image corresponding to the position information. In this case, display data
is generated such that the level information of the sound emitted from the sound source is
displayed in the size of a circle. Also, in this case, display data is generated such that the
frequency information of the sound emitted from the sound source is represented by the color
attached to the circle.
[0112]
The display data generated by the signal processing unit 104C is supplied to the display unit
105. In the display unit 105, based on the display data, the sound information of the sound
source is displayed superimposed on the visual image at a position corresponding to the sound
source in the visual image (see FIG. 7C). In this case, when the sound source position is in the
display surface (in the visual image), the sound information is displayed at or near the sound
source position. In this case, when the sound source position deviates from the display surface (in
the visual image), sound information is displayed at the end of the display surface (visual image)
near the sound source position.
[0113]
The transmissive HMD 10C shown in FIG. 16 is configured in the same manner as the
transmissive HMD 10 shown in FIGS. 1 and 3 described above, so that the same effect can be
obtained. Further, in the transmissive HMD 10C shown in FIG. 16, in the signal processing unit
03-05-2019
33
104C, only the sound source specified by the sound source specifying unit 135 is targeted, and
position information and sound information are acquired. Therefore, only the sound information
of the specific sound source can be displayed in the visual image, and the search for the specific
sound source becomes easy.
[0114]
<5. Modified Example> In the above embodiment, the sound information of the sound source
is the level information and the frequency. However, other information may be considered as this
sound information. For example, (1) character information representing a word determined by
speech recognition, (2) a mimic language indicating the pronunciation of an object obtained from
an environmental sound (train: gatangoton, etc.), (3) 4) Time axis waveform of sound signal,
power spectrum, frequency spectrum, etc.
[0115]
Moreover, in the above-mentioned embodiment, the example in which the level information of
the sound of the sound source is displayed in the size of a circle is shown. However, display
shapes other than circles may be used. For example, it may be a font according to the type of the
sound generator such as (1) polygon, (2) arrow, (3) speech balloon, (4) human voice, other living
things, environmental sound, etc.
[0116]
Moreover, in the above-mentioned embodiment, the example in which the frequency information
of the sound of the sound source is displayed in the color added to the circle is shown. However,
(1) gender of the person, (2) voice of the person, (3) other kinds of sounding body such as living
organisms, environmental sound are displayed in different colors, and further, the magnitude of
the sound is displayed in shades of color Etc. are also conceivable.
[0117]
The present invention can be applied to, for example, a hearing support device using a head
mounted display.
03-05-2019
34
[0118]
10, 10A, 10C ... transmission type head mount display (transmission type HMD) 10B ... non
transmission type head mount display (non transmission type HMD) 101 ... microphone (alldirectional microphone) 101a ... directivity Microphone 102 ... amplifier 103 ... A / D converter
104, 104A, 104C ... signal processing unit 105, 105B ... display unit 106 ... amplifier 107 ... A / D
converter 111 · · · · · · Digital filter 112 · · · Gain adjustment unit 113, 113A · · · Sound source
position detection unit 114 · · · Level analysis unit 115 · · · Frequency analysis unit 116 · · · ·
display data generation unit 117 · · · · · · sound source direction estimation Unit 118 ··· Digital
filter 119 · · · Gain adjustment unit 131 · · · Image sensor (camera) 132 · · · Imaging signal
processing unit 134 ... superimposing unit 135 ... sound source identification unit
03-05-2019
35
1/--страниц
Пожаловаться на содержимое документа