JP2015164273

Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2015164273
Abstract: PROBLEM TO BE SOLVED: To provide a program for solving the problem that mutual
interference of voices in a conference interferes with smooth communication. A headset
connection unit (1141) to which a headset (115) in which a headphone and a microphone are
integrated are connectable and a headset connection unit of the conference apparatus and the
other conference apparatus is used. The voice control unit 114 switches the voice of each
conference participant to be output to the headphones of the headset connected to the headset
connection unit based on a predetermined condition according to the utterance of each
conference participant. [Selected figure] Figure 3
Conference apparatus, control method and program in conference apparatus
[0001]
The present invention relates to a conferencing apparatus, and a control method and program in
the conferencing apparatus.
[0002]
Conventionally, in a conference using a conference apparatus, the voice during the conference is
third when the conference is performed in an environment where a third party exists in the
surroundings, such as a place other than a wall divided conference room such as a meeting
corner. It is known to use a headset (headphones with a microphone) for the purpose of not
disturbing the person and for the purpose of preventing ambient sounds from interfering with
the conference.
09-05-2019
1
[0003]
As a conference apparatus using a headset, for example, Patent Document 1 discloses a speaker
identification problem in which it is difficult to identify a speaker in a conference apparatus using
voice for a plurality of participants. For the purpose of solving the echo problem that a problem
occurs, one using headphones and a pair of microphones for each participant is disclosed.
In the case of this device, the speaker identification problem can be solved since the listening is
by stereo captured by a pair of microphones.
In addition, since the voice of the other party is output from the headphones, the voice of the
other party is not input again to the microphone of the other party, and therefore no echo
problem occurs. Also, since headphones are worn, it is unlikely that ambient noise will interfere
with speech.
[0004]
However, since the headset is a device used by one person, in the case of a conference in which a
plurality of persons are present, a plurality of headsets will be used. In such a situation, when the
voices of the participants are spread with the microphone of the headset and the respective audio
signals are mixed in the conference apparatus, there is a problem that the voices of each other
interfere with each other so that they can not communicate smoothly.
[0005]
Although the one disclosed in Patent Document 1 solves the echo problem in a conference
apparatus intended for a plurality of participants, the participants are at a remote location
because the microphone is placed at a position away from the conference participants. The
problem is that the voice of the conference should not be annoying to the third party, because
the microphone must speak relatively loudly, and the surrounding microphones will hear the
surrounding sound, so the surrounding sounds Can not solve the problem of ensuring that it does
not interfere with the meeting.
[0006]
The present invention is made in view of the above, and an object of the present invention is to
09-05-2019
2
solve the problem that mutual interference of voices interferes with smooth communication.
[0007]
In order to solve the problems described above and achieve the object, the conference apparatus
of the present invention is a conference apparatus that enables a conference by communicating
with other conference apparatuses via a communication network, in which headphones and a
microphone are integrated. According to the speech of each conference participant who uses the
headset connection means capable of connecting the headsets configured in the above and the
headset connection means of the conference apparatus and the other conference apparatus. And
audio control means for switching the audio of each conference participant to be output to the
headphones of the headset connected to the headset connection means based on the conditions.
[0008]
According to the present invention, it is possible to eliminate the interference of voices of a
plurality of participants.
[0009]
FIG. 1 is a diagram showing a schematic configuration of a video conference system using a
conference apparatus according to an embodiment.
FIG. 2 is a block diagram showing an example of the hardware configuration of the terminal
(conference apparatus) of the embodiment.
FIG. 3 is a block diagram showing a configuration example of the voice control unit of the
conference apparatus of the embodiment.
FIG. 4 is a flow chart for explaining the operation of the voice control unit of the transmitting
side terminal of the embodiment.
FIG. 5 is a flow chart for explaining the operation of the voice control unit of the receiving side
terminal of the embodiment.
09-05-2019
3
[0010]
Hereinafter, embodiments of the conferencing apparatus will be described in detail with
reference to the attached drawings.
[0011]
FIG. 1 is a diagram showing a schematic configuration of a video conference system as an
example of a conference apparatus according to an embodiment.
The video conference system described here is a system suitable for performing a conference in
which a plurality of people participate in an environment in which a third party exists in the
vicinity.
[0012]
A video conference system 10 shown in FIG. 1 includes a terminal 1 as a conference apparatus
provided in a conference place such as each meeting corner used by each user participating in
the video conference, a server computer (hereinafter referred to as a server) 2 and The terminal
1 and the server 2 are connected to a communication network 3 such as the Internet. In the
present video conference system 10, it is assumed that each conference place used by the
participants is at a remote place, and the terminal 1 is provided for each conference place, and
there are a plurality of terminals as shown in FIG. Of course, the terminal 1 may be installed as
needed.
[0013]
The server 2 has a data relay function of transferring the video information and the audio
information transmitted from each terminal 1 to the terminal 1 in another conference location.
[0014]
As described above, in the video conference system 10 of the present embodiment, the plurality
of terminals 1 are connected via the communication network 3, and the video information and
09-05-2019
4
audio information of the conference participants acquired by each of the terminals 1 are
transmitted via the server 2. It is a video conferencing system that can communicate with each
other and perform video conferencing.
Note that transmission and reception of video information and audio information between the
terminals 1 may be directly performed using a dedicated line or the like without passing through
the server 2.
[0015]
Next, the configuration of the terminal 1 as a conference apparatus will be described. FIG. 2 is a
block diagram showing an example of the hardware configuration of the terminal 1.
[0016]
FIG. 2 is a hardware configuration diagram of the terminal 1 as the conference apparatus
according to the present embodiment. As shown in FIG. 2, the terminal 1 of the present
embodiment is a ROM that stores programs used to drive the CPU 101 such as a central
processing unit (CPU) 101 that controls the overall operation of the terminal 1 and an initial
program loader (IPL). (Read Only Memory) 102, RAM (Random Access Memory) 103 used as
main memory of CPU 101, flash memory 104 for storing various data such as terminal programs,
image data, audio data, etc., flash according to control of CPU 101 A solid state drive (SSD) 105
that controls reading and writing of various data to the memory 104, a media drive 107 that
controls reading and writing (storage) of data to a recording medium 106 such as a flash
memory, and a destination of the terminal 1 are selected. Operation that is operated when Tan
108, and a network I / F (Interface) 111 for the power switch 109 for switching ON / OFF of the
power supply terminal 1, by using a communication network 3 to the data transmission.
[0017]
The terminal 1 also has a built-in camera 112 that captures an object and obtains image data
according to control of the CPU 101, an imaging element I / F 113 that controls driving of the
camera 112, and controls input and output of audio signals. A voice control unit 114 (details will
be described later) to which a plurality of headsets 115 are connected, a voice input / output unit
116 for processing voice signal input / output with the voice control unit 114 according to
control of the CPU 101, and external according to control of the CPU 101 A display I / F 117 for
09-05-2019
5
transmitting image data to the attached display 120, an external device connection I / F 118 for
connecting various external devices, and electrically connecting the above respective components
as shown in FIG. A bus line 110 such as an address bus or a data bus is provided.
[0018]
The display 120 is a display unit configured of a liquid crystal or an organic EL that displays an
image of a subject, an operation icon, and the like.
Also, the display 120 is connected to the display I / F 117 by a display cable 119. The display
cable 119 may be a cable for analog RGB (VGA) signals or a cable for component video, and may
be an HDMI (High-Definition Multimedia Interface) or DVI (Digital Signal Processor). (Video
Interactive) It may be a cable for signals.
[0019]
The camera 112 includes a lens and a solid-state imaging device that converts light into electric
charge to digitize an image (image) of a subject, and as a solid-state imaging device, a
complementary metal oxide semiconductor (CMOS) or a charge coupled device (CCD) Etc. are
used.
[0020]
An external device such as an external camera can be connected to the external device
connection I / F 118 by a USB (Universal Serial Bus) cable or the like.
When an external camera is connected, the external camera is driven prior to the built-in camera
112 according to the control of the CPU 101.
[0021]
The recording medium 106 is configured to be attachable to and detachable from the terminal 1.
Further, as long as the nonvolatile memory performs reading or writing of data according to the
09-05-2019
6
control of the CPU 101, not only the flash memory 104 but also EEPROM (Electrically Erasable
and Programmable ROM) or the like may be used.
[0022]
Furthermore, the terminal program may be recorded and distributed in a computer readable
recording medium such as the recording medium 106 in the form of an installable or executable
file. The terminal program may be stored in the ROM 102 instead of the flash memory 104.
[0023]
In the present embodiment, in the headset 115, headphones and a microphone are integrally
configured as in a headphone with a microphone.
[0024]
Next, details of the voice control unit 114 will be described.
FIG. 3 is a block diagram showing a configuration example of the voice control unit 114 of the
terminal 1. As shown in FIG. 3, the voice control unit 114 includes a headset connection unit
1141, a transmission voice detection unit 1142, a transmission voice selection unit 1143, a
transmission voice transmission unit 1144, and a reception voice reception unit 1145. And a
reception voice detection unit 1146 and a reception voice selection unit 1147.
[0025]
The headset connection unit 1141 connects the headset 115, inputs transmission voice from the
microphone of the headset 115, and outputs the transmission voice to a transmission voice
detection unit 1142 described later. Here, the transmission voice is a voice signal input to the
device itself through the headset 115. Also, the headset connection unit 1141 receives a
reception voice from a reception voice selection unit 1147 described later, and outputs the
reception voice to the headphones of the headset 115. Here, the reception voice is a voice signal
input from another device via the voice input / output unit 116.
09-05-2019
7
[0026]
In the present embodiment, as shown in FIG. 3, the headset connection portion 1141 can connect
a plurality of headsets 115. For this reason, in the present embodiment, it is possible to connect a
plurality of headsets 115 to one terminal 1 and use them simultaneously. However, the
connection form of the headset 115 is not limited to this, and a single headset 115 may be
connected to the headset connection portion 1141 and used. Note that the function of the
transmission voice selection unit 1143 may be realized using a computer including a CPU. In that
case, the control program executed by the CPU is provided by being incorporated in advance in a
ROM or the like.
[0027]
Next, the operation of each part of the voice control unit 114 will be described using FIG. 4 and
FIG.
[0028]
[Transmission Side] FIG. 4 is a flow chart for explaining the operation of the voice control unit
114 of the transmission side terminal 1.
[0029]
First, the transmission voice detection unit 1142 detects, for each headset 115, the presence /
absence of a transmission voice signal which is a voice signal input to the own apparatus through
the headset connection unit 1141, and the signal level thereof, and transmits their information
and The speech voice signal is output to the transmission voice selection unit 1143 (S101).
The transmission voice detection unit 1142 determines that there is a transmission voice signal
when the signal level of the transmission voice signal from the headset 115 is equal to or higher
than a predetermined value.
Further, only the signal level of the transmission voice signal of each headset 115 may be used as
the information to be output to the transmission voice selection unit 1143. In that case, the
presence or absence of the transmission voice signal is determined by the transmission voice
09-05-2019
8
selection unit 1143 based on the signal level of the transmission voice signal.
[0030]
Next, based on the information from the transmission voice detection unit 1142, the
transmission voice selection unit 1143 selects a transmission voice signal with the highest signal
level, and transmits the selected transmission voice signal to the transmission voice transmission
unit 1144. It outputs (S102).
[0031]
Then, the transmission voice transmission unit 1144 outputs the transmission voice signal input
from the transmission voice selection unit 1143 to the voice input / output unit 116 and the
reception voice detection unit 1146 (S103).
Note that the voice input / output unit 116 performs A / D conversion of the voice signal input
from the transmission voice sending unit 1144 and converts the voice signal into voice data via
the network I / F 111, the communication network 3, and another server via the server 2.
Although transmitted to the apparatus, A / D conversion of the audio signal is performed by the
transmission voice transmitting unit 1144 of the voice control unit 114, and the A / D converted
voice data is output to the voice input / output unit 116. You may do so.
[0032]
[Receiver] Next, the operation of the voice control unit 114 on the receiver will be described. FIG.
5 is a flowchart for explaining the operation of the voice control unit 114 of the terminal 1 on
the receiving side.
[0033]
First, the receiving voice receiving unit 1145 adjusts the signal level etc. of the receiving voice
signal which is the voice signal from the other conference apparatus (the other terminal 1) input
through the voice input / output unit 116 (for example, ) And output to the reception voice
detection unit 1146 (S201). Voice input / output unit 116 receives voice data from another
09-05-2019
9
conference apparatus via communication network 3, server 2 and network I / F 111, D / A
converts the voice data, and receives the received voice as a voice signal. The voice input /
output unit 116 receives the received voice data as it is, assuming that the D / A conversion of
voice data is performed by the reception voice receiving unit 1145 of the voice control unit 114.
It may be output to the receiver 1145.
[0034]
Next, the reception voice detection unit 1146 detects the presence / absence of the transmission
voice signal input from the transmission voice transmission unit 1144 and the reception voice
signal input from the reception voice reception unit 1145, and their signal levels, and The
information, the transmission voice signal and the reception voice signal are output to the
reception voice selection unit 1147 (S202). The information sent to the reception voice selection
unit 1147 may be only the signal levels of the transmission voice signal and the reception voice
signal.
[0035]
Next, the reception voice selection unit 1147 selects, as a predetermined condition, a voice signal
(one of the transmission voice signal and the reception voice signal) with the highest signal level
and outputs it to the headset connection unit 1141 (S203). Thereby, the voice of each conference
participant can be switched. In the present embodiment, as the predetermined condition, the
highest signal level is used, but the present invention is not limited to this.
[0036]
Then, the headset connection unit 1141 distributes and outputs the input audio signal to the
plurality of headsets 115 connected to the headset connection unit 1141 (S204).
[0037]
As described above, in the conference apparatus of this embodiment, when a plurality of
conference participants speak at almost the same time, only the audio signal from the headset
115 having the highest audio signal level is output to the subsequent stage, The signal is cut.
09-05-2019
10
This control prevents voice interference between the conference participants, and even if the
voice of the same speaker is spread with multiple microphones at the same venue, the one with a
low signal level is cut, so the occurrence of echo occurs. Can be suppressed.
[0038]
The speech voice of the conference participant, ie, the transmission voice, is always compared
with the reception voice, and is output from the headset 115 instead of the reception voice when
there is no reception voice or the comparison result is small. This also enables conversations
between meeting participants in the same venue.
[0039]
In the present embodiment, it is assumed that the headset 115 is worn for each participant, so
the voice is output to the voice of the conference participant on the own terminal side (generally
not loud) and to the headphones of the headset 115. The voice of the other party's terminal is
unlikely to bother other people around you. In addition, since the voice of each participant is
appropriately switched (according to the level of the signal level of the voice signal) according to
the voice of each participant (according to the magnitude of the signal level of the voice signal),
interference of voice due to overlapping of multiple voices, etc. As a result, it is possible to solve
the problem of the problem and to have a smooth meeting because the conversation between the
participants is not disturbed even when wearing headphones. As described above, when
performing a teleconference in which a plurality of persons participate in an environment in
which a third party is present in the surroundings, it is possible to prevent the ambient sound
from interfering with the conference. In addition, it is possible to prevent the voice in the
conference from being annoying to the third party, and it is possible to eliminate the interference
of the voices of a plurality of participants. Furthermore, echo problems as in the case of using a
conventional microphone and speaker can be eliminated.
[0040]
(Other Embodiment 1) The embodiment described above is a video conference system, but as
another embodiment, a conference apparatus in which the camera 112, the imaging element I / F
113, and the display I / F 117 are excluded from the terminal 1 It is also possible to configure as
an audio conference system used, and the same effect as the video conference system 10
described above can be obtained.
09-05-2019
11
[0041]
(Other Embodiment 2) The terminal in the video conference system 10 shown in FIG. 1
exemplifies a dedicated terminal provided in a conference location, but it may be a dedicated
terminal that can be carried.
Moreover, it is also possible to use not only a dedicated terminal but a terminal capable of
performing information processing such as a notebook computer, a tablet computer, a
smartphone, etc. as a terminal in the video conference system 10. Even in this case, it goes
without saying that the same effect as that of the video conference system 10 described above
can be obtained.
[0042]
1 terminal (conference device) 2 server 3 communication network 10 video conference system
101 CPU 102 ROM 103 RAM 104 flash memory 105 SSD 106 recording medium 107 media
drive 108 operation button 109 power switch 110 bus line 111 network I / F 112 camera 113
imaging Element I / F 114 Voice control unit 115 Headset 116 Voice input / output unit 117
Display I / F 118 External device connection I / F 119 Display cable 120 Display 1141 Headset
connection unit 1142 Transmission voice detection unit 1143 Transmission voice selection unit
1144 transmission voice transmission unit 1145 reception voice reception unit 1146 reception
voice detection unit 1147 reception voice selection unit
[0043]
Unexamined-Japanese-Patent No. 2006-237839
09-05-2019
12