close

Вход

Забыли?

вход по аккаунту

JP2016177782

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2016177782
Abstract: [Problem] To provide a wearable translation device that is less likely to lose the
naturalness of conversation when translating a conversation between speakers of different
languages and retranslating the translation result. A wearable translation device (1) includes a
microphone device (13) for acquiring a voice of a first language from a user and converting the
voice into a voice signal of the first language. A control circuit 11 for acquiring an audio signal of
the second language converted from the audio signal of the first language and acquiring an audio
signal of the first language reconverted from the audio signal of the second language; And an
audio processing circuit that executes predetermined processing on an audio signal of one
language. A speaker device 15 converts a voice signal of the second language into voice and
outputs the voice signal, and a speaker device 17 converts the voice signal of the first language
subjected to the processing into voice and outputs the voice. The voice processing circuit 16
executes processing to direct the voice output from the speaker device 17 to the direction of the
user's hearing device based on the relative position of the user's hearing device with respect to
the speaker device 17. [Selected figure] Figure 1
Wearable device and translation system
[0001]
The present disclosure relates to a wearable device worn and used on a user's body in order to
automatically and in real time translate conversations between different language speakers.
[0002]
03-05-2019
1
With the development of techniques for speech recognition, machine translation and speech
synthesis, translation devices are known that translate speech between different speakers
automatically and in real time.
Some such translation devices are portable or wearable.
[0003]
For example, when translating a speech from a speaker in the first language to a speaker in the
second language using a translation device, the speaker in the first language wants to confirm
whether the content of the translated speech is correct or not. I have a hope. Thus, for example,
Patent Documents 1 and 2 disclose a translation apparatus that retranslates an utterance
translated into the second language into the first language and feeds it back to the speaker of the
first language. The translation apparatus of Patent Documents 1 and 2 feeds back the retranslation result to the speaker of the first language by display on a display or by voice.
[0004]
Patent Document 1: JP-A-2001-222531 Patent Document 2: JP-A-2007-272260 Patent
Document 2: International Publication No. 2013/105413 Patent Document 2: JP-A-2012093705
[0005]
In order to improve the convenience of the translation device, for example, it is possible to make
the speaker and the listener aware of the presence of the translation device as much as possible
when using the translation device, and it is a natural conversation even if the translation device
intervenes Needs to be able to be recognized by people and listeners.
[0006]
In a portable or wearable translation device, when the retranslation result is fed back to the
speaker of the first language, providing a display for displaying the retranslation result increases
the size of the translation device.
03-05-2019
2
Therefore, feedback may be provided by voice alone without providing a display.
However, if the speaker of the second language hears the voice of the first language outputted as
a result of the retranslation together with the voice of the second language translated, the
conversation may be disturbed.
[0007]
The present disclosure provides a wearable device and a translation system that preserves the
naturalness of conversation when translating a conversation between different language
speakers and retranslating the translation result.
[0008]
A wearable device according to an aspect of the present disclosure is a wearable device that can
be worn at a predetermined position on a user's body, and is a microphone that acquires a voice
of a first language from the user and converts it into a voice signal of a first language It has an
apparatus.
Further, a control circuit for acquiring an audio signal of the second language converted from the
audio signal of the first language and acquiring an audio signal of the first language reconverted
from the audio signal of the second language, and reconverted And an audio processing circuit
that executes predetermined processing on the audio signal of the first language. The speaker
system further includes a first speaker device that converts an audio signal of a second language
into voice and outputs the voice, and a second speaker device that converts the voice signal of
the first language subjected to the processing into voice and outputs the voice. . The audio
processing circuit is configured to convert the first output of the second speaker device to the
direction of the user's hearing device based on the relative position of the user's hearing device
relative to the second speaker device. Perform speech signal processing of the language.
[0009]
The wearable translation device and the translation system according to the present disclosure
are effective in preserving the naturalness of conversation when translating a conversation
between speakers of different languages and retranslating the translation result.
03-05-2019
3
[0010]
A block diagram showing a configuration of a translation system according to a first embodiment
A diagram showing a first example of a wearable translation device of a translation system
according to a first embodiment with a user wearing the first embodiment A diagram showing a
second example of a wearable translation device of a translation system according to a second
embodiment in which the user wears a diagram A diagram showing a third example of a state in
which a user wears a wearable translation device of the translation system according to the first
embodiment Sequence diagram showing the first part of the operation of the translation system
according to the first embodiment Sequence diagram showing the second part of the operation of
the translation system according to the first embodiment according to the first embodiment The
figure explaining measurement of relative position of a user's hearing instrument to the speaker
apparatus of the wearable translation device of a translation system The wearable translation
device of a translation system concerning a 1st embodiment The figure which illustrates the
direction of the sound which is output respectively from the speaker device when it is used The
block diagram which shows the constitution of the translation system which relates to the 2nd
embodiment A sequence diagram showing an operation of a translation system according to a
third embodiment A block diagram showing a configuration of a wearable translation device of a
translation system according to a fourth embodiment
[0011]
Hereinafter, embodiments will be described in detail with reference to the drawings as
appropriate.
However, the detailed description may be omitted if necessary.
For example, detailed description of already well-known matters or redundant description of
substantially the same configuration may be omitted. This is to avoid unnecessary redundancy in
the following description and to facilitate understanding by those skilled in the art.
[0012]
It is to be understood that the attached drawings and the following description are provided to
enable those skilled in the art to fully understand the present disclosure, and they are not
intended to limit the claimed subject matter.
03-05-2019
4
[0013]
First Embodiment A wearable translation apparatus according to a first embodiment will be
described below with reference to FIGS.
[0014]
[1−1.
Configuration] FIG. 1 is a block diagram showing a configuration of translation system 100
according to the first embodiment.
The translation system 100 includes a wearable translation device 1, an access point device 2, a
speech recognition server device 3, a machine translation server device 4, and a speech synthesis
server device 5.
[0015]
The wearable translation device 1 can be attached to a predetermined position of the user's body.
The wearable translation device 1 is attached to, for example, the chest or abdomen of a user.
The wearable translation device 1 wirelessly communicates with the access point device 2. The
access point device 2 communicates with the speech recognition server device 3, the machine
translation server device 4, and the speech synthesis server device 5 via, for example, the
Internet. Therefore, the wearable translation device 1 communicates with the speech recognition
server device 3, the machine translation server device 4, and the speech synthesis server device
5 via the access point device 2. The speech recognition server device 3 converts the speech
signal into text. The machine translation server device 4 converts the text of the first language
into the text of the second language, and converts the text of the second language into the text of
the first language. The speech synthesis server device 5 converts the text into a speech signal.
[0016]
The voice recognition server device 3, the machine translation server device 4, and the voice
synthesis server device 5 are computer devices each having a control circuit such as a CPU and a
03-05-2019
5
memory. In the speech recognition server device 3, the control circuit executes processing for
converting the speech signal of the first language into the text of the first language according to
a predetermined program. In the machine translation server device 4, the control circuit executes
a process of converting the text of the first language into the text of the second language
according to a predetermined program. In the speech synthesis server 5, the control circuit
converts the text of the second language into a speech signal of the second language according to
a predetermined program. In the present embodiment, the speech recognition server device 3,
the machine translation server device 4, and the speech synthesis server device 5 are
respectively configured by different computer devices, but may be configured by a single server
device. Alternatively, a plurality of server devices may be used to execute distributed functions.
[0017]
In the present embodiment, the case where the user of the wearable translation apparatus 1 is a
speaker of the first language and talks with the speaker of the second language facing the user
will be described. Further, in the present embodiment, the case where the first language is
Japanese and the second language is English will be described. In the following description, it is
assumed that the speaker in the second language does not speak but participates in the
conversation only as a listener. Also, re-translation means translating the result of translating one
language into a different language back to the original language.
[0018]
The wearable translation device 1 includes a control circuit 11, a position measurement device
12, a microphone device 13, a wireless communication circuit 14, a speaker device 15, an audio
processing circuit 16, and a speaker device 17. The position measurement device 12 measures
the relative position of the user's 31 hearing device (for example, the right ear, the left ear, or
both ears) with respect to the speaker device 17. The microphone device 13 acquires the voice of
the first language from the user and converts the voice into the voice signal of the first language.
The wireless communication circuit 14 communicates with the voice recognition server device 3,
the machine translation server device 4, and the voice synthesis server device 5 outside the
wearable translation device 1 via the access point device 2. The control circuit 11 receives the
speech signal of the second language translated from the speech signal of the first language from
the speech recognition server device 3, the machine translation server device 4 and the speech
synthesis server device 5 via the wireless communication circuit 14. A voice signal of the first
language which is obtained and output as a result of retranslation of the voice signal of the
second language is obtained. The speech processing circuit 16 executes predetermined
03-05-2019
6
processing on the speech signal of the first language output as a result of the retranslation. The
speaker device 15 converts the voice signal of the second word into voice and outputs it. The
speaker device 17 converts the processed voice signal of the first language into voice and
outputs it.
[0019]
The wearable translation device 1 includes a plurality of speakers for converting a voice signal of
the second language and / or a processed voice signal of the first language into voice and
outputting the voice. At least one of the plurality of speakers constitutes the first speaker device
15, and at least two of the plurality of speakers constitute the second speaker device 17.
[0020]
FIG. 2 is a diagram showing a first example of a state in which the user 31 wears the wearable
translation device 1 of the translation system 100 according to the first embodiment. The
wearable translation device 1 is worn on the chest or abdomen of the user 31 by being put on
the neck of the user 31 by, for example, the strap 21. The microphone device 13 includes, for
example, a microphone array including at least two microphones disposed at a predetermined
distance from each other in the vertical direction with respect to the ground when the user 31
wears the wearable translation device 1 as illustrated in FIG. It is. The microphone device 13 has
a beam in a direction from the microphone device 13 to the user's voice generator 31a (for
example, the mouth). Here, the speaker is not only the user's mouth, but also a part including the
mouth peripheral part such as the user's chin and lower nose part, and is a part where distance
information from the speaker device 17 can be obtained. As shown in FIG. 8, the speaker device
15 is provided so as to output voice toward a listener facing the user 31 when the user 31 wears
the wearable translation device 1. As shown in FIG. 8, when the user 31 wears the wearable
translation device 1, the speaker device 17 outputs voice toward the hearing instrument 31 b
(for example, the right ear, the left ear, or both ears) of the user 31. Provided to When the user
31 wears the wearable translation device 1 as shown in FIG. 2, for example, the speaker device
15 is provided on the front of the wearable translation device 1 and the speaker device 17 is
provided on the upper surface of the wearable translation device 1.
[0021]
03-05-2019
7
FIG. 3 is a diagram showing a second example of the wearable translation device 1 of the
translation system 100 according to the first embodiment in a state where the user 31 wears the
wearable translation device 1. The wearable translation apparatus 1 may be attached to the chest
or abdomen of the clothes of the user 31 with a pin or the like. The wearable translation device 1
may be configured in, for example, a name tag type.
[0022]
FIG. 4 is a diagram showing a third example of a state in which the user 31 wears the wearable
translation device 1 of the translation system 100 according to the first embodiment. The
wearable translation device 1 may be attached to the arm of the user 31 by a belt 22, for
example.
[0023]
In the wearable translation apparatus 1 of FIG. 1, the audio processing circuit 16 outputs the
sound output from the speaker device 17 to the user 31 based on the relative position of the
hearing instrument 31 b of the user 31 with respect to the speaker device 17. The audio signal of
the first language output as a result of re-translation is processed (oriented processing) so as to
point in the direction of the auditory organ 31b.
[0024]
[1−2.
Operation] FIG. 5 is a sequence diagram showing a first part of the operation of translation
system 100 according to the first embodiment. When a voice signal of Japanese (first language)
is input from the user 31 via the microphone device 13, the control circuit 11 sends the input
voice signal to the voice recognition server device 3. The voice recognition server device 3
performs voice recognition on the input voice signal to generate recognized Japanese text and
sends it to the control circuit 11. When the Japanese text is sent from the speech recognition
server device 3, the control circuit 11 sends the Japanese text to the machine translation server
device 4 together with a control signal instructing translation from Japanese to English. The
machine translation server device 4 performs machine translation of Japanese text to generate
translated English (second language) text and sends it to the control circuit 11. When the English
text is sent from the machine translation server device 4, the control circuit 11 sends the English
03-05-2019
8
text to the speech synthesis server device 5. The speech synthesis server device 5 performs
speech synthesis of English text to generate a synthesized English speech signal, and sends it to
the control circuit 11. When an English voice signal is sent from the voice synthesis server device
5, the control circuit 11 converts the English voice signal into voice by the speaker device 15 and
outputs it.
[0025]
FIG. 6 is a sequence diagram showing a second part of the operation of the translation system
100 according to the first embodiment. FIG. 6 shows the operation after FIG. When a voice signal
of English (second language) is sent from the voice synthesis server device 5, the control circuit
11 sends the voice signal of English to the voice recognition server device 3 for re-translation.
The speech recognition server device 3 performs speech recognition on an English speech signal,
generates recognized English text, and sends it to the control circuit 11. When the English text is
sent from the speech recognition server device 3, the control circuit 11 sends the English text to
the machine translation server device 4 together with a control signal instructing retranslation
from English to Japanese. The machine translation server device 4 performs machine translation
of the English text, generates Japanese (first language) text output as a result of retranslation,
and sends it to the control circuit 11. When the Japanese text is sent from the machine
translation server device 4, the control circuit 11 sends the Japanese text to the speech synthesis
server device 5. The speech synthesis server device 5 performs speech synthesis of Japanese
text, generates a synthesized Japanese speech signal, and sends it to the control circuit 11. When
the Japanese speech signal is sent from the speech synthesis server device 5, the control circuit
11 sends the Japanese speech signal to the speech processing circuit 16. The voice processing
circuit 16 outputs the result of the retranslation so that the voice output from the speaker device
17 is directed to the direction of the user's hearing device 31b based on the relative position of
the user 31's hearing device 31b to the speaker device 17. Processing of the first language
speech signal to be processed. The voice processing circuit 16 converts the processed voice
signal into voice and outputs the voice from the speaker device 17.
[0026]
It should be noted that when it is not detected that the hearing instrument 31b is located at a
predetermined distance from the wearable translation apparatus 1, or the hearing instrument
31b is in a predetermined direction with respect to the wearable translation apparatus 1 (a
direction in which the speaker device 17 faces. For example, when it is not detected to be
positioned in the upper direction, the audio processing circuit 16 may end the processing and
03-05-2019
9
may not output audio.
[0027]
FIG. 7 is a view for explaining measurement of the relative position of the hearing instrument
31b of the user 31 with respect to the speaker device 17 of the wearable translation device 1 of
the translation system 100 according to the first embodiment. The position measurement device
12 is provided on the upper surface of the wearable translation device 1 when the user 31 wears
the wearable translation device 1 as shown in FIG. 7, for example. The position measurement
device 12 includes a speaker and a microphone. The position measurement device 12 emits an
impulse signal toward the head of the user 31 by the speaker of the position measurement
device 12, and receives an impulse signal reflected by the lower jaw of the user 31 by the
microphone of the position measurement device 12. Thus, the position measurement device 12
measures the distance D from the position measurement device 12 to the lower jaw of the user
31. The relative position of the loudspeaker device 17 to the position measurement device 12 is
known. The relative positions of the right ear and the left ear with respect to the lower jaw of the
user 31 can be set in advance without being significantly different even for different users.
Therefore, when the user 31 wears the wearable translation device 1 as shown in FIG. 7, the
relative position of the hearing instrument 31 b of the user 31 to the speaker device 17 can be
obtained by measuring the distance D.
[0028]
Here, although the distance from the speaker device 17 to the lower jaw of the user 31 is
measured as an example of detecting the position of the hearing instrument 31b of the user 31
with respect to the speaker device 17, another detection method may be used. That is, it is only
necessary to be able to detect the position of the hearing instrument 31 b of the user 31 so that
the sound of the speaker device 17 can be directed to the direction of the hearing instrument 31
b of the user 31.
[0029]
The position measurement device 12 may measure the relative position of the hearing
instrument of the user 31 with respect to the speaker device 17 using, for example, the
technology of Patent Document 3 or 4.
03-05-2019
10
[0030]
FIG. 8 is a diagram for explaining the directions of sounds output from the speaker devices 15
and 17 when the wearable translation device 1 of the translation system 100 according to the
first embodiment is used.
The user 31 is a speaker of the first language, and the user 31 faces the listener 32 who is a
speaker of the second language. Under a normal situation in which the user 31 and the listener
32 talk, the user 31 and the listener 32 face each other approximately 1 to 3 m apart while
standing or sitting. When the user 31 wears the wearable translation device 1 as shown in FIG. 2,
for example, the wearable translation device 1 is located below the user 31's 31b hearing
instrument and somewhere in the range from immediately below the neck to the waist. Also, the
hearing aids (both ears) 31b, 32b of the user 31 and the listener 32 are in a horizontal plane
parallel to the ground. In this case, in order to output sound from the speaker device 17 in the
direction of the hearing aid of the user 31, for example, a technique of stereo dipole reproduction
can be used. The speaker device 17 includes two speakers arranged in close proximity to each
other to perform stereo dipole reproduction. The voice processing circuit 16 directs the voice
output from the speaker device 17 to the direction of the hearing device 31 b of the user 31
based on the relative position of the user 31 's hearing device 31 b with respect to the speaker
device 17 and the head related transfer function of the user 31. To filter the speech signal of the
first language output as a result of retranslation.
[0031]
The audio processing circuit 16 may perform the following processing instead of stereo dipole
reproduction. The speaker device 17 includes a plurality of speakers disposed at a predetermined
distance from one another. The sound processing circuit 16 distributes the sound signal of the
second language into a plurality of sound signals corresponding to a plurality of speakers, and
directs the sound outputted from the speaker device 17 to the direction of the hearing
instrument 31 b of the user 31. The audio processing circuit 16 may change the phase so that
the arrival time of the distributed audio signal to the left and right ears is the same. For example,
thereby, the direction of the sound output from the speaker device 17 can be changed.
[0032]
The speaker device 15 may include a plurality of speakers arranged to have a predetermined
distance from each other, and have a beam in a direction from the speaker device 15 toward a
03-05-2019
11
virtual person (for example, the listener 32) facing the user 31 .
[0033]
The wearable translation device 1 may include a gravity sensor for detecting whether the
wearable translation device 1 is substantially stationary.
If the wearable translation device 1 is not stationary, it is not possible to measure the accurate
relative position of the user 31's hearing instrument relative to the speaker device 17. Therefore,
the measurement of the relative position of the user 31's hearing instrument relative to the
speaker device 17 may be stopped. Alternatively, when the wearable translation device 1 is not
stationary, the relative position of the user 31's hearing instrument to the speaker device 17 may
be roughly measured. The speech processing circuit 16 outputs the speech of the first language
output as a result of the retranslation so that the speech output from the speaker device 17 is
directed to the direction of the user's hearing device based on the roughly measured relative
position. Signal processing may be performed.
[0034]
The position measurement device 12 first measures roughly the relative position of the hearing
instrument of the user 31 to the speaker device 17 (for example, when the user 31 wears the
wearable translation device 1). The voice processing circuit 16 outputs the result of the retranslation as a result of the re-translation so that the voice output from the speaker device 17 is
directed to the direction of the hearing aid 31 b of the user 31 based on the relative position
measured roughly. Audio signal processing may be performed. Thereafter, the position
measurement device 12 measures the more accurate relative position of the hearing instrument
31 b of the user 31 with respect to the speaker device 17. The voice processing circuit 16 retranslates so that the voice output from the speaker device 17 is directed to the direction of the
user's 31 hearing device based on the more accurate relative position of the user 31's hearing
device 31b with respect to the speaker device 17. Processing may be performed on the speech
signal of the first language to be output as a result.
[0035]
03-05-2019
12
[1−3. Effect etc.] The wearable device equivalent to the wearable translation device 1
according to the first embodiment can be attached to a predetermined position of the body of the
user 31, and acquires the voice of the first language from the user 31 A microphone device 13 is
provided for converting speech signals into speech. Further, the control circuit 11 acquires a
speech signal of the second language converted from the speech signal of the first language, and
acquires a speech signal of the first language reconverted from the speech signal of the second
language; And an audio processing circuit 16 for executing predetermined processing on the
audio signal of the first language. Further, the wearable translation device 1 converts the voice
signal of the second language into voice and outputs the first speaker device corresponding to
the speaker device 15 that outputs the voice signal, and the voice signal of the first language
which has been subjected to predetermined processing. And a second speaker device
corresponding to the speaker device 17 that outputs the converted signal. The voice processing
circuit 16 reconverts the voice output from the second speaker device to direct the voice of the
second speaker device toward the user's hearing device 31b based on the relative position of the
user 31's hearing device relative to the second speaker device. Execute processing of the voice
signal of the first language. As a result, the conversation between different speakers is converted,
the conversion result is converted again, and even if feedback is provided only by voice without
providing a display for displaying the conversion result, the natural nature of the conversation
can be obtained. It is possible to provide a wearable device equivalent to the wearable translation
device 1 which is not easily damaged. As a result, it is possible to provide the user with
translation experiences such as simplicity and feeling light unique to the wearable type
translation device. In addition, since the reconverted voice is pinpointed at the user's ear, the
user 31 can easily recognize the reconverted voice, and a display is provided to determine
whether the content of the converted speech is correct. It can be confirmed by voice only.
[0036]
The wearable translation apparatus 1 according to the first embodiment may be worn on the
chest or abdomen of the user 31. As a result, it is possible to provide the user with translation
experiences such as simplicity and feeling light unique to the wearable type translation
device.
[0037]
According to the wearable device corresponding to the wearable translation device 1 according
to the first embodiment, the second speaker device corresponding to the speaker device 17
includes two speakers arranged in proximity to each other, and the stereo dipole reproduction is
03-05-2019
13
performed You may In addition, the audio processing circuit 16 is a first language reconverted
based on the relative position of the hearing instrument 31 b of the user 31 to the second
speaker device corresponding to the speaker device 17 and the head related transfer function of
the user 31. Processing of the audio signal of This enables pinpointed playback of the
reconverted voice at the user's ear, using the existing technology of stereo dipole playback.
[0038]
According to the wearable device corresponding to the wearable translation device 1 according
to the first embodiment, the second speaker device corresponding to the speaker device 17 may
include a plurality of speakers arranged with a predetermined distance from each other Good.
Further, the audio processing circuit 16 may distribute the audio signal of the second language
into a plurality of audio signals corresponding to a plurality of speakers, and adjust the phases of
the distributed audio signals. This allows pinpointing of the reconverted speech at the user's ear
using existing techniques of beamforming.
[0039]
According to the wearable device corresponding to the wearable translation device 1 according
to the first embodiment, the microphone device 13 may include a plurality of microphones
arranged with a predetermined distance from each other. In addition, the beam may be provided
in the direction from the microphone device 13 to the speaker 31 a of the user 31. As a result,
the influence of noise other than the speech of the user 31 (for example, the speech of the
listener 32 in FIG. 8) becomes less likely.
[0040]
According to the wearable device corresponding to the wearable translation device 1 according
to the first embodiment, the first speaker device corresponding to the speaker device 15 may
include a plurality of speakers arranged with a predetermined distance from each other Good.
Further, the beam may be provided in a direction from the first speaker device corresponding to
the speaker device 15 toward the virtual person facing the user 31. As a result, the user 31 is
less likely to be affected by the converted second language speech, and it becomes easier to
recognize the reconverted first language speech.
03-05-2019
14
[0041]
The wearable device equivalent to the wearable translation device 1 according to the first
embodiment further comprises a position measurement device 12 for measuring the relative
position of the hearing instrument 31b of the user 31 to a second speaker device equivalent to
the speaker device 17. May be Thereby, based on the actual relative position of the user's 31
hearing instrument with respect to the speaker apparatus 17, the reconverted voice can be
pinpointedly reproduced at the user's ear.
[0042]
The translation system 100 according to the first embodiment includes a wearable device
corresponding to the wearable translation device 1 further including a communication circuit
corresponding to the wireless communication circuit 14, and the voice recognition server device
3 and the machine are provided outside the wearable device. The translation server device 4 and
the speech synthesis server device 5 may be provided. The speech recognition server device 3
may convert the speech signal of the first language into text of the first language, and convert the
speech signal of the second language into text of the second language. The machine translation
server device 4 may convert the text of the first language into the text of the second language,
and reconvert the text of the second language into the text of the first language. The speech
synthesis server device 5 may convert the text of the second language into a speech signal of the
second language, and convert the text of the first language into a speech signal of the first
language. The control circuit 11 may obtain the speech signal of the second language and the
speech signal of the first language reconverted from the speech synthesis server device 5 via the
communication circuit corresponding to the wireless communication circuit 14. Thereby, the
configuration of the wearable translation device 1 can be simplified. For example, the speech
recognition server device 3, the machine translation server device 4, and the speech synthesis
server device 5 may be provided by a third party (cloud service) different from the manufacturer
or the seller of the wearable translation device 1. By using the cloud service, for example, a
multilingual wearable translation device can be provided at low cost.
[0043]
Second Embodiment A wearable translation apparatus according to a second embodiment will
now be described with reference to FIG.
03-05-2019
15
[0044]
The same components as those of the translation system 100 and the wearable translation device
1 according to the first embodiment may be denoted by the same reference numerals, and the
description thereof may be omitted.
[0045]
[2−1.
Configuration] FIG. 9 is a block diagram showing a configuration of translation system 200
according to the second embodiment.
The wearable translation device 1A of the translation system 200 according to the present
embodiment includes a user input device 18 in place of the position measurement device 12 of
FIG. Otherwise, the wearable translation device 1A of FIG. 9 is configured in the same manner as
the wearable translation device 1 of FIG.
[0046]
[2−2. Operation] The user input device 18 acquires a user input specifying the relative
position of the hearing instrument 31b (FIG. 7) of the user 31 with respect to the speaker device
17. The user input device 18 is configured of a touch panel, a button, and the like.
[0047]
In the wearable translation apparatus 1A, a plurality of predetermined distances (for example, far
(60 cm), middle (40 cm), and near (20 cm)) corresponding to the distance D in FIG. 7 are set to
be selectable. The user can select one of these distances using the user input device 18. The
control circuit 11 obtains the relative position of the hearing instrument 31 b of the user 31 with
respect to the speaker device 17 based on the distance input from the user input device 18 in
this manner.
03-05-2019
16
[0048]
[2−3. Effect etc.] The wearable device equivalent to the wearable translation device 1A
according to the second embodiment obtains a user input specifying the relative position of the
hearing instrument 31b of the user 31 to the second speaker device equivalent to the speaker
device 17 May further comprise a user input device 18. By removing the position measurement
device 12 of FIG. 1, the configuration of the wearable translation device 1A of FIG. 9 is simplified
as compared with the wearable translation device 1 of FIG.
[0049]
Third Embodiment A wearable translation apparatus according to a third embodiment will now
be described with reference to FIGS. 10 and 11. FIG.
[0050]
The same components as those of the translation system 100 and the wearable translation device
1 according to the first embodiment may be denoted by the same reference numerals, and the
description thereof may be omitted.
[0051]
[3−1.
Structure] FIG. 10 is a block diagram showing a structure of a translation system 300 according
to the third embodiment.
Translation system 300 includes wearable translation device 1, access point device 2, and
translation server device 41. The translation server device 41 includes a speech recognition
server device 3A, a machine translation server device 4A, and a speech synthesis server device
5A. The wearable translation device 1 and the access point device 2 of FIG. 10 are configured in
the same manner as the wearable translation device 1 and the access point device 2 of FIG. 1,
respectively. The speech recognition server device 3A, the machine translation server device 4A,
and the speech synthesis server device 5A in FIG. 10 respectively have the same functions as the
speech recognition server device 3, the machine translation server device 4, and the speech
synthesis server device 5 in FIG. Have. The access point device 2 communicates with the
03-05-2019
17
translation server device 41 via, for example, the Internet. Accordingly, the wearable translation
device 1 communicates with the translation server device 41 via the access point device 2.
[0052]
[3−2. Operation] FIG. 11 is a sequence diagram showing an operation of translation system
300 according to the third embodiment. When an audio signal of Japanese (first language) is
input from the user 31 via the microphone device 13, the control circuit 11 sends the input
audio signal to the translation server device 41. The voice recognition server device 3A of the
translation server device 41 performs voice recognition on the input voice signal, generates a
recognized Japanese text, and sends it to the machine translation server device 4A. The machine
translation server device 4A performs machine translation of Japanese text to generate
translated English (second language) text and sends it to the speech synthesis server device 5A.
The voice synthesis server device 5A performs voice synthesis of English text to generate a
synthesized English voice signal and sends it to the control circuit 11. When an English speech
signal is sent from the speech synthesis server device 5A, the control circuit 11 converts the
English speech signal into speech by the speaker device 15 and outputs the speech.
[0053]
When a voice signal of English (second language) is sent from the voice synthesis server device
5A, the control circuit 11 sends the voice signal of English to the translation server device 41 for
re-translation. The voice recognition server device 3A of the translation server device 41
performs voice recognition on the English voice signal to generate a recognized English text, and
sends it to the machine translation server device 4A. The machine translation server device 4A
performs machine translation of the English text to generate a retranslated Japanese (first
language) text and sends it to the speech synthesis server device 5A. The speech synthesis server
device 5 A performs speech synthesis of Japanese text, generates a synthesized Japanese speech
signal, and sends it to the wearable translation device 1. The control circuit 11 sends a Japanese
voice signal to the voice processing circuit 16 when a voice signal of Japanese is sent from the
voice synthesis server device 5A. The voice processing circuit 16 is output as a result of
retranslation so that the voice output from the speaker device 17 is directed to the direction of
the user 31 hearing device based on the relative position of the user 31 hearing device 31b to
the speaker device 17. Processing of the first language speech signal. The voice processing
circuit 16 converts the processed voice signal into voice by the speaker device 17 and outputs it.
03-05-2019
18
[0054]
[3−3. Effects etc.] The translation system 300 according to the third embodiment may
include the speech recognition server device 3A, the machine translation server device 4A, and
the speech synthesis server device 5A as an integrated translation server device 41. As a result,
the number of times of communication can be reduced compared to the translation system 100
including the wearable translation device 1 according to the first embodiment, and time and
power consumption for communication can be reduced.
[0055]
Fourth Embodiment A wearable translation apparatus according to a fourth embodiment will now
be described with reference to FIG.
[0056]
The same components as those of the translation system 100 and the wearable translation device
1 according to the first embodiment may be denoted by the same reference numerals, and the
description thereof may be omitted.
[0057]
[4−1.
Configuration] FIG. 12 is a block diagram showing a configuration of a wearable translation
apparatus 1B according to a fourth embodiment.
The wearable translation device 1B of FIG. 12 has the functions of the speech recognition server
device 3, the machine translation server device 4 and the speech synthesis server device 5 of FIG.
The wearable translation device 1B includes a control circuit 11B, a position measurement device
12, a microphone device 13, a speaker device 15, a voice processing circuit 16, a speaker device
17, a voice recognition circuit 51, a machine translation circuit 52, and a voice synthesis circuit
53. The position measurement device 12, the microphone device 13, the speaker device 15, the
audio processing circuit 16, and the speaker device 17 in FIG. 12 are respectively configured in
the same manner as the corresponding components in FIG. 1. The speech recognition circuit 51,
the machine translation circuit 52, and the speech synthesis circuit 53 respectively have the
03-05-2019
19
same functions as the speech recognition server device 3, the machine translation server device
4, and the speech synthesis server device 5 of FIG. The control circuit 11B retranslateds the
speech signal of the second language and the speech signal of the second language which are
translated from the speech signal of the first language by the speech recognition circuit 51, the
machine translation circuit 52, and the speech synthesis circuit 53. As a result, an audio signal of
the first language to be output is obtained.
[0058]
[4−2. Operation] When an audio signal of Japanese (first language) is input from the user
31 via the microphone device 13, the control circuit 11B sends the input audio signal to the
speech recognition circuit 51. The speech recognition circuit 51 performs speech recognition on
the input speech signal to generate recognized Japanese text and sends it to the control circuit
11B. When the Japanese text is sent from the speech recognition circuit 51, the control circuit
11B sends the Japanese text to the machine translation circuit 52 together with a control signal
instructing translation from Japanese to English. The machine translation circuit 52 performs
machine translation of Japanese text to generate translated English (second language) text and
sends it to the control circuit 11B. When the English text is sent from the machine translation
circuit 52, the control circuit 11B sends the English text to the speech synthesis circuit 53. The
speech synthesis circuit 53 performs speech synthesis of English text to generate a synthesized
English speech signal and sends it to the control circuit 11B. When an English speech signal is
sent from the speech synthesis circuit 53, the control circuit 11B converts the English speech
signal into speech by the speaker device 15 and outputs the speech.
[0059]
When a speech signal in English (second language) is sent from the speech synthesis circuit 53,
the control circuit 11B sends the speech signal in English to the speech recognition circuit 51 for
retranslation. The speech recognition circuit 51 performs speech recognition on an English
speech signal to generate recognized English text and sends it to the control circuit 11B. When
the English text is sent from the speech recognition circuit 51, the control circuit 11B sends the
English text to the machine translation circuit 52 together with a control signal instructing
retranslation from English to Japanese. The machine translation circuit 52 performs machine
translation of the English text to generate a retranslated Japanese (first language) text and sends
it to the control circuit 11B. When the Japanese text is sent from the machine translation circuit
52, the control circuit 11B sends the Japanese text to the speech synthesis circuit 53. The speech
synthesis circuit 53 performs speech synthesis of Japanese text, generates a synthesized
03-05-2019
20
Japanese speech signal, and sends it to the control circuit 11B. When the Japanese speech signal
is sent from the speech synthesis circuit 53, the control circuit 11B sends the Japanese speech
signal to the speech processing circuit 16. The voice processing circuit 16 outputs the result of
the retranslation so that the voice output from the speaker device 17 is directed to the direction
of the user's hearing device 31b based on the relative position of the user 31's hearing device
31b to the speaker device 17. Processing of the first language speech signal to be processed. The
voice processing circuit 16 converts the processed voice signal into voice by the speaker device
17 and outputs it.
[0060]
After the speech recognition circuit 51 performs speech recognition to generate a recognized
first language text, the speech recognition circuit 51 may send it to the machine translation
circuit 52 instead of the control circuit 11B. Similarly, after machine translation circuit 52
performs machine translation to generate translated or retranslated text, it may send it to speech
synthesis circuit 53 instead of control circuit 11B.
[0061]
[4−3. Effects etc.] The wearable device equivalent to the wearable translation device 1B
according to the fourth embodiment includes a speech recognition circuit 51 for converting an
audio signal of the first language into text of the first language, It may further include a machine
translation circuit 52 for converting text in two languages, and a speech synthesis circuit 53 for
converting text in a second language into speech signals in a second language. Further, the
control circuit 11B acquires the speech signal of the second language from the speech synthesis
circuit 53, the speech recognition circuit 51 converts the speech signal of the second language
into text of the second language, and the machine translation circuit 52 The text of the second
language may be reconverted into the text of the first language, and the speech synthesis circuit
53 may convert the reconverted text of the first language into a speech signal of the first
language. The control circuit 11B may acquire the speech signal of the first language from the
speech synthesis circuit 53. As a result, the wearable translation device 1B can translate a
conversation between speakers of different languages without communicating with an external
server device.
[0062]
03-05-2019
21
Other Embodiments As described above, the first to fourth embodiments have been described as
examples of the technology disclosed in the present application. However, the technology in the
present disclosure is not limited to this, and is also applicable to embodiments in which changes,
replacements, additions, omissions, and the like are appropriately made. Moreover, it is also
possible to combine each component demonstrated in the 1st-4th embodiment, and to set it as a
new embodiment.
[0063]
Therefore, other embodiments will be exemplified below.
[0064]
In the first to third embodiments, the wireless communication circuit 14 has been described as
an example of the communication circuit of the wearable translation device, but the
communication circuit includes an external voice recognition server device, a machine translation
server device, and a voice synthesis It may be one that can communicate with the server device.
Therefore, the wearable translation device may be connected by wire to an external speech
recognition server device, a machine translation server device, and a speech synthesis server
device.
[0065]
In the first to fourth embodiments, although the control circuit, the communication circuit, and
the audio processing circuit of the wearable translation device are shown as separate blocks,
these circuits may be configured as a single integrated circuit chip. . Also, the functions of the
control circuit, the communication circuit, and the audio processing circuit of the wearable
translation device may be implemented by a program executed on a general-purpose processor.
[0066]
In the first to fourth embodiments, the case where only one user (speaker) uses the wearable
translation apparatus has been described, but each of a plurality of different language speakers
trying to talk to each other is the wearable translation An apparatus may be used.
03-05-2019
22
[0067]
In the first to fourth embodiments, the processing of the speech signal of the first language
output as a result of retranslation is performed so that the speech output from the speaker
device 17 is directed to the direction of the hearing aid 31b of the user 31. However, processing
may be performed on the audio signal of the first language output as a result of retranslation so
that the audio output from the speaker device 17 is directed to a direction other than the user's
31 hearing instrument. .
[0068]
In the first to fourth embodiments, the case where the first language is Japanese and the second
language is English has been described, but the first language and the second language may be
any other languages. Good.
[0069]
In the first and second embodiments, the speech recognition server device 3 performs speech
recognition in both the first language and the second language, and the machine translation
server device 4 performs translation from the first language to the second language and the
second language. It has been described that the translation from the two languages into the first
language is performed, and the speech synthesis server device 5 performs the speech synthesis
in both the first language and the second language.
However, separate speech recognition server devices may be used to perform speech recognition
of the first language and speech recognition of the second language.
Separate machine translation server devices may be used to translate the first language into the
second language and the translation from the second language into the first language.
Separate speech synthesis server devices may be used to perform speech synthesis of the first
language and speech synthesis of the second language. The same applies to the translation server
device 41 of the third embodiment, the speech recognition circuit 51 of the fourth embodiment,
the machine translation circuit 52, and the speech synthesis circuit 53.
03-05-2019
23
[0070]
In the first to fourth embodiments, the translated speech signal of the second language is
converted into speech by the speaker device 15 and output, and then retranslation of the speech
signal of the second language is performed. However, the control circuit 11 may wait for the
speaker device 15 to convert the speech signal of the second language into speech and output
the speech signal until the speech signal of the first language output as a result of retranslation is
acquired. . The control circuit 11 converts the voice signal of the second language into voice by
the speaker device 15 and outputs the voice signal, and converts the voice signal of the first
language according to the processed retranslation into voice by the speaker device 17 and
outputs the voice signal. The steps may be performed substantially simultaneously. The wearable
translation apparatus further includes a user input device, and the control circuit 11 outputs a
voice of the first language output as a result of the retranslation, and a user input indicating that
the content is appropriate is input through the user input device. After being obtained, the
translated second language speech may be output. In this case, when a user input indicating that
the content of the first language speech output as a result of retranslation is not appropriate is
obtained through the user input device, the control circuit 11 converts the translated second
language text. The other candidate may be acquired from the machine translation server
apparatus 4 and the speech signal of the first language obtained by re-translating the text of the
second language may be output.
[0071]
As described above, the embodiment has been described as an example of the technology in the
present disclosure. For that purpose, the attached drawings and the detailed description are
provided.
[0072]
Therefore, among the components described in the attached drawings and the detailed
description, not only components essential for solving the problem but also components not
essential for solving the problem in order to exemplify the above-mentioned technology May also
be included. Therefore, the fact that those non-essential components are described in the
accompanying drawings or the detailed description should not immediately mean that those nonessential components are essential.
03-05-2019
24
[0073]
Moreover, since the above-mentioned embodiment is for illustrating the technique in this
indication, various change, substitution, addition, omission, etc. can be performed in a claim or its
range of equality.
[0074]
According to the present disclosure, it is possible to provide a wearable translation device that is
less likely to lose the naturalness of conversation when translating a conversation between
speakers of different languages and retranslating the translation result.
[0075]
1, 1A, 1B wearable translation device 2 access point device 3, 3A voice recognition server device
4, 4A machine translation server device 5, 5A voice synthesis server device 11, 11B control
circuit 12 position measurement device 13 microphone device 14 wireless communication
circuit 15 , 17 Speaker device 16 Speech processing circuit 18 User input device 21 Strap 22
Belt 31 User (speaker) 32 Listener 41 Translation server device 51 Speech recognition circuit 52
Machine translation circuit 53 Speech synthesis circuit
03-05-2019
25
1/--страниц
Пожаловаться на содержимое документа