close

Вход

Забыли?

вход по аккаунту

JPH0713582

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPH0713582
[0001]
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a
portable voice recognition output assisting apparatus suitable for use by those who produce
unclear voices such as vocal cord extractors and physically disabled persons, and particularly
easy to carry. The present invention relates to a portable voice recognition output assisting
device capable of enabling appropriate recognition of unclear speech and assisting in speech.
[0002]
2. Description of the Related Art A conventional speech recognition apparatus stores, in advance,
a large number of standard speech patterns corresponding to the speech produced by a healthy
person, and a speech pattern produced from the mouth of the healthy person and a large number
of standard speech stored in advance. The pattern is compared and collated, and if there is a
standard voice pattern that matches the voice pattern uttered by a healthy person, it is
performed to recognize the voice uttered by the healthy person from the standard voice pattern.
[0003]
On the other hand, a speech synthesizer is an apparatus for outputting speech, which records
speech uttered by an announcer, compresses and records it into low bits by an analysis method,
and reproduces it when output. There is a rule combining method in which single tones are
combined corresponding to pseudonyms and accents and intonations are superimposed.
01-05-2019
1
The former is used as the output of the voice response device, and is used in the order entry field
in combination with the PB input of the touch phone. In the latter, technology has been
developed to convert Japanese and English sentences directly into speech, and there is great
hope for future technology development.
[0004]
However, the conventional speech recognition apparatus performs speech recognition from a
speech pattern uttered from the mouth of a healthy person, for example, a person who has
extracted a vocal cord by surgery, etc., and a tongue with a tongue cancer. It is not possible to
recognize at all the voice emitted from a lost person or a non-healthy person who produces an
unclear voice. The reason is not only unrecognizable in order to utter vague speech, but also to
detect air vibration of speech uttered from the mouth, so the person who extracted the vocal
cords and the tongue loses the tongue In the case of a person, it is not applicable because it does
not utter voice from the mouth originally.
[0005]
Depending on future technological progress, it will be possible to recognize an unspecified
number of voices, and various devices using voice recognition devices will be used in daily life,
but in any case , It is thought that it is development of an effective device for healthy people.
Therefore, it seems very difficult for non-healthy people with various disabilities to fully utilize
their new devices even if their speech is unclear or the rate of sound generation is slow. Be
[0006]
On the other hand, in the case of the above-mentioned speech synthesizer, it does not become a
speech signal containing many words or emotions uttered by an individual, and it is still
insufficient from the viewpoint of conversation.
[0007]
The present invention has been made in view of the above-described circumstances, and an
object of the present invention is to provide a portable voice recognition output assisting device
capable of reliably inputting a signal corresponding to voice even if a person who can not speak
voice from the mouth.
01-05-2019
2
[0008]
In addition, another object of the present invention is to provide a portable speech recognition
output assisting device that correctly recognizes unclear speech uttered by a non-healthy person
and realizes speech synthesis including emotions of the person who utters the speech. It is.
[0009]
Furthermore, another object of the present invention is to provide a portable voice recognition
output assisting device that generates an appropriate voice signal while considering the
condition of the body of a non-healthy person.
Furthermore, another object of the present invention is to provide a portable voice recognition
output assisting apparatus which can be easily worn by non-healthy persons and is rich in
operability.
[0010]
In order to solve the above-mentioned problems, the invention according to claim 1 relates to the
back side of a strip-shaped mounting body formed of a sound absorbing cloth which is wound
and fixed to a vibration generator. A flat voice input means for detecting the vibration generated
from the vibration generating body and converting it into an electric vibration frequency signal is
attached, and a sound signal corresponding to the vibration frequency signal is output to the
surface side of the mounting body A portable voice recognition output assisting device having a
voice input / output device attached with flat voice output means.
[0011]
Next, the invention according to claim 2 is an audio input means for detecting vibration
generated from a vibration generator and outputting an electric vibration frequency signal, and
according to the vibration frequency signal inputted by the audio input means A voice input /
output device having voice output means for outputting a voice signal; a voice recognition unit
for recognizing voiceprints, strength and weakness of sounds, and generated sounds from a
vibration frequency signal input from said voice input means; A standard voice pattern and a
voice code corresponding to the pattern are stored, and a voice code related to the generated
sound recognized by the voice recognition unit is compared with the voice code already stored,
and when both voice codes match A voice for reading out a standard voice pattern corresponding
01-05-2019
3
to the voice code, and outputting voice information including the standard voice pattern, the
voiceprint, the strength and weakness of the sound, and the height and the like A portable voice
recognition output assist device provided with a No. determination unit.
[0012]
Next, the invention corresponding to claim 3 combines the standard speech pattern newly output
from the speech code judgment unit and the voiceprint with the constituent features of the
invention corresponding to claim 2, and further the speech A voice synthesis unit for producing a
synthesized voice by adding the strength and strength and height of the voice, and a voice
conversion output unit for converting the synthesized voice created by the voice synthesis unit
into a voice signal and outputting it from the voice output unit. Portable voice recognition output
assisting device.
[0013]
Furthermore, in the invention corresponding to claim 4, in the constituent features of the
invention corresponding to claim 2, the standard speech pattern newly output from the speech
code determination unit and the voiceprint are synthesized, A speech synthesis unit that creates
synthetic speech by adding strength and high and low, a speech storage unit that stores synthetic
speech created by the speech synthesis unit, and converts synthetic speech stored in the speech
storage unit into a speech signal And an audio repetition switch for reading out the synthesized
sound stored in the audio storage unit and causing the audio output unit to repeatedly output the
synthesized sound stored in the audio storage unit; and A portable voice recognition system
comprising voice speed variable means for varying the speed of a voice signal, and voice strength
variable means for varying the voice signal level outputted from the voice conversion output unit
and adding strength and weakness An output assist device.
[0014]
Further, the invention according to claim 5 is characterized in that a voice input / output device
portion having voice input means and voice output means, a main device portion having a voice
recognition unit, a voice code determination unit, a voice conversion output unit, A voice repeat
switch for reading out synthesized speech stored in a storage unit and repeatedly outputting the
voice from the voice output means, voice speed variable means for varying the speed of a voice
signal output from the voice conversion output unit, and from the voice conversion output unit It
is a portable voice recognition output auxiliary device divided into a voice adjustment part having
voice strength changing means for varying the level of the voice signal to be output and adding
strength.
01-05-2019
4
[0015]
Therefore, according to the invention corresponding to claim 1, by taking the above measures, a
sound absorbing cloth is used for the vibration generating body, for example, a mounting body
which is fixed around the neck of a non-healthy person, and By attaching flat voice input means
and voice output means separately to the back side and the front side of the wearing body, it is
possible to prevent the influence of voice uttered from the mouth and noise coming from the
outside, and moreover it is unhealthy The burden on the throat of the person is reduced, and it is
possible to reliably input the vibration uttered directly from the throat.
[0016]
Next, in the invention corresponding to claim 2, the speech recognition unit recognizes the
voiceprint, the strength of the sound, the pitch of the sound and the vocalization from the
vibration frequency signal inputted from the speech input means, and sends it to the speech code
judging unit. .
In this voice code determination unit, a plurality of standard voice patterns and voice codes
corresponding to the patterns are stored in advance, so the voice code related to the generated
sound sent from the voice recognition unit and the voice code already stored When both voice
codes match, the standard voice pattern corresponding to the voice code is read out and voice
information including the standard voice pattern, the voiceprint, the strength and weakness of
the sound, and the like is output. It is possible to correctly recognize even an unclear voice
uttered by a person, and easily output a standard voice pattern converted from a short word
uttered by a non-healthy person to a long word used in daily conversation and the like.
[0017]
Furthermore, the invention corresponding to claim 3 has the same operation as the invention
corresponding to claim 2 and, in addition, the speech synthesis unit synthesizes the voice pattern
with the standard speech pattern sent from the speech code determination unit. Further, since
the synthetic speech is created by adding strength and weakness of the sound and high and low,
voice synthesis including emotions can be performed, and furthermore, the synthetic speech is
converted into a speech signal in the speech signal conversion output unit and output from the
speech output means , Can output a voice signal accompanied by emotional expression.
[0018]
01-05-2019
5
Furthermore, the invention corresponding to claim 4 has the same operation as the invention
corresponding to claim 2 and claim 3, and by operating a voice repetition switch to read out
synthesized speech from the voice storage unit again and output voice Since the signal is
repeatedly output from the means, the same audio signal can be output without emitting an
audio from the beginning, even if it is heard again from the other party.
Further, by changing the output speed of the audio signal by the audio speed variable means, it is
possible to output the audio signal at a speed that can be easily understood by a healthy person.
Further, since the voice signal level is changed by the voice strength / weakness changing means
and the voice signal level is added and outputted, the voice signal which can be easily understood
by the healthy person can be similarly output.
[0019]
Furthermore, the invention according to claim 5 comprises an audio input / output device
portion having an audio input means and an audio output means, a main device portion having
an audio recognition unit, an audio code determination unit, an audio signal conversion output
unit, etc. The audio input / output unit is wound around the neck of a non-healthy person, the
main unit is hung on the waist of the torso and so on, and the audio control unit is held at hand.
If you operate it, you can easily carry it and operate it easily.
[0020]
Embodiments of the present invention will be described hereinbelow with reference to the
drawings.
FIG. 1 is a block diagram showing the configuration of the device of the present invention.
In the figure, reference numeral 1 denotes a voice input / output device, for example, a cloth
mounting body 11 such as a corset which is wound around the neck in the case of whiplash etc.
as shown in FIG. The voice input means 12 for directly taking in the vibration uttered from the
throat and the voice output means 13 for outputting a voice signal are attached to the place, and
further Velcro tape 14a, 14b is attached.
01-05-2019
6
In addition, you may fix using conventionally well-known various fixing means other than velcro
14a, 14b, for example, a hook.
[0021]
The mounting body 11 is made of, for example, a cloth having excellent sound absorbing
properties like a curtain that blocks external noise, thereby absorbing voices uttered from the
mouth and noises coming from the outside, and the voice input means 12 is Do not affect.
The voice input means 12 is mounted flat on the back (inner) surface portion of the mounting
body 11, converts the vibration uttered from the throat into an electrical signal and outputs it.
By flattening in this way, it becomes easy to be fitted to the wearing body 11, there is no feeling
of pressure on the throat, and thus the burden on the throat can be reduced.
On the other hand, in the voice output means 13, a flat speaker is similarly attached to the side
opposite to the voice input means 12, that is, the front side (outer side) surface portion of the
mounting body 11.
By attaching the flat speaker to the front on the same vertical line as the mouth, the burden on
the throat is reduced, and from the viewpoint of the other party, it creates a state in which sound
is emitted from the mouth. In addition, the sound output means 13 may be attached in a natural
attachment state inconspicuous as long as it is finished by covering the sound output means 13
with a similar color or an appropriate material, or by appropriately devising the color of the
sound output means 13. Do.
[0022]
A voice recognition unit 2 recognizes the voiceprint characteristic of the individual, the strength
and weakness of the sound, the high and low, and the correct utterance from the voice vibration
frequency signal input from the voice input means 12. As shown in FIG. 3, the speech recognition
01-05-2019
7
unit 2 comprises speech spectrum conversion means 21, sound quality judgment means 22,
voice print judgment means 23, speech recognition means 24 and the like. The speech spectrum
conversion means 21 converts the speech vibration frequency signal as shown in FIG. 4A, for
example, into a speech spectrum as shown in FIG. 4B by sampling at a predetermined period. The
sound quality judging means 22 judges the strength and weakness of the sound from the speech
spectrum, and among them, a predetermined reference level is set in advance for the strength
and weakness of the sound, and each component of the speech spectrum is up and down from
the reference level It indicates the degree of separation, while the pitch of the sound depends on
the frequency of the sound, but here represents the level of each component of the speech
spectrum exclusively. The voiceprint judging means 23 extracts the frequency component level
of the speech spectrum, and the speech recognition means 24 decides the speech from the
distribution state of the speech spectrum, and a character code corresponding to the speech, for
example, "A". Convert it into a code like "i" and output it. The data determined by the
determination means 22 to 24 are output in time series and sent to the audio code determination
unit 3.
[0023]
The voice code determination unit 3 stores in advance a standard voice pattern and a voice code
corresponding thereto, and extracts a character code (voice code) which is a correct voice
recognized by the voice recognition unit 24 and is the voice. The code has a function of
comparing the code with the already stored voice code, and when both voice codes are identical,
outputting a corresponding standard voice pattern. Specifically, as shown in FIG. 5, voice pattern
storage means 31 for storing standard voice patterns, and voice code storage means 32 for
storing voice codes corresponding to the respective standard voice patterns of the voice pattern
storage means 31; It is constituted by the voice code judging means 33.
[0024]
The voice code judging means 33 puts the data on strength and weakness of the sound from the
sound quality judging means 22 and the voiceprint feature data from the voiceprint judging
means 23 in a buffer memory waiting state, and the voice recognition means 24 recognizes the
correct data. For the speech code of the vocal sound, the speech code is compared with a large
number of speech codes stored in the speech code storage means 32, and if it is identical to the
speech code already stored, the speech pattern storage means 31 The standard voice pattern
corresponding to the voice code is extracted from the voice code and stored in the voice
information storage unit 4 together with the data already in the buffer memory waiting state. At
01-05-2019
8
this time, the voice code of the generated sound of the generated sound recognition means 24
may be simultaneously stored. On the other hand, when the speech code recognized by the
speech recognition unit 24 does not match the speech code already stored, the speech code of
the speech recognized by the speech recognition unit 24 is output.
[0025]
The standard voice patterns stored in the voice pattern storage means 31 are patterns
corresponding to words used in daily conversation such as "Good morning", "Thank you",
"Goodbye", for example. That is, by converting short speech codes into long words, non-healthy
persons can be sufficiently communicablely patterned without having to say all the words.
[0026]
The voice information storage unit 4 temporarily stores voice information such as voiceprint
characteristics, sound strength and weakness, standard voice patterns relating to pitch and pitch,
and voice codes of generated sounds recognized as necessary. It is sent to the speech synthesis
unit 5.
[0027]
In the voice synthesis unit 5, as shown in FIG. 6, voice information storage means 51 for storing
voice information sent from the voice information storage unit 4 and voice information stored in
the voice information storage means 51. Among them, a standard voice pattern and feature data
of voiceprint are synthesized, and by adding the strength and weakness of the sound and the
pitch of the sound to such synthesized sound, a completely demodulated synthesized sound is
created and stored in the subsequent voice storage unit 6 And a voice synthesis unit 52.
[0028]
A speech conversion output unit 7 has a function of reading out synthesized speech information
stored in the speech storage unit 6, converting it into an analog signal capable of speech output,
and outputting speech from the speech output unit 13.
[0029]
Further, an audio output adjustment unit 8 is provided in the present apparatus.
01-05-2019
9
The reason why the voice output adjusting unit 8 is provided is that the contents of the
conversation can be properly transmitted to the other party according to the condition of the
non-healthy person.
That is, when the voice signal output from the voice output unit 13 is once reheard from the
other side, the voice output adjusting unit 8 performs a read operation to cause the voice storage
unit 6 to repeatedly output the synthesized sound. A repeat switch 81 is provided.
This is to reduce the burden as it is very difficult for non-healthy people to utter the same speech
from the beginning.
[0030]
Further, the voice output adjusting unit 8 is provided with a voice speed variable device 82 and a
voice strength / weakness variable device 83. An analog first-order lag circuit using a capacitor
or the like is incorporated in advance in the voice conversion output unit 7 side, and the speed of
the voice signal is varied by appropriately shorting the first-order lag circuit with the voice speed
variable unit 82 . This is because the speech speed of the non-healthy person is not necessarily
fast, so the output speed of the synthetic sound output from the voice output means 13 is
appropriately changed to make it easy for the health person to hear. Further, the voice strength /
weakness variable device 83 adds strength and weakness to the voice signal by varying the level
of the voice signal on the voice conversion output unit 7 side or varying the amplification factor,
and outputs the voice signal. This is to make the voice signal outputted from the voice output
means 13 strong and weak to make it easy to hear even in a place where there is a lot of external
noise.
[0031]
Next, the operation of the apparatus configured as described above will be described. First, after
the non-healthy person wraps the mounting body 11 of the voice input / output device 1 around
the neck, the Velcro portions provided on the opposite surfaces of the mounting body 11 are
pressed and fixed. At this time, the voice output means 13 attached to the mounting body 11 is
set to come to the front position, and the voice input means 12 is set to the position of the throat
01-05-2019
10
vibration most likely to be taken in, for example, the side of the neck. . At this time, since the
voice input means 12 and the output means 13 are formed in a flat shape, they are easy to be
fitted to the neck and the burden on the throat is extremely reduced.
[0032]
When a non-healthy person generates voice in this state, the vibration of the throat of the nonhealthy person is taken in by the voice input means 12, converted into an electrical vibration
frequency signal, and sent to the voice recognition unit 2.
[0033]
Here, after the speech recognition unit 2 converts the vibration frequency signal input from the
speech input unit 12 into a speech spectrum by the speech spectrum conversion unit 21, the
sound quality judgment unit 22, the voice print judgment unit 23 and the generated sound
judgment unit 24 Send out.
Each of the judging means 22 to 24 determines the strength and weakness of the sound, the
pitch of the sound, the feature of the voiceprint and the correct generated sound in accordance
with the above-mentioned determination condition, and in the case of the generated sound, a
character code (voice code) To the voice code determination unit 3 together with the strength
data of the sound, the pitch of the sound, and the feature data of the voiceprint.
[0034]
In this code determination unit 3, a standard voice pattern is stored in advance in voice pattern
storage means 31, and a voice code corresponding to the standard voice pattern is stored in
voice code storage means 32, and in particular, in the standard voice pattern. For example, they
are stored in the form of patterns corresponding to words used in daily conversation such as
"Good morning", "Thank you", "Good bye".
[0035]
Therefore, when the code determination unit 3 receives a character code (speech code) that is a
correct utterance recognized by the speech recognition unit 2, the code determination unit 3
compares some of the speech codes with the already stored speech code, When the two voice
codes become identical, the standard voice pattern corresponding thereto is read out, and the
voice information storage unit 4 together with the data on the strength and weakness of the
01-05-2019
11
sound from the sound quality decision means 22 and the voice print feature data from the voice
print decision means 23 To the voice synthesis unit 5 via
[0036]
Here, the voice synthesis unit 5 temporarily stores the voice information such as the standard
voice pattern, the strength and weakness of the sound, the pitch, and the voiceprint sent from the
voice information storage unit 4 in the voice information storage unit 51, and Do speech
synthesis with.
This speech synthesis synthesizes a standard speech pattern and feature data of a voiceprint
among speech information, and further adds a combination of strength and weakness of the
sound and the pitch of the sound to such synthesized sound to produce a synthesized sound that
is completely demodulated. After being stored in the storage unit 6, it is sent to the voice
conversion output unit 7.
The voice conversion output unit 7 reads out the synthesized sound information stored in the
voice storage unit 6, converts it into an analog signal capable of voice output, and outputs voice
from the voice output unit 13.
[0037]
At this time, for example, when the non-healthy person operates the voice repetition switch 81
again when it is heard again from the other party, the synthesized voice information is read out
again from the voice storage unit 6 and voice output can be performed by the voice conversion
output unit 7 Since the voice output unit 13 converts the signal into an analog signal and outputs
the voice, it is possible to convey the appropriate voice signal, that is, the contents of the
conversation to the other party. In addition, when the speech speed of the non-healthy person is
slow, if the output speed of the sound signal is appropriately increased by the voice speed
variable unit 82, the health person or the like can easily hear. Also, for example, where there is a
lot of external noise, if the voice strength controller 83 is variably operated, the voice signal level
can be increased and output from the voice output means 13, and similarly healthy people can
easily hear.
01-05-2019
12
[0038]
Therefore, according to the configuration of the embodiment as described above, since the
mounting body 11 to be the main body of the voice input / output device 1 is made of a cloth or
the like excellent in sound absorption, Not only perfect matching, but also voices uttered from
the mouth and noises coming from the outside can be absorbed, and the voice input means 12
can appropriately input vibrations uttered from the throat. Furthermore, if flat voice input means
12 and voice output means 13 are attached to the surface of the wearing body 11 so as to be
lightweight and convenient to carry, there is no pressure on the throat, and the burden on the
throat is reduced. It can be reduced. Further, the voice recognition unit 2 recognizes the features
of the voiceprint, the strength and weakness of the sound, the high and low of the voice, and the
vocalization from the vibration frequency signal inputted from the voice input means 12,
Information on strength and weakness and sound level is sent to the voice code determination
unit 3, where the voice code is compared with a large number of voice codes stored in advance,
and when both voice codes match, the voice code corresponds Thank you "," "Goodbye," read the
standard voice pattern corresponding to the words used in daily conversation, and sent it to the
voice synthesis unit 5 together with the characteristics of the voiceprint, the strength and
weakness of the sound, the pitch of the sound, etc. The standard speech pattern of a long
sentence, which is a daily conversation, can be output from the first short speech utterance by a
non-healthy person, and the speech burden of the non-healthy person can be sufficiently
assisted.
[0039]
Furthermore, after temporarily storing various voice information sent from the voice code
determination unit 3 in the voice synthesis unit 5, the voice print characteristic is synthesized
from the voice information to the standard voice pattern, and the strength of the sound is further
enhanced. And because the pitch of the sound is added, it is possible to create a synthesized
sound including the emotions of non-healthy people.
[0040]
Furthermore, since the voice repetition switch 81 for repeatedly outputting the voice signal, the
voice speed changer 82 for varying the speed and strength of the voice signal, and the voice
strength changer 83 are provided, it is possible to Accordingly, an appropriate audio signal can
be output while appropriately operating.
[0041]
01-05-2019
13
In the above embodiment, the entire configuration has been described. However, when
considered from the viewpoint of easy carrying by a non-healthy person and easy operation, it is
preferable to have the following divided configuration.
That is, a voice input / output unit having voice input means 12 and voice output means 13,
voice recognition unit 2, voice code determination unit 3, voice information storage unit 4, voice
synthesis unit 5, voice storage unit 6 and voice conversion output If it is divided into an
apparatus main body portion including a power supply portion consisting of the unit 7 and the
like and an audio output adjustment portion such as the voice repetition switch 81, the voice
speed variable 82 and the voice dynamic variable 83, etc. In this case, the voice input / output
device part can be fixedly wound around the neck, the device body can be suspended at the
waist, and the voice output adjustment part can be carried by hand, thereby facilitating
portability and enhancing operability. Can.
[0042]
Moreover, although the mounting body 11 used the cloth, it will not be specifically limited if it is
a sound absorbing paper base or the like material.
In addition, the present invention can be implemented with various modifications without
departing from the scope of the invention.
[0043]
As described above, according to the present invention, the following various effects can be
obtained. According to the first aspect of the present invention, even a person who can not utter
voice from the mouth can reliably input a signal corresponding to voice and can appropriately
input the vibration of the throat without pressing the throat of a non-healthy person.
[0044]
The inventions of claims 2 and 3 can correctly recognize the vague speech uttered by a nonhealthy person, and further include the emotion of the person who utters the speech by
synthesizing the speech pattern, the voiceprint, the strength of the sound, and the like. Speech
01-05-2019
14
synthesis can be realized.
[0045]
Next, according to the fourth aspect of the present invention, it is possible to generate a proper
voice signal by appropriately performing a voice operation in consideration of the physical
condition of the non-healthy person and according to the listening condition of the other person.
Furthermore, according to the fifth aspect of the present invention, by appropriately dividing the
configuration, the non-healthy person can be easily worn and the operability of the non-healthy
person can be enhanced.
01-05-2019
15
1/--страниц
Пожаловаться на содержимое документа