close

Вход

Забыли?

вход по аккаунту

JP2016504622

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2016504622
The present invention relates to a method of calculating at least two individual signals from at
least two output signals formed from at least two source signals, specifying a mixing ratio on the
output signals of the at least two source signals and mixing ratio The source signal is eliminated
in each subtraction signal by multiplying at least two subtraction signals from the output signal
by multiplying the output signal with a factor formed by the following equation, and the
subtraction signal is subjected to Fourier transform, thereby converting the conversion signal
The remainder signal is determined from the converted signal, at least two individual signals are
calculated based on the converted signal and the remainder signal, and the individual signals are
inverse Fourier transformed.
Method for calculating at least two individual signals from at least two output signals
[0001]
The invention relates to a method for calculating at least two individual signals according to the
preamble of claim 1.
[0002]
It is known that stereo signals can be generated by recording two or more sound sources.
A level difference or arrival time difference is used for this. By using a stereo signal, a threedimensional sound will be created.
03-05-2019
1
[0003]
Known stereo techniques are intensity stereo in the form of AB-, XY- and MS-methods as well as
compromise techniques such as ORTF- and OSS-methods.
[0004]
In recent years, acoustic systems with more than two speakers have been gradually increasing in
number in the market.
In particular, such an acoustic system has for example been in the cars for many years, and such
cars have at least two speakers in the side door, and two in the car of the car. It has a speaker.
[0005]
This is why, as is the case with stereo, several methods have been developed to allow more than
two separate signals. In this case, there are roughly two different approaches.
[0006]
First, there is a method that can calculate each source signal from multiple output signals. In this
case, each instrument, each singing voice and speaking voice is understood as a source signal. In
the case of a chorus or a plurality of singers can also be regarded as one source signal. For
example, even if there are 20 female singers, "soprano" can be grasped as one singing voice, for
example.
[0007]
All in all, the problem is to extract an arbitrary number of source signals from two output signals,
as is present in a stereo signal, or from more output signals.
[0008]
03-05-2019
2
Such a method for source separation is carried out under the so-called "blind separation" tagline.
There are two major types of this method. The method is performed on the one hand using
statistical methods such as independent component analysis, principal component analysis,
maximum a posteriori probability method, maximum likelihood method and the like. On the other
hand, it is also known to use recursive optimization methods. The known method then allows any
source signal to be calculated at a high computational cost from a smaller number of output
signals, or more or the same number of source signals for real time calculation. Requires an
output signal.
[0009]
The object of the present invention is therefore to develop a method having the features of the
preamble of claim 1 and to enable source separation in real time even with a smaller number of
output signals than the number of source signals. .
[0010]
This task is solved by the features of claim 1.
Advantageous developments of the invention are given by the subclaims.
[0011]
The core of the present invention is that a subtraction signal is calculated from the output signal,
the subtraction signal is Fourier-transformed, and an individual signal is calculated in the
converted signal obtained in this way. This eliminates the need for statistical methods and
recursion and achieves the calculation of the individual signals in real time. Here, real-time
means that the signal to be reproduced can be provided without interruption while the incoming
data stream (Datenstrom) has a critical time offset that occurs when calculating the first block of
the signal It means that. Thus, for example, from a recording, a particular singing voice and, for
example, a piano can be extracted and assigned to a particular speaker at a mixing ratio to be
selected, such that the output signal does not have to be preprocessed. . Thus, for example, you
03-05-2019
3
should be able to set a CD in a car, determine the distribution of the song and possibly the signal
to be extracted and the signal itself to be extracted, and start playing the song or CD without
noticing the delay. is there.
[0012]
The mixing ratio of the source signal on the output signal can be calculated by specifying the
direction. To that end, the output signal is Fourier transformed and an amplitude value is
calculated. These amplitude values are summarized in one histogram. To that end, point pairs are
created from the Fourier transformed output signal, where point pairs in the histogram can be
grouped by the angles obtained. At this time, the angle of the point pair takes a value between 0
° and 90 °, and is obtained as follows.
[0013]
[0014]
Here, X <˜> 1, org indicates the first output signal X 1 subjected to Fourier transform, and X <˜>
2, org indicates the second output signal X 2 subjected to Fourier transform, a x And a y denote a
vector, which contains a magnitude.
The spelling l is the number index of the output signal X 1, X 2 or the Fourier-transformed
output signal X <˜> 1, org, X <˜> 2, org, based on which the point pair is located . The first
numerical value of the Fourier-transformed output signal X <˜> 1, org and the first numerical
value of the Fourier-transformed output signal X <˜> 2, org are none other than vectors, but
these first numerical values Thus form a first point pair of l = 1. Of course, the term index l may
start from 0, in which case the term index continues up to L−1. Here, L indicates the number
(for example, 4096 points) of Fourier transformed points of each of the output signals X 1 and X
2. θ is an angle between 0 ° and 90 ° and Ψ l is the size of the point pair with index l.
[0015]
The vector Ψ l contains the sum of the magnitudes of the vectors a x and a y.
03-05-2019
4
[0016]
[0017]
The Fourier transform is performed in blocks.
As input signals, powers of two, that is, powers of two are used.
Data points of powers of 10, 11, or 12, ie 1024, 2048 or 4096, proved to be particularly
efficient. Particularly preferred is 4096 data points, since the calculation time is optimal
considering the calculation cost.
[0018]
Histogram segmentation is preferably performed in 1 ° increments. That is, the histogram
includes 90 intervals. The value of the angle θ is then rounded to an integer value to delimit the
numerical range of the histogram (gerundet).
[0019]
Finally, the histogram is flattened so that the maxima (plural maxima) are easy to see. The
following function is used as the flattening function.
[0020]
[0021]
The parameter T is an integer and gives how many adjacent points are incorporated into the
planarization.
03-05-2019
5
Thus, an averaging over multiple data points is performed. At the end of the histogram where
there are no S adjacent points, a value of 0 is used for the corresponding position to be captured.
When calculating the first frequency (Zahlenwert) of the histogram, therefore, a value of 0 should
be applied eight times on one side, while the existing frequency is used on the other side. In the
flattened histogram, all local extrema are identified and sorted by frequency height, ie frequency.
Each place in the histogram corresponds to one angle as described above, and as a result, one
extreme corresponds to one angle. The angle corresponding to the found max (s) is identified and
either a preset number of max is used or all maxima whose frequency exceeds the preset
threshold are used.
[0022]
The flattened histogram-vector is thus:
[0023]
[0024]
ここで、i=0,…,90である。
Here, the problem that can arise is that the source signal does not appear over the entire period
of the output signal.
For example, certain instruments or singing pauses. After this pause finds two or more
histograms so that it does not cause an error in specifying the angle, and after identifying the
corresponding angles, the next histogram is weighted, for example, as a low pass. You can
multiply functions. The low pass can be expressed by the following equation.
[0025]
[0026]
03-05-2019
6
Here, a = 0.1 and b = 0.9.
The index n is the index of the histogram, in particular n = 1 for the first histogram or for the
first 4096 data points. By using a low pass, a histogram hgl̲TP is obtained.
[0027]
This weighting stabilizes the identification of the angle.
[0028]
Based on the calculated angle, the mixing ratio of the two source signals is determined, in which
case the angle
[0029]
[0030]
Is substituted into the following equation.
[0031]
[0032]
Assuming that the maximum of the histogram is at, for example, 18 °, a mixing ratio of V =
0.325 is obtained.
[0033]
Based on the mixing ratio, a subtraction signal is calculated, and in the case of two output signals,
it is given as follows.
[0034]
[0035]
03-05-2019
7
ここで、N=1,2,…である。
N is the index of the source signal that is to be filtered out.
[0036]
Depending on whether the mixing ratio is greater or less than one, a subtraction signal is
calculated accordingly.
At this time, it is possible to use more than two output signals, but doing so will simply result in
high computational costs.
If there are more than two output signals, as in the case of a stereo signal, then preferably two of
the source signals appear most strongly in order to extract the source signal. One output signal is
selected.
[0037]
In any of the subtraction signals calculated in this manner, one source signal is shielded.
[0038]
These subtracted signals are then Fourier transformed.
This is done in a blockwise fashion, with subsequent blocks always beginning with a width
corresponding to half the size of the previously transformed data point.
This means that the first half of one block is already Fourier transformed as the second part of
the preceding block.
03-05-2019
8
[0039]
This method allows continuous signal processing and is known under the name of superposition.
[0040]
A window (eg, a Hanning-Fenster) is multiplied to the input block to minimize leakage effects.
The window function f (n) for the N data points is
[0041]
[0042]
Fourier-transformed subtraction signal X <-> (hereinafter referred to as "transformed signal".
The remainder signal is calculated from
For the two transform signals X <˜>, the remainder signal is given as the first transform signal
minus the second transform signal.
すなわち、
[0043]
[0044]
In the following, starting from two transform signals X <<1> 1, X <>> 2 and one residue signal X
<〜> 3, this gives rise to a total of three signals.
03-05-2019
9
As such, these three signals are compared to one another to extract one individual signal.
[0045]
For each data point of the signals X <˜> 1, X <˜> 2, X <˜> 3, each signal typically has 4096 data
points, and the extrema are calculated from the amplitudes of the three signals. Ru.
[0046]
Starting from an array with 3x4096 data points.
The numeral 3 indicates the number of transform signals and remainder signals, and 4096
indicates the number of Fourier-transformed data points in one block.
Focusing on the first frequency bin or the first data point of the vector X <˜> 1, X <˜> 2, X <˜> 3,
the three frequencies for comparison are determined.
At the lowest of these three values, the one of the other two values is set, and the other value is
set to zero.
[0047]
In order to explain this clearly, numerical examples are shown.
[0048]
It is assumed that the first value of the position X <˜> 1 is 5, the first value of X <˜> 2 is 10, and
the first value of X <˜> 3 is 15.
Then, the numerical value 15 is set to 5 and the first numerical value of X <˜> 2 and X <˜> 3 is
set to 0. Thus, the numbers are considered row by row. An array of 4096 columns and three
rows is obtained, in which two thirds the value is zero. Values not equal to zero are randomly
03-05-2019
10
distributed across the individual vectors. The individual vectors for X <˜> 1 and X <˜> 2 are
referred to as S <˜> 1 and S <˜> 2, and are obtained by Fourier-transforming the source signals S1
and S2 after calculation processing. The resulting vector of residue value signals X <˜> 3 is not
more important than that.
[0049]
The calculation of S <˜> 1 and S <˜> 2 is given by the following equation.
[0050]
[0051]
Here, k is a term index of data points or frequency bins, and in the case of data points, the value
ranges from 1 to 2048.
It should only span half of the frequency bins, because of the double occurrence of data points
along with the Fourier transform for symmetry reasons.
In the above example, k = 1.
[0052]
The rows converted in this way (these correspond to the signals X <˜> 1 and X <˜> 2), that is,
using the individual signals S <˜> 1 and S <˜> 2 Based on this, the calculated source signals S1
and S2 can be calculated by inverse Fourier transform. As the phase, the phase of the signal X
<˜> 1 and X <˜> 2 can also be considered as it is. Therefore, each phase from the vector X <˜> 1
and X <˜> 2 is assigned to S <˜> 1 and S <˜> 2. This assignment is naturally made based on the
term number index k.
[0053]
03-05-2019
11
Individual signals S <˜> 1, S <˜> 2 or Fourier-transformed individual signals S 1 and S 2 may be
slightly different from the source signal due to calculation accuracy or calculation error. That is,
although it is certainly not achieved to completely reproduce the source signal, the difference is
slight, and usually the difference is not noticed.
[0054]
In order to improve the separation of the individual signals, the following steps are possible.
[0055]
In order to make sure that there is no jump to identify the minimum, it is necessary to determine
the minimum of the signals X <˜> 1, X <˜> 2, X <˜> 3 according to the minimum distinction made
earlier. It may be
For example, a conditional low pass filter (bedingter tiefpass filter) may be used.
[0056]
[0057]
P <t> hold (k) is the value of interest for frequency bin k, which is again a term index.
The parameter b can be freely set between 0 and 1, and when b = 0 the sensing of frequency is
cut off.
[0058]
Calculation of S <˜> 1 and S <˜> 2 is performed by the following formula ¦ equation next.
[0059]
03-05-2019
12
[0060]
The parameter η (0 ≦ η ≦ 1) gives what kind of intensity the low-pass filtered signal enters
into the individual signals S 〜 m, m = 1, 2, 3,.
[0061]
Furthermore, the minima of the transformed signals X <˜> m, m = 1, 2, 4, 5, ... can be compared
separately to the remainder value signals X <˜> 3, ie, the minimum specific E min The matching
of (k), 0, or the column maximum value E max (k) is performed between the conversion signal X
<˜> m and the remainder value signal, respectively.
Thus, the vectors E min1, E min2, E max1, and E max2 are calculated.
The minimum value is multiplied by a factor β (0 ≦ β ≦ 2).
Depending on the choice of factors, different effects occur. For .beta. <1, undesired frequencies
are suppressed and for .beta.> 1 harmonic tone formation is obtained.
[0062]
The individual signals are as follows.
[0063]
[0064]
In addition, the signals can be separated based on phase conditions.
When the amplitudes of the converted signal X <˜> m (k) are substantially the same and the
residual value signal X <˜> 3 (k) has a minimum value, the phase is considered.
03-05-2019
13
If these phases are the same, the maximum value is assigned to S <˜> 3 (k), otherwise it is
assigned to S <˜> 1 (k) and S <˜> 2 (k). This idea can, of course, be implemented separately for
each frequency bin k.
[0065]
The following holds as appropriate.
[0066]
[0067]
In addition, the Fourier transformed output signal X <˜> m, org can be further taken into
consideration.
When the minimum is at X <˜> 1, org (k), they are similarly assigned to S <˜> 1 (k).
[0068]
[0069]
The invention will be explained in more detail on the basis of the embodiments described in the
drawings.
[0070]
Fig. 5 is a block diagram of a method according to the invention for stereo signals.
FIG. 6 is a block diagram of a method according to the invention in a second embodiment.
It is a flowchart for specifying a mixing ratio.
03-05-2019
14
It is a flowchart for signal source separation. It is a figure which shows the structure for
recording a stereo signal. It is a scatter diagram of a data pair. FIG. 10 is a scatter plot of FT data
pairs. It is a figure which shows a histogram. It is a figure for demonstrating a superimposition
addition method. FIG. 5 is a diagram showing 3D amplitude spectra of two converted signals and
one remainder signal. It is a figure which shows the 3D amplitude spectrum of an individual
signal. It is a figure which shows a conversion signal and an individual signal in vector form.
[0071]
FIG. 5 shows a configuration for recording a stereo signal. By way of example only, an
arrangement for recording source signals by means of the XY-stereo method is shown, basically
that the method according to the invention can not only be used in all other stereo techniques
but also It can also be applied in such a way that more than one output signal is generated.
[0072]
Shown are four signal sources, which generate one source signal 1, 2, 3, 4 respectively. Source
signal 1 is for example a single singing voice, ie for one male or female singer, source signal 2 is
for a back chorus consisting of a large number of male or female singers, but with the same lyrics
as for chorus The same score is played, source signal 3 is for an instrument such as a piano,
source signal 4 is for a group of instruments, but with the same score as a chorus is there. An
example in this case may be a unit of a violin playing the same melody. This example shows that
a single source signal can not consist of a single male singer or a single female singer, or a single
instrument, rather than a singer or musical instrument. It can be composed of a large number of
Due to the broad directivity of the microphones 5, 6, the source signals 1, 2, 3, 4 are recorded at
different levels, so that the mixing ratio of the source signals 1, 2, 3, 4 is always different.
[0073]
From the source signals 1, 2, 3, 4 thus two output signals of the stereo signal are obtained. These
output signals are exactly the starting point of the method according to the invention, so that the
original source signal is not used further.
03-05-2019
15
[0074]
Of course, many more source signals can be used to generate the output signal Xm, and to
perform the method significantly, but at least two source signals are needed.
[0075]
In the following, two output signals X 1, X 2 will be referred to many times.
However, the method according to the invention is not limited to stereo signals, but can basically
be used for any number of output signals X m, m = 1, 2, 3,.
[0076]
FIG. 1 shows a block diagram according to a first embodiment of the method according to the
invention. In the present embodiment, the two output signals X 1 and X 2 are considered to
determine the mixing ratio for the two source signals (eg, source signals 1 and 2). One possible
way to specify the mixing ratio is described in more detail below. For example, a mixing ratio V 1
occurs for source signal 1 and a mixing ratio V 2 for source signal 2. From the output signals X
1, X 2, the subtraction signals X 1 1, X 2 2 are then calculated, wherein the output signals X 1, X
2 are subtracted according to the mixing ratio as follows:
[0077]
[0078]
Here, N represents each index of the source signal.
[0079]
The subtraction signals X ^ 1 and X ^ 2 calculated in this manner are Fourier-transformed by the
superposition addition method.
[0080]
03-05-2019
16
The output signals X 1 and X 2 are, of course, digital data, which correspondingly correspond to
a simple arrangement of numerical values and data points.
The output signal could also be represented as a vector with a very large number (usually tens of
thousands) of numbers.
This applies to the subtraction signal as well.
The output signals X 1, X 2 or the subtraction signals are subjected to further processing by the
block in order to keep the calculation requirements low and, inter alia, to keep the time until the
first individual signal is obtained short. The Fourier transform is applied, for example, to the first
4096 data points of the subtraction signal X ^ 1 to X ^ 2. Preferably, data points of a power of
two are Fourier transformed. That is because in this example, fast Fourier transform (FFT) can be
applied. The number 4096 is now an ideal number, taking into account the computation time and
resources consumed. Smaller data blocks or larger data blocks (eg, data blocks of 1024 to 2048
data points) can also be employed. In this case, of course, always means continuous data points.
By subjecting the subtraction signals X ^ 1 and X ^ 2 to Fourier transform, converted signals X
<˜> 1 and X <˜> 2 are obtained. By subtracting the conversion signal X <〜2> from the
conversion signal X <〜1>, a remainder signal X <〜> 3 is obtained.
[0081]
The remainder signal X <<> 3 includes as many data points (eg, 4096 data points) as the
converted signal X <〜> 1 and X <<> 2. Because these data points are in the frequency domain,
they are usually also referred to as frequency bins. The frequency bins are thus data points of the
transformed signal or the remainder signal. In the specific case, each transformed signal /
residue signal thus has 4096 frequency bins.
[0082]
Next, the minimum is sought among the signals X <˜> 1, X <˜> 2, X <˜> 3, since these have the
same meaning as the shielded signal. To that end, for each frequency bin, a minimum is
calculated from the amplitudes of the signals X <˜> 1, X <˜> 2, X <˜> 3, and at that minimum the
03-05-2019
17
largest value of each frequency bin is Set and other values for these frequency bins are set to
zero. The phase for values different from zero is obtained, thus obtaining the individual signals S
<˜> 1, S <˜> 2. These individual signals have to be transformed back to the time domain in order
to obtain the calculated source signals S 1, S 2.
[0083]
FIG. 2 shows an embodiment for calculating more source signals than the two calculated source
signals S 1, S 2. For the present embodiment, more than two mixing ratios are specified from the
output signals X 1, X 2, X 3, X 4, thereby obtaining more subtraction signals and accordingly
more conversion signals. Be Also in the converted signals X <˜ 1, X <˜ 2, X <˜ 4, X <˜ 5 and the
remainder signals X <˜> 3 calculated therefrom, the minimum is also obtained. Are performed,
whereby individual signals S <˜> 1, S <˜> 2, 2, S <˜> 3, S <˜> 4 are calculated, and from these, the
calculated source signal S1,. Calculations of S 2, S 3 and S 4 are performed.
[0084]
The following shows one possible way to identify the mixing ratio. Basically, any method can be
used to specify the mixing ratio.
[0085]
FIG. 3 shows a flow diagram for identifying the mixing ratio based on the identification of the
direction. Here, in step S1, the output signals X 1 and X 2 are Fourier-transformed, and thereby
the converted output signals X <˜> 1, org, X <˜> 2, org are obtained. In step S2, the amplitude
value is calculated based on the following equation for each data point or frequency bin of each
of the Fourier-transformed output signals X <˜> 1, org, X <˜> 2, org.
[0086]
[0087]
03-05-2019
18
In the next step S3, the calculation of the angle θ and the size Ψ l is carried out from the
previously calculated values.
At this time, calculation of the magnitude of the Fourier-transformed output signal X <˜> 1, org, X
<˜> 2, org is performed by the data point or as a vector, but in any case, the angle A pair of
values of θ and magnitude Ψ l can be calculated.
[0088]
In step S4, a histogram is calculated from the value pairs of θ and Ψ l, and this histogram is
flattened in step S5.
[0089]
In step S6, some maxima of the histogram are identified by picking up all the maxima that have
exceeded a predetermined number or threshold.
[0090]
Each of these maxima is assigned an angle, each calculated from the maxima according to the
equation given above.
In step S7, next, the mixing ratio to be obtained is specified based on the calculated angle.
[0091]
FIG. 4 shows a flow diagram for source separation.
In the case of converted signals X <<1>, X <〜2> and possibly also X <〜> 4 and X <>> 5 and
remainder value signals X <〜> 3, the minimum for each frequency bin And the maximum are
calculated (step S8). In another embodiment, the Fourier transformed output signal X <˜> 1, org,
X <˜> 2 org and in some cases X <˜> 3 org and X <˜> 4 org are considered. Be done.
03-05-2019
19
[0092]
In step S9, as described above, at each frequency bin, the frequency bin or the maximum value of
each column is set to the minimum, and all other values are set to zero or hold value P <t> hold.
Is assigned.
[0093]
According to step 10, for each value different from zero of the individual signals S <˜> 1, S <˜> 2,
S <˜> 4, S <˜> 5, the converted signal X <˜> 1,, Each phase of X <˜> 2, X <˜> 4, X <˜> 5 is assigned.
[0094]
The assignment is then made on the basis of the frequency bin numbering.
When the frequency bin 28 of the individual signal S <˜> 1 has a value different from zero, the
individual signal will obtain the phase of the frequency bin 28 of the transformed signal X <˜> 1.
In order to shift the maximum value to the minimum value in each frequency bin and to make the
remaining values zero, none of the values of the individual signals will be the same as the value
of the assigned conversion signal. The phase can still be taken over.
[0095]
The hold value ゼ ロ P hold can be set instead of the non-minimum setting to zero. In this way, it
is made for the individual signal not to be missed.
[0096]
FIG. 6 shows a scatter plot of the amplitudes of the two output signals X 1, X 2 in a mixed state,
and in this example the output signals X 1, X 2 are mutually offset sine wave signals. The
scattergrams each represent a value pair (a two-item value), where the value pairs are formed
based on the data point numbers, in other words, the data-bin numbers. The first point of the
03-05-2019
20
data points of the output signal X 1 has a first amplitude, and the first data point of the output
signal X 2 has the same or a different amplitude. These two amplitudes are used to plot one point
in the scatter plot 9. By using a plurality of output signals X 1 and X 2 for each data point, the
same number of data point pairs are generated. At this time, the amplitude of the output signal X
1 is displayed on the axis 10, and the amplitude of the output signal X 2 is displayed on the axis
11. For example, using 4096 data points on output signal X 1 output signal X 2, 4096 data point
pairs are obtained. These form a point cloud 12 in the scatter diagram, which point cloud is
formed from a plurality of points 13 determined from individual data point pairs.
[0097]
FIG. 7 shows a correspondingly formed scatter plot 14, but with the output signals X 1, X 2 being
Fourier transformed, the amplitude of the signal was calculated as follows.
[0098]
[0099]
Correspondingly, the axis 15 displays the magnitude of the amplitude of X <˜> 1, org, that is, the
result of Fourier transform of the output signal X1, and the axis 16 indicates X <˜> 2, org The
magnitude of the amplitude of is displayed.
By calculating the slopes of the straight lines 17 to 18, the mixing ratio is obtained for each.
[0100]
In this simple way, in actual recording, however, the mixing ratio can be calculated only for
sinusoidal signals which are not perfect.
After Fourier transforming the output signals X1 and X2 instead of determining the slope of the
straight line obtained by creating the scatter diagram, the Fourier transformed output signals X
<˜> 1, org, X <˜> 2, org The angle θ and the vector Ψ are calculated respectively from the point
pairs of Instead of a scatter plot, a histogram is determined from the pair consisting of angle θ
and vector で in which the angle is reduced to an integer value from which the corresponding
03-05-2019
21
frequency in the angle is calculated. Each pair of points results in an individual frequency for the
histogram.
[0101]
The histogram is flattened using a flattening function to reduce the number of local minimum
and maximum values. As the flattening function, for example, the following functions described
above can be applied.
[0102]
[0103]
The histogram thus flattened is shown in FIG.
On the axis 19 the angle is shown in degrees, ie from 0 ° to 90 °, while on the axis 20 the
frequency is shown respectively. The histogram 21 has an absolute true maximum 22 and a
plurality of local maximums 23, 24 and 25.
[0104]
According to the number of frequencies, the maximum 22 has the largest value, followed by the
maximum 25, 24, and 23. Of these maxima, the largest number of maxima is selected by a
predetermined number from the head, for example, two maxima, maximum 22 and 25 are
selected, or also the frequency or threshold is exceeded As long as you can continue to choose
the maximum. If this threshold is set to 10, for example, the maximum of 22, 25 and 24 will
apply in the subsequent calculations, but the maximum 23 below the threshold will not apply.
[0105]
Each maximum has one angle. The maximum 22 is, for example, 18 ° and the maximum 25 is
03-05-2019
22
72 °.
[0106]
At this time, the angle is given by the following equation.
[0107]
[0108]
From the angles specified in this way, one mixing ratio is obtained for each angle according to
the following equation.
[0109]
[0110]
N is the index of the source signal to be extracted.
This completes the specification of the mixing ratio based on the direction information.
[0111]
The calculation of the subtraction signals X ^ 1 and X ^ 2 is performed according to the formula
described above.
In particular, the subtraction signals X ^ 1 and X ^ 2 can also be calculated in blocks.
Therefore, it can be performed for each predetermined set of data points, for example, each of
the 4096 data points of the output signals X 1 and X 2 can be calculated as the subtraction
signals X 1 and X 2 by using the mixing ratio. Ru.
03-05-2019
23
The subtraction signals X ^ 1 and X ^ 2 are Fourier transformed by the superposition addition
method as shown in FIG.
[0112]
Shown are data blocks 26, 27, 28, 29 consisting of 2048 data points of the subtraction signal X ^
1 or subtraction signal X ^ 2 respectively. Here, the value 2048 comes from half of the data
points used for the Fourier transform. The number of data points of data blocks 26-29 thus
comes out of the number of data points of data blocks 30-33.
[0113]
Data block 30 is composed of data blocks 26, 27 and correspondingly has twice the number of
data points of data block 26 or 27. The data block 30 is a data block to be Fourier transformed.
The data block 31 consists of data blocks 27 and 28, and the data block 32 consists of a data
block 28 followed by a data block. The data block 27 is included in the data blocks 30 and 31,
and the data block 28 is included in the data blocks 31 and 32.
[0114]
Prior to the Fourier transform, data blocks 30, 31, 32, 33 are further multiplied by Hanning
windows. This can minimize leakage effects during Fourier transform.
[0115]
FIG. 10 shows a 3D-amplitude spectrum of the converted signal X <〜> 1, X <〜> 2 residue value
signal X <〜> 3. On the axis 35, the numbers of the frequency bins, ie the serial numbered points
in the vector of the signal, are shown, while on the axis 36 the magnitudes are displayed. On the
axis 37, the respective sizes are displayed. At position 38 there is the signal ¦ X <˜> 2 ¦, at
position 39 the signal ¦ X <˜> 3 ¦ and at position 40 there is the signal ¦ X <˜> 1 ¦.
03-05-2019
24
[0116]
The signal shown in FIG. 10 was derived from two pure sinusoidal signals. In the actual signal
more or less all frequency bins are full, but for the sake of clarity only sinusoidal signals are
particularly suitable. As can be seen immediately from FIG. 10, the signals ¦ X <˜> 1 ¦, ¦ X <˜> 2 ¦,
¦ X <˜> 3 ¦ are only peaks of only three different heights. It has no state, while it has only a zero
state otherwise.
[0117]
FIG. 11 shows the 3D-amplitude spectrum of the individual signals S <˜> 1, S <˜> 2 residue signal
mixtures S <˜> 3. Axis 35 again displays frequency bins and axis 36 displays magnitude. On the
axis 42, on the other hand, the individual signals involved are given. At position 43 there is an
individual signal ¦ S <˜> 1 ¦, at position 44 there is a remainder signal ¦ S <˜> 3 ¦ and at position
45 an individual signal ¦ S <˜> 2 ¦ is there. In order to enable inverse Fourier transform of the
individual signals ¦ S <˜> 1 ¦, ¦ S <˜> 2 ¦, the positions of the individual signals ¦ S <˜> 1 ¦, ¦ S <˜> 2
¦ differ from zero. The respective phases of the converted signals X <˜> 1 and X <˜> 2 can be
further applied to each of the above.
[0118]
FIG. 12 schematically shows the point of obtaining an individual signal from the converted signal
X <〜> 1, X <〜> 2 and the remainder value signal X <〜> 3. The vector 46 in this example also
partially shows the one of the transformed signal X <˜> 1, and correspondingly the vector 47
corresponds to the value of the transformed signal X <˜> 2 and the vector 48 , The residual signal
X <˜> 3, and the vector 49 indicates the corresponding frequency bin.
[0119]
From these, the individual signal S <˜> 1 shown in the vector 50, the individual signal S <˜> 2
shown in the vector 51, and the remainder value signal S <˜> 3 shown in the vector 52 are as
follows: Come out.
[0120]
03-05-2019
25
For frequency bin 1, the numbers 5 and 7 are greater than 1 and the minimum is in vector 48.
In frequency bin 1, the corresponding vectors 50, 51, 52 are thus set to the following values:
That is, at the position of the minimum value 1, the maximum value 7 is set. This value must
therefore be set to frequency bin 1 of vector 52. The remainder of frequency bin 1, ie the values
in the vectors 50, 51, are set to zero. Looking at the minimum is thus done row by row, per data
point or per frequency bin. These expressions are the same.
[0121]
Correspondingly, the values of the columns for the frequency bins 2, 3, 4, 5, etc. are calculated,
resulting in the values shown in FIG. 12 respectively.
[0122]
In the vector 50, the value or the minimum value for the vector 46 is collected, in the vector 51,
one for the vector 47 and in the vector 52 for the vector 48.
[0123]
The phases of the converted signals X <˜> 1 and X <˜> 2 are still assigned to the vectors 50 and
51 of the individual signals S <˜> 1 and S <˜> 2.
This point will be described for frequency bin 4 of vector 50 by way of example only.
For frequency bin 4 of vector 50, the phase of frequency bin 4 is obtained from vector 46. This is
because vector 50 forms the minimum of vector 46 and hence there are numerical values
present. Correspondingly, the phase of the same frequency bin is passed.
[0124]
In all the embodiments of the figures, more than two signals can be used instead of the two
above-mentioned output signals, the two conversion signals or the individual signals, which have
been specified, and furthermore, as mentioned at the beginning of the description. All
embodiments can also be used in view of the description of the figures and this is therefore not
03-05-2019
26
explicitly excluded.
[0125]
1 source signal 2 source signal 3 source signal 4 source signal 5 microphone 6 microphone 7
mixing ratio specification 8 signal separation 9 scatter diagram 10 axis 11 axis 12 point cloud 13
point 14 scatter chart 15 axis 16 axis 17 straight line 18 straight line 19 axis 20 axis 21
histogram 22 maximum 23 maximum 24 maximum 25 maximum 26 data block 27 data block
28 data block 30 data block 31 data block 32 data block 33 data block 34 3D-amplitude
spectrum 35 axis 36 axis 37 axis 38 position 39 position 40 Position 41 3D-amplitude spectrum
42 axis 43 position 44 position 45 position 46 vector 47 vector 48 vector 49 frequency bin 50
vector 51 vector 52 vector
03-05-2019
27
1/--страниц
Пожаловаться на содержимое документа