Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
Digital-Signal-Processing_04_Random Signals and LTI-Systems_Linear Prediction
English
notebook
Signal Processing
EnglishnotebookSignal Processing
喇叭花
发布于 2023-08-02
推荐镜像 :digital-signal-processing:Digital-Signal-Processing-Lecture
推荐机型 :c2_m4_cpu
Random Signals and LTI-Systems
Linear Prediction
Theory
Example - Linear Prediction of a Speech Signal
Computation of prediction filter
Forward prediction
Prediction from residual error
Application - Speech synthesis using an artificial glottal signal

Random Signals and LTI-Systems

🏃🏻 Getting Started Guide
This document can be executed directly on the Bohrium Notebook. To begin, click the Connect button located at the top of the interface, then select the digital-signal-processing:Digital-Signal-Processing-Lecture Image and choose your desired machine configuration to proceed.

📖 Source
This Notebook is from a famous Signal Processing Lecture. The notebooks constitute the lecture notes to the masters course Digital Signal Processing read by Sascha Spors, Institute of Communications Engineering, Universität Rostock.

This jupyter notebook is part of a collection of notebooks on various topics of Digital Signal Processing. Please direct questions and suggestions to Sascha.Spors@uni-rostock.de.

代码
文本

Linear Prediction

In general, predictive modeling aims at predicting outcomes using (statistical) models of the underlying processes. When applied to discrete-time signals, it refers to predicting future values from past ones. Different signal models are used for this purpose. Forward linear prediction aims at predicting the actual value of a discrete random signal by a linear combination of its past values. It is frequently applied in lossless and lossy coding of signals. For instance, linear predictive coding (LPC) is a major building block of many speech coding and speech synthesis techniques.

代码
文本

Theory

It is assumed that is a real-valued discrete-time random signal. The prediction of the current sample is computed from a weighted superposition of the past samples

where denotes the order of the predictor and the weights of the past samples at time-instant . The weights might differ for different time-instants. This constitutes a moving average (MA) model or a linear filter with finite impulse response (FIR). Therefore the weights can be interpreted as the filter coefficients of a time-variant non-recursive FIR filter.

It is beneficial to express above sum as an inner product of vectors. Combining the filter coefficients and the samples of the signal into the vectors (i.e. column vectors)

yields

for the above sum equation.

The aim of linear prediction is to determine the filter coefficients such that the predicted signal matches the signal as close as possible. In order to quantify the deviation between the signal and its predicted value at time-instance , the error signal

is introduced. The error equals zero, if the signal is perfectly predicted. In general, this cannot be achieved by a finite predictor order . The problem of determining suitable filter coefficients could be approached by directly minimizing the (average) error. However, this constitutes a non-convex optimization problem. Instead, the quadratic average of the error is typically used. This measure is known as mean squared error (MSE). It is defined as

Above equation is referred to as cost function of the optimization problem. We aim at minimizing the cost function, hence minimizing the MSE between the signal and its prediction . The solution of this convex optimization problem is referred to as minimum mean squared error (MMSE) solution. Minimizing the cost function is achieved by calculating its gradient with respect to the filter coefficients [Haykin] using results from matrix calculus

where

denote the auto-correlation matrix and cross-correlation vector, respectively. The elements of the auto-correlation matrix can be interpreted by expanding the outer product and using the definition of the auto-correlation function (ACF)

The matrix is composed from the values of the ACF for all combinations of time indexes. It is symmetric with respect to its main diagonal. The elements on the main diagonal constitute the quadratic means for the respective time indexes. The elements of the cross-correlation vector are given as

As the MMSE solution constitutes a convex optimization problem, the minimum of the cost function is found by setting its gradient to zero, . Rearranging terms this results in the Wiener-Hopf equation

If has full rank (which should hold due to the random signal assumption), the optimum filter coefficients in the MMSE sense are given as

i.e. solving the Wiener-Hopf equation by direct matrix inverse.

Note that various alternative solutions to the Wiener-Hopf equation have been developed which avoid the direct inversion of the auto-correlation matrix. For instance the least-mean squares (LMS) filter which uses steepest gradient descent to solve for . This is also known as Widrow-Hopf solution.

代码
文本

Example - Linear Prediction of a Speech Signal

In the following example, the principle of forward linear prediction is applied to the recording of a speech signal. The recording contains the stationary part of the vocal 'o'. This constitutes a random signal which can be considered reasonably well as wide-sense stationary (WSS). It follows from the properties of the ACF that

where denote the time-invariant weights of the predictor. As we only have access to one recorded instance of the vocal, we inherently assume wide-sense ergodicity. The expectations are computed by averaging with respect to time.

代码
文本

Computation of prediction filter

First the signal is loaded, conditioned and truncated

代码
文本
[2]
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile

L = 32768 # total number of samples of signal
N = 8 # order of predictor (number of FIR coefficients)

# read and truncate audio file
fs, x = wavfile.read('/digital-signal-processing-lecture/data/vocal_o_8k.wav')
x = np.asarray(x, dtype=float)/2**15 # 16 Bit integer -> float
x = x[:L]
代码
文本

The auto-correlation matrix and the cross-correlation vector are estimated by averaging with respect to time. For instance

where denotes the total number of samples of .

代码
文本
[3]
R = np.zeros(shape=(N, N))
r = np.zeros(shape=(N, 1))

for k in np.arange(N, L):
xk = np.expand_dims(np.flip(x[k-N:k]), 1)
R += 1/(L+2-N) * xk * xk.T
r += 1/(L+2-N) * xk * x[k]
代码
文本

Let's inspect the estimated auto-correlation matrix by plotting its values within a surface plot

代码
文本
[4]
plt.figure(figsize=(3, 3))
plt.matshow(R)
plt.title(r'Estimate of auto-correlation matrix $\mathbf{R}$')
plt.colorbar(fraction=0.045)
plt.xlabel('sample shift')
plt.ylabel('sample shift');
<Figure size 300x300 with 0 Axes>
代码
文本

As expected, the elements on the main diagonal have the largest values. The values decay with an increasing distance of the minor diagonals. Hence, the statistical dependencies between samples at two different time instants of the signal decrease with an increasing distance between the samples. These statistical dependencies are exploited for the prediction of the current signal value from past ones.

Now the filter coefficients are computed and plotted.

代码
文本
[5]
h = np.linalg.inv(R) @ r
h = np.squeeze(h)

plt.figure(figsize=(8, 3))
plt.stem(h, basefmt='C0:')
plt.xlabel(r'$k$')
plt.ylabel(r'$h_k$')
plt.grid()
代码
文本

Forward prediction

The value of the signal at time-instant is predicted from its past values by evaluating the equation given above. This essentially constitutes a linear convolution with the forward prediction filter

In order to evaluate the predictor, the predicted signal is computed for all time instants and plotted in comparison with the signal .

代码
文本
[6]
hfp = np.concatenate(([0], h))
xp = np.convolve(x, hfp)[:L]

plt.figure(figsize=(8, 4))
plt.plot(x, label=r'signal $x[k]$', linestyle='--')
plt.plot(xp, label=r'predicted signal $\hat{x}[k]$', alpha=.8)
plt.xlabel(r'$k$')
plt.axis([10000, 10500, -.6, .6])
plt.legend()
plt.grid()
代码
文本

As the difference between the actual signal and its prediction is quite low, the error is computed and plotted for inspection.

代码
文本
[7]
e = x - xp

plt.figure(figsize=(8, 4))
plt.plot(e)
plt.xlabel(r'$k$')
plt.ylabel(r'$e[k]$')
plt.axis([10000, 10500, -.015, .015])
plt.grid()
代码
文本

The amplitude of the error is much lower than the amplitude of the signal. This shows that the signal can be well predicted by the computed predictor. In order to quantify this, the variance of the signal and the error is computed as measure of their average power. Note, both signals are assumed to be zero-mean.

代码
文本
[8]
print('Variance of signal {:2.6f}'.format(np.var(x)))
print('Variance of error {:2.6f}'.format(np.var(e)))
Variance of signal 0.060221
Variance of error 0.000011
代码
文本

Prediction from residual error

The error signal may be rewritten as

where

denotes the prediction-error filter. Taking the -transform of the error signal yields

where for instance . From this result it can be concluded that the signal can be computed from the error signal by inverse filtering with the prediction-error filter

The inverse of the prediction-error filter constitutes an pole-only (recursive) filter as the prediction filter is a zero-only (non-recursive) filter. In the following, the signal is computed from the error signal and plotted.

代码
文本
[9]
from scipy.signal import lfilter

hpe = np.concatenate(([1], -h))
xe = lfilter([1], hpe, e)

plt.figure(figsize=(8, 4))
plt.plot(xe)
plt.xlabel(r'$k$')
plt.ylabel(r'artifical speech signal $x[k]$')
plt.axis([10000, 10500, -.6, .6])
plt.grid()
代码
文本

Application - Speech synthesis using an artificial glottal signal

Linear prediction is used as linear predictive coding (LPC) in the coding and synthesis of speech signals. The latter application is briefly presented in the sequel. It bases on the source-filter model of human speech production. Here speech production is modeled as a sound source filtered by a linear acoustic filter. In the case of vocals, this sound source are the vocal cords which generate the periodic glottal signal. The vocal tract may be interpreted as an acoustic filter. Vocals are articulated by changing the air flow through the vocal cords, their state as well as the configuration of the vocal tract. Due to its acoustical properties, the vocal tract can be represented as a pole-only filter.

The basic idea of LPC-based speech synthesis is, that the inverse of the prediction-error filter is an pole-only filter. When interpreting the error signal as glottal signal in the source-filter model, the inverse of the prediction-error filter can be used to generate the speech signal . A synthetic vocal may be generated by using a synthetic glottal signal. A simple model for the glottal signal is a mean-free periodic rectangular signal with a duty cycle of 1:10. The signal is generated and plotted for illustration.

代码
文本
[10]
from scipy.signal import square


def glottal_signal(f0):
'''Generates a synthetic glottal signal with fundamental frequency f0 in Hertz.'''
w0 = 2 * np.pi * f0
t = 1/fs * np.arange(L)
g = 1/2 * (1 + square(w0*t, duty=1/10))

return - g + 0.1


g = 0.0055 * glottal_signal(100)

plt.figure(figsize=(8, 4))
plt.stem(g[:100])
plt.xlabel(r'$k$')
plt.ylabel(r'glottal signal $g[k]$')
plt.grid()
代码
文本

The spectral contents of the error signal and the synthetic glottal signal are compared by estimating their power spectral densities (PSDs) using the Welch technique.

代码
文本
[11]
from scipy.signal import welch

f, Pee = welch(e, fs, nperseg=1024)
f, Pgg = welch(g, fs, nperseg=1024)
plt.plot(f, Pee,
label=r'$\hat{\Phi}_{ee}(e^{j \Omega})$ error signal')
plt.plot(f, Pgg,
label=r'$\hat{\Phi}_{gg}(e^{j \Omega})$ synthetic glottal signal')

plt.xlim([0, 1000])
plt.xlabel('frequency in Hertz')
plt.legend()
plt.grid()
代码
文本

The spectra of both signals constitute line-like spectra due to their periodicity. However, the levels of the spectral lines differ. This becomes most prominent around 600 Hertz. The synthetic glottal signal is now filtered with the inverse prediction-error filter in order to generate the synthetic speech signal.

代码
文本
[12]
xa = lfilter([1], hpe, g)

plt.figure(figsize=(8, 4))
plt.plot(xa)
plt.xlabel(r'$k$')
plt.ylabel(r'synthetic signal $x[k]$')
plt.axis([10000, 10500, -1, 1])
plt.grid()
代码
文本

Visual comparison with the recorded speech signal shown above yields that the coarse structure of both is quite similar. The differences are a consequence of the simple glottal signal used. Both signals may also be compared aurally by listening to them.

代码
文本
[13]
wavfile.write('artificial_vocal.wav', fs, np.int16(xa*2**15))
代码
文本

Recorded vocal

/digital-signal-processing-lecture/data/vocal_o_8k.wav

Synthesized vocal

/artificial_vocal.wav

代码
文本

Copyright

This notebook is provided as Open Educational Resource. Feel free to use the notebook for your own purposes. The text is licensed under Creative Commons Attribution 4.0, the code of the IPython examples under the MIT license. Please attribute the work as follows: Sascha Spors, Digital Signal Processing - Lecture notes featuring computational examples.

代码
文本
English
notebook
Signal Processing
EnglishnotebookSignal Processing
点个赞吧
推荐阅读
公开
Digital-Signal-Processing 04Random Signals and LTI-Systems_Measurement of Acoustic Impulse Responses
EnglishnotebookSignal Processing
EnglishnotebookSignal Processing
喇叭花
发布于 2023-08-02
公开
Digital-Signal-Processing_04_Random Signals and LTI-Systems_Auto-Correlation Function
EnglishnotebookSignal Processing
EnglishnotebookSignal Processing
喇叭花
发布于 2023-08-02