LPC Speech Synthesizer

Linear Predictive Coding (LPC) based vowel speech synthesis using the source-filter model of speech production — implemented in MATLAB.

This project was completed as part of an MSc Artificial Intelligence programme at the University of Surrey.

Overview

The source-filter model of speech production treats the vocal tract as a time-varying linear filter excited by a periodic source. This project implements that model by:

Loading real male and female vowel recordings ("heed")
Extracting a ~100 ms quasi-stationary segment from the vowel nucleus
Estimating the mean fundamental frequency F0 using MATLAB's pitch() function
Fitting an all-pole LPC filter to the segment using the autocorrelation method (lpc())
Extracting the first three formant frequencies from the LPC filter roots
Generating a periodic impulse train at frequency F0 as the synthetic source
Synthesizing speech by filtering the impulse train through the LPC filter
Comparing the synthesized signal with the original in both time and frequency domains

Repository Structure

LPC-Speech-Synthesizer/ ├── speech_synthesis_female.m # LPC synthesis pipeline for female voice ├── speech_synthesis_male.m # LPC synthesis pipeline for male voice └── README.md

Note: Speech sample .wav files are not included. To reproduce results, download male/female vowel recordings (e.g. from the IViE Corpus) and place heed_f.wav and heed_m.wav in the same directory as the scripts.

Key Concepts

Linear Predictive Coding (LPC)

LPC models the vocal tract as an all-pole (AR) filter. Each speech sample is approximated as a linear combination of its past k values, weighted by prediction coefficients estimated via the autocorrelation (Yule-Walker) method. The LPC order k controls how many past samples are used — higher values capture more spectral detail.

Formant Estimation

Formant frequencies (the resonances of the vocal tract) are extracted from the poles of the LPC filter. Each pole in the z-plane corresponds to a frequency and bandwidth. Roots with bandwidth below 400 Hz and frequency above 90 Hz are classified as formants. The first three formants (F1, F2, F3) characterise the vowel quality.

Speech Synthesis

A 1-second periodic impulse train is generated at the estimated F0 and passed through the LPC all-pole filter. The filter acts as a model of the vocal tract and shapes the flat spectrum of the impulse train into a vowel-like sound. The result approximates the original vowel recording.

Results Summary

Parameter	Female ("heed")	Male ("heed")
Mean F0	~220 Hz	~130 Hz
F1 (Hz)	~350	~300
F2 (Hz)	~2200	~2100
F3 (Hz)	~2900	~2700
LPC order used	40	40

These values are consistent with published formant data for the /iː/ vowel (Peterson & Barney, 1952).

Usage

Requirements

MATLAB R2019b or later
Signal Processing Toolbox

Steps

Clone this repository:

   git clone https://github.com/BilalAhmadSami/LPC-Speech-Synthesizer.git
   cd LPC-Speech-Synthesizer

Place your input .wav files (heed_f.wav, heed_m.wav) in the project directory, or update the AUDIO_FILE variable at the top of each script.
Open MATLAB and run:

   % For female voice:
   run('speech_synthesis_female.m')

   % For male voice:
   run('speech_synthesis_male.m')

When prompted, enter the LPC order k. A value of 14 is a good starting point; try 40 or 100 to observe how spectral resolution changes.
The script prints the estimated F0 and formant frequencies, generates five plots, and saves the original and synthesized audio as .wav files.

Effect of LPC Order

LPC order `k`	Effect
~14 (rule of thumb: Fs/1000 + 2)	Captures main formants; smooth spectral envelope
40	Better spectral detail; more formant peaks resolved
100+	Models fine spectral structure; risk of overfitting noise

Plots Generated

Each script produces five figures:

Speech Waveforms — full signal and the extracted quasi-stationary segment
Impulse Train — the periodic source signal at frequency F0
Filter Response vs. Amplitude Spectrum — LPC filter response overlaid with the segment's dB spectrum
Pole-Zero Diagram — poles of the LPC filter in the z-plane
Synthesized vs. Original — time-domain comparison of the synthesized and original signals

References

Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. Journal of the Acoustical Society of America, 24(2), 175–184.
Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.
Rabiner, L. R., & Schafer, R. W. (2010). Theory and Applications of Digital Speech Processing. Pearson.

MSc Artificial Intelligence, University of Surrey

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LPC Speech Synthesizer

Overview

Repository Structure

Key Concepts

Linear Predictive Coding (LPC)

Formant Estimation

Speech Synthesis

Results Summary

Usage

Requirements

Steps

Effect of LPC Order

Plots Generated

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
speech_synthesis_female.m		speech_synthesis_female.m
speech_synthesis_male.m		speech_synthesis_male.m

Folders and files

Latest commit

History

Repository files navigation

LPC Speech Synthesizer

Overview

Repository Structure

Key Concepts

Linear Predictive Coding (LPC)

Formant Estimation

Speech Synthesis

Results Summary

Usage

Requirements

Steps

Effect of LPC Order

Plots Generated

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages