Skip to content

BilalAhmadSami/LPC-Speech-Synthesizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

LPC Speech Synthesizer

Linear Predictive Coding (LPC) based vowel speech synthesis using the source-filter model of speech production — implemented in MATLAB.

This project was completed as part of an MSc Artificial Intelligence programme at the University of Surrey.


Overview

The source-filter model of speech production treats the vocal tract as a time-varying linear filter excited by a periodic source. This project implements that model by:

  1. Loading real male and female vowel recordings ("heed")
  2. Extracting a ~100 ms quasi-stationary segment from the vowel nucleus
  3. Estimating the mean fundamental frequency F0 using MATLAB's pitch() function
  4. Fitting an all-pole LPC filter to the segment using the autocorrelation method (lpc())
  5. Extracting the first three formant frequencies from the LPC filter roots
  6. Generating a periodic impulse train at frequency F0 as the synthetic source
  7. Synthesizing speech by filtering the impulse train through the LPC filter
  8. Comparing the synthesized signal with the original in both time and frequency domains

Repository Structure

LPC-Speech-Synthesizer/ ├── speech_synthesis_female.m # LPC synthesis pipeline for female voice ├── speech_synthesis_male.m # LPC synthesis pipeline for male voice └── README.md

Note: Speech sample .wav files are not included. To reproduce results, download male/female vowel recordings (e.g. from the IViE Corpus) and place heed_f.wav and heed_m.wav in the same directory as the scripts.


Key Concepts

Linear Predictive Coding (LPC)

LPC models the vocal tract as an all-pole (AR) filter. Each speech sample is approximated as a linear combination of its past k values, weighted by prediction coefficients estimated via the autocorrelation (Yule-Walker) method. The LPC order k controls how many past samples are used — higher values capture more spectral detail.

Formant Estimation

Formant frequencies (the resonances of the vocal tract) are extracted from the poles of the LPC filter. Each pole in the z-plane corresponds to a frequency and bandwidth. Roots with bandwidth below 400 Hz and frequency above 90 Hz are classified as formants. The first three formants (F1, F2, F3) characterise the vowel quality.

Speech Synthesis

A 1-second periodic impulse train is generated at the estimated F0 and passed through the LPC all-pole filter. The filter acts as a model of the vocal tract and shapes the flat spectrum of the impulse train into a vowel-like sound. The result approximates the original vowel recording.


Results Summary

Parameter Female ("heed") Male ("heed")
Mean F0 ~220 Hz ~130 Hz
F1 (Hz) ~350 ~300
F2 (Hz) ~2200 ~2100
F3 (Hz) ~2900 ~2700
LPC order used 40 40

These values are consistent with published formant data for the /iː/ vowel (Peterson & Barney, 1952).


Usage

Requirements

  • MATLAB R2019b or later
  • Signal Processing Toolbox

Steps

  1. Clone this repository:
   git clone https://github.com/BilalAhmadSami/LPC-Speech-Synthesizer.git
   cd LPC-Speech-Synthesizer
  1. Place your input .wav files (heed_f.wav, heed_m.wav) in the project directory, or update the AUDIO_FILE variable at the top of each script.

  2. Open MATLAB and run:

   % For female voice:
   run('speech_synthesis_female.m')

   % For male voice:
   run('speech_synthesis_male.m')
  1. When prompted, enter the LPC order k. A value of 14 is a good starting point; try 40 or 100 to observe how spectral resolution changes.

  2. The script prints the estimated F0 and formant frequencies, generates five plots, and saves the original and synthesized audio as .wav files.

Effect of LPC Order

LPC order k Effect
~14 (rule of thumb: Fs/1000 + 2) Captures main formants; smooth spectral envelope
40 Better spectral detail; more formant peaks resolved
100+ Models fine spectral structure; risk of overfitting noise

Plots Generated

Each script produces five figures:

  1. Speech Waveforms — full signal and the extracted quasi-stationary segment
  2. Impulse Train — the periodic source signal at frequency F0
  3. Filter Response vs. Amplitude Spectrum — LPC filter response overlaid with the segment's dB spectrum
  4. Pole-Zero Diagram — poles of the LPC filter in the z-plane
  5. Synthesized vs. Original — time-domain comparison of the synthesized and original signals

References

  • Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. Journal of the Acoustical Society of America, 24(2), 175–184.
  • Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.
  • Rabiner, L. R., & Schafer, R. W. (2010). Theory and Applications of Digital Speech Processing. Pearson.

MSc Artificial Intelligence, University of Surrey

About

MATLAB implementation of LPC-based vowel speech synthesis using the source-filter model — estimates formant frequencies and F₀ from real speech samples, then re-synthesises vowels via an all-pole filter driven by a periodic impulse train.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors