doco changes

author drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>

Tue, 16 Nov 2010 00:14:57 +0000 (00:14 +0000)

committer drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>

Tue, 16 Nov 2010 00:14:57 +0000 (00:14 +0000)
author drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
Tue, 16 Nov 2010 00:14:57 +0000 (00:14 +0000)
committer drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
Tue, 16 Nov 2010 00:14:57 +0000 (00:14 +0000)
diff --git a/codec2/src/phase.c b/codec2/src/phase.c

index c503d3b36449857f831bc8f24047efd0cdbdaf21..5210f64b228bbc0de73a71f13aeef277c62b5220 100644 (file)
--- a/codec2/src/phase.c
+++ b/codec2/src/phase.c
@@ -149,24 +149,26 @@ void aks_to_H(
     This E[m] then gets passed through the LPC synthesis filter to
     determine the final harmonic phase.
       
-   For a while there were prolems with low pitched males like hts1
-   sounding "clicky".  The synthesied time domain waveform also looked
-   clicky.  Many methods were tried to improve the sounds quality of
-   low pitched males. Finally adding a small amount of jitter to each
-   harmonic worked.
-
-   The current result sounds very close to the original phases, with
-   only 1 voicing bit per frame.  For example hts1a using original
-   amplitudes and this phase model produces speech hard to distinguish
-   from speech synthesise with the orginal phases.  The sound quality
-   of this patrtiallyuantised codec (nb original amplitudes) is higher
-   than g729, even though all the phase information has been
-   discarded.
+   Comparing to speech synthesised using original phases:
+
+   - Through headphones speech synthesised with this model is not as 
+     good. Through a loudspeaker it is very close to original phases.
+
+   - If there are voicing errors, the speech can sound clicky or
+     staticy.  If V speech is mistakenly declared UV, this model tends to
+     synthesise impulses or clicks, as there is usually very little shift or
+     dispersion through the LPC filter.
+
+   - When combined with LPC amplitude modelling there is an additional
+     drop in quality.  I am not sure why, theory is interformant energy
+     is raised making any phase errors more obvious.
  
     NOTES:
  
-     1/ This synthesis model is effectvely the same as simple LPC-10
-     vocoders, and yet sounds much better.  Why?
+     1/ This synthesis model is effectively the same as a simple LPC-10
+     vocoders, and yet sounds much better.  Why? Conventional wisdom
+     (AMBE, MELP) says mixed voicing is required for high quality
+     speech.
  
       2/ I am pretty sure the Lincoln Lab sinusoidal coding guys (like xMBE
       also from MIT) first described this zero phase model, I need to look
@@ -182,9 +184,6 @@ void aks_to_H(
       a small delta-W to make phase tracks line up for voiced
       harmonics.
  
-     4/ Why does this sound so great with 1 V/UV decision?  Conventional
-     wisdom says mixed voicing is required for high qaulity speech.
-
  \*---------------------------------------------------------------------------*/
  
  void phase_synth_zero_order(
author	drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
	Tue, 16 Nov 2010 00:14:57 +0000 (00:14 +0000)
committer	drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
	Tue, 16 Nov 2010 00:14:57 +0000 (00:14 +0000)