part way through coding up codec2 functions, tquantoc unittets written

author drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>

Sun, 22 Aug 2010 00:56:52 +0000 (00:56 +0000)

committer drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>

Sun, 22 Aug 2010 00:56:52 +0000 (00:56 +0000)
author drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
Sun, 22 Aug 2010 00:56:52 +0000 (00:56 +0000)
committer drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
Sun, 22 Aug 2010 00:56:52 +0000 (00:56 +0000)
diff --git a/codec2/README.txt b/codec2/README.txt

index 4b17b17cff86906a9fb5b57b4134aac19e2057f9..186a6561667249c61c505645ab6bbd0b109cd7da 100644 (file)
--- a/codec2/README.txt
+++ b/codec2/README.txt
@@ -1,482 +1,41 @@
-Codec2 - Open Source Low Bit Rate Speech Codec
-===============================================
-
-David Rowe, VK5DGR
-
-[[introduction]]
-Introduction
-------------
-
-Codec2 is an open source low bit rate speech codec designed for
-communications quality speech at around 2400 bit/s.  Applications
-include low bandwidth HF/VHF digital radio.  It fills a gap in open
-source, free-as-in-speech voice codecs beneath 5000 bit/s.
-
-The motivations behind the project are summarised in this
-link:/blog/?p=128[blog post].
-
-You can help and support Codec2 development via a <<help,PayPal donation,>>.
-
-[[status]]
-Status
-------
-
-Still in experimental/development stage - no 2400 bit/s codec
-available yet, but we are getting very close!  
-
-Progress to date:
-
-1. Unquantised encoder (sinenc) and decoder (sinedec) running under
-   Linux/gcc.  The decoder (sinedec) is a test-bed for various
-   modelling and quantisation options - these are controlled via
-   command line switches.
-
-2. LPC modelling is working nicely and there is a first pass LSP
-   vector quantiser working at 32 bits/frame with acceptable voice
-   quality.  Lots could be done to improve this (e.g. improved quality
-   and reduced bit rate).
-
-3. A phase model has been developed that uses 0 bits for phase and 1
-   bit/frame for voiced/unvoiced decision but delivers high quality
-   speech.  This works suspiciously well - codecs with a single bit
-   voiced/unvoiced decision aren't meant to sound this good.  Usually
-   mixed voicing at several bits/frame is required.
-
-4. An experimental post filter has been developed to improve
-   performance with speech mixed with background noise.  The post
-   filter delivers many of the advantages of mixed voicing but unlike
-   mixed voicing requires zero bits.
-
-5. Non-Linear Pitch (NLP) pitch estimator working OK, and a simple
-   pitch tracker has been developed to help with some problem frames.
-
-6. An algorithm for decimating sinusoidal model parameters from the
-   native 10ms rate to a 20ms rate has been developed.  High quality
-   speech is maintained, some small differences are only audible
-   through headphones.  A 20ms frame rate is required for 2400 bit/s
-   coding.
-
-Current work areas:
-
-1. Reduce CPU Load of the first order phase model fit.  Apart from
-   that the codec runs very quickly.
-
-2. Write separate encoder and decoder functions and demo programs for
-   the alpha 2400 bit/s codec.  Currently most of the processing
-   happens inside the sinedec program.
-
-[[source]]
-The Source Code
----------------
-
-Browse:
-
-http://freetel.svn.sourceforge.net/viewvc/freetel/codec2/[http://freetel.svn.sourceforge.net/viewvc/freetel/codec2/]
-
-Check Out:
-
-  $ svn co https://freetel.svn.sourceforge.net/svnroot/freetel/codec2 codec2
-
-[[support]]
-The Mailing List
-----------------
-
-For any questions, comments, support, suggestions for applications by
-end users and developers please post to the https://lists.sourceforge.net/lists/listinfo/freetel-codec2[Codec2 Mailing List]
-
-[[quickstart]]
-Quick Start
------------
-
-To encode the file raw/hts1a.raw to a set of sinusoidal model
-parameters (hts1.mdl) then decode to a raw file src/hts1a_uq:
-
- $ cd codec2/src
- $ make
- $ ./sinenc ../raw/hts1a.raw hts1.mdl 300 ../pitch/hts1a.p
- $ ./sinedec ../raw/hts1a.raw hts1.mdl -o hts1a_uq.raw
- $ play -f s -r 8000 -s w ../raw/hts1a.raw
- $ play -f s -r 8000 -s w hts1a_uq.raw
-
-[[plan]]
-Development Roadmap
--------------------
-
-  [X] Milestone 0 - Project kick off
-  [X] Milestone 1 - Alpha 2400 bits/s codec
-      [X] Spectral amplitudes modelled using LPC
-      [X] Phase and voicing model developed
-      [X] Pitch estimator
-      [X] Spectral amplitudes quantised using LSPs
-      [X] Decimation of model parameters from 20ms to 10ms
-      [ ] Refactor to develop a seperate encoder/decoder functions
-      [ ] Complete 2400 bits/s codec demonstrated
-      [ ] Reduced complexity voicing estimator
-      [ ] Test phone call over LAN
-      [ ] Release 0.1 for Alpha Testing
-  [ ] Milestone 2 - Beta codec for digital radio
-      [ ] Gather samples from the community with different speakers,
-          input filtering, and background noise conditions that break
-         codec.
-      [ ] Further develop algorithm based on samples above
-      [ ] Design FEC scheme
-      [ ] Test over digital radio links
-  [ ] Milestone 3 - Fixed point port
-  [ ] Milestone 4 - codec2-on-a-chip embedded DSP/CPU port
-
-[[howitworks]]
-How it Works
-------------
-
-Speech is modelled as a sum of sinusoids:
-
-  for(m=1; m<=L; m++)
-    s[n] = A[m]*cos(Wo*m + phi[m]);
-
-The sinusoids are multiples of the fundamental frequency Wo
-(omega-naught), so the technique is known as "harmonic sinusoidal
-coding".  For each frame, we analyse the speech signal and extract a set of
-parameters:
-
-  Wo, {A}, {phi}
-
-Where Wo is the fundamental frequency (also know as the pitch), { A
-} is a set of L amplitudes and { phi } is a set of L phases.  L is
-chosen to be equal to the number of harmonics that can fit in a 4 KHz
-bandwidth:
-
-  L = floor(pi/Wo)
-
-Wo is specified in radians normalised to 4 kHz, such that pi radians
-= 4 kHz.  the fundamental frequency in Hz is:
-
-  F0 = (8000/(2*pi))*Wo
-
-We then need to encode (quantise) Wo, { A }, { phi } and transmit them to
-a decoder which reconstructs the speech.  A frame might be 10-20ms in
-length so we update the parameters every 10-20ms (100 to 50 Hz
-update rate).
-
-The speech quality of the basic harmonic sinusoidal model is pretty
-good, close to transparent.  It is also relatively robust to Wo
-estimation errors.  Unvoiced speech (e.g. consonants) are well
-modelled by a bunch of harmonics with random phases.  Speech corrupted
-with background noise also sounds OK, the background noise doesn't
-introduce any grossly unpleasant artifacts.
-
-As the parameters are quantised to a low bit rate and sent over the
-channel, the speech quality drops.  The challenge is to achieve a
-reasonable trade off between speech quality and bit rate.
-
-Bit Allocation
+Codec 2 README
  --------------
  
-[grid="all"]
-`-------------------------------.----------
-Parameter                       bits/frame
--------------------------------------------
-Spectral magnitudes (LSPs)     32
-Low frequency LPC correction     1
-Energy                           5
-Voicing                          1
-Fundamental Frequency (Wo)       7
-Spare                            2
--------------------------------------------
-Total                           48 
--------------------------------------------
+Codec 2 is an open source 2400 bit/s speech codec.  For more
+information please see:
  
-At a 20ms update rate 48 bits/frame is 2400 bits/s.
+    http://rowetel.com.codec2.html
  
-[[challenges]]
-Challenges
+Quickstart
  ----------
  
-The tough bits of this project are:
-
-1. Parameter estimation, in particular pitch (Wo) detection.
-
-2. Reduction of a time-varying number of parameters (L changes with Wo
-   each frame) to a fixed number of parameters required for a fixed
-   bit rate.  The trick here is that { A } tend to vary slowly with
-   frequency, so we can "fit" a curve to the set of { A } and send
-   parameters that describe that curve.
-
-3. Discarding the phases { phi }.  In most low bit rate codecs phases
-   are discarded, and synthesised at the decoder using a rule-based
-   approach.  This also implies the need for a "voicing" model as
-   voiced speech (vowels) tends to have a different phase structure to
-   unvoiced (constants).  The voicing model needs to be accurate (not
-   introduce distortion), and relatively low bit rate.
-
-4. Quantisation of the amplitudes { A } to a small number of bits
-   while maintaining speech quality.  For example 30 bits/frame at a
-   20ms frame rate is 30/0.02 = 1500 bits/s, a large part of our 2400
-   bit/s "budget".
-
-5. Performance with different speakers and background noise
-   conditions.  This is where you come in - as codec2 develops please
-   send me samples of it's performance with various speakers and
-   background noise conditions and together we will improve the
-   algorithm.  This approach proved very powerful when developing
-   link:oslec.html[Oslec].  One of the cool things about open source!
-
-[[help]]
-Can I help?
------------
-
-Maybe; check out the the <<plan, Development Roadmap>> above and the latest
-version of the
-http://freetel.svn.sourceforge.net/viewvc/freetel/codec2/TODO.txt?view=log[TODO]
-list and and see if there is anything that interests you.
-
-Not all of this project is DSP.  There are many general C coding tasks
-like refactoring and writing a command line soft phone application for
-testing the codec over a LAN.
-
-I will happily accept *sponsorship* for this project.  For example
-research grants, or development contracts from companies interested in
-seeing an open source low bit rate speech codec. 
+$ cd codec2/src
+$ make
+$ ./c2enc ../raw/hts1a.raw hts1a_c2.bit
+$ ./c2dec hts1a_c2.bit hts1a_c2.raw 
+$ ../scripts/menu.sh ../raw/hts1a.raw hts1a_c2.raw
  
-You can also donate to the codec2 project via PayPal (which also
-allows credit card donations):
+Programs
+--------
  
-+++++++++++++++++++++++++++
-<form name="_xclick" action="https://www.paypal.com/cgi-bin/webscr" method="post">
-<input type="hidden" name="cmd" value="_xclick">
-<input type="hidden" name="business" value="david@rowetel.com">
-<input type="hidden" name="item_name" value="Codec2 donation">
-<input type="hidden" name="currency_code" value="USD">
-Donation in US$: <input name="amount" value="10.00">
-<input type="image" src="http://www.paypal.com/en_US/i/btn/btn_donate_LG.gif" border="0" name="submit" alt="Make payments with PayPal - it's fast, free and secure!">
-</form>
-+++++++++++++++++++++++++++
+1/ c2enc encodes a file of speech sample to a file of encoded bits.
+One bit is stored in the LSB or each byte.
  
-[[patents]]
-Is it Patent Free?
-------------------
+2/ c2dec decodes a file of bits to a file of speech samples.
  
-I think so - much of the work is based on old papers from the 60, 70s
-and 80's and the PhD thesis work [2] used as a baseline for this codec
-was original.  A nice little mini project would be to audit the
-patents used by proprietary 2400 bit/s codecs (MELP and xMBE) and
-compare.
+3/ c2sim is a simulation/development version of codec 2.  It allows
+selective use of the various codec 2 algorithms.  For example
+switching phase modelling or LSP quantisation on and off.
  
-Proprietary codecs typically have small, novel parts of the algorithm
-protected by patents.  However proprietary codecs also rely heavily on
-large bodies of public domain work.  The patents cover perhaps 5% of
-the codec algorithms.  Proprietary codec designers did not invent most
-of the algorithms they use in their codec. Typically, the patents just
-cover enough to make designing an interoperable codec very difficult.
-These also tend to be the parts that make their codecs sound good.
-
-However there are many ways to make a codec sound good, so we simply
-need to choose and develop other methods.
-
-[[compat]]
-Is Codec2 compatible with xMBE or MELP?
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Nope - I don't think it's possible to build a compatible codec without
-infringing on patents or access to commercial in confidence
-information.
-
-[[hacking]]
-Hacking
--------
-
-If you would like to work on the Codec2 code base here are some
-notes:
-
-* src/code.sh will perform the several processing steps
-  required to output speech files at various processing steps, for
-  example:
-
-  $ ./code.sh hts1a
-+
-will produce hts1a_uq (unquantised, i.e. baseline sinusoidal model),
-hts1a_phase0 (zero phase model), hts1a_lpc10 (10th order LPC model).
-
-* You can then listen to all of these samples (and the original)
-  using:
-
-  $ ./listen.sh hts1a
-
-* Specific notes about LPC and Phase modelling are below.
-
-* There are some useful scripts in the scripts directory, for example
-  wav2raw.sh, raw2wav.sh, playraw.sh, menu.sh.  Note that code.sh and
-  listen.sh are in the src directory as thats where they get used most
-  of the time.
-
-[[lpc]]
-LPC Modelling Notes
--------------------
-
-Linear Prediction Coefficient (LPC) modelling is used to model the
-sine wave amplitudes { A }.  The use of LPC in speech coding is
-common, although the application of LPC modelling to frequency domain
-coding is fairly novel.  They are mainly used for time domain codecs
-like LPC-10 and CELP.
-
-LPC modelling has a couple of advantages:
-
-* From time domain coding we know a lot about LPC, for example how to
-  quantise them efficiently using Line Spectrum Pairs (LSPs).
-
-* The number of amplitudes varies each frame as Wo and hence L vary.
-  This makes the { A } tricky to quantise and transmit.  However it is
-  possible to convey the same information using a fixed number of
-  LPCs which makes the quantisation problem easier.
-
-To test LPC modelling:
-
-  $ ./sinedec ../raw/hts1a.raw hts1a.mdl --lpc 10 - hts1a_lpc10.raw
-
-The blog post [4] discusses why LPC modelling works so well when Am
-recovered via RMS method (Section 5.1 of thesis).  Equal area model of
-LPC spectra versus harmonic seems to work remarkably well, especially
-compared to sampling.  SNRs up to 30dB on female frames.
-
-There is a problem with modelling the low order (e.g. m=1,
-i.e. fundamental) harmonics for males. The amplitude of the m=1
-harmonic is raised by as much as 30dB after LPC modelling as (I think)
-LPC spectra must have zero derivative at DC.  This means it's poor at
-modelling very low freq harmonics which unfortunately the ear is very
-sensitive to.  To correct this an extra bit has been added to correct
-LPC modelling errors on the first harmonic.  When set this bit
-instructs the decoder to attenuate the LPC modelled harmonic by 30dB.
-
-[[phase]]
-Phase Modelling Notes
----------------------
-
-"Zero order" and "First order" phase models have been developed to
-synthesise phases of each harmonic at the decoder side.  These are
-described in source code of
-http://freetel.svn.sourceforge.net/viewvc/freetel/codec2/src/phase.c?view=log[phase.c].
-
-The zero phase model required just one voicing bit to be transmitted
-to the decoder, all other phase information is synthesised use a rule
-based model.  It seems to work OK for most speech samples, but adds a
-"clicky" artcifact to some low picthed speakers.  Also see the blog
-posts below for more discussion of phase models.
-
-To determine voicing we attempt to fit a first order phase model, then
-measure SNR of the fit.  The frame is declared unvoiced if the SNR is
-beneath a threshold.
-
-Current phase modelling makes no attempt to match harmonic phases at
-frame edges.  This area would be worth experimenting with, as it could
-cause roughness.  Then again it might be responsible for effective
-mixed voicing modelling.
-
-Unvoiced speech can be represented well by random phases and a Wo
-estimate that jumps around randomly.  If Wo is small the number of
-harmonics is large which makes the noise less periodic and more noise
-like to the ear.  With Wo jumping around phase tracks are
-discontinuous between frames which also makes the synthesised signal
-more noise like and prevents time domain pulses forming that the ear
-is sensitive to.
-
-Running Phase Models
-~~~~~~~~~~~~~~~~~~~~
-
-In src/phase.c two phase models have been implemented, the algorithms
-are explained in the source code.
-
-Zero Phase model (just one voicing bit/frame):
-
-  $ ./sinedec ../raw/hts1a.raw hts1a.mdl --phase 0 - hts1a_phase0.raw
-
-First order Phase model (when quantised around 13 bits/frame bits for
-pulse position, phase, and a voicing bit):
-
-  $ ./sinedec ../raw/hts1a.raw hts1a.mdl --phase 1 - hts1a_phase1.raw
-
-[[octave]]
-Octave Scripts 
---------------
-
-* pl.m    - plot a segment from a raw file
-
-* pl2.m   - plot the same segments from two different files to compare
-
-* plamp.m - menu based GUI interface to "dump" files, move back and forward
-          through file examining time and frequency domain parameters, lpc 
-          model etc
-  
-          $ ./sinedec ../raw/hts1a.raw hts1a.mdl --lpc 10 --dump hts1a
-          $ cd ../octave
-          $ octave
-          octave:1> plamp("../src/hts1a",25)
-
-* plphase.m - similar to plamp.m but for analysing phase models
-
-          $ ./sinedec ../raw/hts1a.raw hts1a.mdl --phase [0|1] --dump hts1a_phase
-          $ cd ../octave
-          $ octave
-          octave:1> plphase("../src/hts1a_phase",25)
-
-* plpitch.m - plot two pitch contours (.p files) and compare
-
-* plnlp.m   - plots a bunch of NLP pitch estimator states.  link:/images/codec2/tnlp_screenshot.png[Screenshot]
-
-[[directories]]
  Directories
  -----------
  
-  script   - shell scripts to simply common operations
-  speex    - LSP quantisation code borrowed from Speex for testing
+  script   - shell scripts for playing and converting raw files
    src      - C source code
-  octave   - Matlab/Octave scripts
+  octave   - Octave scripts used for visualising internal signals 
+             during development
    pitch    - pitch estimator output files
    raw      - speech files in raw format (16 bits signed linear 8 KHz)
-  unittest - Unit test source code
+  unittest - unit test source code
    wav      - speech files in wave file format
  
-[[other]]
-Other Uses
-----------
-
-The DSP algorithms contained in codec2 may be useful for other DSP
-applications, for example:
-
-* The
-  http://freetel.svn.sourceforge.net/viewvc/freetel/codec2/src/nlp.c?view=log[nlp.c]
-  pitch estimator is a working, tested, pitch estimator for human
-  speech.  NLP is an open source pitch estimator presented in C code
-  complete with a GUI debugging tool (plnlp.m
-  link:/images/codec2/tnlp_screenshot.png[screenshot]).  It can be run
-  stand-alone using the
-  http://freetel.svn.sourceforge.net/viewvc/freetel/codec2/unittest/tnlp.c?view=log[tnlp.c]
-  unit test program. It could be applied to other speech coding
-  research.  Pitch estimation is a popular subject in academia,
-  however most pitch estimators are described in papers, with the fine
-  implementation details left out.
-
-* The basic analysis/synthesis framework could be used for high
-  quality speech synthesis.
-
-[[refs]] 
-References 
-----------
-
-[1] http://perens.com/[Bruce Perens] introducing the
-    http://codec2.org/[codec2 project concept]
-
-[2] David's PhD Thesis,
-    http://www.itr.unisa.edu.au/~steven/thesis/dgr.pdf["Techniques for
-    Harmonic Sinusoidal Coding"], used for baseline algorithm
-
-[3] http://www.rowetel.com/blog/?p=128[Open Source Low rate Speech
-    Codec Part 1 - Introduction]
-    
-[4] http://www.rowetel.com/blog/?p=130[Open Source Low rate Speech
-    Codec Part 2 - Spectral Magnitudes]
-    
-[5] http://www.rowetel.com/blog/?p=131[Open Source Low rate Speech
-    Codec Part 3 - Phase and Male Speech]
-
-[6] http://www.rowetel.com/blog/?p=132[Open Source Low rate Speech
-    Codec Part 4 - Zero Phase Model]
-
diff --git a/codec2/src/c2sim.c b/codec2/src/c2sim.c

index 31addfeabb9ed71e9e7667de627f37708e69387a..26ac63ecf99406bc57657a85cbe782362e0bf1d9 100644 (file)
--- a/codec2/src/c2sim.c
+++ b/codec2/src/c2sim.c
@@ -103,13 +103,12 @@ int main(int argc, char *argv[])
    
    int phase0;
    float ex_phase[MAX_AMP+1];
-  int voiced, voiced_1, voiced_2;
  
    int   postfilt;
    float bg_est;
  
-  int   hand_snr;
-  FILE *fsnr;
+  int   hand_voicing;
+  FILE *fvoicing;
  
    MODEL model_1, model_2, model_3, model_synth, model_a, model_b;
    int transition, decimate;
@@ -121,7 +120,6 @@ int main(int argc, char *argv[])
  
    prev_Wo = TWO_PI/P_MAX;
  
-  voiced_1 = voiced_2 = 0;
    model_1.Wo = TWO_PI/P_MIN;
    model_1.L = floor(PI/model_1.Wo);
    for(i=1; i<=model_1.L; i++) {
@@ -141,7 +139,7 @@ int main(int argc, char *argv[])
        printf("        [--lsp]\n");
        printf("        [--phase0]\n");
        printf("        [--postfilter]\n");
-      printf("        [--hand_snr]\n");
+      printf("        [--hand_voicing]\n");
        printf("        [--dec]\n");
        printf("        [--dump DumpFilePrefix]\n");
      exit(0);
@@ -190,10 +188,10 @@ int main(int argc, char *argv[])
        ex_phase[0] = 0;
    }
  
-  hand_snr = switch_present("--hand_snr",argc,argv);
-  if (hand_snr) {
-      fsnr = fopen(argv[hand_snr+1],"rt");
-      assert(fsnr != NULL);
+  hand_voicing = switch_present("--hand_voicing",argc,argv);
+  if (hand_voicing) {
+      fvoicing = fopen(argv[hand_voicing+1],"rt");
+      assert(fvoicing != NULL);
    }
  
    bg_est = 0.0;
@@ -240,8 +238,8 @@ int main(int argc, char *argv[])
  
      if (phase0) {
         float Wn[M];                    /* windowed speech samples */
-       float ak_phase[PHASE_LPC+1];    /* autocorrelation coeffs  */
-       float Rk[PHASE_LPC+1];          /* autocorrelation coeffs  */
+       float ak_phase[LPC_ORD+1];      /* autocorrelation coeffs  */
+       float Rk[LPC_ORD+1];            /* autocorrelation coeffs  */
         COMP  Sw_[FFT_ENC];
         
         dump_phase(&model.phi[0], model.L);
@@ -266,7 +264,7 @@ int main(int argc, char *argv[])
         
         /* determine voicing */
  
-       snr = est_voicing_mbe(&model, Sw, W, (FS/TWO_PI)*model.Wo, Sw_, &voiced);
+       snr = est_voicing_mbe(&model, Sw, W, (FS/TWO_PI)*model.Wo, Sw_);
         dump_Sw_(Sw_);
         dump_snr(snr);
  
@@ -274,14 +272,13 @@ int main(int argc, char *argv[])
  
         for(i=0; i<MAX_AMP; i++)
             model.phi[i] = 0;
-       if (hand_snr) {
-           fscanf(fsnr,"%f\n",&snr);
-           voiced = snr > 2.0;
+       if (hand_voicing) {
+           fscanf(fvoicing,"%d\n",&model.voiced);
         }
-       phase_synth_zero_order(&model, ak_phase, voiced, ex_phase);
+       phase_synth_zero_order(&model, ak_phase, ex_phase);
         
         if (postfilt)
-           postfilter(&model, voiced, &bg_est);
+           postfilter(&model, &bg_est);
      }
   
      /* optional LPC model amplitudes */
@@ -363,8 +360,8 @@ int main(int argc, char *argv[])
    if (dump)
        dump_off();
  
-  if (hand_snr)
-    fclose(fsnr);
+  if (hand_voicing)
+    fclose(fvoicing);
  
    return 0;
  }
diff --git a/codec2/src/codec2.c b/codec2/src/codec2.c

index c30737ddd24755fb08f34f84e19684658d88c147..e288c90d1e206d476a674b6a6adc183835f89f06 100644 (file)
--- a/codec2/src/codec2.c
+++ b/codec2/src/codec2.c
@@ -161,16 +161,18 @@ void codec2_encode(void *codec2_state, char bits[], short speech[])
      /* Estimate pitch */
  
      nlp(c2->Sn,N,M,P_MIN,P_MAX,&pitch,Sw,&c2->prev_Wo);
-    prev_Wo = TWO_PI/pitch;
+    c2->prev_Wo = TWO_PI/pitch;
      model.Wo = TWO_PI/pitch;
+    model.L = PI/model.Wo;
  
      /* estimate model parameters */
  
      dft_speech(Sw, c2->Sn, c2->w); 
      two_stage_pitch_refinement(&model, Sw);
      estimate_amplitudes(&model, Sw, c2->W);
-    est_voicing_mbe(&model, Sw, c2->W, (FS/TWO_PI)*model.Wo, Sw_, &voiced1);
-    
+    est_voicing_mbe(&model, Sw, c2->W, (FS/TWO_PI)*model.Wo, Sw_);
+    voiced1 = model.voiced;
+
      /* Second Frame - send all parameters --------------------------------*/
  
      /* Read input speech */
@@ -184,20 +186,22 @@ void codec2_encode(void *codec2_state, char bits[], short speech[])
      /* Estimate pitch */
  
      nlp(c2->Sn,N,M,P_MIN,P_MAX,&pitch,Sw,&c2->prev_Wo);
-    prev_Wo = TWO_PI/pitch;
+    c2->prev_Wo = TWO_PI/pitch;
      model.Wo = TWO_PI/pitch;
+    model.L = PI/model.Wo;
  
      /* estimate model parameters */
  
      dft_speech(Sw, c2->Sn, c2->w); 
      two_stage_pitch_refinement(&model, Sw);
      estimate_amplitudes(&model, Sw, c2->W);
-    est_voicing_mbe(&model, Sw, c2->W, (FS/TWO_PI)*model.Wo, Sw_, &voiced2);
+    est_voicing_mbe(&model, Sw, c2->W, (FS/TWO_PI)*model.Wo, Sw_);
+    voiced2 = model.voiced;
  
      /* quantise */
      
      nbits = 0;
-    encode_pitch(bits, &nbits, model.Wo);
+    encode_Wo(bits, &nbits, model.Wo);
      encode_voicing(bits, &nbits, voiced1, voiced2);
      encode_amplitudes(bits, &nbits, c2->Sn, c2->w);
      assert(nbits == CODEC2_BITS_PER_FRAME);
@@ -213,48 +217,58 @@ void codec2_encode(void *codec2_state, char bits[], short speech[])
  
  \*---------------------------------------------------------------------------*/
  
-void codec2_decode(float Sn_[], char bits[])
+void codec2_decode(void *codec2_state, short speech[], char bits[])
  {
      CODEC2 *c2;
      MODEL   model;
      float   ak[LPC_ORD+1];
      int     voiced1, voiced2;
-    int     i, nbits, transition;
-    MODEL   model_fwd, model_back, model_interp;
+    int     i, nbits;
+    MODEL   model_interp;
  
      assert(codec2_state != NULL);
      c2 = (CODEC2*)codec2_state;
  
      nbits = 0;
-    model.Wo = decode_pitch(bits, &nbits);
+    model.Wo = decode_Wo(bits, &nbits);
+    model.L = PI/model.Wo;
      decode_voicing(&voiced1, &voiced2, bits, &nbits);
      decode_amplitudes(&model, ak, bits, &nbits);
      assert(nbits == CODEC2_BITS_PER_FRAME);
  
      /* First synthesis frame - interpolated from adjacent frames */
  
-    interp(c2->prev_model, &model, &model_interp, &model_fwd, &model_back, &transition);
-    phase_synth_zero_order(&model, ak, voiced1, &c2->ex_phase);
-
-    postfilter(&model, voiced, &c2->bg_est);
-       if (transition) {
-           synthesise(Sn_,&model_a,Pn,1);
-           synthesise(Sn_,&model_b,Pn,0);
-       }
-       else {
-           synthesise(Sn_,&model,Pn,1);
-       }
-
-       /* Save output speech to disk */
-
-       for(i=0; i<N; i++) {
-           if (Sn_[i] > 32767.0)
-               buf[i] = 32767;
-           else if (Sn_[i] < -32767.0)
-               buf[i] = -32767;
-           else
-               buf[i] = Sn_[i];
-       }
-
+    model_interp.voiced = voiced1;
+    interp(&model_interp, &c2->prev_model, &model);
+    phase_synth_zero_order(&model_interp, ak, voiced1, &c2->ex_phase);
+    postfilter(&model_interp, voiced1, &c2->bg_est);
+    synthesise(c2->Sn_, &model_interp, c2->Pn, 1);
+
+    for(i=0; i<N; i++) {
+       if (Sn_[i] > 32767.0)
+           speech[i] = 32767;
+       else if (Sn_[i] < -32767.0)
+           speech[i] = -32767;
+       else
+           speech[i] = Sn_[i];
+    }
+
+    /* Second synthesis frame */
+
+    model.voiced = voiced2;
+    phase_synth_zero_order(&model, ak, voiced2, &c2->ex_phase);
+    postfilter(&model, voiced2, &c2->bg_est);
+    synthesise(c2->Sn_, &model, c2->Pn, 1);
+
+    for(i=0; i<N; i++) {
+       if (Sn_[i] > 32767.0)
+           speech[i+N] = 32767;
+       else if (Sn_[i] < -32767.0)
+           speech[i+N] = -32767;
+       else
+           speech[i+N] = Sn_[i];
+    }
+
+    memcpy(&c2->prev_model, model, sizeof(MODEL);
  }
  
diff --git a/codec2/src/codec2.h b/codec2/src/codec2.h

index bca92f5286520045be10d686c0d03923102cdc98..8c3d4043ef3bd5704569a28c0fdc4411620157f9 100644 (file)
--- a/codec2/src/codec2.h
+++ b/codec2/src/codec2.h
@@ -35,7 +35,7 @@
  
  void *codec2_create();
  void codec2_destroy(void *codec2_state);
-void codec2_encode(void *codec2_state, char bits[], short speech[]);
-void codec2_decode(float Sn_[], char bits[]);
+void codec2_encode(void *codec2_state, char bits[], short speech_in[]);
+void codec2_decode((void *codec2_state, int speech_out[], char bits[]);
  
  #endif
diff --git a/codec2/src/defines.h b/codec2/src/defines.h

index 393bcd0dba5df93b6d6b747077ea797ae5439552..def9e307f08f794a7dbbd9c0ae1ae4d5929c54ce 100644 (file)
--- a/codec2/src/defines.h
+++ b/codec2/src/defines.h
@@ -76,9 +76,9 @@ typedef struct {
  typedef struct {
    float Wo;            /* fundamental frequency estimate in radians  */
    int   L;             /* number of harmonics                        */
-  float v[MAX_AMP];    /* voicing measures (unused at present)       */
    float A[MAX_AMP];    /* amplitiude of each harmonic                */
    float phi[MAX_AMP];  /* phase of each harmonic                     */
+  int   voiced;                /* non-zero if this frame is voiced           */
  } MODEL;
  
  #endif
diff --git a/codec2/src/dump.c b/codec2/src/dump.c

index 386443c1affc5c2af99e4da62126df7691ae16dc..aba48f2507134dfd89d5612a559016bf1cb6153c 100644 (file)
--- a/codec2/src/dump.c
+++ b/codec2/src/dump.c
@@ -172,10 +172,7 @@ void dump_model(MODEL *model) {
         fprintf(fmodel,"%f\t",model->A[l]);
      for(l=model->L+1; l<MAX_AMP; l++)
         fprintf(fmodel,"0.0\t");
-    for(l=1; l<=model->L; l++)
-       fprintf(fmodel,"%f\t",model->v[l]);
-    for(l=model->L+1; l<MAX_AMP; l++)
-       fprintf(fmodel,"0.0\t");
+    fprintf(fmodel,"%d\t",model->voiced);
      fprintf(fmodel,"\n");    
  }
  
diff --git a/codec2/src/interp.c b/codec2/src/interp.c

index 21112b57d0990f996008a54db6d7916eea46b30a..2bdf38c22f220a012a16470414eda5cf915879e6 100644 (file)
--- a/codec2/src/interp.c
+++ b/codec2/src/interp.c
@@ -26,81 +26,97 @@
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
  
+#include <assert.h>
  #include <math.h>
  #include <string.h>
  
  #include "defines.h"
  #include "interp.h"
  
+float sample_log_amp(MODEL *model, float w);
+
  /*---------------------------------------------------------------------------*\
  
-  interp()
+  FUNCTION....: interp()            
+  AUTHOR......: David Rowe                           
+  DATE CREATED: 22/8/10 
          
-  Given two frames decribed by model parameters 20ms apart, determines the
-  model parameters of the 10ms frame between them.
-
+  Given two frames decribed by model parameters 20ms apart, determines
+  the model parameters of the 10ms frame between them.  Assumes
+  voicing is available for middle (interpolated) frame.  Outputs are
+  amplitudes and Wo for the interpolated frame.
+
+  This version can interpolate the amplitudes between two frames of
+  different Wo and L.
+  
  \*---------------------------------------------------------------------------*/
  
-void interp(
+void interpolate(
+  MODEL *interp,    /* interpolated model params                     */
    MODEL *prev,      /* previous frames model params                  */
-  MODEL *next,      /* next frames model params                      */
-  MODEL *synth,     /* interp model params for cont frame            */
-  MODEL *a,         /* prev frame extended into this frame           */
-  MODEL *b,         /* next frame extended into this frame           */
-  int   *transition /* non-zero if this is a transition frame, this
-                      information is used for synthesis             */
+  MODEL *next       /* next frames model params                      */
  )
  {
-    int m;
-    
-    if (fabs(next->Wo - prev->Wo) < 0.1*next->Wo) {
-
-       /* If the Wo of adjacent frames is within 10% we synthesise a 
-          continuous track through this frame by linear interpolation
-          of the amplitudes and Wo.  This is typical of a strongly 
-          voiced frame.
-       */
-
-       *transition = 0;
-
-       synth->Wo = (next->Wo + prev->Wo)/2.0;
-       if (next->L > prev->L)
-           synth->L = prev->L;
-       else
-           synth->L = next->L;
-       for(m=1; m<=synth->L; m++) {
-           synth->A[m] = (prev->A[m] + next->A[m])/2.0;
-       }
+    int   l;
+    float w,log_amp;
+
+    /* Wo depends on voicing of this and adjacent frames */
+
+    if (interp->voiced) {
+       if (prev->voiced && next->voiced)
+           interp->Wo = (prev->Wo + next->Wo)/2.0;
+       if (!prev->voiced && next->voiced)
+           interp->Wo = next->Wo;
+       if (prev->voiced && !next->voiced)
+           interp->Wo = prev->Wo;
+    }
+    else {
+       interp->Wo = TWO_PI/P_MAX;
+    }
+    interp->L = PI/interp->Wo;
+
+    /* Interpolate amplitudes using linear interpolation in log domain */
+
+    for(l=1; l<=interp->L; l++) {
+       w = l*interp->Wo;
+       log_amp = (sample_log_amp(prev, w) + sample_log_amp(next, w))/2.0;
+       interp->A[l] = pow(10.0, log_amp);
+    }
+}
+
+/*---------------------------------------------------------------------------*\
+
+  FUNCTION....: sample_log_amp()
+  AUTHOR......: David Rowe                           
+  DATE CREATED: 22/8/10 
+        
+  Samples the amplitude envelope at an arbitrary frequency w.  Uses
+  linear interpolation in the log domain to sample between harmonic
+  amplitudes.
+  
+\*---------------------------------------------------------------------------*/
+
+float sample_log_amp(MODEL *model, float w)
+{
+    int   m;
+    float f, log_amp;
+
+    assert(w > 0.0); assert (w <= PI);
+
+    m = floor(w/model->Wo);
+    f = w - m*model->Wo;
+    assert(f <= 1.0);
+
+    if (m < 1) {
+       log_amp = f*log10(model->A[1]);
+    }
+    else if ((m+1) > model->L) {
+       log_amp = (1.0-f)*log10(model->A[model->L]);
      }
      else {
-       /* 
-          transition frame, adjacent frames have different Wo and L
-          so set up two sets of model parameters based on prev and
-          next.  We then synthesise both of them and add them
-          together in the time domain.
-
-          The transition case is typical of unvoiced speech or
-          background noise or a voiced/unvoiced transition.
-       */
-
-       *transition = 1;
-
-       /* a is prev extended forward into this frame, b is next
-          extended backward into this frame.  Note the adjustments to
-          phase to time-shift the model forward or backward N
-          samples. */
-
-       memcpy(a, prev, sizeof(MODEL));
-       memcpy(b, next, sizeof(MODEL));
-       for(m=1; m<=a->L; m++) {
-           a->A[m] /= 2.0;
-           a->phi[m] += a->Wo*m*N;
-       }
-       for(m=1; m<=b->L; m++) {
-           b->A[m] /= 2.0;
-           b->phi[m] -= b->Wo*m*N;
-       }
+       log_amp = (1.0-f)*log10(model->A[m]) + f*log10(model->A[m+1]);
      }
  
+    return log_amp;
  }
  
diff --git a/codec2/src/interp.h b/codec2/src/interp.h

index 5cbfe909967b167c01ce7a99afb0b2bd05473e8e..2349859de5379216ca3c84e00befcd2cebded1dd 100644 (file)
--- a/codec2/src/interp.h
+++ b/codec2/src/interp.h
@@ -29,7 +29,6 @@
  #ifndef __INTERP__
  #define __INTERP__
  
-void interp(MODEL *prev, MODEL *next, MODEL *synth, MODEL *a, MODEL *b, 
-           int *transition);
+void interpolate(MODEL *interp, MODEL *prev, MODEL *next);
  
  #endif
diff --git a/codec2/src/phase.c b/codec2/src/phase.c

index 2d9e77cf17ef81b225ab2a799f47c1cb20c65f43..bcaa065cd6c019f523b7f538476aa174d870de21 100644 (file)
--- a/codec2/src/phase.c
+++ b/codec2/src/phase.c
@@ -190,7 +190,6 @@ void aks_to_H(
  void phase_synth_zero_order(
      MODEL *model,
      float  aks[],
-    int    voiced,
      float *ex_phase             /* excitation phase of fundamental */
  )
  {
@@ -203,7 +202,7 @@ void phase_synth_zero_order(
    float jitter;
  
    G = 1.0;
-  aks_to_H(model,aks,G,H,PHASE_LPC);
+  aks_to_H(model,aks,G,H,LPC_ORD);
  
    /* 
       Update excitation fundamental phase track, this sets the position
@@ -221,7 +220,7 @@ void phase_synth_zero_order(
  
      /* generate excitation */
  
-    if (voiced) {
+    if (model->voiced) {
         /* This method of adding jitter really helped remove the clicky
            sound in low pitched makes like hts1a. This moves the onset
            of each harmonic over at +/- 0.25 of a sample.
diff --git a/codec2/src/phase.h b/codec2/src/phase.h

index 9ce2d057a540d15d77fdebcd38a51a72e9b961fc..dc2a9041f4a0873a56246d6b52074460fec9a9ca 100644 (file)
--- a/codec2/src/phase.h
+++ b/codec2/src/phase.h
@@ -29,6 +29,6 @@
  #ifndef __PHASE__
  #define __PHASE__
  
-void phase_synth_zero_order(MODEL *model, float aks[], int voiced, float *ex_phase);
+void phase_synth_zero_order(MODEL *model, float aks[], float *ex_phase);
  
  #endif
diff --git a/codec2/src/postfilter.c b/codec2/src/postfilter.c

index a2c36057f69fd9a042d50b3123c1ea34e1544aa2..389f6e9dee857a1af3e2245eedb1b1014bfaf7a9 100644 (file)
--- a/codec2/src/postfilter.c
+++ b/codec2/src/postfilter.c
@@ -93,7 +93,6 @@
  
  void postfilter(
    MODEL *model,
-  int    voiced, 
    float *bg_est
  )      
  {
@@ -112,7 +111,7 @@ void postfilter(
       of the threshold is to prevent updating during high level
       speech. */
  
-  if ((e < BG_THRESH) && !voiced)
+  if ((e < BG_THRESH) && !model->voiced)
        *bg_est =  *bg_est*(1.0 - BG_BETA) + e*BG_BETA;
  
    /* now mess with phases during voiced frames to make any harmonics
@@ -120,7 +119,7 @@ void postfilter(
    */
  
    uv = 0;
-  if (voiced)
+  if (model->voiced)
        for(m=1; m<=model->L; m++)
           if (20.0*log10(model->A[m]) < *bg_est) {
               model->phi[m] = TWO_PI*(float)rand()/RAND_MAX;
diff --git a/codec2/src/postfilter.h b/codec2/src/postfilter.h

index 1d5c4e810f1b8248f092376c06fcdcf77dfa7c53..11037174ddf55d3f76e1b3a9a64f035d14ea1511 100644 (file)
--- a/codec2/src/postfilter.h
+++ b/codec2/src/postfilter.h
@@ -29,6 +29,6 @@
  #ifndef __POSTFILTER__
  #define __POSTFILTER__
  
-void postfilter(MODEL *model, int voiced, float *bg_est);
+void postfilter(MODEL *model, float *bg_est);
  
  #endif
diff --git a/codec2/src/quantise.c b/codec2/src/quantise.c

index a6fa4f74d98143f753e5bb90389f55ef7957ed61..3e87344cff91f0b50dba7e4dc3df1ca1ae8b6770 100644 (file)
--- a/codec2/src/quantise.c
+++ b/codec2/src/quantise.c
@@ -1,7 +1,7 @@
  /*---------------------------------------------------------------------------*\
                                                                               
    FILE........: quantise.c
-  AUTHOR......: David Rowe                                                          
+  AUTHOR......: David Rowe                                                     
    DATE CREATED: 31/5/92                                                       
                                                                               
    Quantisation functions for the sinusoidal coder.  
@@ -508,3 +508,81 @@ void aks_to_M2(
        }
  
  }
+
+/*---------------------------------------------------------------------------*\
+                                                       
+  FUNCTION....: encode_Wo()         
+  AUTHOR......: David Rowe                           
+  DATE CREATED: 22/8/2010 
+
+  Encodes Wo using a WO_BITS code and stuffs into bits[] array.
+
+\*---------------------------------------------------------------------------*/
+
+void encode_Wo(char *bits, int *nbits, float Wo)
+{
+    int   code, bit;
+    float Wo_min = TWO_PI/P_MAX;
+    float Wo_max = TWO_PI/P_MIN;
+    float norm;
+    int   i;
+
+    norm = (Wo - Wo_min)/(Wo_max - Wo_min);
+    code = floor(WO_LEVELS * norm + 0.5);
+    if (code < 0 ) code = 0;
+    if (code > (WO_LEVELS-1)) code = WO_LEVELS-1;
+
+    for(i=0; i<WO_BITS; i++) {
+       bit = (code >> (WO_BITS-i-1)) & 0x1;
+       bits[*nbits+i] = bit;
+    }
+
+    *nbits += WO_BITS;
+}
+
+/*---------------------------------------------------------------------------*\
+                                                       
+  FUNCTION....: decode_Wo()         
+  AUTHOR......: David Rowe                           
+  DATE CREATED: 22/8/2010 
+
+  Decodes Wo using a WO_BITS quantiser.
+
+\*---------------------------------------------------------------------------*/
+
+float decode_Wo(char *bits, int *nbits)
+{
+    int   code;
+    float Wo_min = TWO_PI/P_MAX;
+    float Wo_max = TWO_PI/P_MIN;
+    float step;
+    float Wo;
+    int   i;
+
+    code = 0;
+    for(i=0; i<WO_BITS; i++) {
+       code <<= 1;
+       code |= bits[i];
+    }
+    
+    step = (Wo_max - Wo_min)/WO_LEVELS;
+    Wo   = Wo_min + step*(code);
+
+    *nbits += WO_BITS;
+
+    return Wo;
+}
+
+void encode_voicing(char bits[], int *nbits, int voiced1, int voiced2)
+{
+    bits[0] = voiced1;
+    bits[1] = voiced2;
+    *nbits  += 2;
+}
+
+void decode_voicing(int *voiced1, int *voiced2, char bits[], int *nbits);
+{
+    *voiced1 = bits[0];
+    *voiced2 = bits[1];
+    *nbits  += 2;
+}
diff --git a/codec2/src/quantise.h b/codec2/src/quantise.h

index 6ec1c4b93f5baa0ea8a4cbab9e260eb2f9608895..eb661ad6df6e5e50d04c832455a36552d1a88490 100644 (file)
--- a/codec2/src/quantise.h
+++ b/codec2/src/quantise.h
@@ -27,9 +27,19 @@
  #ifndef __QUANTISE__
  #define __QUANTISE__
  
+#define WO_BITS   7
+#define WO_LEVELS (1<<WO_BITS)
+
  void quantise_init();
-float lpc_model_amplitudes(float Sn[], float w[], MODEL *model, int order,int lsp,float ak[]);
+float lpc_model_amplitudes(float Sn[], float w[], MODEL *model, int order,
+                          int lsp,float ak[]);
  void aks_to_M2(float ak[], int order, MODEL *model, float E, float *snr);
  float get_gmin(void);
  
+void  encode_Wo(char bits[], int *nbits, float Wo);
+float decode_Wo(char bits[], int *nbits);
+
+void encode_voicing(char bits[], int *nbits, int voiced1, int voiced2);
+void decode_voicing(int *voiced1, int *voiced2, char bits[], int *nbits);
+
  #endif
diff --git a/codec2/src/sine.c b/codec2/src/sine.c

index e6d8c8fbcbd1b74be8de9307cbb01dd76930900a..676dc94dbe4215b6a77641f53997bf41c591a78a 100644 (file)
--- a/codec2/src/sine.c
+++ b/codec2/src/sine.c
@@ -1,8 +1,8 @@
  /*---------------------------------------------------------------------------*\
                                                                               
    FILE........: sine.c
-  AUTHOR......: David Rowe                                                          
-  DATE CREATED: 119/8/2010
+  AUTHOR......: David Rowe                                           
+  DATE CREATED: 19/8/2010
                                                                               
    Sinusoidal analysis and synthesis functions.
                                                                               
@@ -46,7 +46,8 @@
                                                                               
  \*---------------------------------------------------------------------------*/
  
-void hs_pitch_refinement(MODEL *model, COMP Sw[], float pmin, float pmax, float pstep);
+void hs_pitch_refinement(MODEL *model, COMP Sw[], float pmin, float pmax, 
+                        float pstep);
  
  /*---------------------------------------------------------------------------*\
                                                                               
@@ -361,9 +362,8 @@ float est_voicing_mbe(
      COMP   Sw[],
      COMP   W[],
      float  f0,
-    COMP   Sw_[],         /* DFT of all voiced synthesised signal for f0 */
+    COMP   Sw_[]          /* DFT of all voiced synthesised signal for f0 */
                            /* useful for debugging/dump file              */
-    int   *voiced         
  )
  {
      int   i,l,al,bl,m;    /* loop variables */
@@ -424,9 +424,9 @@ float est_voicing_mbe(
  
      snr = 10.0*log10(sig/error);
      if (snr > V_THRESH)
-       *voiced = 1;
+       model->voiced = 1;
      else
-       *voiced = 0;
+       model->voiced = 0;
  
      return snr;
  }
diff --git a/codec2/src/sine.h b/codec2/src/sine.h

index 18541616cc945330eeeab1213fa1eb76922ecdd6..09a31b308b8e8cba4de45e9ea2eb89aa6dc8acd9 100644 (file)
--- a/codec2/src/sine.h
+++ b/codec2/src/sine.h
@@ -33,7 +33,7 @@ void make_analysis_window(float w[], COMP W[]);
  void dft_speech(COMP Sw[], float Sn[], float w[]);
  void two_stage_pitch_refinement(MODEL *model, COMP Sw[]);
  void estimate_amplitudes(MODEL *model, COMP Sw[], COMP W[]);
-float est_voicing_mbe(MODEL *model, COMP Sw[], COMP W[], float f0, COMP Sw_[], int *voiced);
+float est_voicing_mbe(MODEL *model, COMP Sw[], COMP W[], float f0, COMP Sw_[]);
  void make_synthesis_window(float Pn[]);
  void synthesise(float Sn_[], MODEL *model, float Pn[], int shift);
  
diff --git a/codec2/unittest/Makefile b/codec2/unittest/Makefile

index 9364eb00c85367829e2482ac3f4d8641da9e00d9..35143c20a6c29666a070d443f8e9a65ce75dc265 100644 (file)
--- a/codec2/unittest/Makefile
+++ b/codec2/unittest/Makefile
@@ -1,6 +1,6 @@
-CFLAGS = -I. -I../src -I../speex -Wall -g -DFLOATING_POINT -DVAR_ARRAYS
+CFLAGS = -I. -I../src -Wall -g -DFLOATING_POINT -DVAR_ARRAYS
  
-all: genres genlsp extract vqtrain tnlp
+all: genres genlsp extract vqtrain tnlp tinterp tquant
  
  genres: genres.o ../src/lpc.o
         gcc $(CFLAGS) -o genres genres.o ../src/lpc.o -lm
@@ -12,6 +12,11 @@ TNLP_OBJ     = tnlp.o ../src/sine.o ../src/nlp.o ../src/four1.o ../src/dump.o
  TCONTPHASE_OBJ = tcontphase.o ../src/globals.o ../src/dump.o ../src/synth.o \
                   ../src/four1.c ../src/initdec.o ../src/phase.o
  
+TINTERP_OBJ    = tinterp.o ../src/interp.o
+
+TQUANT_OBJ     = tquant.o ../src/quantise.o ../src/lpc.o ../src/lsp.o \
+                 ../src/dump.o ../src/four1.o
+
  lsptest: $(LSP_TEST_OBJ)
         gcc $(CFLAGS) -o lsptest $(LSP_TEST_OBJ) -lm
  
@@ -33,6 +38,12 @@ tcontphase: $(TCONTPHASE_OBJ)
  tmodel: tmodel.o
         gcc $(CFLAGS) -o tmodel tmodel.o -lm
  
+tinterp: $(TINTERP_OBJ)
+       gcc $(CFLAGS) -o tinterp $(TINTERP_OBJ) -lm
+
+tquant: $(TQUANT_OBJ)
+       gcc $(CFLAGS) -o tquant $(TQUANT_OBJ) -lm
+
  %.o : %.c
         $(CC) -c $(CFLAGS) $< -o $@
  
diff --git a/codec2/unittest/tinterp.c b/codec2/unittest/tinterp.c

new file mode 100644 (file)

index 0000000..f0cf853
--- /dev/null
+++ b/codec2/unittest/tinterp.c
@@ -0,0 +1,67 @@
+/*---------------------------------------------------------------------------*\
+                                                                          
+  FILE........: tinterp.c                                                  
+  AUTHOR......: David Rowe                                            
+  DATE CREATED: 22/8/10                                        
+                                                               
+  Tests interpolation functions.
+                                                                   
+\*---------------------------------------------------------------------------*/
+
+/*
+  Copyright (C) 2010 David Rowe
+
+  All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License version 2, as
+  published by the Free Software Foundation.  This program is
+  distributed in the hope that it will be useful, but WITHOUT ANY
+  WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+  License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program; if not, write to the Free Software
+  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+*/
+
+#include <assert.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <math.h>
+
+#include "defines.h"
+#include "dump.h"
+#include "interp.h"
+
+int main() {
+    MODEL  prev, next, interp;
+    int    l;
+    FILE  *f;
+
+    f = fopen("interp.txt","wt");
+
+    prev.Wo = PI/20;
+    prev.L = PI/prev.Wo;
+    prev.voiced = 1;
+    for(l=1;l<=prev.L; l++) 
+       prev.A[l] = 1000.0;
+
+    next.Wo = PI/30;
+    next.L = PI/next.Wo;
+    next.voiced = 1;
+    for(l=1;l<=next.L; l++) 
+       next.A[l] = 2000.0;
+
+    interp.voiced = 1.0;
+    interpolate(&interp, &prev, &next);
+    printf("Wo = %f\n", interp.Wo);
+ 
+    for(l=1; l<=interp.L; l++)
+       fprintf(f, "%f\n", interp.A[l]);
+
+    fclose(f);
+    return 0;
+}
diff --git a/codec2/unittest/tquant.c b/codec2/unittest/tquant.c

new file mode 100644 (file)

index 0000000..0475377
--- /dev/null
+++ b/codec2/unittest/tquant.c
@@ -0,0 +1,108 @@
+/*---------------------------------------------------------------------------*\
+                                                                          
+  FILE........: tquant.c                                                  
+  AUTHOR......: David Rowe                                            
+  DATE CREATED: 22/8/10                                        
+                                                               
+  Generates quantisation curves for plotting on Octave.
+                                                                   
+\*---------------------------------------------------------------------------*/
+
+/*
+  Copyright (C) 2010 David Rowe
+
+  All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License version 2, as
+  published by the Free Software Foundation.  This program is
+  distributed in the hope that it will be useful, but WITHOUT ANY
+  WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+  License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program; if not, write to the Free Software
+  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+*/
+
+#include <assert.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <math.h>
+
+#include "defines.h"
+#include "dump.h"
+#include "quantise.h"
+
+int main() {
+    int    i,c,bit;
+    FILE  *f;
+    float  Wo,Wo_dec, error, step_size;
+    char   bits[WO_BITS];
+    int    code, nbits, code_in, code_out;
+
+    /* output pitch quant curve for plotting */
+
+    f = fopen("quant_pitch.txt","wt");
+
+    for(Wo=0.9*(TWO_PI/P_MAX); Wo<=1.1*(TWO_PI/P_MIN); Wo += 0.001) {
+       nbits = 0;
+       encode_Wo(bits, &nbits, Wo);
+        code = 0;
+       for(i=0; i<WO_BITS; i++) {
+           code <<= 1;
+           code |= bits[i];
+       }
+       fprintf(f, "%f %d\n", Wo, code);
+    }
+
+    fclose(f);
+
+    /* check for all pitch codes we get 1:1 match between encoder
+       and decoder Wo levels */
+
+    for(c=0; c<WO_LEVELS; c++) {
+       code_in = c;
+       for(i=0; i<WO_BITS; i++) {
+           bit = (code_in >> (WO_BITS-1-i)) & 0x1;
+           bits[i] = bit;
+       }
+       nbits = 0;
+       Wo = decode_Wo(bits, &nbits);
+       nbits = 0;
+
+       memset(bits, sizeof(bits), 0);
+        encode_Wo(bits, &nbits, Wo);
+        code_out = 0;
+       for(i=0; i<WO_BITS; i++) {
+           code_out <<= 1;
+           code_out |= bits[i];
+       }
+       if (code_in != code_out)
+           printf("  Wo %f code_in %d code_out %d\n", Wo, 
+                  code_in, code_out);
+    }
+
+    /* measure quantisation error stats and compare to expected.  Also
+       plot histogram of error file to check. */
+
+    f = fopen("quant_pitch_err.txt","wt");
+    step_size = ((TWO_PI/P_MIN) - (TWO_PI/P_MAX))/WO_LEVELS;
+
+    for(Wo=TWO_PI/P_MAX; Wo<0.99*TWO_PI/P_MIN; Wo += 0.0001) {
+       nbits = 0; encode_Wo(bits, &nbits, Wo);
+       nbits = 0; Wo_dec = decode_Wo(bits, &nbits);
+       error = Wo - Wo_dec;
+       if (error > (step_size/2.0)) {
+           printf("error: %f  step_size/2: %f\n", error, step_size/2.0);
+           exit(0);
+       }
+       fprintf(f,"%f\n",error);
+    }
+    printf("OK\n");
+
+    fclose(f);
+    return 0;
+}
author	drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
	Sun, 22 Aug 2010 00:56:52 +0000 (00:56 +0000)
committer	drowe67 <drowe67@01035d8c-6547-0410-b346-abe4f91aad63>
	Sun, 22 Aug 2010 00:56:52 +0000 (00:56 +0000)
codec2/README.txt		patch \| blob \| history
codec2/src/c2sim.c		patch \| blob \| history
codec2/src/codec2.c		patch \| blob \| history
codec2/src/codec2.h		patch \| blob \| history
codec2/src/defines.h		patch \| blob \| history
codec2/src/dump.c		patch \| blob \| history
codec2/src/interp.c		patch \| blob \| history
codec2/src/interp.h		patch \| blob \| history
codec2/src/phase.c		patch \| blob \| history
codec2/src/phase.h		patch \| blob \| history
codec2/src/postfilter.c		patch \| blob \| history
codec2/src/postfilter.h		patch \| blob \| history
codec2/src/quantise.c		patch \| blob \| history
codec2/src/quantise.h		patch \| blob \| history
codec2/src/sine.c		patch \| blob \| history
codec2/src/sine.h		patch \| blob \| history
codec2/unittest/Makefile		patch \| blob \| history
codec2/unittest/tinterp.c	[new file with mode: 0644]	patch \| blob
codec2/unittest/tquant.c	[new file with mode: 0644]	patch \| blob