Pitch determination of speech signals : algorithms and devices

書誌事項

Pitch determination of speech signals : algorithms and devices

Wolfgang Hess

(Springer series in information sciences, 3)

Springer-Verlag, 1983

  • : Germany
  • : U.S.

大学図書館所蔵 件 / 66

この図書・雑誌をさがす

注記

Bibliography: p. [590]-684

Includes index

内容説明・目次

内容説明

Pitch (i.e., fundamental frequency FO and fundamental period TO) occupies a key position in the acoustic speech signal. The prosodic information of an utterance is predominantly determined by this parameter. The ear is more sensitive to changes of fundamental frequency than to changes of other speech signal parameters by an order of magnitude. The quality of vocoded speech is essentially influenced by the quality and faultlessness of the pitch measure- ment. Hence the importance of this parameter necessitates using good and reliable measurement methods. At first glance the task looks simple: one just has to detect the funda- mental frequency or period of a quasi-periodic signal. For a number of reasons, however, the task of pitch determination has to be counted among the most difficult problems in speech analysis. 1) In principle, speech is a nonstationary process; the momentary position of the vocal tract may change abruptly at any time. This leads to drastic variations in the temporal structure of the signal, even between subsequent pitch periods, and assuming a quasi-periodic signal is often far from realistic. 2) Due to the flexibility of the human vocal tract and the wide variety of voices, there exist a multitude of possible temporal structures. Narrow-band formants at low harmonics (especially at the second or third harmonic) are an additional source of difficulty. 3) For an arbitrary speech signal uttered by an unknown speaker, the fundamental frequency can vary over a range of almost four octaves (50 to 800 Hz).

目次

  • 1. Introduction.- 1.1 Voice Source Parameter Measurement and the Speech Signal.- 1.2 A Short Look at the Areas of Application.- 1.3 Organization of the Book.- 2. Basic Terminology. A Short Introduction to Digital Signal Processing.- 2.1 The Simplified Model of Speech Excitation.- 2.2 Digital Signal Processing 1: Signal Representation.- 2.3 Digital Signal Processing 2: Filters.- 2.4 Time-Variant Systems. The Principle of Short-Term Analysis.- 2.5 Definition of the Task. The Linear Model of Speech Production.- 2.6 A First Categorization of Pitch Determination Algorithms (PDAs).- 3. The Human Voice Source.- 3.1 Mechanism of Sound Generation at the Larynx.- 3.2 Operational Modes of the Larynx. Registers.- 3.3 The Glottal Source (Excitation) Signal.- 3.4 The Influence of the Vocal Tract Upon Voice Source Parameters.- 3.5 The Voiceless and the Transient Sources.- 4. Measuring Range, Accuracy, Pitch Perception.- 4.1 The Range of Fundamental Frequency.- 4.2 Pitch Perception. Toward a Redefinition of the Task.- 4.2.1 Pitch Perception: Spectral and Virtual Pitch.- 4.2.2 Toward a Redefinition of the Task.- 4.2.3 Difference Limens for Fundamental-Frequency Change.- 4.3 Measurement Accuracy.- 4.4 Representation of the Pitch Information in the Signal.- 4.5 Calibration and Performance Evaluation of a PDA.- 5. Manual and Instrumental Pitch Determination, Voicing Determination.- 5.1 Manual Pitch Determination.- 5.1.1 Time-Domain Manual Pitch Determination.- 5.1.2 Frequency-Domain Manual Pitch Determination.- 5.2 Pitch Determination Instruments (PDIs).- 5.2.1 Clinical Methods for Larynx Inspection.- 5.2.2 Mechanic PDIs.- 5.2.3 Electric PDIs.- 5.2.4 Ultrasonic PDIs.- 5.2.5 Photoelectric PDIs (Transillumination of the Glottis).- 5.2.6 Comparative Evaluation of PDIs.- 5.3 Voicing Determination - Selected Examples.- 5.3.1 Voicing Determination: Parameters.- 5.3.2 Voicing Determination - Simple Voicing Determination Algo-rithms (VDAs)
  • Combined VDA-PDA Systems.- 5.3.3 Multiparameter VDAs. Voicing Determination by Means of Pattern Recognition Methods.- 5.3.4 Summary and Conclusions.- 6. Time-Domain Pitch Determination.- 6.1 Pitch Determination by Fundamental-Harmonic Extraction.- 6.1.1 The Basic Extractor.- 6.1.2 The Simplest Pitch Determination Device - Low-Pass Filter and Zero (or Threshold) Crossings Analysis Basic Extractor.- 6.1.3 Enhancement of the First Harmonic by Nonlinear Means.- 6.1.4 Manual Preset and Tunable (Adaptive) Filters.- 6.2 The Other Extreme - Temporal Structure Analysis.- 6.2.1 Envelope Modeling - the Analog Approach.- 6.2.2 Simple Peak Detector and Global Correction.- 6.2.3 Zero Crossings and Excursion Cycles.- 6.2.4 Mixed-Feature Algorithms.- 6.2.5 Other PDAs That Investigate the Temporal Structure of the Signal.- 6.3 The Intermediate Device: Temporal Structure Transformation and Simplification.- 6.3.1 Temporal Structure Simplification by Inverse Filtering.- 6.3.2 The Discontinuity in the Excitation Signal: Event Detection.- 6.4 Parallel Processing in Fundamental Period Determination. Multichannel PDAs.- 6.4.1 PDAs with Multichannel Preprocessor Filters.- 6.4.2 PDAs with Several Channels Applying Different Extraction Principles.- 6.5 Special-Purpose (High-Accuracy) Time-Domain PDAs.- 6.5.1 Glottal Inverse Filtering.- 6.5.2 Determining the Instant of Glottal Closure.- 6.6 The Postprocessor.- 6.6.1 Time-to-Frequency Conversion
  • Display.- 6.6.2 f0 Determination With Basic Extractor Omitted.- 6.6.3 Global Error Correction Routines.- 6.6.4 Smoothing Pitch Contours.- 6.7 Final Comments.- 7. Design and Implementation of a Time-Domain PDA for Undistorted and Band-Limited Signals.- 7.1 The Linear Algorithm.- 7.1.1 Prefiltering.- 7.1.2 Measurement and Suppression of F1.- 7.1.3 The Basic Extractor.- 7.1.4 Problems with the Formant F2. Implementation of a Multiple Two-Pulse Filter (TPF).- 7.1.5 Phase Relations and Starting Point of the Period.- 7.1.6 Performance of the Algorithm with Respect to Linear Distortions, Especially to Band Limitations.- 7.2 Band-Limited Signals in Time-Domain PDAs.- 7.2.1 Concept of the Universal PDA.- 7.2.2 Once More: Use of Nonlinear Distortion in Time-Domain PDAs.- 7.3 An Experimental Study Towards a Universal Time-Domain PDA Applying a Nonlinear Function and a Threshold Analysis Basic Extractor.- 7.3.1 Setup of the Experiment.- 7.3.2 Relative Amplitude and Enhancement of First Harmonic.- 7.4 Toward a Choice of Optimal Nonlinear Functions.- 7.4.1 Selection with Respect to Phase Distortions.- 7.4.2 Selection with Respect to Amplitude Characteristics.- 7.4.3 Selection with Respect to the Sequence of Processing.- 7.5 Implementation of a Three-Channel PDA with Nonlinear Processing.- 7.5.1 Selection of Nonlinear Functions.- 7.5.2 Determination of the Parameter for the Comb Filter.- 7.5.3 Threshold Function in the Basic Extractor.- 7.5.4 Selection of the Most Likely Channel in the Basic Extractor.- 8. Short-Term Analysis Pitch Determination.- 8.1 The Short-Term Transformation and Its Consequences.- 8.2 Autocorrelation Pitch Determination.- 8.2.1 The Autocorrelation Function and Its Relation to the Power Spectrum.- 8.2.2 Analog Realizations.- 8.2.3 "Ordinary" Autocorrelation PDAs.- 8.2.4 Autocorrelation PDAs with Nonlinear Preprocessing.- 8.2.5 Autocorrelation PDAs with Linear Adaptive Preprocessing.- 8.3 "Anticorrelation" Pitch Determination: Average Magnitude Difference Function, Distance and Dissimilarity Measures, and Other Nonstationary Short-Term Analysis PDAs.- 8.3.1 Average Magnitude Difference Function (AMDF).- 8.3.2 Generalized Distance Functions.- 8.3.3 Nonstationary Short-Term Analysis and Incremental Time-Domain PDAs.- 8.4 Multiple Spectral Transform ("Cepstrum") Pitch Determination.- 8.4.1 The More General Aspect: Deconvolution.- 8.4.2 Cepstrum Pitch Determination.- 8.5 Frequency-Domain PDAs.- 8.5.1 Spectral Compression: Frequency and Period Histogram
  • Product Spectrum.- 8.5.2 Harmonic Matching. Psychoacoustic PDAs.- 8.5.3 Determination of f0 from the Distance of Adjacent Spectral Peaks.- 8.5.4 The Fast Fourier Transform, Spectral Resolution, and the Computing Effort.- 8.6 Maximum-Likelihood (Least-Squares) Pitch Determination.- 8.6.1 The Least-Squares Algorithm.- 8.6.2 A Multichannel Solution.- 8.6.3 Computing Complexity, Relation to Comb Filters, Simplified Realizations.- 8.7 Summary and Conclusions.- 9. General Discussion: Summary, Error Analysis, Applications.- 9.1 A Short Survey of the Principal Methods of Pitch Determination.- 9.1.1 Categorization of PDAs and Definitions of Pitch.- 9.1.2 The Basic Extractor.- 9.1.3 The Postprocessor.- 9.1.4 Methods of Preprocessing.- 9.1.5 The Impact of Technology of the Design of PDAs and the Question of Computing Effort.- 9.2 Calibration, Search for Standards.- 9.2.1 Data Acquisition.- 9.2.2 Creating the Standard Pitch Contour Manually, Automatically, and by an Interactive PDA.- 9.2.3 Creating a Standard Contour by Means of a PDI.- 9.3 Performance Evaluation of PDAs.- 9.3.1 Comparative Performance Evaluation of PDAs: Some Examples from the Literature.- 9.3.2 Methods of Error Analysis.- 9.4 A Closer Look at the Applications.- 9.4.1 Has the Problem Been Solved?.- 9.4.2 Application in Phonetics, Linguistics, and Musicology.- 9.4.3 Application in Education and in Pathology.- 9.4.4 The "Technical" Application: Speech Communication.- 9.4.5 A Way Around the Problem in Speech Communication: Voice-Excited and Residual-Excited Vocoding (Baseband Coding).- 9.5 Possible Paths Towards a General Solution.- Appendix A. Experimental Data on the Behavior of Nonlinear Functions in Time-Domain Pitch Determination Algorithms.- A.1 The Data Base of the Investigation.- A.2 Examples for the Behavior of the Nonlinear Functions.- A.3 Relative Amplitude RA1 and Enhancement RE1 of the First Harmonic.- A.4 Relative Amplitude RASM of Spurious Maximum and Autocorrelation Threshold.- A.5 Processing Sequence, Preemphasis, Phase, Band Limitation.- A.6 Optimal Performance of Nonlinear Functions.- A.7 Performance of the Comb Filters.- Appendix B. Original Text of the Quotations in Foreign Languages Throughout This Book.- List of Abbreviations.- Author and Subject Index.

「Nielsen BookData」 より

関連文献: 1件中  1-1を表示

詳細情報

ページトップへ