-Practical Assignment: Isolated Spoken Digit Recognition

发布于:2024-05-10 ⋅ 阅读:(18) ⋅ 点赞:(0)

机器学习代写-Practical Assignment: Isolated Spoken Digit Recognition

Overview

This individual assignment is about implementing, training and evaluating Hidden Markov Models (HMMs) for recognition of isolated spoken digits. A Viterbi-style HMM training and evaluation framework in Python is provided. The task is to complete the missing parts in the Python programs provided, conduct experiments on isolated digit recognition using the programs, and finally present the findings in a report.

Task Description

The speech data is selected from the TI-Digits corpus and consists of isolated digits spoken by a large number of speakers of both genders. There are 11 digits in total: “zero”, “oh”, and “1-9”.

The data is split into 2 exclusive sets: a training set and a test set. The training set is to be used for training the model parameters (100 utterance per digit). The test set should be used for testing only (60 utterances per digit). The list of utterances in each data set can be found in

● data/flists/flist_train.txt

● data/flists/flist_test.txt

The main programs are provided in the hmm-viterbi.zip. Once unzipped, the hmm-viterbi folder contains the following items (the incomplete files are listed in bold):

● hmm.py  Defines a basic left-to-right no-skip HMM and employs Viterbi to find the most probable path through the HMM given a sequence of observations. Function viterbi_decoding is incomplete.

● hmm_train.py   Performs Viterbi-style HMM training using viterbi_decoding from hmm.py. Function viterbi_train is incomplete.

● hmm_eval.py   Performs evaluation of HMMs using viterbi_decoding from hmm.py. Function “eval_HMMs” is incomplete.

● gmm_demo.py  Demonstrates the use of Gaussian Mixture Models (GMMs) for this task. This program serves as a baseline for comparison with the HMM-based recogniser. You should be able to run this program without any changes.

● test_viterbi.py   Uses a set of pre-trained HMMs to test viterbi_decoding from hmm.py. For a test signal it computes the log-likelihood score and the most probable state sequence for the corresponding HMM and a mismatched HMM. The correct HMM would produce a higher score and a more reasonable state sequence.

● speechtech/   A folder containing auxiliary functions needed for this task. You don’t need to modify anything in this folder. The Python scripts contain plenty of comments and the lines to be completed are clearly marked by “====>>>> FILL WITH YOUR CODE HERE”. To help you complete the assignment, it can be split into three parts: 1) Viterbi Decoding 2) Viterbi-style HMM Training and 3) HMM Evaluation.


网站公告

今日签到

点亮在社区的每一天
去签到