audiomate.corpus.io

This module contains classes to read and write corpora from the filesystem in a wide range of formats. They can also be used to convert between formats.

exception audiomate.corpus.io.UnknownDownloaderException[source]
exception audiomate.corpus.io.UnknownReaderException[source]
exception audiomate.corpus.io.UnknownWriterException[source]
audiomate.corpus.io.available_downloaders()[source]

Get a mapping of all available downloaders.

Returns:A dictionary with downloader classes with the name of these downloaders as key.
Return type:dict

Example:

>>> available_downloaders()
{
    "voxforge" : audiomate.corpus.io.VoxforgeDownloader
}
audiomate.corpus.io.available_readers()[source]

Get a mapping of all available readers.

Returns:A dictionary with reader classes with the name of these readers as key.
Return type:dict

Example:

>>> available_readers()
{
    "default" : audiomate.corpus.io.DefaultReader,
    "kaldi" : audiomate.corpus.io.KaldiReader
}
audiomate.corpus.io.available_writers()[source]

Get a mapping of all available writers.

Returns:A dictionary with writer classes with the name of these writers as key.
Return type:dict

Example:

>>> available_writers()
{
    "default" : audiomate.corpus.io.DefaultWriter,
    "kaldi" : audiomate.corpus.io.KaldiWriter
}
audiomate.corpus.io.create_downloader_of_type(type_name)[source]

Create an instance of the downloader with the given name.

Parameters:type_name – The name of a downloader.
Returns:An instance of the downloader with the given type.
audiomate.corpus.io.create_reader_of_type(type_name)[source]

Create an instance of the reader with the given name.

Parameters:type_name – The name of a reader.
Returns:An instance of the reader with the given type.
audiomate.corpus.io.create_writer_of_type(type_name)[source]

Create an instance of the writer with the given name.

Parameters:type_name – The name of a writer.
Returns:An instance of the writer with the given type.

Base Classes

class audiomate.corpus.io.CorpusDownloader[source]

Abstract class for downloading a corpus.

To implement a downloader for a custom format, programmers are expected to subclass this class and to implement all abstract methods. The documentation of each abstract methods details the requirements that have to be met by an implementation.

_download(target_path)[source]

Performs the actual downloading of the corpus.

Parameters:target_path (str) – Path to a directory where the data should be saved to.
download(target_path)[source]

Downloads the data of the corpus and saves it to the given path. The data has to be saved in a way, so that the corresponding CorpusReader can load the corpus.

Parameters:target_path (str) – The path to save the data to.
classmethod type()[source]

Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_downloader_of_type() or get a list of all built-in downloaders with available_downloaders().

Returns:Name of the downloader
Return type:str
class audiomate.corpus.io.CorpusReader[source]

Abstract class for reading a corpus.

To implement a reader for a custom format, programmers are expected to subclass this class and to implement all abstract methods. The documentation of each abstract methods details the requirements that have to be met by an implementation.

_check_for_missing_files(path)[source]

Tests whether all required files (like annotations) to read the corpus successfully are present. If files are missing, a list with the paths of the missing files is returned. All paths are relative to path. If no files are missing, an empty list is returned.

Parameters:path (str) – Path to the root directory of the data set
Returns:Paths of all the missing files, relative to the path of the root directory of the data set.
Return type:list
_load(path)[source]

Performs the actual reading of the corpus.

Implementations do not have to call _check_for_missing_files() themselves. This is automatically done by load().

Parameters:path (str) – Path to a directory where the data set resides.
Returns:The loaded corpus
Return type:Corpus
load(path)[source]

Load and return the corpus from the given path.

Parameters:path (str) – Path to the data set to load.
Returns:The loaded corpus
Return type:Corpus
Raises:IOError – When the data set is invalid, for example because required files (annotations, …) are missing.
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str
class audiomate.corpus.io.CorpusWriter[source]

Abstract class for writing a corpus.

To implement a writer for a custom format, programmers are expected to subclass this class and to implement all abstract methods. The documentation of each abstract methods details the requirements that have to be met by an implementation.

_save(corpus, path)[source]

Writes the corpus to disk to the given path.

Parameters:
  • corpus (Corpus) – Corpus to write to disk
  • path (str) – Path of the target directory
save(corpus, path)[source]

Save the dataset at the given path.

Parameters:
  • corpus (Corpus) – The corpus to save.
  • path (str) – Path to save the corpus to.
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_writer_of_type() or get a list of all built-in readers with available_writers().

Returns:Name of the writer
Return type:str

Implementations

Support for Reading and Writing by Format
Format Download Read Write
Acoustic Event Dataset   x  
Broadcast   x  
Default   x x
ESC-50 x x  
Folder   x  
Google Speech Commands   x  
GTZAN   x  
Kaldi   x x
Mozilla DeepSpeech     x
MUSAN   x  
TIMIT   x  
TUDA German Distant Speech   x  
Urbansound8k   x  
VoxForge x x  

Acoustic Event Dataset

class audiomate.corpus.io.AEDReader[source]

Reader for the Acoustic Event Dataset.

See also

AED
Download page
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

Broadcast

class audiomate.corpus.io.BroadcastReader[source]

Reads corpora in the Broadcast format.

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

Default

class audiomate.corpus.io.DefaultReader[source]

Reads corpora in the Default format.

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str
class audiomate.corpus.io.DefaultWriter[source]

Writes corpora in the Default format.

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_writer_of_type() or get a list of all built-in readers with available_writers().

Returns:Name of the writer
Return type:str

Folder

class audiomate.corpus.io.FolderReader[source]

Loads all wavs from the given folder and creates a corpus from it.

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

GTZAN

class audiomate.corpus.io.GtzanReader[source]

Reader for the GTZAN music/speech corpus. The corpus consits of 64 music and 64 speech tracks that are each 30 seconds long. The Wave files are 16-bit mono and have a sampling rate of 22050 Hz.

See also

MARSYAS: GTZAN Music/Speech
Download page
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

Kaldi

class audiomate.corpus.io.KaldiReader(main_label_list_idx='default', main_feature_idx='default')[source]

Supports reading data sets in Kaldi format.

See also

Kaldi: Data preparation
Describes how a data set has to be structured to be understood by Kaldi and the format of the individual files.
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str
class audiomate.corpus.io.KaldiWriter(main_label_list_idx='default', main_feature_idx='default')[source]

Supports writing data sets in Kaldi format.

See also

Kaldi: Data preparation
Describes how a data set has to be structured to be understood by Kaldi and the format of the individual files.
static feature_scp_generator(path)[source]

Return a generator over all feature matrices defined in a scp.

static read_float_matrix(rx_specifier)[source]

Return float matrix as np array for the given rx specifier.

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_writer_of_type() or get a list of all built-in readers with available_writers().

Returns:Name of the writer
Return type:str
static write_float_matrices(scp_path, ark_path, matrices)[source]

Write the given dict matrices (utt-id/float ndarray) to the given scp and ark files.

MUSAN

class audiomate.corpus.io.MusanReader[source]

Reader for the MUSAN corpus. MUSAN is a corpus of music, speech, and noise recordings.

See also

MUSAN: A Music, Speech, and Noise Corpus
Paper explaining the structure and characteristics of the corpus
OpenSLR: MUSAN
Download page
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

Google Speech Commands

class audiomate.corpus.io.SpeechCommandsReader[source]

Reads the google speech commands dataset.

See also

Launching Speech Commands DS
Blog-Entry on the release of the speech commands dataset.
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus

class audiomate.corpus.io.TimitReader[source]

Reader for the TIMIT Corpus.

See also

TIMIT
Download page
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

TUDA German Distant Speech

class audiomate.corpus.io.TudaReader[source]

Reader for the TUDA german distant speech corpus (german-speechdata-package-v2.tar.gz).

Note

It only loads files ending in -beamformedSignal.wav

static get_ids_from_folder(path, part_name)[source]

Return all ids from the given folder, which have a corresponding beamformedSignal file.

static load_file(folder_path, idx, corpus)[source]

Load speaker, file, utterance, labels for the file with the given id.

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

ESC-50

class audiomate.corpus.io.ESC50Downloader(url=None)[source]

Downloader for the ESC-50 dataset.

Parameters:url (str) – The url to download the dataset from. If not given the default URL is used.
static download_file_chunked(url, to)[source]

Downloads the file from url to the local path to.

classmethod type()[source]

Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_downloader_of_type() or get a list of all built-in downloaders with available_downloaders().

Returns:Name of the downloader
Return type:str
class audiomate.corpus.io.ESC50Reader[source]

Reader for the ESC-50 dataset (Environmental Sound Classification).

See also

ESC-50
Download page
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

Mozilla DeepSpeech

class audiomate.corpus.io.MozillaDeepSpeechWriter(transcription_label_list_idx='default')[source]

Writes files to use for training with Mozilla DeepSpeech (https://github.com/mozilla/DeepSpeech).

Since it is expected that every utterance is in a separate file, any utterances that are not in separate file in the original corpus, are extracted into a separate file in the subfolder audio of the target path.

Parameters:transcription_label_list_idx (str) – The transcriptions are used from the label-list with this id.
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_writer_of_type() or get a list of all built-in readers with available_writers().

Returns:Name of the writer
Return type:str

Urbansound8k

class audiomate.corpus.io.Urbansound8kReader[source]

Reader for the Urbansound8k dataset.

See also

Urbansound8k
Download page
classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str

VoxForge

class audiomate.corpus.io.VoxforgeDownloader(lang='de', url=None)[source]

Downloader for audio files from http://www.voxforge.org/. All .tgz files that are linked from the given url are downloaded and extracted.

Parameters:
  • lang (str) – If no URL is given the predefined URL’s for the given language is used, if one is defined.
  • url (str) – The url to check for available .tgz files.
static available_files(url)[source]

Extract and return urls for all available .tgz files.

static download_files(file_urls, target_path)[source]

Download all files and store to the given path.

static extract_files(file_paths, target_path)[source]

Unpack all files to the given path.

classmethod type()[source]

Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_downloader_of_type() or get a list of all built-in downloaders with available_downloaders().

Returns:Name of the downloader
Return type:str
class audiomate.corpus.io.VoxforgeReader[source]

Reader for collections of voxforge audio data. The reader expects extracted .tgz files in the given folder.

See also

http://www.voxforge.org/
Download page
static data_folders(path)[source]

Generator which yields a list of valid data directories (corresponds to the content of one .tgz).

static parse_prompts(etc_folder)[source]

Read prompts and prompts-orignal and return as dictionary (id as key).

static parse_speaker_info(readme_path)[source]

Parse speaker info and return tuple (idx, gender).

classmethod type()[source]

Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through create_reader_of_type() or get a list of all built-in readers with available_readers().

Returns:Name of the reader
Return type:str