audiomate.corpus.io¶
This module contains classes to read and write corpora from the filesystem in a wide range of formats. They can also be used to convert between formats.
-
audiomate.corpus.io.
available_downloaders
()[source]¶ Get a mapping of all available downloaders.
Returns: A dictionary with downloader classes with the name of these downloaders as key. Return type: dict Example:
>>> available_downloaders() { "voxforge" : audiomate.corpus.io.VoxforgeDownloader }
-
audiomate.corpus.io.
available_readers
()[source]¶ Get a mapping of all available readers.
Returns: A dictionary with reader classes with the name of these readers as key. Return type: dict Example:
>>> available_readers() { "default" : audiomate.corpus.io.DefaultReader, "kaldi" : audiomate.corpus.io.KaldiReader }
-
audiomate.corpus.io.
available_writers
()[source]¶ Get a mapping of all available writers.
Returns: A dictionary with writer classes with the name of these writers as key. Return type: dict Example:
>>> available_writers() { "default" : audiomate.corpus.io.DefaultWriter, "kaldi" : audiomate.corpus.io.KaldiWriter }
-
audiomate.corpus.io.
create_downloader_of_type
(type_name)[source]¶ Create an instance of the downloader with the given name.
Parameters: type_name – The name of a downloader. Returns: An instance of the downloader with the given type.
-
audiomate.corpus.io.
create_reader_of_type
(type_name)[source]¶ Create an instance of the reader with the given name.
Parameters: type_name – The name of a reader. Returns: An instance of the reader with the given type.
-
audiomate.corpus.io.
create_writer_of_type
(type_name)[source]¶ Create an instance of the writer with the given name.
Parameters: type_name – The name of a writer. Returns: An instance of the writer with the given type.
Base Classes¶
-
class
audiomate.corpus.io.
CorpusDownloader
[source]¶ Abstract class for downloading a corpus.
To implement a downloader for a custom format, programmers are expected to subclass this class and to implement all abstract methods. The documentation of each abstract methods details the requirements that have to be met by an implementation.
-
_download
(target_path)[source]¶ Performs the actual downloading of the corpus.
Parameters: target_path (str) – Path to a directory where the data should be saved to.
-
download
(target_path)[source]¶ Downloads the data of the corpus and saves it to the given path. The data has to be saved in a way, so that the corresponding
CorpusReader
can load the corpus.Parameters: target_path (str) – The path to save the data to.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_downloader_of_type()
or get a list of all built-in downloaders withavailable_downloaders()
.Returns: Name of the downloader Return type: str
-
-
class
audiomate.corpus.io.
CorpusReader
[source]¶ Abstract class for reading a corpus.
To implement a reader for a custom format, programmers are expected to subclass this class and to implement all abstract methods. The documentation of each abstract methods details the requirements that have to be met by an implementation.
-
_check_for_missing_files
(path)[source]¶ Tests whether all required files (like annotations) to read the corpus successfully are present. If files are missing, a list with the paths of the missing files is returned. All paths are relative to path. If no files are missing, an empty list is returned.
Parameters: path (str) – Path to the root directory of the data set Returns: Paths of all the missing files, relative to the path of the root directory of the data set. Return type: list
-
_load
(path)[source]¶ Performs the actual reading of the corpus.
Implementations do not have to call
_check_for_missing_files()
themselves. This is automatically done byload()
.Parameters: path (str) – Path to a directory where the data set resides. Returns: The loaded corpus Return type: Corpus
-
load
(path)[source]¶ Load and return the corpus from the given path.
Parameters: path (str) – Path to the data set to load. Returns: The loaded corpus Return type: Corpus Raises: IOError
– When the data set is invalid, for example because required files (annotations, …) are missing.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
-
-
class
audiomate.corpus.io.
CorpusWriter
[source]¶ Abstract class for writing a corpus.
To implement a writer for a custom format, programmers are expected to subclass this class and to implement all abstract methods. The documentation of each abstract methods details the requirements that have to be met by an implementation.
-
_save
(corpus, path)[source]¶ Writes the corpus to disk to the given path.
Parameters: - corpus (Corpus) – Corpus to write to disk
- path (str) – Path of the target directory
-
save
(corpus, path)[source]¶ Save the dataset at the given path.
Parameters: - corpus (Corpus) – The corpus to save.
- path (str) – Path to save the corpus to.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_writer_of_type()
or get a list of all built-in readers withavailable_writers()
.Returns: Name of the writer Return type: str
-
Implementations¶
Format | Download | Read | Write |
---|---|---|---|
Acoustic Event Dataset | x | ||
Broadcast | x | ||
Default | x | x | |
ESC-50 | x | x | |
Free-Spoken-Digit-Dataset | x | x | |
Folder | x | ||
Google Speech Commands | x | ||
GTZAN | x | ||
Kaldi | x | x | |
Mozilla DeepSpeech | x | ||
MUSAN | x | ||
TIMIT | x | ||
TUDA German Distant Speech | x | ||
Urbansound8k | x | ||
VoxForge | x | x |
Acoustic Event Dataset¶
-
class
audiomate.corpus.io.
AEDReader
[source]¶ Reader for the Acoustic Event Dataset.
See also
- AED
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
Broadcast¶
-
class
audiomate.corpus.io.
BroadcastReader
[source]¶ Reads corpora in the Broadcast format.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
-
classmethod
Default¶
-
class
audiomate.corpus.io.
DefaultReader
[source]¶ Reads corpora in the Default format.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
-
classmethod
-
class
audiomate.corpus.io.
DefaultWriter
[source]¶ Writes corpora in the Default format.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_writer_of_type()
or get a list of all built-in readers withavailable_writers()
.Returns: Name of the writer Return type: str
-
classmethod
Folder¶
-
class
audiomate.corpus.io.
FolderReader
[source]¶ Loads all wavs from the given folder and creates a corpus from it.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
-
classmethod
GTZAN¶
-
class
audiomate.corpus.io.
GtzanReader
[source]¶ Reader for the GTZAN music/speech corpus. The corpus consits of 64 music and 64 speech tracks that are each 30 seconds long. The Wave files are 16-bit mono and have a sampling rate of 22050 Hz.
See also
- MARSYAS: GTZAN Music/Speech
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
Kaldi¶
-
class
audiomate.corpus.io.
KaldiReader
(main_label_list_idx='default', main_feature_idx='default')[source]¶ Supports reading data sets in Kaldi format.
See also
- Kaldi: Data preparation
- Describes how a data set has to be structured to be understood by Kaldi and the format of the individual files.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
-
class
audiomate.corpus.io.
KaldiWriter
(main_label_list_idx='default', main_feature_idx='default')[source]¶ Supports writing data sets in Kaldi format.
See also
- Kaldi: Data preparation
- Describes how a data set has to be structured to be understood by Kaldi and the format of the individual files.
-
static
feature_scp_generator
(path)[source]¶ Return a generator over all feature matrices defined in a scp.
-
static
read_float_matrix
(rx_specifier)[source]¶ Return float matrix as np array for the given rx specifier.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_writer_of_type()
or get a list of all built-in readers withavailable_writers()
.Returns: Name of the writer Return type: str
MUSAN¶
-
class
audiomate.corpus.io.
MusanReader
[source]¶ Reader for the MUSAN corpus. MUSAN is a corpus of music, speech, and noise recordings.
See also
- MUSAN: A Music, Speech, and Noise Corpus
- Paper explaining the structure and characteristics of the corpus
- OpenSLR: MUSAN
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
Google Speech Commands¶
-
class
audiomate.corpus.io.
SpeechCommandsReader
[source]¶ Reads the google speech commands dataset.
See also
- Launching Speech Commands DS
- Blog-Entry on the release of the speech commands dataset.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus¶
-
class
audiomate.corpus.io.
TimitReader
[source]¶ Reader for the TIMIT Corpus.
See also
- TIMIT
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
TUDA German Distant Speech¶
-
class
audiomate.corpus.io.
TudaReader
[source]¶ Reader for the TUDA german distant speech corpus (german-speechdata-package-v2.tar.gz).
Note
It only loads files ending in -beamformedSignal.wav
See also
-
static
get_ids_from_folder
(path, part_name)[source]¶ Return all ids from the given folder, which have a corresponding beamformedSignal file.
-
static
load_file
(folder_path, idx, corpus)[source]¶ Load speaker, file, utterance, labels for the file with the given id.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
-
static
ESC-50¶
-
class
audiomate.corpus.io.
ESC50Downloader
(url=None)[source]¶ Downloader for the ESC-50 dataset.
Parameters: url (str) – The url to download the dataset from. If not given the default URL is used. -
classmethod
type
()[source]¶ Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_downloader_of_type()
or get a list of all built-in downloaders withavailable_downloaders()
.Returns: Name of the downloader Return type: str
-
classmethod
-
class
audiomate.corpus.io.
ESC50Reader
[source]¶ Reader for the ESC-50 dataset (Environmental Sound Classification).
See also
- ESC-50
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
Free-Spoken-Digit-Dataset¶
-
class
audiomate.corpus.io.
FreeSpokenDigitDownloader
(url=None)[source]¶ Downloader for the Free-Spoken-Digit dataset.
Parameters: url (str) – The url to download the dataset from. If not given the default URL is used. It is expected to be a zip file. -
classmethod
type
()[source]¶ Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_downloader_of_type()
or get a list of all built-in downloaders withavailable_downloaders()
.Returns: Name of the downloader Return type: str
-
classmethod
-
class
audiomate.corpus.io.
FreeSpokenDigitReader
[source]¶ Reader for the Free-Spoken-Digit Corpus.
See also
- Free-Spoken-Digit-Dataset
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
Mozilla DeepSpeech¶
-
class
audiomate.corpus.io.
MozillaDeepSpeechWriter
(transcription_label_list_idx='default')[source]¶ Writes files to use for training with Mozilla DeepSpeech (https://github.com/mozilla/DeepSpeech).
Since it is expected that every utterance is in a separate file, any utterances that are not in separate file in the original corpus, are extracted into a separate file in the subfolder audio of the target path.
Parameters: transcription_label_list_idx (str) – The transcriptions are used from the label-list with this id. -
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_writer_of_type()
or get a list of all built-in readers withavailable_writers()
.Returns: Name of the writer Return type: str
-
classmethod
Urbansound8k¶
-
class
audiomate.corpus.io.
Urbansound8kReader
[source]¶ Reader for the Urbansound8k dataset.
See also
- Urbansound8k
- Download page
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str
VoxForge¶
-
class
audiomate.corpus.io.
VoxforgeDownloader
(lang='de', url=None)[source]¶ Downloader for audio files from http://www.voxforge.org/. All .tgz files that are linked from the given url are downloaded and extracted.
Parameters: - lang (str) – If no URL is given the predefined URL’s for the given language is used, if one is defined.
- url (str) – The url to check for available .tgz files.
-
static
download_files
(file_urls, target_path)[source]¶ Download all files and store to the given path.
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the downloader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_downloader_of_type()
or get a list of all built-in downloaders withavailable_downloaders()
.Returns: Name of the downloader Return type: str
-
class
audiomate.corpus.io.
VoxforgeReader
[source]¶ Reader for collections of voxforge audio data. The reader expects extracted .tgz files in the given folder.
See also
- http://www.voxforge.org/
- Download page
-
static
data_folders
(path)[source]¶ Generator which yields a list of valid data directories (corresponds to the content of one .tgz).
-
static
parse_prompts
(etc_folder)[source]¶ Read prompts and prompts-orignal and return as dictionary (id as key).
-
classmethod
type
()[source]¶ Returns a string that uniquely identifies the reader. This is usually the name of the corpus, for example musan or timit. Users can use this string to obtain an instance of the desired reader through
create_reader_of_type()
or get a list of all built-in readers withavailable_readers()
.Returns: Name of the reader Return type: str