audiomate.feeding¶
The audiomate.feeding
module provides tools for a simple access to data stored in different
audiomate.corpus.assets.Container
.
Datasets¶
-
class
audiomate.feeding.
Dataset
(corpus_or_utt_ids, feature_containers)[source]¶ An abstract class representing a dataset. A dataset provides indexable access to data. An implementation of a concrete dataset should override the methods
__len__
and__getitem
.A sample returned from a dataset is a tuple containing the data for this sample from every container. The data from different containers is ordered in the way the containers were passed to the Dataset.
Parameters:
-
class
audiomate.feeding.
FrameDataset
(corpus_or_utt_ids, container)[source]¶ A dataset wrapping frames of a corpus. A single sample represents a single frame.
Parameters: Note
For a frame dataset it is expected that every container contains exactly one value/vector for every frame. So the first dimension of every array in every container have to match.
Example
>>> corpus = audiomate.Corpus.load('/path/to/corpus') >>> container_inputs = containers.FeatureContainer('/path/to/features.hdf5') >>> container_outputs = containers.Container('/path/to/targets.hdf5') >>> >>> ds = FrameDataset(corpus, [container_inputs, container_outputs]) >>> len(ds) # Number of frames in the dataset 2938 >>> ds[293] # Frame (inputs, outputs) with index 293 ( array([0.58843831, 0.18128443, 0.19718328, 0.25284105]), array([0.0, 1.0]) )
-
get_utt_regions
()¶ Return the regions of all utterances, assuming all utterances are concatenated. It is assumed that the utterances are sorted in ascending order for concatenation.
A region is defined by offset (in chunks), length (num-chunks) and a list of references to the utterance datasets in the containers.
Returns: List of with a tuple for every utterances containing the region info. Return type: list
-
partitioned_iterator
(partition_size, shuffle=True, seed=None)[source]¶ Return a partitioning
audiomate.feeding.FrameIterator
for the dataset.Parameters: - partition_size (str) – Size of the partitions in bytes. The units
k
(kibibytes),m
(mebibytes) andg
(gibibytes) are supported, i.e. apartition_size
of1g
equates \(2^{30}\) bytes. - shuffle (bool) – Indicates whether the data should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
Returns: A partition iterator over the dataset.
Return type: - partition_size (str) – Size of the partitions in bytes. The units
-
-
class
audiomate.feeding.
MultiFrameDataset
(corpus_or_utt_ids, container, frames_per_chunk, return_length=False, pad=False)[source]¶ A dataset wrapping chunks of frames of a corpus. A single sample represents a chunk of frames.
A chunk doesn’t overlap an utterances boundaries. So if the utterance length is not divisible by the chunk length, the last chunk of an utterance may be smaller than the chunk size.
Parameters: - corpus_or_utt_ids (Corpus, list) – Either a corpus or a list of utterances. This defines which utterances are considered for iterating.
- container (list, Container) – A single container or a list of containers.
- frames_per_chunk (int) – Number of subsequent frames in a single sample.
- return_length (bool) – If True, the length of the chunk is returned as well. (default
False
) The length is appended to tuple as the last element. (e.g. [container1-data, container2-data, length]) - pad (bool) – If True, samples that are shorter are padded with zeros to match
frames_per_chunk
. If padding is enabled, the lengths are always returnedreturn_length = True
.
Note
For a multi-frame dataset it is expected that every container contains exactly one value/vector for every frame. So the first dimension of every array in every container have to match.
Examples
>>> corpus = audiomate.Corpus.load('/path/to/corpus') >>> container_inputs = containers.FeatureContainer('/path/to/features.hdf5') >>> container_outputs = containers.Container('/path/to/targets.hdf5') >>> >>> ds = MultiFrameDataset(corpus, [container_inputs, container_outputs], 5) >>> len(ds) # Number of chunks in the dataset 355 >>> ds[20] # Chunk (inputs, outputs) with index 20 ( array([[0.72991909, 0.20258683, 0.30574747, 0.53783217], [0.38875413, 0.83611128, 0.49054591, 0.15710017], [0.35153358, 0.40051009, 0.93647765, 0.29589257], [0.97465772, 0.80160451, 0.81871436, 0.4892925 ], [0.59310933, 0.8565602 , 0.95468696, 0.07933512]]), array([[0.0, 1.0], [0.0, 1.0],[0.0, 1.0],[0.0, 1.0], [0.0, 1.0]]) )
If the length should be returned, pass
True
toreturn_length
(Except for chunks at the of utterances the length will be equal toframes_per_chunk
.)>>> corpus = audiomate.Corpus.load('/path/to/corpus') >>> container_inputs = containers.FeatureContainer('/path/to/features.hdf5') >>> container_outputs = containers.Container('/path/to/targets.hdf5') >>> >>> ds = MultiFrameDataset(corpus, [container_inputs, container_outputs], 5) >>> len(ds) # Number of chunks in the dataset 355 >>> ds[20] # Chunk (inputs, outputs) with index 20 ( array([[0.72991909, 0.20258683, 0.30574747, 0.53783217], [0.38875413, 0.83611128, 0.49054591, 0.15710017], [0.35153358, 0.40051009, 0.93647765, 0.29589257], [0.97465772, 0.80160451, 0.81871436, 0.4892925 ], [0.59310933, 0.8565602 , 0.95468696, 0.07933512]]), array([[0.0, 1.0], [0.0, 1.0],[0.0, 1.0],[0.0, 1.0], [0.0, 1.0]]), 5 )
-
get_utt_regions
()[source]¶ Return the regions of all utterances, assuming all utterances are concatenated. It is assumed that the utterances are sorted in ascending order for concatenation.
A region is defined by offset (in chunks), length (num-chunks) and a list of references to the utterance datasets in the containers.
Returns: List of with a tuple for every utterances containing the region info. Return type: list
-
partitioned_iterator
(partition_size, shuffle=True, seed=None)[source]¶ Return a partitioning
audiomate.feeding.MultiFrameIterator
for the dataset.Parameters: - partition_size (str) – Size of the partitions in bytes. The units
k
(kibibytes),m
(mebibytes) andg
(gibibytes) are supported, i.e. apartition_size
of1g
equates \(2^{30}\) bytes. - shuffle (bool) – Indicates whether the data should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
Returns: A partition iterator over the dataset.
Return type: - partition_size (str) – Size of the partitions in bytes. The units
Iterator¶
-
class
audiomate.feeding.
DataIterator
(corpus_or_utt_ids, feature_containers, shuffle=True, seed=None)[source]¶ An abstract class representing a data-iterator. A data-iterator provides sequential access to data. An implementation of a concrete data-iterator should override the methods
__iter__
and__next__
.A sample returned from a data-iterator is a tuple containing the data for this sample from every container. The data from different containers is ordered in the way the containers were passed to the DataIterator.
Parameters: - corpus_or_utt_ids (Corpus, list) – Either a corpus or a list of utterances. This defines which utterances are considered for iterating.
- containers (list, Container) – A single container or a list of containers.
- shuffle (bool) – Indicates whether the data should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
-
class
audiomate.feeding.
FrameIterator
(corpus_or_utt_ids, container, partition_size, shuffle=True, seed=None)[source]¶ A data-iterator wrapping frames of a corpus. A single sample represents a single frame.
Parameters: - corpus_or_utt_ids (Corpus, list) – Either a corpus or a list of utterances. This defines which utterances are considered for iterating.
- container (list, Container) – A single container or a list of containers.
- partition_size (str) – Size of the partitions in bytes. The units
k
(kibibytes),m
(mebibytes) andg
(gibibytes) are supported, i.e. apartition_size
of1g
equates \(2^{30}\) bytes. - shuffle (bool) – Indicates whether the data should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
Note
For a FrameIterator it is expected that every container contains exactly one value/vector for every frame. So the first dimension of every array in every container have to match.
Example
>>> corpus = audiomate.Corpus.load('/path/to/corpus') >>> container_inputs = containers.FeatureContainer('/path/to/features.hdf5') >>> container_outputs = containers.Container('/path/to/targets.hdf5') >>> >>> ds = FrameIterator(corpus, [container_inputs, container_outputs], '1G', shuffle=True, seed=23) >>> next(ds) # Next Frame (inputs, outputs) ( array([0.58843831, 0.18128443, 0.19718328, 0.25284105]), array([0.0, 1.0]) )
-
class
audiomate.feeding.
MultiFrameIterator
(corpus_or_utt_ids, container, partition_size, frames_per_chunk, return_length=False, pad=False, shuffle=True, seed=None)[source]¶ A data-iterator wrapping chunks of subsequent frames of a corpus. A single sample represents a chunk of frames.
Parameters: - corpus_or_utt_ids (Corpus, list) – Either a corpus or a list of utterances. This defines which utterances are considered for iterating.
- container (list, Container) – A single container or a list of containers.
- partition_size (str) – Size of the partitions in bytes. The units
k
(kibibytes),m
(mebibytes) andg
(gibibytes) are supported, i.e. apartition_size
of1g
equates \(2^{30}\) bytes. - frames_per_chunk (int) – Number of subsequent frames in a single sample.
- return_length (bool) – If True, the length of the chunk is returned as well. (default
False
) The length is appended to tuple as the last element. (e.g. [container1-data, container2-data, length]) - pad (bool) – If True, samples that are shorter are padded with zeros to match
frames_per_chunk
. If padding is enabled, the lengths are always returnedreturn_length = True
. - shuffle (bool) – Indicates whether the data should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
Note
For a MultiFrameIterator it is expected that every container contains exactly one value/vector for every frame. So the first dimension (outermost) of every array in every container have to match.
Example
>>> corpus = audiomate.Corpus.load('/path/to/corpus') >>> container_inputs = containers.FeatureContainer('/path/to/features.hdf5') >>> container_outputs = containers.Container('/path/to/targets.hdf5') >>> >>> ds = MultiFrameIterator(corpus, [container_inputs, container_outputs], '1G', 5, shuffle=True, seed=23) >>> next(ds) # Next Chunk (inputs, outputs) ( array([[0.72991909, 0.20258683, 0.30574747, 0.53783217], [0.38875413, 0.83611128, 0.49054591, 0.15710017], [0.35153358, 0.40051009, 0.93647765, 0.29589257], [0.97465772, 0.80160451, 0.81871436, 0.4892925 ], [0.59310933, 0.8565602 , 0.95468696, 0.07933512]]) array([[0.0, 1.0], [0.0, 1.0],[0.0, 1.0],[0.0, 1.0], [0.0, 1.0]]) )
Partitioning¶
-
class
audiomate.feeding.
PartitioningContainerLoader
(corpus_or_utt_ids, feature_containers, partition_size, shuffle=True, seed=None)[source]¶ Load data from one or more containers in partitions. It computes a scheme to load the data of as many utterances as possible in one partition.
A scheme is initially computed on creation of the loader. To compute a new one the
reload()
method can be used. This only has an effect ifshuffle == True
, otherwise the utterances are defined always loaded in the same order.With a given scheme, data of a partition can be retrieved via
load_partition_data()
. It loads all data of the partition with the given index into memory.Parameters: - corpus_or_utt_ids (Corpus, list) – Either a corpus or a list of utterances. This defines which utterances are considered for loading.
- containers (container.Container, list) – Either a single or a list of Container objects. From the given containers data is loaded.
- partition_size (str) – Size of the partitions in bytes. The units
k
(kibibytes),m
(mebibytes) andg
(gibibytes) are supported, i.e. apartition_size
of1g
equates \(2^{30}\) bytes. - shuffle (bool) – Indicates whether the utterances should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
Example
>>> corpus = audiomate.Corpus.load('/path/to/corpus') >>> container_inputs = containers.FeatureContainer('/path/to/feat.hdf5') >>> container_outputs = containers.Container('/path/to/targets.hdf5') >>> >>> lo = PartitioningContainerLoader( >>> corpus, >>> [container_inputs, container_outputs], >>> '1G', >>> shuffle=True, >>> seed=23 >>> ) >>> len(lo.partitions) # Number of parititions 5 >>> lo.partitions[0].utt_ids # Utterances in the partition with index 0 ['utt-1', 'utt-2', ...] >>> p0 = lo.load_partition_data(0) # Load partition 0 into memory >>> p0.info.utt_ids[0] # First utterance in the partition 'utt-1' >>> p0.utt_data[0] # Data of the first utterance ( array([[0.58843831, 0.18128443, 0.19718328, 0.25284105], ...]), array([[0.0, 1.0], ...]) )
-
load_partition_data
(index)[source]¶ Load and return the partition with the given index.
Parameters: index (int) – The index of partition, that refers to the index in self.partitions
.Returns: - A PartitionData object containing the data
- for the partition with the given index.
Return type: PartitionData
-
class
audiomate.feeding.
PartitionInfo
[source]¶ Class for holding the info of a partition.
Variables: - utt_ids (list) – A list of utterance-ids in the partition.
- utt_lengths (list) – List with lengths of the utterances (Outermost dimension in the dataset of the container). Since there are maybe multiple containers, every item is a tuple of lengths. They correspond to the length of the utterance in every container, in the order of the containers passed to the ParitioningContainerLoader.
- size (int) – The number of bytes the partition will allocate, when loaded.
-
class
audiomate.feeding.
PartitionData
(info)[source]¶ Class for holding the loaded data of a partition.
Parameters: info (PartitionInfo) – The info about the partition. Variables: utt_data (list) – A list holding the data-objects for every utterance in the order of info.utt_ids
. The entries are also lists or tuples containing the array for every container.
-
class
audiomate.feeding.
PartitioningFeatureIterator
(hdf5file, partition_size, shuffle=True, seed=None, includes=None, excludes=None)[source]¶ Iterates over all features in the given HDF5 file.
Before iterating over the features, the iterator slices the file into one or more partitions and loads the data into memory. This leads to significant speed-ups even with moderate partition sizes, regardless of the type of disk (spinning or flash). Pseudo random access is supported with a negligible impact on performance and randomness: The data is randomly sampled (without replacement) within each partition and the partitions are loaded in random order, too.
The features are emitted as triplets in the form of
(utterance name, index of the feature within the utterance, feature)
.When calculating the partition sizes only the size of the features itself is factored in, overhead of data storage is ignored. This overhead is usually negligible even with partition sizes of multiple gigabytes because the data is stored as numpy ndarrays in memory (one per utterance). The overhead of a single ndarray is 96 bytes regardless of its size. Nonetheless the partition size should be chosen to be lower than the total available memory.
Parameters: - hdf5file (h5py.File) – HDF5 file containing the features
- partition_size (str) – Size of the partitions in bytes. The units
k
(kibibytes),m
(mebibytes) andg
(gibibytes) are supported, i.e. apartition_size
of1g
equates \(2^{30}\) bytes. - shuffle (bool) – Indicates whether the features should be returned in
random order (
True
) or not (False
). - seed (int) – Seed to be used for the random number generator.
- includes (iterable) – Iterable of names of data sets that should be
included when iterating over the feature
container. Mutually exclusive with
excludes
. If both are specified, onlyincludes
will be considered. - excludes (iterable) – Iterable of names of data sets to skip
when iterating over the feature container.
Mutually exclusive with
includes
. If both are specified, onlyincludes
will be considered.
Example
>>> import h5py >>> from audiomate.feeding import PartitioningFeatureIterator >>> hdf5 = h5py.File('features.h5', 'r') >>> iterator = PartitioningFeatureIterator(hdf5, '12g', shuffle=True) >>> next(iterator) ('music-fma-0100', 227, array([ -0.15004082, -0.30246958, -0.38708138, ..., -0.93471956, -0.94194776, -0.90878332 ], dtype=float32)) >>> next(iterator) ('music-fma-0081', 2196, array([ -0.00207647, -0.00101351, -0.00058832, ..., -0.00207647, -0.00292684, -0.00292684], dtype=float32)) >>> next(iterator) ('music-hd-0050', 1026, array([ -0.57352495, -0.63049972, -0.63049972, ..., 0.82490814, 0.84680521, 0.75517786], dtype=float32))