Changelog¶

Next Version¶

v6.0.0¶

Breaking Changes

Drop support of Python 3.5 because a required dependency (llvmlite) does not support it anymore.

New Features

Setup consistent way for logging. (Logging)
Added downloader (audiomate.corpus.io.CommonVoiceDownloader) for the Common Voice Corpora.
Add existence checks for reader (audiomate.corpus.io.CorpusReader) to see if folder exists.
Add existence checks and a option for forcing redownload for downloader (audiomate.corpus.io.CorpusDownloader).

v5.2.0¶

New Features

Added reader (audiomate.corpus.io.LibriSpeechReader) and downloader (audiomate.corpus.io.LibriSpeechDownloader) for the LibriSpeech Dataset.

v5.1.0¶

New Features

Added Downloader for SWC Corpus ((audiomate.corpus.io.SWCDownloader).
Updated SWC-Reader (audiomate.corpus.io.SWCReader) with an own implementation, so no manual preprocessing is needed anymore.
Added conversion class (audiomate.corpus.conversion.WavAudioFileConverter) to convert all files (or files that do not meet the requirements) of a corpus.
Added writer (audiomate.corpus.io.NvidiaJasperWriter) for NVIDIA Jasper.
Create a consistent way to define invalid utterances of a dataset. Invalid utterance ids are defined in a json-file (e.g. audiomate/corpus/io/data/tuda/invalid_utterances.json). Those are loaded automatically in the base-reader and can be accessed in the concrete implementation.

v5.0.0¶

Breaking Changes

Changed audiomate.corpus.validation.InvalidItemsResult to use it not only for Utterances, but also for Tracks for example.
Refactoring and addition of splitting functions in the audiomate.corpus.subset.Splitter.

New Features

Added audiomate.corpus.validation.TrackReadValidator to check for corrupt audio tracks/files.
Added reader (audiomate.corpus.io.FluentSpeechReader) for the Fluent Speech Commands Dataset.
Added functions to check for contained tracks and issuers (audiomate.corpus.CorpusView.contains_track(), audiomate.corpus.CorpusView.contains_issuer()).
Multiple options for controlling the behavior of the audiomate.corpus.io.KaldiWriter.
Added writer (audiomate.corpus.io.Wav2LetterWriter) for the wav2letter engine.
Added module with functions to read/write sclite trn files (audiomate.formats.trn).

Fixes

Improved performance of Tuda-Reader (audiomate.corpus.io.TudaReader).
Added wrapper for the `audioread.audio_open` function (audiomate.utils.audioread) to cache available backends. This speeds up audioopen operations a lot.
Performance improvements, especially for importing utterances, merging, subviews.

v4.0.1¶

Fixes

Fix audiomate.corpus.io.CommonVoiceReader to use correct file-extension of the audio files.

v4.0.0¶

Breaking Changes

For utterances and labels -1 was used for representing that the end is the same as the end of the parent utterance/track. In order to prevent -1 checks in different methods/places float('inf') is now used. This makes it easier to implement stuff like label overlapping.
audiomate.annotations.LabelList is now backed by an interval-tree instead of a simple list. Therefore the labels have no fixed order anymore. The interval-tree provides functionality for operations like merging, splitting, finding overlaps with much lower code complexity.
Removed module audiomate.annotations.label_cleaning, since those methods are available on audiomate.annotations.LabelList directly.

New Features

Added reader (audiomate.corpus.io.RouenReader) and downloader (audiomate.corpus.io.RouenDownloader) for the LITIS Rouen Audio scene dataset.
Added downloader (audiomate.corpus.io.AEDDownloader) for the Acoustic Event Dataset.
[#69] Method to get labels within range: audiomate.annotations.LabelList.labels_in_range().
[#68] Add convenience method to create Label-List with list of label values: audiomate.annotations.LabelList.with_label_values().
[#61] Added function to split utterances of a corpus into multiple utterances with a maximal duration: audiomate.corpus.CorpusView.split_utterances_to_max_time().
Add functions to check for overlap between labels: audiomate.annotations.Label.do_overlap() and audiomate.annotations.Label.overlap_duration().
Add function to merge equal labels that overlap within a label-list: audiomate.annotations.LabelList.merge_overlapping_labels().
Added reader (audiomate.corpus.io.AudioMNISTReader) and downloader (audiomate.corpus.io.AudioMNISTDownloader) for the AudioMNIST dataset.

Fixes

[#76][#77][#78] Multiple fixes on KaldiWriter

v3.0.0¶

Breaking Changes

Moved label-encoding to its own module (audiomate.encoding). It now provides the processing of full corpora and store it in containers.
Moved audiomate.feeding.PartitioningFeatureIterator to the audiomate.feeding module.
Added audiomate.containers.AudioContainer to store audio tracks in a single file. All container classes are now in a separate module audiomate.containers.
Corpus now contains Tracks not Files anymore. This makes it possible to different kinds of audio sources. Audio from a file is now included using audiomate.tracks.FileTrack. New is the audiomate.tracks.ContainerTrack, which reads data stored in a container.
The audiomate.corpus.io.DefaultReader and the audiomate.corpus.io.DefaultWriter now load and store tracks, that are stored in a container.
All functionality regarding labels was moved to its own module audiomate.annotations.
The class audiomate.tracks.Utterance was moved to the tracks module.

New Features

Introducing the audiomate.feeding module. It provides different tools for accessing container data. Via a audiomate.feeding.Dataset data can be accessed by indices. With a audiomate.feeding.DataIterator one can easily iterate over data, such as frames.
Added processing steps for computing Onset-Strength (audiomate.processing.pipeline.OnsetStrength)) and Tempogram (audiomate.processing.pipeline.Tempogram)).
Introduced audiomate.corpus.validation module, that is used to validate a corpus.
Added reader (audiomate.corpus.io.SWCReader) for the SWC corpus. But it only works for the prepared corpus.
Added function (audiomate.corpus.utils.label_cleaning.merge_consecutive_labels_with_same_values()) for merging consecutive labels with the same value
Added downloader (audiomate.corpus.io.GtzanDownloader) for the GTZAN Music/Speech.
Added audiomate.corpus.assets.Label.tokenized() to get a list of tokens from a label. It basically splits the value and trims whitespace.
Added methods on audiomate.corpus.CorpusView, audiomate.corpus.assets.Utterance and audiomate.corpus.assets.LabelList to get a set of occurring tokens.
Added audiomate.encoding.TokenOrdinalEncoder to encode labels of an utterance by mapping every token of the label to a number.
Create container base class (audiomate.corpus.assets.Container), that can be used to store arbitrary data per utterance. The audiomate.corpus.assets.FeatureContainer is now an extension of the container, that provides functionality especially for features.
Added functions to split utterances and label-lists into multiple parts. (audiomate.corpus.assets.Utterance.split(), audiomate.corpus.assets.LabelList.split())
Added audiomate.processing.pipeline.AddContext to add context to frames, using previous and subsequent frames.
Added reader (audiomate.corpus.io.MailabsReader) and downloader (audiomate.corpus.io.MailabsDownloader) for the M-AILABS Speech Dataset.

Fixes

[#58] Keep track of number of samples per frame and between frames. Now the correct values will be stored in a Feature-Container, if the processor implements it correctly.
[#72] Fix bug, when reading samples from utterance, using a specific duration, while the utterance end is not defined.

v2.0.0¶

Breaking Changes

Update various readers to use the correct label-list identifiers as defined in Data Mapping.

New Features

Added downloader (audiomate.corpus.io.TatoebaDownloader) and reader (audiomate.corpus.io.TatoebaReader) for the Tatoeba platform.
Added downloader (audiomate.corpus.io.CommonVoiceDownloader) and reader (audiomate.corpus.io.CommonVoiceReader) for the Common Voice Corpus.
Added processing steps audiomate.processing.pipeline.AvgPool and audiomate.processing.pipeline.VarPool for computing average and variance over a given number of sequential frames.
Added downloader (audiomate.corpus.io.MusanDownloader) for the Musan Corpus.
Added constants for common label-list identifiers/keys in audiomate.corpus.

v1.0.0¶

Breaking Changes

The (pre)processing module has moved to audiomate.processing. It now supports online processing in chunks. For this purpose a pipeline step can require context. The pipeline automatically buffers data, until enough frames are ready.

New Features

Added downloader (audiomate.corpus.io.FreeSpokenDigitDownloader) and reader (audiomate.corpus.io.FreeSpokenDigitReader) for the Free-Spoken-Digit-Dataset.

v0.1.0¶

Initial release