Changelog

Next Version

v6.0.0

Breaking Changes

  • Drop support of Python 3.5 because a required dependency (llvmlite) does not support it anymore.

New Features

  • Setup consistent way for logging. (Logging)
  • Added downloader (audiomate.corpus.io.CommonVoiceDownloader) for the Common Voice Corpora.
  • Add existence checks for reader (audiomate.corpus.io.CorpusReader) to see if folder exists.
  • Add existence checks and a option for forcing redownload for downloader (audiomate.corpus.io.CorpusDownloader).

v5.2.0

New Features

  • Added reader (audiomate.corpus.io.LibriSpeechReader) and downloader (audiomate.corpus.io.LibriSpeechDownloader) for the LibriSpeech Dataset.

v5.1.0

New Features

  • Added Downloader for SWC Corpus ((audiomate.corpus.io.SWCDownloader).
  • Updated SWC-Reader (audiomate.corpus.io.SWCReader) with an own implementation, so no manual preprocessing is needed anymore.
  • Added conversion class (audiomate.corpus.conversion.WavAudioFileConverter) to convert all files (or files that do not meet the requirements) of a corpus.
  • Added writer (audiomate.corpus.io.NvidiaJasperWriter) for NVIDIA Jasper.
  • Create a consistent way to define invalid utterances of a dataset. Invalid utterance ids are defined in a json-file (e.g. audiomate/corpus/io/data/tuda/invalid_utterances.json). Those are loaded automatically in the base-reader and can be accessed in the concrete implementation.

v5.0.0

Breaking Changes

New Features

Fixes

  • Improved performance of Tuda-Reader (audiomate.corpus.io.TudaReader).
  • Added wrapper for the `audioread.audio_open` function (audiomate.utils.audioread) to cache available backends. This speeds up audioopen operations a lot.
  • Performance improvements, especially for importing utterances, merging, subviews.

v4.0.1

Fixes

  • Fix audiomate.corpus.io.CommonVoiceReader to use correct file-extension of the audio files.

v4.0.0

Breaking Changes

  • For utterances and labels -1 was used for representing that the end is the same as the end of the parent utterance/track. In order to prevent -1 checks in different methods/places float('inf') is now used. This makes it easier to implement stuff like label overlapping.
  • audiomate.annotations.LabelList is now backed by an interval-tree instead of a simple list. Therefore the labels have no fixed order anymore. The interval-tree provides functionality for operations like merging, splitting, finding overlaps with much lower code complexity.
  • Removed module audiomate.annotations.label_cleaning, since those methods are available on audiomate.annotations.LabelList directly.

New Features

Fixes

  • [#76][#77][#78] Multiple fixes on KaldiWriter

v3.0.0

Breaking Changes

New Features

  • Introducing the audiomate.feeding module. It provides different tools for accessing container data. Via a audiomate.feeding.Dataset data can be accessed by indices. With a audiomate.feeding.DataIterator one can easily iterate over data, such as frames.
  • Added processing steps for computing Onset-Strength (audiomate.processing.pipeline.OnsetStrength)) and Tempogram (audiomate.processing.pipeline.Tempogram)).
  • Introduced audiomate.corpus.validation module, that is used to validate a corpus.
  • Added reader (audiomate.corpus.io.SWCReader) for the SWC corpus. But it only works for the prepared corpus.
  • Added function (audiomate.corpus.utils.label_cleaning.merge_consecutive_labels_with_same_values()) for merging consecutive labels with the same value
  • Added downloader (audiomate.corpus.io.GtzanDownloader) for the GTZAN Music/Speech.
  • Added audiomate.corpus.assets.Label.tokenized() to get a list of tokens from a label. It basically splits the value and trims whitespace.
  • Added methods on audiomate.corpus.CorpusView, audiomate.corpus.assets.Utterance and audiomate.corpus.assets.LabelList to get a set of occurring tokens.
  • Added audiomate.encoding.TokenOrdinalEncoder to encode labels of an utterance by mapping every token of the label to a number.
  • Create container base class (audiomate.corpus.assets.Container), that can be used to store arbitrary data per utterance. The audiomate.corpus.assets.FeatureContainer is now an extension of the container, that provides functionality especially for features.
  • Added functions to split utterances and label-lists into multiple parts. (audiomate.corpus.assets.Utterance.split(), audiomate.corpus.assets.LabelList.split())
  • Added audiomate.processing.pipeline.AddContext to add context to frames, using previous and subsequent frames.
  • Added reader (audiomate.corpus.io.MailabsReader) and downloader (audiomate.corpus.io.MailabsDownloader) for the M-AILABS Speech Dataset.

Fixes

  • [#58] Keep track of number of samples per frame and between frames. Now the correct values will be stored in a Feature-Container, if the processor implements it correctly.
  • [#72] Fix bug, when reading samples from utterance, using a specific duration, while the utterance end is not defined.

v2.0.0

Breaking Changes

  • Update various readers to use the correct label-list identifiers as defined in Data Mapping.

New Features

v1.0.0

Breaking Changes

  • The (pre)processing module has moved to audiomate.processing. It now supports online processing in chunks. For this purpose a pipeline step can require context. The pipeline automatically buffers data, until enough frames are ready.

New Features

  • Added downloader (audiomate.corpus.io.FreeSpokenDigitDownloader) and reader (audiomate.corpus.io.FreeSpokenDigitReader) for the Free-Spoken-Digit-Dataset.

v0.1.0

Initial release