Skip to content

Essentia 2.1 beta5

Pre-release
Pre-release
Compare
Choose a tag to compare
@dbogdanov dbogdanov released this 05 Sep 14:55
· 1318 commits to master since this release

Essentia 2.1 beta5 is our current preliminary version of the forthcoming 2.1 release. This pre-release includes the following changes:

  • Algorithms updates and bug-fixes

    • Fix the slaneyMel scale implementation in MelBands and MFCC (#849). Introduced in 2.1-beta4, it was erroneously computing the HTK Mel scale. Set htkMel as the default scale to ensure backward compatibility with all previous versions of MelBands/MFCC.

    • New option unit_tri for triangle area normalization in MelBands, MFCC, and TriangularBands.

    • New parameter silenceThreshold in MFCC and GFCC. Set default threshold to 1e-10 (#543).

    • TriangularBands: faster unit-sum normalization and an improved check for insufficient spectrum resolution (#142).

    • ConstantQ and the related Chromagram and SpectrumCQ are reimplemented from scratch and now function correctly. The maxFrequency parameter is replaced by numberBins.

    • New negativeFrequencies parameter in FFTC to include negative frequencies in the output.

    • New normalize parameter for IFFT size normalization.

    • FFTC now supports KissFFT and Accelerate.

    • PoolAggregator: new aggregation method last to get the last value. Fix possible nan/inf values in kurtosis and skewness (#689). Apply aggregation for pool values that contain only one vector too.

    • New checkRange parameter in Trimmer and StereoTrimmer.

    • PitchFilter: improve consistency between input and output stream types (#674).

    • PitchMelodia: fix missing output pitchConfidence in streaming mode.

    • MultiPitchMelodia: peakFrameThreshold and peakFrameThreshold parameters now work correctly (they were overridden by hardcoded values).

    • New tolerance parameter in PitchYinFFT. When the pitch confidence is lower than the tolerance value the output pitch is set to 0. A tolerance of 1 disables this feature.

    • Fix occasional negative values output by Danceability (#483).

    • LoudnessEBUR128:

      • Fix memory leaks and warnings on empty input. Set a larger internal buffer size to avoid buffer resizes.
      • New parameter startFromZero to zero-center the first window for loudness estimation.
    • Fix a memory leak in AudioLoader.

    • BeatTrackerDegara output is now deterministic (#860).

    • ChordDetectionBeats: add new parameter chromaPick and fix a beat segment indexing bug in the case of very close consecutive beats.

    • New minPeakDistance parameter in PeakDetection.

    • Fix invalid memory access in PCA (#727).

    • Update Key and KeyExtractor algorithms with new pitch class profiles and new parameters for detuning correction and low-energy HPCP bin thresholding. Use the new bgate profile by default. Add spectral whitening step to KeyExtractor. Change output key naming. Add a new function equivalentKey to match between equivalent names.

    • Proper mutex implementation for all FFT* algorithms.

  • New algorithms

    • Invertible Constant-Q based on Non-Stationary Gabor frames: NSGConstantQ, NSGIConstantQ, NSGConstantQStreaming.
    • Chromaprinter (fingerprinting) wrapper for the Chromaprint library.
    • NNLSChroma and LogSpectrum (derived from the original NNLS Chroma code).
    • TriangularBarkBands (more configurable than BarkBands) and BFCC (bark-frequency cepstrum coefficients).
    • New algorithms for audio problems detection: ClickDetector, DiscontinuityDetector, FalseStereoDetector, GapsDetector, HumDetector, NoiseBurstDetector, SNR, SaturationDetector, StartStopCut, TruePeakDetector.
    • New algorithms for probabilistic Yin (pYIN) pitch estimation: PitchYinProbabilistic, PitchYinProbabilities, PitchYinProbabilitiesHMM.
    • StereoTrimmer and StereoMuxer.
    • Welch (power spectral density estimation).
    • New algorithm IFFTC for inverse complex STFT.
    • Histogram.
  • Updated music and sound feature extractors streaming_extractor_music and streaming_extractor_freesound. Both extractors are now also available as algorithms: MusicExtractor and FreesoundExtractor. New MusicExtractorSVM algorithm allows applying SVM models to the output of MusicExtractor.

    • Fix possible memory leaks in MusicExtractor

    • Proper logging for "out of memory" errors

    • Skip aggregation for some descriptors

    • Add audio length to metadata and remove end_time

    • Add number of audio channels to metadata (number_channels)

    • Better grouping of metadata related to audio analysis

    • Updated key/chords estimation parameters

    • Estimate key using three different key profiles (temperley, krumhansl, edma)

    • Updated descriptors in MusicExtractor:

      • New LoudnessEBU128 loudness descriptors
      • Add melbands128 high-resolution melbands
      • Compute hpcp_crest
      • Compute bpm_histogram
      • New stdev aggregate statistics in addition to var
    • Updated descriptors in FreesoundExtractor

      • Add melbands96 high-resolution melbands
      • Add stdev statistic
      • Remove frequency_bands
      • Do not output bpm_confidence when configured to use 'degara' for beat tracking
      • spectral_contrast and scvalleys are now called spectral_contrast_coeffs and spectral_contrast_valleys for consistency with MusicExtractor
      • startFrame and stopFrame are now called sound_start_frame and sound_stop_frame
  • New extractors

    • Add a new extractor for spectrograms and log-energy Mel-spectrograms (streaming_spectrogram).
  • Python bindings updates

    • Add support for Python 3.
    • Update all tutorials and code examples to Python 3.
    • New essentia.pyutils submodule provides useful functions for a number of use-cases (spectrograms, CQ-grams, batch processing with extractors, etc.)
    • Fix a memory bug in Pool on a isSingleValue check in Python.
    • Faster VECTOR_VECTOR_REAL conversion from Python types.
  • Build scripts updates

    • Add script for Python packaging (python.py) and wheels.
    • Travis CI and build scripts for manylinux wheels.
    • Update Waf to 2.0.10.
    • The code is now partly C++11.
    • Build flags for MSVC.
    • Fixes for cross-compilation with Mingw-w64.
    • Default --prefix=$VIRTUAL_ENV when inside a virtualenv.
    • Read PKG_CONFIG_PATH and add new flag --pkg-config-path for custom lib paths.
    • New flag --only-python to build Python extension separately from libessentia.
    • Link only to libessentia when building examples.
    • Generate a proper essentia.pc pkg-config file.
    • Static builds updates.
      • Replace LibAv with FFmpeg, build with muxers.
      • Update Taglib version to 1.11.1, build with zlib.
      • Update Gaia to 2.4.5.
  • Miscellaneous

    • Fix segfault in the Vamp plugin (#635, #371).
    • Add support for SingleVectorString to Pool.
    • Added support for Cephes Bessel functions via a 3rdparty library Cephes.
  • Updated documentation, tutorials, and examples including a significant web redesign.

    • Improve build scripts for documentation.
    • Every algorithm page now has links to related algorithms.
    • An updated list of research works using Essentia.
    • New python examples.
    • New QA scripts for audio problems detection and HPCPs.
  • A usual assortment of code cleanup, updated and expanded unit tests, and better logging (more informative log and exception messages).