Essentia 2.1 beta5
Pre-releaseEssentia 2.1 beta5 is our current preliminary version of the forthcoming 2.1 release. This pre-release includes the following changes:
-
Algorithms updates and bug-fixes
-
Fix the
slaneyMel
scale implementation in MelBands and MFCC (#849). Introduced in 2.1-beta4, it was erroneously computing the HTK Mel scale. SethtkMel
as the default scale to ensure backward compatibility with all previous versions of MelBands/MFCC. -
New option
unit_tri
for triangle area normalization in MelBands, MFCC, and TriangularBands. -
New parameter
silenceThreshold
in MFCC and GFCC. Set default threshold to1e-10
(#543). -
TriangularBands: faster unit-sum normalization and an improved check for insufficient spectrum resolution (#142).
-
ConstantQ and the related Chromagram and SpectrumCQ are reimplemented from scratch and now function correctly. The
maxFrequency
parameter is replaced bynumberBins
. -
New
negativeFrequencies
parameter in FFTC to include negative frequencies in the output. -
New
normalize
parameter for IFFT size normalization. -
FFTC now supports KissFFT and Accelerate.
-
PoolAggregator: new aggregation method
last
to get the last value. Fix possible nan/inf values in kurtosis and skewness (#689). Apply aggregation for pool values that contain only one vector too. -
New
checkRange
parameter in Trimmer and StereoTrimmer. -
PitchFilter: improve consistency between input and output stream types (#674).
-
PitchMelodia: fix missing output
pitchConfidence
in streaming mode. -
MultiPitchMelodia:
peakFrameThreshold
andpeakFrameThreshold
parameters now work correctly (they were overridden by hardcoded values). -
New
tolerance
parameter in PitchYinFFT. When the pitch confidence is lower than the tolerance value the output pitch is set to 0. Atolerance
of 1 disables this feature. -
Fix occasional negative values output by Danceability (#483).
-
LoudnessEBUR128:
- Fix memory leaks and warnings on empty input. Set a larger internal buffer size to avoid buffer resizes.
- New parameter
startFromZero
to zero-center the first window for loudness estimation.
-
Fix a memory leak in AudioLoader.
-
BeatTrackerDegara output is now deterministic (#860).
-
ChordDetectionBeats: add new parameter
chromaPick
and fix a beat segment indexing bug in the case of very close consecutive beats. -
New
minPeakDistance
parameter in PeakDetection. -
Fix invalid memory access in PCA (#727).
-
Update Key and KeyExtractor algorithms with new pitch class profiles and new parameters for detuning correction and low-energy HPCP bin thresholding. Use the new
bgate
profile by default. Add spectral whitening step to KeyExtractor. Change output key naming. Add a new functionequivalentKey
to match between equivalent names. -
Proper mutex implementation for all FFT* algorithms.
-
-
New algorithms
- Invertible Constant-Q based on Non-Stationary Gabor frames: NSGConstantQ, NSGIConstantQ, NSGConstantQStreaming.
- Chromaprinter (fingerprinting) wrapper for the Chromaprint library.
- NNLSChroma and LogSpectrum (derived from the original NNLS Chroma code).
- TriangularBarkBands (more configurable than BarkBands) and BFCC (bark-frequency cepstrum coefficients).
- New algorithms for audio problems detection: ClickDetector, DiscontinuityDetector, FalseStereoDetector, GapsDetector, HumDetector, NoiseBurstDetector, SNR, SaturationDetector, StartStopCut, TruePeakDetector.
- New algorithms for probabilistic Yin (pYIN) pitch estimation: PitchYinProbabilistic, PitchYinProbabilities, PitchYinProbabilitiesHMM.
- StereoTrimmer and StereoMuxer.
- Welch (power spectral density estimation).
- New algorithm IFFTC for inverse complex STFT.
- Histogram.
-
Updated music and sound feature extractors
streaming_extractor_music
andstreaming_extractor_freesound
. Both extractors are now also available as algorithms: MusicExtractor and FreesoundExtractor. New MusicExtractorSVM algorithm allows applying SVM models to the output of MusicExtractor.-
Fix possible memory leaks in MusicExtractor
-
Proper logging for "out of memory" errors
-
Skip aggregation for some descriptors
-
Add audio
length
to metadata and removeend_time
-
Add number of audio channels to metadata (
number_channels
) -
Better grouping of metadata related to audio analysis
-
Updated key/chords estimation parameters
-
Estimate key using three different key profiles (
temperley
,krumhansl
,edma
) -
Updated descriptors in MusicExtractor:
- New LoudnessEBU128 loudness descriptors
- Add
melbands128
high-resolution melbands - Compute
hpcp_crest
- Compute
bpm_histogram
- New
stdev
aggregate statistics in addition tovar
-
Updated descriptors in FreesoundExtractor
- Add
melbands96
high-resolution melbands - Add
stdev
statistic - Remove
frequency_bands
- Do not output
bpm_confidence
when configured to use 'degara' for beat tracking spectral_contrast
andscvalleys
are now calledspectral_contrast_coeffs
andspectral_contrast_valleys
for consistency with MusicExtractorstartFrame
andstopFrame
are now calledsound_start_frame
andsound_stop_frame
- Add
-
-
New extractors
- Add a new extractor for spectrograms and log-energy Mel-spectrograms (
streaming_spectrogram
).
- Add a new extractor for spectrograms and log-energy Mel-spectrograms (
-
Python bindings updates
- Add support for Python 3.
- Update all tutorials and code examples to Python 3.
- New
essentia.pyutils
submodule provides useful functions for a number of use-cases (spectrograms, CQ-grams, batch processing with extractors, etc.) - Fix a memory bug in Pool on a
isSingleValue
check in Python. - Faster VECTOR_VECTOR_REAL conversion from Python types.
-
Build scripts updates
- Add script for Python packaging (python.py) and wheels.
- Travis CI and build scripts for manylinux wheels.
- Update Waf to 2.0.10.
- The code is now partly C++11.
- Build flags for MSVC.
- Fixes for cross-compilation with Mingw-w64.
- Default
--prefix=$VIRTUAL_ENV
when inside a virtualenv. - Read
PKG_CONFIG_PATH
and add new flag--pkg-config-path
for custom lib paths. - New flag
--only-python
to build Python extension separately from libessentia. - Link only to libessentia when building examples.
- Generate a proper
essentia.pc
pkg-config file. - Static builds updates.
- Replace LibAv with FFmpeg, build with muxers.
- Update Taglib version to 1.11.1, build with zlib.
- Update Gaia to 2.4.5.
-
Miscellaneous
-
Updated documentation, tutorials, and examples including a significant web redesign.
- Improve build scripts for documentation.
- Every algorithm page now has links to related algorithms.
- An updated list of research works using Essentia.
- New python examples.
- New QA scripts for audio problems detection and HPCPs.
-
A usual assortment of code cleanup, updated and expanded unit tests, and better logging (more informative log and exception messages).