Awesome Python Scientific Audio Overview

Curated list of python software and packages related to scientific research in audio

🏠 Home · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor · 😺 faroit/awesome-python-scientific-audio · ⭐ 1.4K · 🏷️ Programming Languages

[ Daily / Weekly / Overview ]

Python for Scientific Audio

The aim of this repository is to create a comprehensive, curated list of python software/tools related and used for scientific research in audio/music applications.

Audio Related Packages
Tutorials
Books
Scientific Paper
Other Resources
Related lists
Contributing
License

Total number of packages: 66

Read-Write

audiolazy (⭐658) :octocat: (⭐658) 📦 - Expressive Digital Signal Processing (DSP) package for Python.
audioread (⭐442) :octocat: (⭐442) 📦 - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
mutagen :octocat: (⭐1.3k) 📦 - Reads and writes all kind of audio metadata for various formats.
pyAV :octocat: (⭐1.9k) - PyAV is a Pythonic binding for FFmpeg or Libav.
(Py)Soundfile :octocat: (⭐11) 📦 - Library based on libsndfile, CFFI, and NumPy.
pySox (⭐470) :octocat: (⭐470) 📦 - Wrapper for sox.
stempeg (⭐79) :octocat: (⭐79) 📦 - read/write of STEMS multistream audio.
tinytag (⭐606) :octocat: (⭐606) 📦 - reading music meta data of MP3, OGG, FLAC and Wave files.

Transformations - General DSP

acoustics :octocat: (⭐449) 📦 - useful tools for acousticians.
AudioTK (⭐241) :octocat: (⭐241) - DSP filter toolbox (lots of filters).
AudioTSM :octocat: (⭐75) 📦 - real-time audio time-scale modification procedures.
Gammatone (⭐198) :octocat: (⭐198) - Gammatone filterbank implementation.
pyFFTW :octocat: (⭐333) 📦 - Wrapper for FFTW(3).
NSGT :octocat: (⭐90) 📦 - Non-stationary gabor transform, constant-q.
matchering (⭐986) :octocat: (⭐986) 📦 - Automated reference audio mastering.
MDCT (⭐44) :octocat: (⭐44) 📦 - MDCT transform.
pydub :octocat: (⭐7.4k) 📦 - Manipulate audio with a simple and easy high level interface.
pytftb :octocat: (⭐238) - Implementation of the MATLAB Time-Frequency Toolbox.
pyroomacoustics (⭐1.2k) :octocat: (⭐1.2k) 📦 - Room Acoustics Simulation (RIR generator)
PyRubberband (⭐133) :octocat: (⭐133) 📦 - Wrapper for rubberband to do pitch-shifting and time-stretching.
PyWavelets :octocat: (⭐1.7k) 📦 - Discrete Wavelet Transform in Python.
Resampy :octocat: (⭐227) 📦 - Sample rate conversion.
SFS-Python :octocat: (⭐63) 📦 - Sound Field Synthesis Toolbox.
sound_field_analysis :octocat: (⭐75) 📦 - Analyze, visualize and process sound field data recorded by spherical microphone arrays.
STFT :octocat: (⭐43) 📦 - Standalone package for Short-Time Fourier Transform.

Feature extraction

aubio :octocat: (⭐3k) 📦 - Feature extractor, written in C, Python interface.
audioFlux (⭐1.8k) :octocat: (⭐1.8k) 📦 - A library for audio and music analysis, feature extraction.
audiolazy (⭐658) :octocat: (⭐658) 📦 - Realtime Audio Processing lib, general purpose.
essentia :octocat: (⭐2.5k) - Music related low level and high level feature extractor, C++ based, includes Python bindings.
python_speech_features (⭐2.3k) :octocat: (⭐2.3k) 📦 - Common speech features for ASR.
pyYAAFE (⭐236) :octocat: (⭐236) - Python bindings for YAAFE feature extractor.
speechpy (⭐879) :octocat: (⭐879) 📦 - Library for Speech Processing and Recognition, mostly feature extraction for now.
spafe (⭐377) :octocat: (⭐377) 📦 - Python library for features extraction from audio files.

Data augmentation

audiomentations (⭐1.4k) :octocat: (⭐1.4k) 📦 - Audio Data Augmentation.
muda :octocat: (⭐221) 📦 - Musical Data Augmentation.
pydiogment (⭐76) :octocat: (⭐76) 📦 - Audio Data Augmentation.

Speech Processing

aeneas :octocat: (⭐2.2k) 📦 - Forced aligner, based on MFCC+DTW, 35+ languages.
deepspeech (⭐22k) :octocat: (⭐22k) 📦 - Pretrained automatic speech recognition.
gentle (⭐1.3k) :octocat: (⭐1.3k) - Forced-aligner built on Kaldi.
Parselmouth (⭐894) :octocat: (⭐894) 📦 - Python interface to the Praat phonetics and speech analysis, synthesis, and manipulation software.
persephone :octocat: (⭐153) 📦 - Automatic phoneme transcription tool.
pyannote.audio (⭐3.3k) :octocat: (⭐3.3k) 📦 - Neural building blocks for speaker diarization.
pyAudioAnalysis (⭐5.3k)² :octocat: (⭐5.3k) 📦 - Feature Extraction, Classification, Diarization.
py-webrtcvad (⭐1.7k) :octocat: (⭐1.7k) 📦 - Interface to the WebRTC Voice Activity Detector.
pypesq (⭐291) :octocat: (⭐291) - Wrapper for the PESQ score calculation.
pystoi (⭐272) :octocat: (⭐272) 📦 - Short Term Objective Intelligibility measure (STOI).
PyWorldVocoder (⭐639) :octocat: (⭐639) - Wrapper for Morise's World Vocoder.
Montreal Forced Aligner :octocat: (⭐1k) - Forced aligner, based on Kaldi (HMM), English (others can be trained).
SIDEKIT 📦 - Speaker and Language recognition.
SpeechRecognition (⭐7.3k) :octocat: (⭐7.3k) 📦 - Wrapper for several ASR engines and APIs, online and offline.

Environmental Sounds

sed_eval :octocat: (⭐119) 📦 - Evaluation toolbox for Sound Event Detection

Perceptial Models - Auditory Models

cochlea (⭐104) :octocat: (⭐104) 📦 - Inner ear models.
Brian2 :octocat: (⭐787) 📦 - Spiking neural networks simulator, includes cochlea model.
Loudness (⭐33) :octocat: (⭐33) - Perceived loudness, includes Zwicker, Moore/Glasberg model.
pyloudnorm :octocat: (⭐474) - Audio loudness meter and normalization, implements ITU-R BS.1770-4.
Sound Field Synthesis Toolbox :octocat: (⭐63) 📦 - Sound Field Synthesis Toolbox.

Source Separation

commonfate (⭐17) :octocat: (⭐17) 📦 - Common Fate Model and Transform.
NTFLib (⭐46) :octocat: (⭐46) - Sparse Beta-Divergence Tensor Factorization.
NUSSL :octocat: (⭐531) 📦 - Holistic source separation framework including DSP methods and deep learning methods.
NIMFA :octocat: (⭐506) 📦 - Several flavors of non-negative-matrix factorization.

Music Information Retrieval

Catchy (⭐21) :octocat: (⭐21) - Corpus Analysis Tools for Computational Hook Discovery.
chord-detection (⭐78) :octocat: (⭐78) - Algorithms for chord detection and key estimation.
Madmom :octocat: (⭐1.1k) 📦 - MIR packages with strong focus on beat detection, onset detection and chord recognition.
mir_eval :octocat: (⭐522) 📦 - Common scores for various MIR tasks. Also includes bss_eval implementation.
msaf :octocat: (⭐410) 📦 - Music Structure Analysis Framework.
librosa :octocat: (⭐6k) 📦 - General audio and music analysis.

Deep Learning

Kapre (⭐891) :octocat: (⭐891) 📦 - Keras Audio Preprocessors
TorchAudio (⭐2.1k) :octocat: (⭐2.1k) - PyTorch Audio Loaders
nnAudio (⭐867) :octocat: (⭐867) 📦 - Accelerated audio processing using 1D convolution networks in PyTorch.

Symbolic Music - MIDI - Musicology

Music21 :octocat: (⭐1.8k) 📦 - Toolkit for Computer-Aided Musicology.
Mido :octocat: (⭐1.2k) 📦 - Realtime MIDI wrapper.
mingus (⭐785) :octocat: (⭐785) 📦 - Advanced music theory and notation package with MIDI file and playback support.
Pretty-MIDI :octocat: (⭐712) 📦 - Utility functions for handling MIDI data in a nice/intuitive way.

Realtime applications

Jupylet (⭐197) :octocat: (⭐197) - Subtractive, additive, FM, and sample-based sound synthesis.
PYO :octocat: (⭐1.2k) - Realtime audio dsp engine.
python-sounddevice (⭐836) :octocat: 📦 - PortAudio wrapper providing realtime audio I/O with NumPy.
ReTiSAR (⭐55) :octocat: (⭐55) - Binarual rendering of streamed or IR-based high-order spherical microphone array signals.

Web Audio

TimeSide (Beta) (⭐351) :octocat: (⭐351) - high level audio analysis, imaging, transcoding, streaming and labelling.

Audio Dataset and Dataloaders

beets :octocat: (⭐12k) 📦 - Music library manager and MusicBrainz tagger.
musdb :octocat: (⭐130) 📦 - Parse and process the MUSDB18 dataset.
medleydb :octocat: (⭐157) - Parse medleydb audio + annotations.
Soundcloud API (⭐91) :octocat: (⭐91) 📦 - Wrapper for Soundcloud API.
Youtube-Downloader :octocat: (⭐121k) 📦 - Download youtube videos (and the audio).
audiomate (⭐124) :octocat: (⭐124) 📦 - Loading different types of audio datasets.
mirdata :octocat: (⭐290) 📦 - Common loaders for Music Information Retrieval (MIR) datasets.

Wrappers for Audio Plugins

VamPy Host 📦 - Interface compiled vamp plugins.

Tutorials

Whirlwind Tour Of Python :octocat: (⭐3.4k) - fast-paced introduction to Python essentials, aimed at researchers and developers.
Introduction to Numpy and Scipy :octocat: (⭐3k) - Highly recommended tutorial, covers large parts of the scientific Python ecosystem.
Numpy for MATLAB® Users - Short overview of equivalent python functions for switchers.
MIR Notebooks :octocat: (⭐1.1k) - collection of instructional iPython Notebooks for music information retrieval (MIR).
Selected Topics in Audio Signal Processing (⭐56) - Exercises as iPython notebooks.
Live-coding a music synthesizer Live-coding video showing how to use the SoundDevice library to reproduce realistic sounds. Code (⭐17).

Books

Python Data Science Handbook (⭐39k) - Jake Vanderplas, Excellent Book and accompanying tutorial notebooks.
Fundamentals of Music Processing - Meinard Müller, comes with Python exercises.

Scientific Papers

Python for audio signal processing - John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011.
librosa: Audio and Music Signal Analysis in Python, Video - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.
pyannote.audio: neural building blocks for speaker diarization, Video - Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.

Other Resources

Coursera Course - Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University.
Digital Signal Processing Course - Masters Course Material (University of Rostock) with many Python examples.
Slack Channel - Music Information Retrieval Community.

There is already PythonInMusic but it is not up to date and includes too many packages of special interest that are mostly not relevant for scientific applications. Awesome-Python (⭐174k) is large curated list of python packages. However, the audio section is very small.

Contributing

Your contributions are always welcome! Please take a look at the contribution guidelines first.

I will keep some pull requests open if I'm not sure whether those libraries are awesome, you could vote for them by adding 👍 to them.