The International Society of Music Information Retrieval


Conferences
Transactions of ISMIR
Women in MIR
Resources


About the Society
Membership
Community Statistics
Contact

Conferences / ISMIR 2021

Full Proceedings

Proceedings of the 22nd International Society for Music Information Retrieval Conference, Online, Nov 7-12, 2021 (ISBN: 978-1-7327299-0-2) [pdf]

Papers
Rohit M A, Amitrajit Bhattacharjee, Preeti Rao
Four-way Classification of Tabla Strokes with Models Adapted from Automatic Drum Transcription 19-26 [pdf]
Taketo Akama
A Contextual Latent Space Model: Subsequence Modulation in Melodic Sequence 27-34 [pdf]
María Alfaro-Contreras, David Rizo, Jose M. Inesta, Jorge Calvo-Zaragoza
OMR-assisted transcription: a case study with early prints 35-41 [pdf]
Stefan A Baumann
Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation 42-49 [pdf]
Axel Berndt
The Music Performance Markup Format and Ecosystem 50-57 [pdf]
Louis Bigo, David Regnier, Nicolas Martin
Identification of rhythm guitar sections in symbolic tablatures 58-65 [pdf]
Charles Brazier, Gerhard Widmer
On-Line Audio-to-Lyrics Alignment Based on a Reference Performance 66-73 [pdf]
Aaron Carter-Enyi, Gilad Rabinovitch, Nathaniel Condit-Schultz
Visualizing Intertextual Form with Arc Diagrams: Contour and Schema-based Methods 74-80 [pdf]
Francisco J. Castellanos, Antonio-Javier Gallego, Jorge Calvo-Zaragoza
Unsupervised Domain Adaptation for Document Analysis of Music Score Images 81-87 [pdf]
Rodrigo Castellon, Chris Donahue, Percy Liang
Codified audio language modeling learns useful representations for music information retrieval 88-96 [pdf]
Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang
Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding 97-104 [pdf]
Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang
SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours 105-112 [pdf]
Vincent K.M. Cheung, Hsuan-Kai Kao, Li Su
Semi-supervised violin fingering generation using variational autoencoders 113-120 [pdf]
Keunwoo Choi, Yuxuan Wang
Listen, Read, and Identify: Multimodal Singing Language Identification of Music 121-127 [pdf]
Shreyan Chowdhury, Gerhard Widmer
On Perceived Emotion in Expressive Piano Performance: Further Experimental Evidence for the Relevance of Mid-level Perceptual Features 128-134 [pdf]
Bas Cornelissen, Willem Zuidema, John Ashley Burgoyne
Cosine Contours: a Multipurpose Representation for Melodies 135-142 [pdf]
Shuqi Dai, Zeyu Jin, Celso Gomes, Roger Dannenberg
Controllable deep melody generation via hierarchical music structure representation 143-150 [pdf]
Emir Demirel, Sven Ahlbäck, Simon Dixon
MSTRE-Net: Multistreaming Acoustic Modeling for Automatic Lyrics Transcription 151-158 [pdf]
Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick, Julian Mcauley
Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music 159-166 [pdf]
Sachinda Edirisooriya, Hao-Wen Dong, Julian Mcauley, Taylor Berg-Kirkpatrick
An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition 167-173 [pdf]
Anders Elowsson, Olivier Lartillot
A Hardanger Fiddle Dataset with Performances Spanning Emotional Expressions and Annotations Aligned using Image Registration 174-181 [pdf]
Jeffrey Ens, Philippe Pasquier
Building the MetaMIDI Dataset: Linking Symbolic and Audio Musical Data 182-188 [pdf]
Christoph Finkensiep, Martin A Rohrmeier
Modeling and Inferring Proto-Voice Structure in Free Polyphony 189-196 [pdf]
Francesco Foscarin, Nicolas Audebert, Raphael Fournier-S’Niehotta
PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation 197-204 [pdf]
Dave Foster, Simon Dixon
Filosax: A Dataset of Annotated Jazz Saxophone Recordings 205-212 [pdf]
Giovanni Gabbolini, Derek Bridge
An interpretable music similarity measure based on path interestingness 213-219 [pdf]
Hugo F Flores Garcia, Aldo Aguilar, Ethan Manilow, Bryan Pardo
Leveraging Hierarchical Structures for Few-Shot Musical Instrument Recognition 220-228 [pdf]
Mark R H Gotham, Rainer Kleinertz, Christof Weiss, Meinard Müller, Stephanie Klauk
What if the ‘When’ Implies the ‘What’?: Human harmonic analysis datasets clarify the relative role of the separate steps in automatic tonal analysis 229-236 [pdf]
Juan S. Gómez-Cañón, Estefania Cano, Yi-Hsuan Yang, Perfecto Herrera, Emilia Gomez
Let’s agree to disagree: Consensus Entropy Active Learning for Personalized Music Emotion Recognition 237-245 [pdf]
Curtis Hawthorne, Ian Simon, Rigel Swavely, Ethan Manilow, Jesse Engel
Sequence-to-Sequence Piano Transcription with Transformers 246-253 [pdf]
Ben Hayes, Charalampos Saitis, Gyorgy Fazekas
Neural Waveshaping Synthesis 254-261 [pdf]
Johannes Hentschel, Fabian C. Moss, Markus Neuwirth, Martin A Rohrmeier
A semi-automated workflow paradigm for the distributed creation and curation of expert annotations 262-269 [pdf]
Mojtaba Heydari, Frank Cwitkowitz, Zhiyao Duan
BeatNet: CRNN and Particle Filtering for Online Joint Beat, Downbeat and Meter Tracking 270-277 [pdf]
Yuki Hiramatsu, Eita Nakamura, Kazuyoshi Yoshii
Joint Estimation of Note Values and Voices for Audio-to-Score Piano Transcription 278-284 [pdf]
Yo-Wei Hsiao, Li Su
Learning note-to-note affinity for voice segregation and melody line identification of symbolic music data 285-292 [pdf]
Jui-Yang Hsu, Li Su
VOCANO: A note transcription framework for singing voice in polyphonic music 293-300 [pdf]
Rujing Huang, Bob L. T. Sturm, Andre Holzapfel
De-centering the West: East Asian Philosophies and the Ethics of Applying Artificial Intelligence to Music 301-309 [pdf]
Tun Min Hung, Bo-Yu Chen, Yen Tung Yeh, Yi-Hsuan Yang
A Benchmarking Initiative for Audio-domain Music Generation using the FreeSound Loop Dataset 310-317 [pdf]
Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam, Yi-Hsuan Yang
EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation 318-325 [pdf]
Kevin Ji, Daniel Yang, Timothy Tsai
Piano Sheet Music Identification Using Marketplace Fingerprinting 326-333 [pdf]
Keunhyoung Kim, Jongpil Lee, Sangeun Kum, Juhan Nam
Learning a cross-domain embedding space of vocal and mixed audio with a structure-preserving triplet loss 334-341 [pdf]
Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation 342-349 [pdf]
Filip Korzeniowski, Sergio Oramas, Fabien Gouyon
Artist Similarity Using Graph Neural Networks 350-357 [pdf]
Jin Ha Lee, Arpita Bhattacharya, Ria Antony, Nicole Santero, Anh Le
“Finding Home”: Understanding How Music Supports Listeners’ Mental Health through a Case Study of BTS 358-365 [pdf]
Harin Lee, Frank Höger, Marc Schönwiesner, Minsu Park, Nori Jacoby
Cross-cultural Mood Perception in Pop Songs and its Alignment with Mood Detection Algorithms 366-373 [pdf]
Jordan Lenchitz
Reconsidering quantization in MIR 374-380 [pdf]
Liwei Lin, Gus Xia, Qiuqiang Kong, Junyan Jiang
A unified model for zero-shot music source separation, transcription and synthesis 381-388 [pdf]
Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck
Pitch-Informed Instrument Assignment using a Deep Convolutional Network with Multiple Kernel Shapes 389-395 [pdf]
Wei-Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song
SpecTNT: a Time-Frequency Transformer for Music Audio 396-403 [pdf]
Néstor Nápoles López, Mark R H Gotham, Ichiro Fujinaga
AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks 404-411 [pdf]
Vincenzo Madaghiele, Pasquale Lisena, Raphael Troncy
MINGUS: Melodic Improvisation Neural Generator Using Seq2Seq 412-419 [pdf]
Ninon Lizé Masclef, Andrea Vaglio, Manuel Moussallam
User-centered evaluation of lyrics-to-audio alignment 420-427 [pdf]
Naotake Masuda, Daisuke Saito
Synthesizer Sound Matching with Differentiable DSP 428-434 [pdf]
Andrew Mcleod, Martin A Rohrmeier
A Modular System for the Harmonic Analysis of Musical Scores using a Large Vocabulary 435-442 [pdf]
Gianluca Micchi, Katerina Kosta, Gabriele Medeot, Pierre Chanquion
A deep learning method for enforcing coherence in Automatic Chord Recognition 443-451 [pdf]
Martin A Miguel, Diego Fernandez Slezak
Modeling beat uncertainty as a 2D distribution of period and phase: a MIR task proposal 452-459 [pdf]
Olof Misgeld, Torbjörn L Gulz, Jūra Miniotaitė, Andre Holzapfel
A case study of deep enculturation and sensorimotor synchronization to real music 460-467 [pdf]
Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon
Symbolic Music Generation with Diffusion Models 468-475 [pdf]
Faraaz Nadeem
Learning from Musical Feedback with Sonic the Hedgehog 476-483 [pdf]
Javier Nistal, Stefan Lattner, Gaël Richard
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis With GANs 484-492 [pdf]
Takehisa Oyama, Ryoto Ishizuka, Kazuyoshi Yoshii
Phase-Aware Joint Beat and Downbeat Estimation Based on Periodicity of Metrical Structure 493-499 [pdf]
Yuto Ozaki, John M Mcbride, Emmanouil Benetos, Peter Pfordresher, Joren Six, Adam Tierney, Polina Proutskova, Emi Sakai, Haruka Kondo, Haruno Fukatsu, Shinya Fujii, Patrick E. Savage
Agreement Among Human and Automated Transcriptions of Global Songs 500-508 [pdf]
Emilia Parada-Cabaleiro, Maximilian Schmitt, Anton Batliner, Bjorn W. Schuller, Markus Schedl
Automatic Recognition of Texture in Renaissance Music 509-516 [pdf]
Ashis Pati, Alexander Lerch
Is Disentanglement enough? On Latent Representations for Controllable Music Generation 517-524 [pdf]
Nicolás Pironio, Diego Fernandez Slezak, Martin A Miguel
Pulse clarity metrics developed from a deep learning beat tracking model 525-530 [pdf]
Verena Praher, Katharina Prinz, Arthur Flexer, Gerhard Widmer
On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples 531-538 [pdf]
Laure Prétet, Gaël Richard, Geoffroy Peeters
Is there a “language of music-video clips” ? A qualitative and quantitative study 539-546 [pdf]
Gowriprasad R, Venkatesh V, Hema A Murthy, R Aravind, Sri Rama Murty K
Tabla Gharana Recognition from Audio music recordings of Tabla Solo performances 547-554 [pdf]
Lindsey Reymore, Emmanuelle Beauvais-Lacasse, Bennett Smith, Stephen Mcadams
Navigating noise: Modeling perceptual correlates of noise-related semantic timbre categories with audio features 555-561 [pdf]
Kyle Robinson, Dan Brown
Quantitative User Perceptions of Music Recommendation List Diversity 562-568 [pdf]
Martin A Rohrmeier, Fabian C. Moss
A Formal Model of Extended Tonal Harmony 569-578 [pdf]
Simon Rouard, Gaëtan Hadjeres
CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis 579-585 [pdf]
Luke O Rowe, George Tzanetakis
Curriculum Learning for Imbalanced Classification in Large Vocabulary Automatic Chord Recognition 586-593 [pdf]
Justin Salamon, Oriol Nieto, Nicholas J. Bryan
Deep Embeddings and Section Fusion Improve Music Segmentation 594-601 [pdf]
Antonia Saravanou, Federico Tomasi, Rishabh Mehrotra, Mounia Lalmas
Multi-Task Learning of Graph-based Inductive Representations of Music Content 602-609 [pdf]
Pedro Pereira Sarmento, Adarsh Kumar, Cj Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang
DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models 610-617 [pdf]
Harald Victor Schweiger, Emilia Parada-Cabaleiro, Markus Schedl
Does Track Sequence in User-generated Playlists Matter? 618-625 [pdf]
Simon J Schwär, Sebastian Rosenzweig, Meinard Müller
A Differentiable Cost Measure for Intonation Processing in Polyphonic Music 626-633 [pdf]
Pavan M Seshadri, Alexander Lerch
Improving Music Performance Assessment With Contrastive Learning 634-641 [pdf]
Dougal Shakespeare, Camille Roth
Tracing Affordance and Item Adoption on Music Streaming Platforms 642-649 [pdf]
Zhengshan Shi
Computational analysis and modeling of expressive timing in Chopin’s Mazurkas 650-656 [pdf]
Nithya Nadig Shikarpur, Asawari Keskar, Preeti Rao
Computational analysis of melodic mode switching in raga performance 657-664 [pdf]
Qingwei Song, Qiwei Sun, Dongsheng Guo, Haiyong Zheng
SinTra: Learning an inspiration model from a single multi-track music segment 665-672 [pdf]
Janne Spijkervet, John Ashley Burgoyne
Contrastive Learning of Musical Representations 673-681 [pdf]
Xiaoheng Sun, Qiqi He, Gao Yongwei, Wei Li
Musical Tempo Estimation Using a Multi-scale Network 682-689 [pdf]
Pau Torras, Arnau Baró, Lei Kang, Alicia Fornés
On the Integration of Language Models into Sequence to Sequence Architectures for Handwritten Music Recognition 690-696 [pdf]
Kosetsu Tsukuda, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto
Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music 697-704 [pdf]
Kosetsu Tsukuda, Masahiro Hamasaki, Masataka Goto
Toward an Understanding of Lyrics-viewing Behavior While Listening to Music on a Smartphone 705-713 [pdf]
Andrea Vaglio, Romain Hennequin, Manuel Moussallam, Gael Richard
The Words Remain the Same: Cover Detection with Lyrics Transcription 714-721 [pdf]
Ziyu Wang, Gus Xia
MuseBERT: Pre-training Music Representation for Music Understanding and Controllable Generation 722-729 [pdf]
Ju-Chiang Wang, Jordan B. L. Smith, Wei-Tsung Lu, Xuchen Song
Supervised Metric Learning For Music Structure Features 730-737 [pdf]
Shiqi Wei, Gus Xia
Learning long-term music representations via hierarchical contextual constraints 738-745 [pdf]
Christof Weiss, Johannes Zeitler, Tim Zunner, Florian Schuberth, Meinard Müller
Learning Pitch-Class Representations from Score-Audio Pairs of Classical Music 746-753 [pdf]
Christof Weiss, Geoffroy Peeters
Training Deep Pitch-Class Representations With a Multi-Label CTC Loss 754-761 [pdf]
Daniel Wolff, Remi Mignot, Axel Roebel
Audio Defect Detection in Music with Deep Networks 762-768 [pdf]
Minz Won, Keunwoo Choi, Xavier Serra
Semi-supervised Music Tagging Transformer 769-776 [pdf]
Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham Mysore, Xavier Serra
Emotion Embedding Spaces for Matching Music to Stories 777-785 [pdf]
Abudukelimu Wuerkaixi, Christodoulos Benetatos, Zhiyao Duan, Changshui Zhang
CollageNet: Fusing arbitrary melody and accompaniment into a coherent song 786-793 [pdf]
Kazuhiko Yamamoto
Human-in-the-Loop Adaptation for Interactive Musical Beat Tracking 794-801 [pdf]
Daniel Yang, Timothy Tsai
Composer Classification With Cross-Modal Transfer Learning and Musically-Informed Augmentation 802-809 [pdf]
Daniel Yang, Kevin Ji, Timothy Tsai
Aligning Unsynchronized Part Recordings to a Full Mix Using Iterative Subtractive Alignment 810-817 [pdf]
Mickael Zehren, Marco Alunno, Paolo Bientinesi
ADTOF: A large dataset of non-synthetic music for automatic drum transcription 818-824 [pdf]
Huan Zhang, Yiliang Jiang, Tao Jiang, Hu Peng
Learn by Referencing: Towards Deep Metric Learning for Singing Assessment 825-832 [pdf]
Jingwei Zhao, Gus Xia
AccoMontage: Accompaniment Arrangement via Phrase Selection and Style Transfer 833-840 [pdf]