ATTENTION: This is a web archive! The IMS Group was split up in 2018 and does not exist anymore. Recent work of former members can be found at the VR/AR Group and the Computer Vision Group.

Interactive Media Systems, TU Wien

Optical Music Recognition in Mensural Notation with Region-Based Convolutional Neural Networks

By Alexander Pacha and Jorge Calvo-Zaragoza

Abstract

In this work, we present an approach for the task of optical music recognition (OMR) using deep neural networks. Our intention is to simultaneously detect and categorize musical symbols in handwritten scores, written in mensural notation. We propose the use of region-based convolutional neural networks, which are trained in an end-toend fashion for that purpose. Additionally, we make use of a convolutional neural network that predicts the relative position of a detected symbol within the staff, so that we cover the entire image-processing part of the OMR pipeline. This strategy is evaluated over a set of 60 ancient scores in mensural notation, with more than 15000 annotated symbols belonging to 32 different classes. The results reflect the feasibility and capability of this approach, with a weighted mean average precision of around 76% for symbol detection, and over 98% accuracy for predicting the position.

Reference

A. Pacha, J. Calvo-Zaragoza: "Optical Music Recognition in Mensural Notation with Region-Based Convolutional Neural Networks"; Talk: 19th International Society for Music Information Retrieval Conference, Paris, France; 09-23-2018 - 09-27-2018; in: "Proceedings of the 19th International Society for Music Information Retrieval Conference", (2018), 240 - 247.

BibTeX

Click into the text area and press Ctrl+A/Ctrl+C or ⌘+A/⌘+C to copy the BibTeX into your clipboard… or download the BibTeX.