Homepage Course content Spanish

Deep Learning for Music Information Retrieval: Generative Models

drawing

Website for the graduate-level course covering the deep learning theory, literature, and practice applied to digital audio. More specifically, the course covers generative models (including autoencoders).

Taught at the National Autonomous University of Mexico UNAM during the spring semester of 2022 (January 31st - May 27th) by Iran R. Roman.

Course content

  1. Introduction and digital audio review
  2. Audio features and musical objects
  3. Datasets and dimensionality reduction
  4. Cross-validation and linear regression
  5. Logistic regression, binary cross-entropy, and evaluation metrics
  6. Softmax classification
  7. Feedforward neural networks
  8. CNNs and optimization techniques
  9. Autoencoders
  10. VAEs and GANs
  11. Transformers and RNNs

Prerequisites

This is a graduate-level course that assumes knowledge of digital audio signal processing, object-oriented programming (we will work with Python3), differential calculus (chain rule), linear algebra, and basic probability/statistics. To ensure that everybody is on the same page, we will review these concepts as they become relevant to course content. However, if you have never been exposed to these concepts, this course will likely be more challenging that what it has to be.

If you need to review these concepts, checkout the following:

Or make an online search for other materials covering these concepts.

Course logistics

The course runs from January 31st to May 27th (2022) and meets Wednesdays from 5PM to 8PM (Eastern Standard Time) over Zoom.

If you are not affiliated with UNAM and are interested in joining the course (it’s free, by the way), please email the instructor. The course will welcome all students who can meet the course prerequisites.

All course materials are in English (since all relevant literature in this field of research is in Enligh). However, the lectures will be delivered in Spanish (remember the class is taught at UNAM). Some guest lectures will be delivered in English. Strong knowledge of English reading, writting, and speaking is assumed.

Submit your homework by sending an email to the instructor with ALL the relevant files.

Getting help

Post your questions on the course sub-reddit deeplearningaudio.


© Iran R. Roman 2022