Yusong Wu (吴雨松)

PhD at Mila, University of Montreal. Focusing on audio understanding, music generation, and real-time interactive models in music. I will be looking for a industry position in late 2025.

About Me


I am a PhD candidate in Computer Science at the University of Montreal and Mila. I am fortunate to be co-advised by Professor Aaron Courville and Professor Chengzhi Anna Huang. My doctoral research focuses on audio understanding, music generation, creative and compositional generative models, and real-time interactive models in music. I am a percussionist who has performed timpani in orchestral settings. In my spare time, I also enjoy playing the guitar and harmonica.
I will be looking for a industry position in late 2025.

Selected Publications and Manuscripts


Projects


Generative Adversarial Post-Training (GAPT)

Streaming Generation for Music Accompaniment

FLAM: Frame-wise Language-Audio Modeling

ReaLchords and GenJam: Real-time Melody-to-chord Accompaniment via RL

CLAP: Large-scale Contrastive Language-audio Model

MusicLDM: Text-to-Music Generation with Mixup

3rd Place at AI Song Contest 2022

Hierarchical Music Generation with Detailed Control

Automatic Audio Captioning with Transformer

Expressive Peking Opera Synthesis

Chinese Guqin Dataset

Experience


Mila Logo

Mila, University of Montreal

PhD Candidate in Computer Science - -

Conducting research on real-time, interactive music accompaniment models using reinforcement learning (RL) and multi-agent RL (MARL).

Interactive Music Generative Models Reinforcement Learning
Adobe Logo

Adobe Research, Co-Creation for Audio, Video, & Animation

Student Researcher - -

Researched open-vocabulary audio event localization techniques conditioned on text prompts.

Multi-modal Representation Learning Open-vocabulary sound event detection
Google DeepMind Magenta Logo

Google DeepMind, Magenta Team

Student Researcher - -

Developed a reinforcement learning-based system, ReaLchords, for real-time melody-to-chord accompaniment, and built GenJam, an interactive framework that enables delay-tolerant inference and anticipatory output visualization.

Real-Time Music Interaction Generative Models
Mila Logo

Mila, University of Montreal

CS Research Master - -

Worked on hierarchical music generation models with detailed control.

  • Proposed MIDI-DDSP, a model for controlling musical performance and synthesis with hierarchical representation.
  • MIDI-DDSP enables high-fidelity audio reconstruction, accurate performance prediction, and novel audio generation from note sequences.
Hierarchical Music Models Audio and Symbolic Music Generation
Tencent AI Lab Logo

Tencent AI Lab

Research Intern - -

Developed expressive singing voice synthesis methods and explored dynamic vocal modeling.

Singing Voice Synthesis
BUPT Logo

Beijing University of Posts and Telecommunications

-

Research focused on audio captioning, symbolic music datasets, and computational musicology.

  • Placed 2nd in the DCASE 2020 Challenge for Automatic Audio Captioning.
  • Developed the Chinese Guqin Dataset, a symbolic music dataset.
  • Proposed statistical approaches to distinguish musical genres in computational musicology.
Music Generation Automatic Audio Captioning Computational Musicology

Recent News


  • May 2025 — Guest lecture on ReaLchords & ReaLJam at courses by Prof. Julian McAuley & Shlomo Dubnov.
  • Apr 2025 — Invited talk on ReaLchords at Musical AI Agent meeting, SFU.
  • Dec 2024 — Invited talk at Prof. Dina Katabi's lab meeting on neural audio codec, MIT.
  • Mar 2024 — Poster presentation on ReaLchords at NEMISIG 2024.
  • 2023 — Released PianorollVis.js, a JavaScript library for visualizing pianoroll.
  • 2022 — Delivered a tutorial on designing controllable synthesis systems for musical signals at ISMIR 2022.

Contact