York University Faculty of Graduate Studies
Most real-time procedural audio visualization, such as the imagery generated by visualizers found in mainstream music players or conventional VJ software, is typically reduced to only representing the beat and amplitude of music. This is largely due to the difficulty of identifying and quantifying higher-level audio features in real time. My Master's Research Project aimed to address this limitation by developing a suite of analysis tools capable of extracting meaningful musical characteristics from live audio signals. By analyzing significant deviations across multiple audio descriptors over extended time windows, the framework attempts to detect and predict audibly distinct segments within an audio stream, enabling the dynamic triggering of visual effects at significant moments during a live musical performance. A central focus of the project was capturing the musical arc of tension and release, a key element in compelling music visualization. In order to achieve this goal, I developed a metric to represent 'relative complexity': essentially, a continuously evolving value derived from the aggregation of detected onsets across specific frequency bands. This metric made it possible to identify rising musical intensity. A practical application of this unique metric was detecting the build-up and release of major bass drops—when a sustained increase in relative complexity sharply breaks and coincides with a spike in average spectral centroid, the system becomes primed to detect a substantial surge in low-frequency volume. When all three conditions are met, the framework can reliably flag high-impact moments. This behavior was tested and validated using mainstream Pop and EDM music, where such dynamics are especially pronounced. Together with other common audio descriptors, this metric allows visualizations to respond not only to rhythm, but to deeper structural and expressive changes within the music provided. The framework has been successfully adapted for both DMX lighting systems and real-time rendered visuals, enabling synchronized, reactive experiences without the need for timecoding or manual intervention. To protect ongoing commercialization efforts, the project is not currently available for public distribution. Click here to preview a visualization using the framework. Software Used: MaxMSP (Bonk~, Digital Orchestra Toolbox, DMXUSBPro, Zsa.Descriptors), Unity (extOSC) |