π¬ Python Audio Visualization Cheat Sheet #
Create stunning audio visualizations in Python using librosa for audio analysis and moviepy for video creation. This guide provides a complete workflow from loading an audio file to exporting a video.
π οΈ 1. Installation #
Install the necessary libraries. librosa is for audio processing, moviepy for video editing, and matplotlib for plotting.
pip install librosa moviepy matplotlib numpy pandas
π΅ 2. Audio Processing with Librosa #
librosa is the core library for analyzing audio and extracting features.
Loading Audio #
Load an audio file as a floating-point time series (y) and get its native sample rate (sr).
import librosa
file_path = 'your_audio.mp3'
y, sr = librosa.load(file_path)
# y: numpy array with the audio waveform
# sr: sample rate (e.g., 22050 Hz)
Feature Extraction #
Analyze the audio to extract meaningful features that can drive the visualization.
-
Spectrogram: A visual representation of the spectrum of frequencies as they vary with time.
import numpy as np D = librosa.stft(y) S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max) -
Beat Tracking: Find the tempo and the frames where beats occur.
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr) beat_times = librosa.frames_to_time(beat_frames, sr=sr) -
Harmonic-Percussive Separation: Separate the audio into harmonic (tonal) and percussive (rhythmic) components.
y_harmonic, y_percussive = librosa.effects.hpss(y)
π 3. Generating Visualization Frames #
Use matplotlib to create an image for each frame of the audio. These images will be compiled into a video.
Plotting with librosa.display
#
librosa.display provides easy-to-use functions for plotting audio data.
import matplotlib.pyplot as plt
import librosa.display
fig, ax = plt.subplots(figsize=(10, 4))
librosa.display.waveshow(y, sr=sr, ax=ax)
ax.set_title('Waveform')
plt.show()
Creating a Custom Visualizer Frame #
For a dynamic video, you’ll generate one image per audio frame in a loop. Here’s how to capture a Matplotlib figure as a NumPy array without displaying it.
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
def create_frame(S_frame):
fig = plt.Figure(figsize=(5, 5), dpi=100)
canvas = FigureCanvas(fig)
ax = fig.gca()
# Your custom plotting logic here
ax.plot(S_frame, color='cyan')
ax.axis('off')
fig.tight_layout(pad=0)
# Redraw the canvas and get the image as a numpy array
canvas.draw()
image = np.frombuffer(canvas.tostring_rgb(), dtype='uint8')
image = image.reshape(fig.canvas.get_width_height()[::-1] + (3,))
plt.close(fig)
return image
ποΈ 4. Video Creation with MoviePy #
moviepy can take a sequence of images (as file paths or NumPy arrays) and compile them into a video file.
Creating a Clip from Images #
The ImageSequenceClip class is perfect for this task. It takes a list of image frames and a desired frames-per-second (fps) rate.
from moviepy.editor import ImageSequenceClip, AudioFileClip
# Assume `image_frames` is a list of numpy arrays from your visualizer
fps = 30 # Frames per second
video_clip = ImageSequenceClip(image_frames, fps=fps)
Adding Audio and Exporting #
Set the audio of your video clip to the original audio file and write the final output.
audio_clip = AudioFileClip(file_path)
final_clip = video_clip.set_audio(audio_clip)
final_clip.write_videofile(
'output_video.mp4',
codec='libx264',
audio_codec='aac',
fps=fps
)
π 5. Complete Workflow Example #
Here’s a simplified workflow to tie everything together.
import librosa
import numpy as np
from moviepy.editor import ImageSequenceClip, AudioFileClip
from tqdm import tqdm
# (Include the create_frame function from above)
# 1. Load Audio
audio_path = librosa.example('nutcracker')
y, sr = librosa.load(audio_path)
# 2. Analyze Audio
S = np.abs(librosa.stft(y))
# 3. Generate Frames
fps = 30
frame_len = int(sr / fps)
video_frames = []
for i in tqdm(range(0, S.shape[1])):
# This is a simplified example; a real visualizer would be more complex
# and might not map 1-to-1 with STFT frames.
frame_data = S[:, i]
frame_image = create_frame(frame_data)
video_frames.append(frame_image)
# 4. Create Video
video_clip = ImageSequenceClip(video_frames, fps=fps)
audio_clip = AudioFileClip(audio_path)
final_clip = video_clip.set_audio(audio_clip.subclip(0, video_clip.duration))
final_clip.write_videofile('animusic.mp4', codec='libx264', audio_codec='aac', fps=fps)