Audio Generation

With a few lines of code, we can produce sound and play it within a Jupyter notebook. Consider incorporating audio components as you're developing a notebook.

We will import a few tools to make this possible, including pylab for plotting and numerical work.

%matplotlib inline
from pylab import *
from IPython.display import Audio

We now set up five seconds of sound, sampled at 8000 times per second. We generate two pure tones together at 440 and 442 Hertz. This corresponds to a musical note at A above middle C. The slight difference in frequencies will cause a beating, or fluctuation of the sound at 2 beats per second.

Fs = 8000.0
Len = 5
t = linspace(0,Len,Fs*Len)
f1 = 442.0
f2 = 440.0
signal = sin(2*pi*f1*t) + sin(2*pi*f2*t)

Audio(data=signal, rate=Fs)

Your browser does not support the audio element.

We can analyse this signal with the Fourier transform. Plotting, we see the energy is concentrated at 440 Hz (and there is a mirror image in the frequency near 8000-440 Hz).

freqs = t*Fs/Len # a list of frequencies, in Hertz
fsig = abs(fft(signal)) # Fourier transform of the signal
plot(freqs,fsig) # amplitude versus frequency

[]

By zooming in on the relevant part of the signal, we can see the presence of energy at the two frequencies of 440 Hz and 442 Hz.

plot(freqs[2100:2300],fsig[2100:2300])

[]

Some note on audio in the computer.

The sound we hear with our ears are the rapid vibrations of air pressure on our eardrums, usually generated from vibrations of some object like a guitar string, drum head, or the vocal cords of a human. These sounds are represented as a function of time. In the computer, we represent this function as a list of numbers, or samples, that indicate the value of the function at a range of time values.

Usually, we sample at uniform time intervals. In the example above, we have 8000 samples per second, for a length of 5 seconds. The Nyquist-Shannon sampling theorem tells us that we need to sample at least as fast as twice the highest frequency that we want to reproduce.

Humans with exceptional hearing can hear frequencies up to 20,000 Hz (20 kHz). This suggests we should sample at least at 40,000 samples per second for high quality audio. In fact, a compact disk is sampled at 44100 samples per second, and digital audio tapes at 48000 samples per second.

But, since computer speakers are often of lower quality, we typically sample at lower rates like 8000, 10000, or 22050 samples per second. That give sound that is "good enough" and saves on computer memory.

Last updated