Seeing”Realtime FFT Graph of Audio WAV File or Microphone Input with Python…” on python.reddit.com reminded me of one I’d built in python with shoebot.

While it works OK, I feel like I’m missing a higher level audio library (especially having seen Minim, for C++ and Java).

To run it in shoebot:

sbot -w audiobot.bot

### audiobot.bot

# Major library imports import atexit import pyaudio from numpy import zeros, short, fromstring, array from numpy.fft import fft NUM_SAMPLES = 512 SAMPLING_RATE = 11025 def setup(): size(350, 260) speed(SAMPLING_RATE / NUM_SAMPLES) _stream = None def read_fft(): global _stream pa = None def cleanup_audio(): if _stream: _stream.stop_stream() _stream.close() pa.terminate() if _stream is None: pa = pyaudio.PyAudio() _stream = pa.open(format=pyaudio.paInt16, channels=1, rate=SAMPLING_RATE, input=True, frames_per_buffer=NUM_SAMPLES) atexit.register(cleanup_audio) audio_data = fromstring(_stream.read(NUM_SAMPLES), dtype=short) normalized_data = audio_data / 32768.0 return fft(normalized_data)[1:1+NUM_SAMPLES/2] def flatten_fft(scale = 1.0): """ Produces a nicer graph, I'm not sure if this is correct """ for i, v in enumerate(read_fft()): yield scale * (i * v) / NUM_SAMPLES def triple(audio): '''return bass/mid/treble''' c = audio.copy() c.resize(3, 255 / 3) return c def draw(): '''Draw 3 different colour graphs''' global NUM_SAMPLES audio = array(list(flatten_fft(scale = 80))) freqs = len(audio) bass, mid, treble = triple(audio) colours = (0.5, 1.0, 0.5), (1, 1, 0), (1, 0.2, 0.5) fill(0, 0, 1) rect(0, 0, WIDTH, 400) translate(50, 200) for spectrum, col in zip((bass, mid, treble), colours): fill(col) for i, s in enumerate(spectrum): rect(i, 0, 1, -abs(s)) else: translate(i, 0) audio = array(list(flatten_fft(scale = 80)))

You missed a great “with-statement opportunity” in your read_fft() function. Make this function into a generator, put the pa.open() call in a with-statement, then you can iterate on this (call .next() on it) to get each data block. No need for atexit. Use the contextlib.closing wrapper to ensure the stream device is closed when the generator exits.

Cheers, I’ll have a go at using the context manager tomorrow (it’s 1230am here).

The maths bits I’m most shakey, on not really sure if flatten_fft is sane or not for instance (and maybe I can use numpy to do that instead?).

perhaps the graph would appear better visually if you convert it to dB units by taking the log10 of the FFT data… just a thought!

PS: I’m with you on the need for a powerful python audio library!

This may seem a silly question, but what sort of range numbers should I put into the log10 (at the moment I’m using numbers 0 to 1.0) ?

This may seem like a silly question, but what range of number should I put into the log10 ? – At the moment I’m using numbers in the range 1 to 10.