Audio | COMP101

We’ve looked at one application of the techniques we’ve covered. This week, we’re going to look at another - we’re moving from graphics to audio. There are many intresting concepts which we can extract the litrature on Audio Processing which you may find useful in other contexts. Spesifically, anything related to real-time processing.

What is Audio?

When we think of audio, we tend to think as it being a continous wave. Computers aren’t very good at dealing with this kind of data - infact they are absolutely terrible at it! To deal with continous data, we actually take 'snapshots' of time and compute what is going on at that snapshot. This approach is very common when we want to represent continous data for computers.

As is common in computing, this idea can be found everywhere when you know what to look for:

In animation you might call them keyframes,
in video files we call them frames,
in physics engines we call them timesteps,
but in audio we call them samples.

Audio Toy

Open the Audio Toy, I built this to demonstate the concept at play using audio. The frequency used for the demo is quite a lot lower than human hearing, but if you set it too high the browser will be mad at you!

I’m showing a wave, which is at a frequency of two, in other words, the wave repeats twice in one block of time (where block is usually a second).

Adjust the frequency to see how this alters the graph
- Set the value to 1 and 20 - does the graph change how you’d expect?

Note	Recall I said computers are bad at representing this kind of thing. There is a bit of trickery going on with this example - this is actually just a wave being sampled at a suitably high rate you can’t tell the difference.

Sampling

Lets say you wanted to represent this wave in a way that you could store in a computer. I’ve already told you how we would do this:

Divide the wave into equal-sized sections
Make a note of the value at this point
Store this value in a suitable structure (eg, an array)

My toy can demonstate this (well, not the array part). Select the 'Show points?' option. The lines on the cricles are not important - I was just being lazy when I drew them (they are really 360 degree arcs).

The points in the blue are the points in the wave that we would sample. The number of times we do this per 'block' of time is called the sample rate adjust the sample rate box to alter the number of points in the plot. When the computer reconstructs these, it doesn’t actually know what the original sound was doing inbetween the samples. There are a few different ways in which you could move between the values, and animation makes extensive use of this fact. For this demo, I’m just moving between the values linarly (in a streight line) - you may sometimes see this referred to as Lerp.

Select the final option on the 'Features' menu - the 'show plot' option. This is what the resulting wave would look like if you lerp’d between each sample. Adjust the sample rate to see how well the two lines match up. You can also disable the 'real' line unclicking the "show real?" option.

Audio Samples

Audio in the range of 2 or 3 hertz (that’s the unit for frequency when the 'block of time' is a second) would not be audable. The human ear can’t hear sounds that low! Real sounds are more likley to be in the region of hundred to thoustands of hertz. The plot looks far too messy at that kind of frequency to be readable. Lets look at some different frequencies and how they sound.

Note	you need headphones for this next bit, if you don’t have them you can skip it - although for these sessions going forward you can book out headphones from the GA stores.

We can hear what different frequencies sound like. Notes which have different frequences are preceived as having different pitches (we hear them as different notes). On the right hand side of the toy there is a button marked 'generate'. This will generate waves with the frequencies listed in the table. Click on 'play' and you should be able to hear the difference between each frequency.

Tip	Some of the options (like wave type and volume) have an effect on the scale’s play buttons. The frequency and sample rate from the graph don’t effect them though. The frequency is taken from the table, and the sample rate is set by your audio card!

As a mentioned previouslly, this frequency is actually represented by samples of the wave. The number of samples is also expressed per second - a fairly common sample rate is 44,100 samples per second (so, 44,100Hz) - that’s a lot of samples for a 3 minute song!

Loading Audio Files

Just like in graphics, it’s fairly common to use an optimised on-disk format for audio files, such as mp3, or ogg vorbis. I’m not going to go into detail about how these formats work - it’s a facinating discussion for another time. We’re going to use a library to import our audio files that we are going to modify.

That library is called NAudio, and we can install it the same way we installed the graphics library we used for our graphics tool - using nuget.