Getting Started with Audio in Unity

Audio in Unity uses much the same approach as everything else in Unity. There are game objects which have components that contain data and code. In this session, we’ll be creating a script that we can start adding audio into.

Tip	when working in Unity, make sure to save your work regularly.

Setting up our project

To start our project, we need to use Unity Hub to create a new Unity Project.

Open Unity Hub ( start › Unity Hub )
On the projects tab, select the New button.

The blue new project button in Unity Hub

For our new project, we want to create a 3D project. Give it a suitable name (eg, AudioProject).

Tip	it is important to make sure you use the LTS version of Unity installed in the labs. My screenshot version may differ due to bugs in Unity for Linux.

The new unity project page

Once you are happy with the settings, click the create button.

Populating our scene

Now we have the unity project set up, Unity should open and present us with a blank scene. For today’s session, we’ll not be adding much to this. We need a suitable game object for attaching our audio to.

Unity supports a full 3D audio system, but we’re just going to place a single audio listener at the origin (0,0,0) of the scene.

Right click on the Heirachy panel (the panel that opened on the left hand side)

The Unity Heirachy

Select Create Empty from the list that appears
Give the new object a suitable name, for example AudioManager

Adding Audio

To make audio work in Unity, we will need three things:

an audio listener (representing the player) - this is already in the scene attached to the camera
an audio source
a script to control the audio source

Audio Sources

We will now create the audio source and the script we’ll be using for these workshop sessions. Click on your new game object in the heirarchy (AudioManager).

On the right hand side of the window, the inspector should show the currently selected game object.

Check that the transform’s position is set to (0,0,0)
Click add component
Search for Audio Source
Select audio source from the list when it is shown

Adding an audio source

Our Script

Now all the boiler plate stuff is out the way, we can start getting to the interesting part. Generating audio from a script.

We’ll now create a script to house the code we’ll be using in the rest of tinkering audio.

Click add component again
This time, click new script at the bottom of the list that appears
Give your new script a suitable name, such as AudioManager

Note	Given I’ve called the game object audio manager, that might get confusing. Can you think of a better name? If you choose a different name be careful when adapting the code below.

Adding a new behaviour with AudioManager as the name

Writing our behaviour

We’ve now got a script - but it’s not doing anything yet! Unity uses a callback based approach. This means that functions get called automatically at certian times during the game loop.

We want to do some setup when the game first loads. We’ll therefore use the Start() method.

Note	`Start()` gets called when the object is first created - what is this function the same as in Windows Forms?

We need to do a few things to get started with our script:

Create a new property to store an object of type, AudioSource
Create a new clip object that we can use to play audio
Create two callbacks for the Unity Audio clip - we’ll talk about these in a second
Create some properties to make our code easier to read

    public int sampleRate = 44100;
    public int durationInSeconds = 1;

    private AudioSource source;
    private AudioClip generatedClip;

    // Start is called before the first frame update
    void Start()
    {
	int durationInSamples = sampleRate * durationInSeconds;
	generatedClip = AudioClip.Create("GeneratedAudio",
			durationInSamples, 1, sampleRate, true,
			OnAudioRead, OnAudioSetPosition);
        source = GetComponent<AudioSource>();
    }

    void OnAudioRead(float[] data) {
    }

    void OnAudioSetPosition(int newPosition) {
    }

Unity Audio Callbacks

Sample Rate is a way of setting how many array elements per second we need to process. Use the Audio Toy to experiment with this.

The AudioClip.Create function is a Factory method used to create audio clips. It takes a number of Arguments, which can be seen on the Unity Docs.

Name

First is the name of the audio clip. I’ve just set this to GeneratedAudio.

Length

We need to tell Unity how long our sample will be. This is expressed in the number of array elements required the number of elements is the sample rate times the duration in seconds (ie, if have a sample rate of 44100 per second, and our track lasts for 2 seconds, it will be 44100 * 2 elements long).

Channels

Next is the number of channels, I’ve hard coded this to 1 (mono audio). The unity documentation refers to the next argument as frequency - they really mean sample rate, and we’ll be using frequency to mean something else later in this session.

Streaming

Unity has two modes of operation, streaming and non-streaming. In streaming audio, when unity wants to play audio from our clip it asks the function we pass in next (OnAudioRead) to give it the next bit of the audio. This means we can avoid storing large amounts of data we’re generating on the fly and makes it much easier to change things.

Callbacks

Next there are two callbacks. The unity docs describe the function as AudioClip.PCMReaderCallback.

This might sound a bit scarey at first, but this just means we’re promising that, we’ll provide a function that:

The function returns no value (it’s return type is void)
The function takes an array of floats (float[]) as the only argument

Because we’re passing the function itself (and not calling it), we don’t put the brackets.

This kind of callback processing approach is very powerful. Lots of languages and UI libraries use it! You even used it during tinkering graphics for handling form events (remember the _Click and Mouse Event code).

Finally, there is another callback. Unity can use this to tell us if the audio is changed somehow outside our code (maybe someone restarted the audio clip, or it gets paused). We don’t have to implement this, but it’s one line of code so we might as well.

Populating our functions

SetAudioPosition

Let’s do the easier function first - OnSetAudioPosition. This function is meant to take the new position as an argument, then update the internal position. We need to keep track of our current position when playing the audio, so let’s store that as a variable at the top of the script.

int position = 0;

When the OnSetAudioPosition function runs, we can simply update this to be new position passed in:

void OnAudioSetPosition(int newPosition) {
	this.position = newPosition;
}

OnAudioRead

The first audio function we’ll implement is a sine wave. When the OnAudioRead method is called, we are given an array and are expected to populate it with audio samples. To do this, we need to use a loop:

void OnAudioRead(float[] data) {
	int count = 0;
	while ( count < data.Length ) {
		data[count] = SineWave( frequency, position );// TODO generate sample here;
		position++;
		count++;
	}
}

Note	I’ve implemented this as a while loop (as that’s what the documentation for Unity did). Can you figure out how to make this into a for loop?

The Sine Wave

Let’s recall what the equation for generating a sample for a sine wave looks like:

\[s \leftarrow a sin(2\pi t f)\]

Where:

s: is the resulting sample
a: is the maximum amplitude (volume)
$2\pi$: is a constant (remember, $2\pi$ radians = circle)
f: is the frequency
t: is the moment in time from our sample

Luckily for us, this translates fairly nicely into code:

public float volume = 0.5f;
public float frequency = 440;

private float SineWave(float frequency, int wavePosition) {
	float t = (float) wavePosition / sampleRate; // 0 = start, 1.0 = 1 second, etc...
	return volume * Mathf.Sin( 2 * Mathf.PI * frequency * t);
}

Note	I’m casting wavePosition to a float before doing the division, why am I doing this?

GUI

We’ve got no way to trigger our sound effect at the moment. Lets add a function which can trigger our sound effect (we could use this in other scripts as well) and a button.

void playAudio() {
	source.clip = generatedClip;
	source.Play();
}

To create a UI, we could add a Canvas and add a button to it, but we’re programmers so lets do it in code!

void OnGUI() {
	Rect bounds = new Rect(10, 10, 150, 100);
	if ( GUI.Button( bounds, "Play" ) ) {
		playAudio();
	}
}

We’re at the point we we can finally test our function! Press the play button in unity and let’s see what our scene looks like:

WHILE THE GAME IS RUNNING

In the inspector, you should be able to change some of the parameters. Try changing the frequency to 500 and playing the audio again. The note should be higher. Try changing the frequency to 300, it should be lower.

Let’s try modifying the duration, it’s currently 1 second. Let’s make it 2 seconds, it should be twice as long. Did that work? If you’re not sure, try an even longer time (eg 10 or 30 seconds).

Note	The audio will still be one second long. Why is this?

Try exiting play mode (clicking the play button so it’s not blue any-more), changing the duration and then pressing the play button did that work?

Experimenting with audio

There are a few ways we can extend this, and show that we’ve started to get a grasp on what’s going on.

Playing Multiple Notes

So far, we’ve made our scene play a single note. Let’s try making it play two notes, one after another. I’m leaving this one as open ended to you. There are a few ways of going about solving this.

Here are a few hints:

Set the duration to 2 seconds.
Based on your experimentation, we know that the frequency defines the note, so we need to alter the frequency after one second.
OnAudioRead is keeping track of the position in the wave using the position variable
Position is defined in sample rates (ie, when the value is equal to sample rate, one second has passed)

Experimenting with Durations

If you are able to do this, you should hopefully realise we can use the position as a tracker for the location in the audio track. 1 second seems a little long to hold a single note.

Can you figure out how to make each note last half a second?
What about switch back and forth every half second?
What about play a sequence of 5 or 6 notes?