Audio Engine

From Hero of Allacrost Wiki
Jump to: navigation, search

This document describes the use and operation of the audio component of the Allacrost game engine. Allacrost's audio engine uses the OpenAL library for audio playback, as well as the Ogg Vorbis library for loading and streaming music. The main goal of the audio engine is to provide a powerful and easy-to-use interface for the API user. Allacrost audio data comes in two file formats: .wav for sound effects and .ogg for music. This design document is divided into the following sections.


Audio Engine Summary
  • src/engine/audio/
  • audio.h
  • audio.cpp
  • audio_descriptor.h
  • audio_descriptor.cpp
  • audio_effects.h
  • audio_effects.cpp
  • audio_input.h
  • audio_input.cpp
  • audio_stream.h
  • audio_stream.cpp
Include Header
  • #include "audio.h"
  • hoa_audio
Classes Defined
  • GameAudio (singleton name: AudioManager)
  • AudioDescriptor
  • SoundDescriptor
  • MusicDescriptor
Libraries Used
  • OpenAL (cross-platform audio playback library)
  • Ogg Vorbis (.ogg file manipulation routines)

OpenAL Concepts

To fully understand how to utilize all of the features in the audio engine, you're going to need a crash course on OpenAL. You OpenAL's API is modeled after that of OpenGL, so calls to the two libraries may appear very similar. OpenAL is a 3D audio library, meaning that audio that is played back has a position, and that position in part determines how the audio plays. For example, the further the player moves away from a running waterfall, the more quiet the water sound will be. OpenAL has three fundamental objects which you must understand: buffers, sources, and listeners.

Buffers can be viewed as nothing more than containers of audio data. When you load a sound or music file, that audio data gets placed into an OpenAL buffer or buffers (we may use multiple buffers in the case of streaming audio). The only limit to the number of OpenAL buffers that one can have at any time is due to the amount of main memory on the system. Sources are exactly what they sound like: a source of audio in a three dimensional space. OpenAL can only create a limited number of sources depending on the player's operating system and audio card, but the typical number of sources available are from 16 to 64. The audio engine automatically manages sources between sounds and music so that the API user should never need to concern themselves with them. Finally every OpenAL context has a single listener, which describes the player's position, velocity, and orientation in 3D space. The properties of the listener affect the playback of all audio that is using positional playback.

The AudioDescriptor class encapsulates all the OpenAL buffers that a piece of audio needs to hold its data. The audio engine creates as many OpenAL sources as possible (up to an upper limit of 64). These sources are then shared by AudioDescriptor objects in the background so that the user should never need to concern themselves over whether a piece of audio will have access to a source or not. Finally the listener object is retained by the AudioManager singleton, where the API user can access and modify its properties.

Differences Between Sound and Music

There are only a few differences between the way sounds and music are processed by the audio engine, but it is important to note the differences.

  • Separate global volume controls

The GameAudio class has two volume controls, one which affects the volume of all sounds, and the other which affects the volume of all music.

  • Music playback

When a piece of music is played, if any other music is playing, that music is automatically stopped. In effect, only one piece of music is allowed to play at any time, while any number of sounds may be played simultaneously.

  • Looping defaults

By default, sounds have looping disabled while music has looping enabled.

  • Streaming defaults

By default, sounds play statically while music is streamed.


AudioDescriptor is an abstract class that represents audio data, whether it is sound or music. The SoundDescriptor and MusicDescriptor objects derive from this class, but they are used and manipulated in essentially the same manner (except as noted in the section on differences between sound and music). After constructing either a SoundDescriptor or MusicDescriptor object, you use one the following function to initialize the class with audio data from a file:

// For SoundDescriptor objects
bool SoundDescriptor::LoadAudio(const std::string& filename, AUDIO_LOAD load_type = AUDIO_LOAD_STATIC, uint32 stream_buffer_size = private_audio::DEFAULT_BUFFER_SIZE);

// For MusicDescriptor objects
bool MusicDescriptor::LoadAudio(const std::string& filename, AUDIO_LOAD load_type = AUDIO_LOAD_STREAM_FILE, uint32 stream_buffer_size = private_audio::DEFAULT_BUFFER_SIZE); 

First note that the last two arguments have default values, which differ depending upon whether the object is a SoundDescriptor or MusicDescriptor. The first string argument should contain the filename of the sound or music file to load. Sound files should have a ".wav" extension, while music files should have a ".ogg" extension. The second argument describes the manner in which the audio should be loaded. The choices are AUDIO_LOAD_STATIC, AUDIO_LOAD_STREAM_FILE, and AUDIO_LOAD_STREAM_MEMORY. Statically loaded audio can not support customized looping, which will be explained later. Audio that is statically loaded takes the entire contents of the audio data and places it in a single OpenAL buffer. Streaming audio takes the audio data from a source (a file or memory where the audio data has been loaded to) and streams the data into OpenAL buffers in chunks. The third argument specifies the size of those chunks, and thus this argument is only valid when the second argument does not indicate static loading. The default buffer size should suffice. The function will return true if the audio is loaded successfully, or false if it was not.

Note that the properties set in the LoadAudio call (load type, buffer size) may not be changed unless the function is called once more. If any audio data is held by the object when LoadAudio is called, it will be released regardless of whether or not the LoadAudio call is successful. To manually release the audio data, make the following call.

void AudioDescriptor::FreeAudio();

This will remove the audio data, as well as reset the playback position and set the audio state to AUDIO_STATE_UNLOADED. When the object is destroyed, the destructor will call FreeAudio to remove any audio data that is still present.

Note: About Copying

One must be careful when using the copy constructor or copy assignment operator on any SoundDescriptor or MusicDescriptor object. The reason for this is because the internal audio data, buffers, sources, state, and other members are not copied. It is perfectly fine to make copies of these objects as long as they do not have valid data loaded (ie, LoadAudio was not yet invoked, or FreeAudio was invoked prior to making the copy). Making a copy of an object which has audio data loaded will invoke a warning message, and that audio data will not be copied over (you will have to call LoadAudio on the copy for it to be valid). This mistake commonly occurs when constructing a container of audio descriptor objects. see the example code below.

// INCORRECT! The copy pushed onto the back of the my_sounds vector is not loaded
vector<SoundDescriptor> my_sounds;
SoundDescriptor sound;

// CORRECT! The LoadAudio method is called only after the copy is made
vector<SoundDescriptor> my_sounds;

State Manipulation

State manipulation covers the principal methods for manipulating audio, such as playing, stopping, and pausing. The AUDIO_STATE enum declares the possible states that a piece of audio may be in.


To retrieve the state of an AudioDescriptor object, make the following call.

AUDIO_STATE AudioDescriptor::GetState();

The following functions should be implicit as to their purpose. Note that the state of the audio when these methods are invoked may cause different results (trying to pause audio that is already paused will result in no operation).

void AudioDescriptor::Play();
void AudioDescriptor::Stop();
void AudioDescriptor::Pause();
void AudioDescriptor::Resume();
void AudioDescriptor::Rewind();

Seeking and Looping

It is possible to seek the playback position of audio data to a specific location. You may either seek by sample (where 0 is the first sample of audio data and n-1 is the final sample), or by second (where 2.5f would equal 2.5 seconds). The SeekSecond version actually seeks to the closest sample to the second offset requested, so that the next piece of audio that will be played will be a full sample. If the sample or second arguments are invalid (out of range, or negative), then no seek operation will occur. Note that calling the Rewind method is essentially the same as seeking to the beginning of the audio.

void AudioDescriptor::SeekSample(uint32 sample);
void AudioDescriptor::SeekSecond(float second);

Looping is enabled by default for music, but is disabled by default for sounds. You may use the IsLooping method to determine if the audio has looping enabled or disabled, and the SetLooping method to enable or disable the looping effect.

bool AudioDescriptor::IsLooping() const
void AudioDescriptor::SetLooping(bool loop);

Custom Looping

(to be written)

Positional Audio

(to be written)


(to be written)

Audio Cache Management

(to be written)

Audio Effects

(to be written)