FMOD Engine User Manual 2.03
This chapter will introduce you to using 3D sound with the Core API. With it, you can easily implement interactive 3D audio and have access to features such as 5.1 or 7.1 speaker output, automatic attenuation, doppler, and more advanced psychoacoustic 3D audio techniques.
For information specific to the Studio API and FMOD Studio events, see the Studio API 3D Events chapter.
You do not need to set the speaker mode for FMOD. Any sound using FMOD_3D is automatically positioned in a surround speaker system. As long as the player's sound card supports it, and their operating system speaker settings are correct, their audio device will be able to output the sound in 5.1 or 7.1.
When loading a sound or sound bank, the sound must be created with System::createSound or System::createStream using the FMOD_3D flag. ie.
result = system->createSound("../media/drumloop.wav", FMOD_3D, 0, &sound);
if (result != FMOD_OK)
{
HandleError(result);
}
It is generally best not to try and switch between 3D and 2D at all, if you want though, you can change the Sound or Channel's mode to FMOD_3D_HEADRELATIVE at runtime which places the sound always relative to the listener, effectively sounding 2D as it will always follow the listener as the listener moves around.
A major part of spatialization is attenuating the volume of a channel based on its distance from the listener. The FMOD Engine supports multiple different models for how this should occur.
This is the default FMOD 3D distance model. All sounds naturally attenuate (fade out) in the real world using an inverse distance attenuation. The flag to set to this mode is FMOD_3D_INVERSEROLLOFF but if you're loading a sound you don't need to set this because it is the default. It is more for the purpose or resetting the mode back to the original if you set it to FMOD_3D_LINEARROLLOFF at some later stage.
When FMOD uses this model, 'mindistance' of a Sound / Channel, is the distance that the sound starts to attenuate from. This can simulate the sound being smaller or larger. By default, for every doubling of this mindistance, the sound volume will halve. This roll-off rate can be changed with System::set3DSettings.
As an example of relative sound sizes, we can compare a bee and a jumbo jet. At only a meter or 2 away from a bee we will probably not hear it any more. In contrast, a jet will be heard from hundreds of meters away. In this case we might set the bee's mindistance to 0.1 meters. After a few meters it should fall silent. The jumbo jet's mindistance could be set to 50 meters. This could take many hundreds of meters of distance between listener and sound before it falls silent. In this case we now have a more realistic representation of the loudness of the sound, even though each wave file has a fully normalized 16bit waveform within. (ie if you played them in 2D they would both be the same volume).
The 'maxdistance' does not affect the rate of roll-off, it simply means the distance where the sound stops attenuating. Don't set the maxdistance to a low number unless you want it to artificially stop attenuating. This is usually not wanted. Leave it at its default of 10000.0.
This is a combination of the inverse and linear-square roll-off models. At shorter distances where inverse roll-off would provide greater attenuation, it functions as inverse roll-off mode; then at greater distances where linear-square roll-off mode would provide greater attenuation, it uses that roll-off mode instead. For this roll-off mode, distance values greater than mindistance are scaled according to the rolloffscale. Inverse tapered roll-off mode approximates realistic behavior while still guaranteeing the sound attenuates to silence at maxdistance.
These are alternative distance models, also available in the FMOD Engine. To use them, add the FMOD_3D_LINEARROLLOFF or FMOD_3D_LINEARSQUAREROLLOFF flag to System::createSound or Sound::setMode / ChannelControl::setMode. While less realistic, these models are more game programmer-friendly, as they result in the attenuation fading linearly between 'mindistance' and 'maxdistance'. In these modes, the mindistance is the same as it is in the inverse model (i.e.: the minimum distance before the sound starts to attenuate), but the maxdistance is the point where the volume = 0 due to 3D distance. The attenuation in-between those two points is linear or linear squared, depending on which model is selected.
Custom roll-off allows a FMOD_3D_ROLLOFF_CALLBACK to be set that allows you to calculate how the volume roll-off happens. If a callback is not convenient, the Core API also allows an array of points that are linearly interpolated between, to denote a 'curve', using ChannelControl::set3DCustomRolloff.
If the player's sound card supports it, any sound using FMOD_3D is automatically positioned in a surround speaker system, so you do not need to set the speaker mode for FMOD. Provided the player has correctly set their operating system's speaker settings, their audio device will be able to output the audio in 5.1 or 7.1.
There are three configurable settings in the FMOD Engine that affect all 3D sounds. These are:
All three settings can be set with System::set3DSettings. In most games, there is no need to set them.
While spatialization is often enough on its own, some games benefit from more complex 3D behavior. Here's a few ideas.
Controlling a spatializer DSP using the Core API requires setting the data parameter associated with 3D attributes for the Channel. This is a data parameter of type FMOD_DSP_PARAMETER_DATA_TYPE_3DATTRIBUTES or FMOD_DSP_PARAMETER_DATA_TYPE_3DATTRIBUTES_MULTI. When using the Core API System, you must set this DSP parameter explicitly. To do this, use ChannelControl::set3DAttributes with the handle that was returned from System::playSound for the channel. If 3D positioning of a ChannelGroup instead, set the ChannelGroup to be 3D once with ChannelControl::setMode, then call ChannelControl::set3DAttributes for that channel group.
Because the effect of a spatializer DSP depends on the position of the channel or channel group relative to the listener, it is also necessary to update the 3D attributes of the listener once per frame with System::set3DListenerAttributes.
Call System::update once per frame so the 3D calculations can update based on the positions and other attributes.
This method works with our pan DSP, the object panner DSP, the Resonance Source and Soundfield spatializers, and any other third party plug-ins that make use of the FMOD spatializers.
Attributes must use a coordinate system with the positive Y axis being up and the positive X axis being right (left-handed coordinate system). FMOD converts passed in coordinates from right-handed to left-handed for the plug-in if the System is initialized with the FMOD_INIT_3D_RIGHTHANDED flag.
The absolute data for the FMOD_DSP_PARAMETER_3DATTRIBUTES is straightforward, however the relative part requires some work to calculate.
/*
This code supposes the availability of a maths library with basic support for 3D and 4D vectors and 4x4 matrices:
// 3D vector
class Vec3f
{
public:
float x, y, z;
// Initialize x, y & z from the corresponding elements of FMOD_VECTOR
Vec3f(const FMOD_VECTOR &v);
};
// 4D vector
class Vec4f
{
public:
float x, y, z, w;
Vec4f(const Vec3f &v, float w);
// Initialize x, y & z from the corresponding elements of FMOD_VECTOR
Vec4f(const FMOD_VECTOR &v, float w);
// Copy x, y & z to the corresponding elements of FMOD_VECTOR
void toFMOD(FMOD_VECTOR &v);
};
// 4x4 matrix
class Matrix44f
{
public:
Vec4f X, Y, Z, W;
};
// 3D Vector cross product
Vec3f crossProduct(const Vec3f &a, const Vec3f &b);
// 4D Vector addition
Vec4f operator+(const Vec4f &a, const Vec4f &b);
// 4D Vector subtraction
Vec4f operator-(const Vec4f& a, const Vec4f& b);
// Matrix multiplication m * v
Vec4f operator*(const Matrix44f &m, const Vec4f &v);
// 4x4 Matrix inverse
Matrix44f inverse(const Matrix44f &m);
*/
void calculatePannerAttributes(const FMOD_3D_ATTRIBUTES &listenerAttributes, const FMOD_3D_ATTRIBUTES &emitterAttributes, FMOD_DSP_PARAMETER_3DATTRIBUTES &pannerAttributes)
{
// pannerAttributes.relative is the emitter position and orientation transformed into the listener's space:
// First we need the 3D transformation for the listener.
Vec3f right = crossProduct(listenerAttributes.up, listenerAttributes.forward);
Matrix44f listenerTransform;
listenerTransform.X = Vec4f(right, 0.0f);
listenerTransform.Y = Vec4f(listenerAttributes.up, 0.0f);
listenerTransform.Z = Vec4f(listenerAttributes.forward, 0.0f);
listenerTransform.W = Vec4f(listenerAttributes.position, 1.0f);
// Now we use the inverse of the listener's 3D transformation to transform the emitter attributes into the listener's space:
Matrix44f invListenerTransform = inverse(listenerTransform);
Vec4f position = invListenerTransform * Vec4f(emitterAttributes.position, 1.0f);
// Setting the w component of the 4D vector to zero means the matrix multiplication will only rotate the vector.
Vec4f forward = invListenerTransform * Vec4f(emitterAttributes.forward, 0.0f);
Vec4f up = invListenerTransform * Vec4f(emitterAttributes.up, 0.0f);
Vec4f velocity = invListenerTransform * (Vec4f(emitterAttributes.velocity, 0.0f) - Vec4f(listenerAttributes.velocity, 0.0f));
// We are now done computing the relative attributes.
position.toFMOD(pannerAttributes.relative.position);
forward.toFMOD(pannerAttributes.relative.forward);
up.toFMOD(pannerAttributes.relative.up);
velocity.toFMOD(pannerAttributes.relative.velocity);
// pannerAttributes.absolute is simply the emitter position and orientation:
pannerAttributes.absolute = emitterAttributes;
}
When using FMOD_DSP_PARAMETER_3DATTRIBUTES_MULTI, you must call calculatePannerAttributes for each listener, filling in the appropriate listener attributes.
Set this on the DSP by using DSP::setParameterData with the index of the FMOD_DSP_PARAMETER_DATA_TYPE_3DATTRIBUTES. You will need to check with the author of the DSP for the structure index. Pass the data into the DSP using DSP::setParameterData with the index of the 3D Attributes, FMOD_DSP_PARAMETER_DATA_TYPE_3DATTRIBUTES or FMOD_DSP_PARAMETER_DATA_TYPE_3DATTRIBUTES_MULTI.
The following is an example of a typical game's audio loop that uses System::update to update the 3D attributes of channels and listeners, as well as the FMOD channel management system, once per frame.
do
{
UpdateGame(); // here the game is updated and the sources would be moved with channel->set3DAttibutes.
system->set3DListenerAttributes(0, &listener_pos, &listener_vel, &listener_forward, &listener_up); // update 'ears'
system->update(); // needed to update 3d engine, once per frame.
} while (gamerunning);
Most games usually take the position, velocity and orientation from the camera's vectors and matrix.
Velocity is only required if you want doppler effects. If you do not, you can pass 0 or NULL to both System::set3DListenerAttributes and ChannelControl::set3DAttributes for the velocity parameter, and no doppler effect will be heard.
It is important that the velocity passed to the FMOD Engine is in meters per second and not meters per frame. To get the correct velocity vector, use a method such as calculating it using vectors from your game's physics code. Don't just subtract the last frame's position from the current position, as this is affected by framerate, meaning that the higher the framerate the smaller the position deltas and thus the smaller the doppler effect, which is incorrect.
If the only way you can get the velocity is to subtract this and last frame's position vectors, then remember to time adjust them from meters per frame back up to meters per second. This is done simply by scaling the difference vector obtained by subtracting the two position vectors, by one over the frame time delta.
Here is an example.
velx = (posx-lastposx) * 1000 / timedelta;
velz = (posy-lastposy) * 1000 / timedelta;
velz = (posz-lastposz) * 1000 / timedelta;
timedelta is the time since the last frame in milliseconds. This can be obtained with functions such as timeGetTime(). So at 60fps, the timedelta would be 16.67ms. if the source moved 0.1 meters in this time, the actual velocity in meters per second would be:
vel = 0.1 * 1000 / 16.67 = 6 meters per second.
Similarly, if we only have half the framerate of 30fps, then subtracting position deltas will gives us twice the distance that it would at 60fps (so it would have moved 0.2 meters this time).
vel = 0.2 * 1000 / 33.33 = 6 meters per second.
Getting the correct orientation set up is essential if you want the source to move around you in 3D space.
By default, FMOD uses a left-handed coordinate system. If you are using a right-handed coordinate system then FMOD must be initialized by passing FMOD_INIT_3D_RIGHTHANDED to System::init. In either case FMOD requires that the positive Y axis is up and the positive X axis is right, if your coordinate system uses a different convention then you must rotate your vectors into FMOD's space before passing them to FMOD.
Note for plug-in writers: FMOD always uses a left-handed coordinate system when passing 3D data to plug-ins. This coordinate system is fixed to use +X = right, +Y = up, +Z = forward. When the system is initialised to use right-handed coordinates FMOD will flip the Z component of vectors before passing them to plug-ins.
Some games have a split screen mode, where different sections of the screen represent cameras in different locations. As the listener is almost always positioned in the same location as the camera, this means that the FMOD Engine has to be able to handle more than one listener at once. This is handled by using System::set3DNumListeners and System::set3DListenerAttributes.
For example, if you have two player split screen, System::set3DNumListeners would be set to two. When updating the positions of the listener, for each 'camera' or 'listener' call System::set3DListenerAttributes with 0 as the listener number of the first camera, and 1 for the listener number of the second camera.
When using multiple listeners in the Core API, 3D Channels have the following behavior:
A stereo sound, when played as 3d, is split into two mono voices internally which are separately 3D positionable. Multi-channel audio formats are also supported, so an eight channel sound (for example) allocates 8 mono voices internally in FMOD. To rotate the left and right part of a stereo 3D sound in 3D space, use the ChannelControl::set3DSpread function. By default, the subchannels position themselves in the same place, therefore sounding 'mono'.
Historically, audio spatialization (the process of taking an audio file and making it sound "in the world") has been all about positioning sound in speakers arranged on a horizontal plane. This arrangement is often seen in the form of 5.1 or 7.1 surround. With the advancement of VR technology, however, more emphasis has been put on making sound as immersive as the visuals. This is achieved by more advanced processing of the audio signals for the traditional horizontal plane as well as the introduction of height spatialization. This has given the rise of the term "spatial audio" which focuses on this more realistic approach to spatialization.
Within FMOD there are several ways you can achieve a more immersive spatialization experience, depending on your target platform some may or may not apply. The following sections outline a few general approaches with specific implementation details contained within.
The most traditional way to approach spatialization is by panning signal into virtual speakers, so with the introduction of 7.1.4 (7 horizontal plane speakers, 1 sub-woofer, 4 roof speakers) you can do just this.
System::setSoftwareFormat(0, FMOD_SPEAKERMODE_7POINT1POINT4, 0).System::setOutput(FMOD_OUTPUTTYPE_WINSONIC).You can now System::createSound and System::playSound content authored as 7.1.4. If you have the necessary sound system setup (i.e. Dolby Atmos) you will hear the sound play back including the ceiling speakers. If you have a headphone based setup (i.e. Windows Sonic for Headphones or Dolby Atmos for Headphones) you will hear an approximation of ceiling speakers.
To take an existing horizontal plane signal and push it into the ceiling plane you can create an FMOD spatializer and adjust the height controls.
System::createDSPByType(FMOD_DSP_TYPE_PAN).Not only will this let you blend to the 0.0.4 ceiling speakers by setting the value between 0.0 and 1.0, it will also let you blend from the 0.0.4 ceiling speakers to the ground plane 7.1.0 by setting the value between 0.0 and -1.0.
The FMOD_OUTPUTTYPE_WINSONIC plug-in supports 7.1.4 output available on Windows, UWP, Xbox One and Xbox Series X|S. Also, the FMOD_OUTPUTTYPE_PHASE plug-in supports 7.1.4 output for iOS devices. Other platforms will fold 7.1.4 down to 7.1.
To get more discrete spatialization of an audio signal you can use the FMOD object spatializer, so named because the audio signal is packaged with the spatialization information (position, orientation, etc) and sent to an object mixer. Often used to highlight important sounds with strong localization to add interest to a scene, usually used in-conjunction with the channel based approach, be that 7.1.4 or even simply 5.1 / 7.1.
System::setOutput(FMOD_OUTPUTTYPE_WINSONIC) or System::setOutput(FMOD_OUTPUTTYPE_AUDIO3D) or System::setOutput(FMOD_OUTPUTTYPE_AUDIOOUT) or System::setOutput(FMOD_OUTPUTTYPE_PHASE).System::createDSPByType(FMOD_DSP_TYPE_OBJECTPAN).There is no limit to how many FMOD_DSP_TYPE_OBJECTPAN DSPs you can create, however there is a limit to how many can be processed at a time. This limit is flexible, and varies from platform to platform. When there are more object spatializers in use than there are available resources for, FMOD virtualizes the least significant sounds by processing with a traditional channel based mix.
An important consideration, when using object spatializers, is signal flow. Unlike most DSPs, after the signal enters an object spatializer DSP it is sent out to the object mixer. Regardless of whether the object mixer is a software library or a physical piece of hardware, the result is that you no longer have access to that signal. Any processing you would like to perform on that signal must therefore be accomplished before it enters the object spatializer DSP. Despite this, to assist mixing, the object spatializer automatically applies any "downstream" ChannelGroup volume settings.
Object spatialization is available via the following output plug-ins:
Other output plug-ins will emulate object spatialization using traditional channel based panning.
In addition to the built-in channel and object based approaches there are third party plug-ins available that can assist too. The FMOD DSP plug-in API (see FMOD_DSP_DESCRIPTION) allows any developer to produce an interface for their spatial audio technology and provide it across all FMOD platforms. Additionally the FMOD output plug-in API (see FMOD_OUTPUT_DESCRIPTION) allows developers to implement a renderer for the FMOD object spatializer extending the functionality to more platforms and more technologies.
Some examples of publicly-available third-party plug-ins: