Understanding Audio Buffer Sizes

When input and output audio may have different sampling rates, it can be confusing to understand the size of the buffers required. The goal of this section is to explain and provide examples to clear this confusion.

When you initialize the Immersitech Library, part of the library configuration is the sampling rate and number of frames as parameters. These values dictate the format of the audio on the OUTPUT side.

When you initialize any participant using some variant of the add participant function, part of the participant configuration is the sampling rate and number of channels as parameters. These values dictate the format of the audio on the INPUT side.

You will submit audio into the input function with the input format you specified and the Immersitech Library will convert the output audio to the output format specified.

The table below will exercise some examples for 10 millisecond buffers:

Who Sampling Rate Number of Frames Number of Channels Number of Samples
Library Output 48 kHz 480 2 960
Library Output 32 kHz 320 1 320
Library Output 24 kHz 240 2 480
Library Output 16 kHz 160 2 320
Library Output 8 kHz 80 1 80
Participant 1 Input 48 kHz 480 2 960
Participant 2 Input 48 kHz 480 1 480
Participant 3 Input 16 kHz 160 2 320
Participant 4 Input 16 kHz 160 1 160
Participant 5 Input 8 kHz 80 2 160
Participant 6 Input 8 kHz 80 1 80

It is important to also establish here that the Immersitech library currently supports the output buffer size to either be 10 milliseconds or 20 milliseconds worth of data. For an output sampling rate of 48 kHz, this is either 480 frames or 960 frames while at an output sampling rate of 8 kHz this is 80 frames or 160 frames. We also support buffer sizes of 512 or 1024 if your output sampling rate is 48 kHz for some audio systems that only use power of two buffers.