With small channel counts, mono and stereo, the channel converter will
no longer allocate memory on the heap for the channel map, but will
instead just store it in the struct directly. For larger channel counts
it will fall back to a heap allocation. This prevents stereo channel
maps resulting in a heap allocation of 8 bytes.
With this change a stereo passthrough `ma_device` can be initialized
with a heap allocation for the internal data converter.
This only affects passthrough channel conversion. When shuffling or
weights are required, a heap allocation will still be done. This
optimization is specifically for passthrough.
This adds the following APIs:
ma_device_get_playback_format()
ma_device_get_capture_format()
ma_device_get_playback_channels()
ma_device_get_capture_channels()
ma_device_get_playback_channel_map()
ma_device_get_capture_channel_map()
The heap allocation is aligned to MA_CACHE_LINE_SIZE which is an
optimal alignment for ring buffers.
This also reduces the size of the `ma_device` struct for non-duplex
devices which is the most common setup.
The following functions are removed:
ma_linear_resampler_get_required_input_frame_count()
ma_linear_resampler_get_expected_output_frame_count()
ma_resampler_get_required_input_frame_count()
ma_resampler_get_expected_output_frame_count()
ma_data_converter_get_required_input_frame_count()
ma_data_converter_get_expected_output_frame_count()
These functions were used for calculating the required number of input
frames given an output capacity, and the number of expected number of
output frames given an input frame count. In practice these have proven
to be extremely annoying and finicky to get right. I myself have had
trouble keeping this working consistently as I make changes to the
processing function and I have zero confidence custom resampling
backends will implement this correctly.
If you need this functionality, take a copy of the resampler from
miniaudio 0.11.x and maintain that yourself.
The idea here is to only update the resampler object once at the end.
This improves speeds up the problematic s16 mono upsampling path with
Clang, but that same path with GCC is still slow somehow.
This makes the s16 mono upsampling path slower somehow. This seems to be
the problem code path for some reason. Other paths don't seem to be so
sensitive to seemingly harmless changes.
The idea here is to have a more clearly defined data dependency
separation between the resampler and the filtering state which I'm
hoping might open up more optimization opportunities. The problem with
this theory, is that this commit makes the GCC build slower on the s16
mono upsampling path. It appears to be slightly fast with Clang though.
I tried addressing this, but upon doing so the build was slower. It was
especially bad with Clang where is was 2x slower(!), and just slightly
slower with GCC.
Not sure exactly what's going on here, but I guess the compiler is
hitting some edge case that's prevent efficient optimization. What's
weirder, is the slowness only affects the mono s16 code path. Other
code paths are totally fine.