Android NDK: working with OpenSL ES

Good day, Habrazhiteli.

I previously wrote about OpenAL. Later, comrade zagayevskiy wrote a good article on OpenSL ES. In one of our games, in order not to rewrite all the code for working with sound, we did not rewrite everything on OpenSL ES (at the port on Android). Not many sounds were used in the game, so there was no problem with OpenAL. But in the last game, we used a lot of sounds (the specificity of the game requires it), it was here that we ran into a big problem (delays during playback - the least of them). It was decided to rewrite everything on OpenSL ES . For this, I wrote a couple of wraps, about which I already told. I decided to share this and on Habré, maybe someone will come in handy.

OpenSL ES Short Description

This case is available with Android API 9 (Android 2.3) and higher. Some features are only available in Android API 14 (Android 4.0) and higher. OpenSL ES provides an interface in C, which can also be called from C ++, providing the same capabilities as the Android Java API parts for working with sounds:

Note : Although it is based on OpenSL ES, this API is not a complete implementation of any profile from OpenSL ES 1.0.1.

Liba, as you might have guessed, is written in pure C. Therefore, there is no full-fledged OOP. Special structures are used (let's call them pseudo-object-oriented structures (:), which are the usual C structures that contain pointers to functions that receive pointers to the structure by the first argument. Something like this in C ++, but explicitly. In OpenSL ES are two types of such structures:

An object ( SLObjectItf ) is an abstraction of a set of resources designed to perform a specific set of tasks and to store information about these resources. When creating an object, its type is defined, which determines the range of tasks that can be solved with its help.
Interface ( SLEngineItf , SLSeekItf , etc.) is an abstraction of a set of interrelated functionality provided by a particular object. The interface includes a variety of methods used to perform actions on an object. The interface has a type that defines the exact list of methods supported by this interface. The interface is determined by its identifier, which can be used in the code to refer to the type of interface (for example, SL_IID_VOLUME, SL_IID_SEEK ). All constants and interface names are pretty obvious, so special problems should not arise.

To summarize : objects are used to allocate resources and get interfaces. And then with the help of these interfaces we work with the object. One object can have several interfaces (for changing volume, for changing position, etc.). Depending on the device (or object type), some interfaces may not be available. I will say in advance, you can stream audio from your assets directory using SLDataLocator_AndroidFD , which supports an interface for moving positions around a track. At the same time, you can load the entire file into the buffer (using SLDataLocator_AndroidFD ), and play it from there. But this object does not support the SL_IID_SEEK interface, therefore it will not be possible to move around the track = /

Audio content

There are many ways to pack audio content into an app:

Resources . By placing audio files in res / raw / directories, you can easily access them using the API for Resources . However, there is no direct native access to these resources, so you will have to copy them from Java code.
Assets . By placing the audio files in the assets / directory, you can access them from C ++ with the help of the native manager. See android / asset_manager.h and android / asset_manager_jni.h headers for more information.
Network You can use the URI data locator to play audio directly from the network. Do not forget about the necessary permisheny for this (:
Local file system . The URI data locator supports the file: scheme for accessing local files, provided that the files are accessible to the application (well, that is, it will not work to read files from the internal storage of another application). Please note that in Android, access to files is restricted using the Linux user ID and group ID mechanisms.
Record . Your application can record audio from a microphone, save content, and later play.
Compiled and linked inline . You can directly cram audio content into the library, and then play with the buffer queue data locator. This is very well suited for short PCM compositions. PCM data is converted to a hex string using the bin2c tool.
Generation in real time . An application can generate (synthesize) PCM data on the fly, and then play it back using the buffer queue data locator.

Something about my wrappers

In general, I am a fan of OOP, so I try to somehow group a certain functional of C-methods and wrap my classes so that it will be convenient to work in the future. By analogy with the way I did it for OpenAL , classes appeared:

OSLContext . It is responsible for initializing the library and creating instances of the required buffers.
OSLSound . Base class for working with sounds.
OSLWav . Class to work with wav. Inherited from OSLSound to keep the overall interface to work. To work with ogg, you can then create a class OSLOgg, as I did in OpenAL. This distinction is made, since these formats have a completely different loading process. WAV is a clean format, it’s enough just to read the bytes, but ogg must also be decompressed with Ogg Vorbis , I’m generally silent about mp3 (:
OSLMp3 . Class for working with Mp3. Inherited from OSLSound to keep the overall interface to work. The class almost doesn't implement anything at all, because the mp3 stream is. But if you want to decode mp3 with the help of some lame or something else, then in the load (char * filename) method you can implement decoding and use the BufferPlayer.
OSLPlayer . Actually, the main class for working with sound. The fact is that the mechanism of work in OpenSL ES is not the same as in OpenAL. In OpenAL there is a special structure for the buffer and sound source (on which we hang the buffer). In OpenSL ES, everything revolves around players that are different.
OSLBufferPlayer . We use this player when we want to load the entire file into memory. As a rule, it is used for short sound effects (shot, explosion, etc.). As already said, it does not support the SL_IID_SEEK interface, therefore it will not be possible to move around the track.
OSLAssetPlayer , allows you to stream from the assets directory (that is, do not load the entire file into memory). Use to play long tracks (background music, for example).

The principle of working with objects

The whole cycle of working with objects like this:

Get the object by specifying the desired interfaces.
Implement it by calling (*obj)->Realize(obj, async) .
Get the required interfaces by calling (*obj)-> GetInterface (obj, ID, &itf)
Work through interfaces.
Delete the object and clear the used resources by calling (*obj)->Destroy(obj) .

Library initialization (context)

First you need to add the lOpenSLES flag to the LOCAL_LDLIBS section of the Android.mk file in the jni directory: LOCAL_LDLIBS += -lOpenSLES and connect two header files:

 #include <SLES/OpenSLES.h> #include <SLES/OpenSLES_Android.h>

Now you need to create an object through which we will work with the library (something similar to the context in OpenAL) using the slCreateEngine method. The resulting object becomes the central object for accessing the OpenSL ES API. Next, we initialize the object using the Realize method.

 result = slCreateEngine(&engineObj, //pointer to object 0, // count of elements is array of additional options NULL, // array of additional options lEngineMixIIDCount, // interface count lEngineMixIIDs, // array of interface ids lEngineMixReqs); if (result != SL_RESULT_SUCCESS ) { LOGE("Error after slCreateEngine"); return; } result = (*engineObj)->Realize(engineObj, SL_BOOLEAN_FALSE ); if (result != SL_RESULT_SUCCESS ) { LOGE("Error after Realize"); return; }

Now you need to get the interface SL_IID_ENGINE , through which you will get access to the speakers, playing sounds, and so on.

 result = (*engineObj)->GetInterface(engineObj, SL_IID_ENGINE, &engine); if (result != SL_RESULT_SUCCESS ) { LOGE("Error after GetInterface"); return; }

It remains to get and initialize the OutputMix object for working with speakers using the CreateOutputMix method:

 result = (*engine)->CreateOutputMix(engine, &outputMixObj, lOutputMixIIDCount, lOutputMixIIDs, lOutputMixReqs); if(result != SL_RESULT_SUCCESS){ LOGE("Error after CreateOutputMix"); return; } result = (*outputMixObj)->Realize(outputMixObj, SL_BOOLEAN_FALSE); if(result != SL_RESULT_SUCCESS){ LOGE("Error after Realize"); return; }

In addition to the initialization of the main objects in the constructor of my OSLContext OSLContext , all the necessary players are initialized. The maximum possible number of players is limited. I recommend to create no more than 20.

 void OSLContext::initPlayers(){ for(int i = 0; i< MAX_ASSET_PLAYERS_COUNT; ++i) assetPlayers[i] = new OSLAssetPlayer(this); for(int i = 0; i< MAX_BUF_PLAYERS_COUNT; ++i) bufPlayers[i] = new OSLBufferPlayer(this); }

Work with sounds

In fact, the types of sounds can be divided into two categories: pure (not compressed data) PCM, which are contained in WAV and compressed formats (mp3, ogg, etc.). Mp3 and ogg can decode and receive all the same uncompressed PCM audio data. To work with PCM use BufferPlayer. For AssetPlayer compressed formats, as decoding files will be quite expensive. If you take an mp3, then hardware will not decode it on old phones, and with the help of third-party software solutions, decoding will take more than a dozen seconds, which, you see, is not acceptable. In addition, such PCM data will weigh too much.

When calling the player () method, it requests the free player from the context ( OSLContext ). If you want to loop the sound, we get OSLAssetPlayer , in another case OSLBufferPlayer .

PCM playback

I will not write about reading WAV again, you can see about it in the article about OpenAL. In the same article I will tell you how to create BufferPlayer using the received PCM data.

Initializing BufferPlayer for PCM

 locatorBufferQueue.locatorType = SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE; locatorBufferQueue.numBuffers = 16; //   ,      SLDataFormat_PCM formatPCM; formatPCM.formatType = SL_DATAFORMAT_PCM; formatPCM.numChannels = 2; formatPCM.samplesPerSec = SL_SAMPLINGRATE_44_1;// header.samplesPerSec*1000; formatPCM.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16 ;//header.bitsPerSample; formatPCM.containerSize = SL_PCMSAMPLEFORMAT_FIXED_16;// header.fmtSize; formatPCM.channelMask = SL_SPEAKER_FRONT_LEFT|SL_SPEAKER_FRONT_RIGHT ; formatPCM.endianness = SL_BYTEORDER_LITTLEENDIAN; audioSrc.pLocator = &locatorBufferQueue; audioSrc.pFormat = &formatPCM; locatorOutMix.locatorType = SL_DATALOCATOR_OUTPUTMIX; locatorOutMix.outputMix = context->getOutputMixObject(); audioSnk.pLocator = &locatorOutMix; audioSnk.pFormat = NULL; //   const SLInterfaceID ids[2] = {SL_IID_ANDROIDSIMPLEBUFFERQUEUE,/*SL_IID_MUTESOLO,*/ /*SL_IID_EFFECTSEND,SL_IID_SEEK,*/ /*SL_IID_MUTESOLO,*/ SL_IID_VOLUME}; const SLboolean req[2] = {SL_BOOLEAN_TRUE,SL_BOOLEAN_TRUE}; result = (*context->getEngine())->CreateAudioPlayer(context->getEngine(), &playerObj, &audioSrc, &audioSnk,2, ids, req); assert(SL_RESULT_SUCCESS == result); result = (*playerObj)->Realize(playerObj, SL_BOOLEAN_FALSE ); assert(SL_RESULT_SUCCESS == result); if (result != SL_RESULT_SUCCESS ) { LOGE("Can not CreateAudioPlayer %d", result); playerObj = NULL; } //   result = (*playerObj)->GetInterface(playerObj, SL_IID_PLAY, &player); assert(SL_RESULT_SUCCESS == result); //       result = (*playerObj)->GetInterface(playerObj, SL_IID_VOLUME, &fdPlayerVolume); assert(SL_RESULT_SUCCESS == result); result = (*playerObj)->GetInterface(playerObj, SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &bufferQueue); assert(SL_RESULT_SUCCESS == result);

In general, there is nothing complicated. Here only there is a HUGE problem. Notice the SLDataFormat_PCM structure. Why did I explicitly fill in the parameters myself and not read the WAV file from the headers? Because I have all the WAV files in the same format, i.e. the same number of channels, frequency, bit rate, etc. The fact is that if you create a buffer and specify 2 channels in the parameters, and try to play a track with 1 channel, the application will fall. The only option is to reinitialize the entire buffer if the file has a different format. But after all, the beauty is just that we initialize the player 1 time, and then just change the buffer on it. Therefore, there are two options here, either create several players with different parameters, or all of your .wav files lead to the same format. Well, or initialize the buffer every time again -_-

In addition to the interface for volume, there are two other interfaces:

SL_IID_MUTESOLO for managing channels (for multichannel audio only, this is indicated in the numChannels field of the SLDataFormat_PCM structure).
SL_IID_EFFECTSEND for applying effects (by specification - only the reverb effect).

Adding sound to the queue when selecting a player and installing sound to it:

 void OSLBufferPlayer::setSound(OSLSound * sound){ if(bufferQueue == NULL) LOGD("bufferQueue is null"); this->sound = sound; (*bufferQueue)->Clear(bufferQueue); (*bufferQueue)->Enqueue(bufferQueue, sound->getBuffer() , sound->getSize()); }

Playing compressed formats

In WAV, all sounds are not an option. And not because the files themselves take up a lot of space (although this too), just when you load them into memory, there simply is not enough RAM for this (:

I create classes for each of the formats, so that in the future, if necessary, write a part on decoding them. For mp3, there is OSLMp3 class, which, in fact, only stores the file name in order to install it on the player in the future. The same can be done for ogg and other supported formats.

I will give a complete method for initialization, explanations in the comments.

Initializing AssetPlayer to work with compressed formats

  void OSLAssetPlayer::init(char * filename){ SLresult result; AAsset* asset = AAssetManager_open(mgr, filename, AASSET_MODE_UNKNOWN); if (NULL == asset) { return JNI_FALSE; } //   off_t start, length; int fd = AAsset_openFileDescriptor(asset, &start, &length); assert(0 <= fd); AAsset_close(asset); //     SLDataLocator_AndroidFD loc_fd = {SL_DATALOCATOR_ANDROIDFD, fd, start, length}; SLDataFormat_MIME format_mime = {SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED}; SLDataSource audioSrc = {&loc_fd, &format_mime}; SLDataLocator_OutputMix loc_outmix = {SL_DATALOCATOR_OUTPUTMIX, context->getOutputMixObject()}; SLDataSink audioSnk = {&loc_outmix, NULL}; //   const SLInterfaceID ids[3] = {SL_IID_SEEK, SL_IID_MUTESOLO, SL_IID_VOLUME}; const SLboolean req[3] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE}; result = (*context->getEngine())->CreateAudioPlayer(context->getEngine(), &playerObj, &audioSrc, &audioSnk, 3, ids, req); assert(SL_RESULT_SUCCESS == result); //   result = (*playerObj)->Realize(playerObj, SL_BOOLEAN_FALSE); assert(SL_RESULT_SUCCESS == result); //       result = (*playerObj)->GetInterface(playerObj, SL_IID_PLAY, &player); assert(SL_RESULT_SUCCESS == result); //       result = (*playerObj)->GetInterface(playerObj, SL_IID_SEEK, &fdPlayerSeek); assert(SL_RESULT_SUCCESS == result); //      result = (*playerObj)->GetInterface(playerObj, SL_IID_MUTESOLO, &fdPlayerMuteSolo); assert(SL_RESULT_SUCCESS == result); //      result = (*playerObj)->GetInterface(playerObj, SL_IID_VOLUME, &fdPlayerVolume); assert(SL_RESULT_SUCCESS == result); //      result = (*fdPlayerSeek)->SetLoop(fdPlayerSeek, sound->isLooping() ? SL_BOOLEAN_TRUE : SL_BOOLEAN_FALSE, 0, SL_TIME_UNKNOWN); assert(SL_RESULT_SUCCESS == result); // return JNI_TRUE; }

Conclusion

OpenSL ES is quite easy to learn. Yes, and he has a lot of opportunities (for example, you can record audio). It is a pity that with cross-platform problems. OpenAL is cross-platform, but on Android it does not behave very much. OpenSL has a couple of minuses, weird callback behavior, not all specification features are supported, etc. But in general, ease of implementation and stable work cover these disadvantages.

Sors can be taken on github.com

Add. infa

Interesting reading on the topic:

The Standard for Embedded Audio Acceleration on the developer’s site.
The Khronos Group Inc. OpenSL ES Specification .
Android NDK. Development of applications for Android on C / C ++.
Ogg Vorbis

Source: https://habr.com/ru/post/235795/

All Articles