SIP phone on STM32F7-Discovery

Hello.

Some time ago we wrote about how we managed to launch a SIP phone on STM32F4-Discovery with 1 MB of ROM and 192 KB of RAM) based on Embox . It must be said here that that version was minimal and connected two phones directly without a server and with voice transmission only in one direction. Therefore, we decided to launch a more comprehensive phone with a call through the server, voice transfer in both directions, but at the same time meet the smallest possible memory size.

For the phone, it was decided to choose the application simple_pjsua as part of the PJSIP library. This is a minimal application that can register on the server, receive and answer calls. Below I will immediately give a description of how to run it on the STM32F7-Discovery.
')

How to run

Configuring Embox

make confload-platform/pjsip/stm32f7cube

In the conf / mods.config file we set the desired SIP account.
```
 include platform.pjsip.cmd.simple_pjsua_imported( sip_domain="server", sip_user="username", sip_passwd="password") 
```
where server is the SIP server (for example, sip.linphone.org), username and password are the username and password of the account.
Putting Embox with the make command. About the firmware we have on the wiki and in the article .

Run the command “simple_pjsua_imported” in the Embox console

 00:00:12.870 pjsua_acc.c ....SIP outbound status for acc 0 is not active 00:00:12.884 pjsua_acc.c ....sip:alexk2222@sip.linphone.org: registration success, status=200 (Registration succes 00:00:12.911 pjsua_acc.c ....Keep-alive timer started for acc 0, destination:91.121.209.194:5060, interval:15s

Finally, it remains to insert speakers or headphones into the audio output, and speak into two small MEMS microphones next to the display. We call from Linux through the application simple_pjsua, pjsua. Well, or you can any other type of linphone.

All this is described on our wiki .

How did we come to this

So, initially there was a question about choosing a hardware platform. Since it was clear that the STM32F4-Discovery does not fit the memory, the STM32F7-Discovery was chosen. It has 1 MB of flash drive and 256 KB of RAM (+ 64 special fast memory, which we will also use). Not too thick for calls through the server, but decided to try to get in.

Conditionally for themselves the task was divided into several stages:

Run PJSIP on QEMU. It was convenient for debugging, plus we already had support for the AC97 codec there.
Voice recording and playback on QEMU and STM32.
Porting the simple_pjsua application from PJSIP. It allows you to register on the SIP server and call.
Deploy your own Asterisk-based server and test on it, then try external ones, such as sip.linphone.org

The sound in Embox works through Portaudio, which is used in PISIP. The first problems appeared on QEMU - WAV played well on 44100 Hz, but on 8000 something obviously went wrong. It turned out that the matter was in setting the frequency - by default in the equipment it was 44100, and with us this was not changed by software.

Here, probably, it is worth explaining a little how the sound plays at all. A sound card can be set to some kind of pointer to a piece of memory, from which you want to play or record at a predetermined frequency. After the buffer ends, an interrupt is generated, and execution continues with the next buffer. The fact is that these buffers need to have time to fill in advance, while the previous one is playing. We still will face this problem further on STM32F7.

Next, we rented a server and deployed Asterisk on it. Since there was a lot to be debugged, and I didn’t want to talk into the microphone, I had to do automatic playback and recording. To do this, we patched simple_pjsua so that you could slip files instead of audio devices. In PJSIP, this is done quite simply, since they have the concept of a port, which can be either a device or a file. And these ports can be flexibly connected to other ports. You can see the code in our pjsip repository . As a result, the scheme was as follows. On the Asterisk server, I started two accounts - for Linux and for Embox. Next on Embox, the command simple_pjsua_imported is executed , Embox is registered on the server, then from Linux we call to Embox. At the time of connection, we check on the Asterisk server that the entire connection is established, and after a while we should hear the sound from Linux in Embox, and in Linux we save the file that is being played from Embox.

After it worked on QEMU, we switched to porting to STM32F7-Discovery. The first problem is that they did not fit in 1 MB ROM without the optimization of the compiler “-Os” on the size of the image. Therefore, include the "-Os". Further, the patch turned off support for C ++, so it is needed only for pjsua, and we use simple_pjsua.

After putting simple_pjsua into place , they decided that now there is a chance to start it. But first it was necessary to deal with the recording and voice playback. Question - where to write? Chose an external memory - SDRAM (128 MB). You can try it yourself:

Will create a stereo WAV with a frequency of 16000 Hz and a duration of 10 seconds:

 record -r 16000 -c 2 -d 10000 -m C0000000

We lose:

 play -m C0000000

There were two problems. The first with the codec is WM8994, and there is such a thing as a slot, and these slots 4. So, by default, if this is not configured, then during audio playback, playback occurs in all four slots. Therefore, at a frequency of 16000 Hz, we received 8000 Hz, and for 8000 Hz, playback simply did not work. When only slots 0 and 2 were selected, it worked as it should. Another problem was the audio interface in the STM32Cube, in which the audio output works through the SAI (Serial Audio Interface) synchronously with the audio input (did not understand the details, but it turns out that they share a common clock and when the audio output is initialized, the audio is somehow tied to it entrance). That is, it is impossible to start them separately, so they did the following - they always work (including interrupts generated) audio input and audio output. But when nothing is played in the system, we simply slip the empty buffer to the audio output, and when the playback starts, we begin to fill it honestly.

Next, we faced the fact that the sound when recording voice was very quiet. This is due to the fact that the MEMS microphones on the STM32F7-Discovery somehow do not work well at frequencies below 16000 Hz. Therefore, we expose 16000 Hz, even if 8000 Hz comes. To do this, it was really necessary to add a software conversion of one frequency to another.

Then I had to increase the size of the heap, which is located in RAM. According to our calculations, pjsip required about 190 Kb, and we only have about 100 Kb left. Then I had to use some external memory - SDRAM (about 128 KB).

After all these edits, I saw the first packages between Linux and Embox, and I heard a sound! But the sound was terrible, not at all like at QEMU, nothing could be disassembled. Then we thought about what could be the matter. Debugging has shown that Embox simply does not have time to fill / unload audio buffers. While pjsip processes one frame, 2 interrupts have occurred when the buffers have finished processing, which is too much. The first thought to speed up was compiler optimization, but it was already included in PJSIP. The second is a hardware floating point, we told about it in the article . But as practice has shown, FPU did not give a significant increase in speed. The next step was prioritizing the threads. Embox has different scheduling strategies, and I turned on the one that supports the priorities, and set the audio to the highest possible priority. That didn't help either.

The next idea was that we work with external memory and it would be good to move structures there that are accessed extremely often. I conducted a preliminary analysis of when and under what simple_pjsua allocates memory. It turned out that out of 190 Kb the first 90 Kb are allocated for internal needs of PJSIP and they are not used very often. Then, during an incoming call, the pjsua_call_answer function is called, in which then buffers are allocated to work with incoming and outgoing frames. It was about 100 kb more. And here we did the following. Before the call, the data is stored in the external memory. As soon as the call - we immediately replace the heap with another - in RAM. Thus, all the “hot” data was transferred to faster and more predictable memory.

In the end, all this together allowed us to start simple_pjsua and call through its server. And then through other servers such as sip.linphone.org.

findings

As a result, it turned out to start simple_pjsua with voice transfer in both directions through the server. The problem with the additionally spent 128 KB SDRAM can be solved by using a slightly more powerful Cortex-M7 (for example, STM32F769NI with 512 KB of RAM), but we still have not left hope to get into 256 KB :) We will be glad if someone is interested , and even better - try. All sources, as usual, are in our repository .

Source: https://habr.com/ru/post/431134/

All Articles

SIP phone on STM32F7-Discovery

How to run

How did we come to this

findings

More articles: