How we developed the technology to detect devices nearby

This story began with the function “Near” in one of our mobile apps. We wanted users to quickly create a group chat or add nearby users as friends. We tried to solve this problem with the help of geolocation, Bluetooth, Wi-Fi and ultrasound, but for each of the methods we found critical in our case flaws.

As a result, we came up with a new way. It is based on the search for a coincidence of ambient noise: if devices hear the same thing, then, most likely, they are nearby.
')
In the article we will talk about the principle of its work, and also consider the advantages and disadvantages of other common methods of detecting devices.

Interaction between devices nearby

People who are next to each other often want to exchange files, add a new acquaintance as a friend, play a game together, transfer money, share an account, or perform other joint actions. Such applications will become more convenient if they allow the user to easily interact with people or devices around them.

For example, Petrov just met Ivanov and they are trying to “make friends” on Facebook. After several unsuccessful attempts to find each other, they are likely to close Facebook, exchange phone numbers and communicate via WhatsApp.

By the way, Vkontakte provided this: in their mobile application for iOS and Android there is a function “People Nearby”, which allows you to find other users with the help of geolocation. I will tell about minuses of this method a bit later.


Search for a new friend in FB	Search for a new friend in VK

For the function to be really convenient for the user, it should work:

On any smartphone
Anytime and anywhere. On the street, in transport, in the office, etc.
Cross platform At least on iOS and Android, and better, in the browser
For sure. Identify devices that are really nearby.
Quickly. Device must be found in less than 10 seconds
Simply. No additional user action.

Ambient noise

Wherever you are (in the office, transport, cafe, on the street, meeting or concert) - there is ambient noise everywhere: people's voices, music, engine performance, wheel noise, knocking of keys, and so on.

A short sample of natural ambient noise along with the exact time it was recorded is, in most cases, unique to any location on Earth. The coincidence of ambient noise and time means that the recorders are nearby. This is the basis on which the technology works.

Scheme of work

Each device captures sound from a microphone in real time and transforms it into a special fingerprint using a perceptual hash function . A feature of perceptual hash functions is that small differences in the source data are expressed by small differences in the resulting hash.

A sound imprint with an exact time stamp is sent to the server. By comparing it with the fingerprints of other devices made at the same point in time, the server can determine how similar the original sounds are. If the similarity rate is above a certain threshold, the devices receive each other's identifiers for subsequent interaction.

Example and comparison of prints from two different devices

It was necessary to make sure that this principle works and is able to find coincidences in the sound recorded by different devices at a distance of several meters, and also that the obviously different sound does not match. We manually collected hundreds of hours of sound recorded simultaneously on several devices in many different places.

Using this data, we went through a lot of generation algorithms and parameters for comparing fingerprints to achieve the best result. As a result, they achieved that a 6-second imprint allows detecting a device at a distance of up to 5 meters in 96% of cases, and a false-positive result is possible in 0.0039% of cases.

We developed libraries for iOS and Android, which hide the entire implementation from the application through a simple API and embed them in their applications.

The disadvantage of this approach is that it does not work in absolute silence. Silence is very similar to any other silence and the algorithm intentionally ignores it in order to eliminate false positives. It is worth noting that absolute silence is found in real conditions extremely rarely. It is enough to knock the keys of the keyboard or the sound of steps for the devices to detect each other.

Sometimes it looks funny: users silently wait for the detection of seconds 10, after which one of them says something like “It does not work!”. This phrase works like a spell and after a second the devices detect each other.

One advantage of this approach is cross-platform. JS-version of the library, works in Chrome, Safari, Firefox, Edge, including, in their mobile versions.

Another way ...

In our application, the function "Near" is one of the key. We tried to apply various existing methods for its implementation, but we faced critical limitations and problems for us.
Let's take a closer look at alternative ways.

Geolocation

This is the most obvious way to solve a problem. At the moment when the user opens the “Nearby” section, we get his current location and search for the closest users on the server.

If you imagine location as the center of a circle, and the error of coordinates in the form of a radius, then 2 users can be represented as follows:

If the distance between devices (d) is less than the sum of the errors (r1 + r2), then there is a probability (P) that users are nearby.

The search radius must not be less than the coordinate error. As it turned out, the real coordinates of the smartphone can be beyond the limits of error, for example, in Android this happens in 32% of cases . So, even being nearby, users can still not “see” each other.

The coordinates obtained using GPS and GLONASS are accurate, but this method often does not work indoors, moreover, it may take up to a minute to search for satellites. At the same time, the GPS / GLONASS module is not present in all devices (Hi, iPad Wi-Fi!) Or can be disabled at the OS level (Hi, Android!).

In fact, even outside the building, on a densely built street, GPS / GLONASS is often mistaken because of the reflection of a signal from buildings and can give an accuracy of less than 100 meters:

Therefore, in most cases, it is necessary to use the coordinates obtained by triangulating the signal of the surrounding Wi-Fi networks and cell towers, this method works quickly and energy efficiently, but the accuracy is much lower: 100 - 1500 meters. In practice, the device often determines the wrong location in the city, and sometimes it can “teleport” to another city.

We implemented this method and tested it in Moscow, in about 15% of cases the devices do not find each other due to incorrect coordinates. Especially often mistakes occur inside the skyscraper Moscow-City, in the subway and ground transportation. Also, due to low accuracy, “extra” users (not nearby) will often come across.

+ easy to implement way
- low accuracy
- does not work well in transport (in motion)

Bump

The Bump team came up with an original way to increase search accuracy by geolocation. Users need to bang their smartphones, while the accelerometer records the exact time of contact and sends it along with the coordinates to the server, the algorithm searches for a pair only among devices with the same time of contact. This simple idea reduces the probability of a false positive result by orders of magnitude, which makes it possible to significantly increase the search radius.

But in 2013, Google absorbed them, and in 2014 the project was closed , despite the fact that the Bump SDK was built into many third-party applications, and the Bump file sharing application received hundreds of millions of downloads. The further fate of technology is unknown.

The main disadvantage of the technology is that only one pair of devices is connected in one “Bump”. To merge a group of users, you will need to make a lot of “Bamps”.

+ high accuracy
- it is necessary to push devices together
- paired device detection
- the project is closed

Bluetooth, BLE and Wi-Fi

iOS and Android are not strictly friends via Bluetooth. Data transfer between these platforms is not a trivial task: Apple allows the application to connect only to certified (Made For iPhone) Bluetooth devices.

In order for devices to detect each other, the following method is used: iOS simulates any Bluetooth Low Energy peripherals, setting its token as the name of a BLE device. Android temporarily changes the Bluetooth name of the smartphone to its token and turns on discovery mode. Now, to discover devices around, Android scans Bluetooth for Android and BLE for iOS devices. iOS only scans BLE for iOS detection, since Bluetooth scanning is not possible using a public API. In order to detect Android, iOS via the cloud receives the identifiers of surrounding Android devices that have discovered its BLE token.

In some cases, the surrounding Wi-Fi-networks help to detect that the devices are nearby: the iOS application can get the BSSID of the Wi-Fi access point the user is currently connected to, and the Android BSSID of all visible points. If a match is found, then users are nearby.

Properly implement this method yourself is not so easy, including due to the many features of the BLE stack of different versions of Android and iOS. There are libraries that hide the complex implementation “under the hood”.

We tried Google Nearby . Finding a pair of iOS - Android is slow, on average, the search takes 20 seconds, and in some cases lasts up to 40 seconds, this turned out to be the main stopping factor.

Another caveat is that Bluetooth is turned off on most smartphones, so iOS users each time they use the function will need to correctly answer the question “Allow an application to use Bluetooth?”.

Also, it is worth remembering that the use of Bluetooth (on Android) greatly affects the consumption of charge. Google warns that Google Nearby increases energy consumption by 2.5 - 3.5 times .

+ proof of proximity (guarantee that the devices are nearby)
- slow detection
- high energy consumption

Information sharing through sound

All smartphones have a speaker and microphone. You can encode any identifier into the sound on one device, reproduce it with the help of a speaker, decode it on devices in the radius of hearing and thus unite the devices into a group.

An example of the spectrogram signal Chirp.io

In the audible range, the signal is mixed with voice, music, and ambient noise in order to increase the likelihood of correct decoding, you have to play the sound at maximum volume. The most commonly used is FSK and PSK modulation, generating a sound similar to whistle or noise (depending on data density), which annoys many people ( example of sound ). This method is implemented in the Chirp.io project.

- does not work in noisy places
- annoying others

You can use a range of 18-20 kHz, it is usually not noisy, and most adults will not hear the annoying sound. Unfortunately, some smartphones also perceive it badly, the problem of reflection and interference becomes relevant, the range of stable communication decreases to 0.5 - 3 meters. This method is implemented in Google Nearby and Chirp.io, but is included separately.

- works at too short distances

Instead of conclusion

We have been testing technology in our own application for over 2 years. During this time, we have been convinced of its efficiency and convenience in “combat” conditions. In a very short time, we want to give any developer the opportunity to quickly embed and use it in his application.

I hope the article was informative and useful. If the topic turns out to be interesting, in the following articles I plan to talk in more detail about the algorithms for creating and comparing the "prints" of the surround sound, as well as the difficulties that we had to face.

I am pleased to answer your questions in the comments!

Source: https://habr.com/ru/post/347954/

All Articles