New approach to contactless interface design

This is a translation of the original New Design Practices for Touch-free Interactions article.

tfi_01

Sensory interfaces practically captured developed markets, which resulted in changes in user expectations and views of UX-specialists on human-computer interaction (Human-Computer Interaction, HCI). Now, following sensory interfaces, technologies of contactless gestural and natural language interactions (Natural Language Interaction, NLI) are beginning to penetrate into the industry. The proliferation of these technologies promises changes in the UX industry, starting with the heuristics that guide us and ending with design patterns and final results .

Man-machine interfaces: waiting for change

Thanks to the touch interfaces, user interaction with computing devices has become more natural and intuitive. With the widespread adoption of sensory technologies, new concepts of interactions began to emerge. With the efforts of Microsoft and Apple, contactless gestural and natural language (NLI) interfaces, respectively, having waited in the wings, are now finally beginning to penetrate the industry. If these technologies take root, then we will be able to take the next step towards creating a natural user interface (NUI).
')

Contactless sign interfaces

This interaction model has gained popularity due to the Kinect sensors created by Microsoft for the Xbox gaming platform. These devices were then adapted for use with Windows-based computers and with Samsung Smart TV devices . In this way, non-contact gestural interfaces have stepped from computer games straight into everyday life.

Kinect for Windows comes with an interesting feature called Near Mode . In Near Mode, users can use contactless tin to work with a PC, sitting behind it, without having to get up to its full height. The technology of contactless gestural interaction allows reducing the number of interface elements in productive applications, thanks to the fact that the objects displayed on the screen can be treated almost like real physical objects. Also, this technology allows you to use a computer in conditions where touching it for any reason is undesirable, for example in the kitchen or in the operating room.

Natural language interaction

The idea to directly access a computer is not new , but the success of the Siri application for the iPhone has finally brought this technology to the forefront in the industry. The main advantage of natural-language interaction is that this technology imitates our communication style, learned by everyone else in early childhood.

Sophisticated natural language interfaces not only make the interaction between man and computer natural: with the speech interaction, the user humanizes the computer and perceives it as a certain member of society. Thanks to this, designers and content authors have tremendous opportunities to build truly deep relationships with users.

Principles of human-machine interaction

With the development of technology, there are more and more opportunities to improve the process of interaction, but do not forget that human capabilities are not limitless, and this applies to such processes as well. Stepping towards new models of interactions, we are simultaneously creating a system of knowledge, thanks to which UX-specialists will be able to take advantage of these models. The principles of man-machine interaction are the theoretical basis.

According to the man-machine interaction model developed by Bill Verplank, user interaction with any system consists of three components — human factors:

information perception efficiency: do we correctly perceive the signals of the system;
information processing principle: do we correctly understand and process these signals;
information transfer efficiency: do we correctly enter the results back into the system.

Paying attention to these three elements of user experience in the design process, it is possible to achieve improvement of the whole system. These human factors can serve as a theoretical basis for predicting and evaluating new heuristics and interface design patterns.

Heuristics

Nobody cancels already well-known heuristics. However, it is worth expanding their list for more efficient use of technologies of contactless gestural and natural language interactions. Here are a couple of examples of such heuristics:

Recognition of inaccurate gestures and erroneous movements (efficiency of information transfer) - even on the screen surface, the human hand is not able to reproduce gestures with the utmost precision, what to say about gestures in space. You should not expect high precision gestures: otherwise, the user will encounter a lot of errors. It is necessary to establish reasonable limits of accuracy and forgive the user minor errors.
The “personal qualities” of a system must correspond to its functionality (efficiency of information processing) - when communicating with an inanimate object, people automatically attribute to it certain personal qualities. The answers received from this object help them to form a more accurate portrait of this “personality”. Agree, if productive applications like Excel or Numbers would communicate with us in a friendly manner, it would be much more pleasant to use them. Drivers would have more confidence in the car navigation system, have it with a confident and stern voice. And, for example, thanks to Siri, users now enjoy doing routine tasks like daily planning - all because the Apple developers have programmed a sense of humor in it.
You do not need to force the user to repeat gestures many times or just gesticulate for a long time, unless that is the goal (information transfer efficiency) - repeated repetition of gestures or prolonged activity exhaust users. Muscle tension grows, accuracy decreases - as a result, all this does not in the best way affect work efficiency. Of course, this heuristic is not applicable if the goal is to get users to practice.
Sign and voice commands should be appropriate in the context of a situation in which the user resorts to them (the efficiency of perception and transmission of information) - sign and voice commands are perfectly visible and audible from the side, and the user will never use them if would look ridiculous. For example, if an application is intended for use in an office, then sweeping movements and shouting out strange commands would be rather inappropriate. On the other hand, in the context of a children's game, funny gestures and commands can be very helpful.
Invitations to action should be understandable, and the interaction itself should be logical and consistent (information processing efficiency) according to the research of Josh Clark (Josh Clark) and Dan Saffer (Dan Saffer), gesture control is more effective than using graphic or sensory elements. interface, but the techniques of such management are less obvious to users. The same goes for voice control. Using gestures and voice commands in the interface takes human-machine interaction to a new level, and here the clarity of invitations to action and the sequence of operations become more important than ever.

Patterns

With the widespread use of touch interfaces, such sign libraries have emerged to help designers working on methods of sensory interaction. Contactless gestural interfaces, not limited to two dimensions of a flat screen, allow designers to effectively use the third dimension - depth, as well as body movements.

Add to this the voice control - and you get almost unlimited possibilities: for example, the user can simultaneously control one element of the system through gestures, and the other - through voice commands.

Gestures for working in Near Mode (direct data entry)

Initially, the Kinect sensor recognized the movements made by the whole body, but now, due to the presence of the Near Mode mode in the new version, its functionality has expanded significantly. The following are examples of gestures that can be used while sitting at the PC, with illustrations from the Think Moto Gesture Library:

Shift, stretch and shrink: these are the basic gestures of contactless control, similar to gestures for working with the touch interface.
Push-Pull: these gestures can be used to zoom in or out of objects on the screen.
Capture and Release: since the gesture interfaces use the compression and stretching gestures described above to scale elements, the pinch gesture can be used to capture objects on the screen. By “grabbing” such an object, the user can control it through secondary gestures.
Twist: A twist is an example of a secondary gesture. Due to the fact that in contactless gestural interfaces, a third, “grasped” object can be rotated to the usual two dimensions, thereby changing its shape or position (for example, turning the map over or rotating the cube).
Throw: another secondary gesture - the user can “throw” an object on the screen in order to quickly move it away. This gesture can be associated with the removal of an object or with its movement in 3D space.

Read gestures (indirect data entry)

In addition to new gestures for controlling through gestures, the Kinect for PC sensor is able to read other gestures of users who show their fatigue or mood. For example, more active gestures (say, more sweeping and sharp gestures) can be regarded by the system as a sign that the user is excited, and the system adjusts its behavior accordingly. For users using productive applications, this behavior may be caused by a feeling of dissatisfaction, and the system may try to help the user to calm down.

Another indicator is the accuracy of gestures. Lazy and inaccurate user gestures can be interpreted as a sign of fatigue, in which case the system may issue a message suggesting a break from work. In addition, depending on whether the user is sitting behind a PC or standing at full height (although this cannot be completely attributed to indirect features), you can make different sets of functions available.

Voice control

Due to the complexity of natural languages, creating patterns for NLI interfaces is more difficult than with gestures. Nevertheless, some constructions inherent in natural languages can be used as the basis for creating such patterns.

To begin with, users can perform voice data input mainly in two ways: by asking questions (and receiving answers from the system) and giving commands (which leads to the execution of an operation by the system). Further, separate sentences can be divided into phrases, each of which is a separate semantic unit. A number of publications, including the book Speech Technology edited by Feng Chen and Kristiina Jokinen, are devoted to what developers of natural language interfaces from the field of linguistics and communications can learn.

results

When introducing innovative models of interaction, perhaps the most difficult stage is communication with stakeholders who have not yet had time to become familiar with the innovation. Visualizing non-existent things is quite difficult, so UX designers will have to think carefully about how to convey the right information.

Specs

Interaction design specifications are based on two dimensions, which are enough for them. However, for some contactless gestures, variables like “distance to the screen” or “movement along the Z axis” may be required, which are more effectively visualized in 3D.

Speech interaction is even harder. Now that the interaction with the system literally becomes a dialogue, designers should take into account many additional factors, such as user intonations, accents or the choice of words for the same team. In a natural-language interface, such variations should be taken into account to the maximum.

In addition, there are many variables that determine what the system gives the answer to the user. Intonation, choice of words, modulation, timbre - all of these and many other factors influence how the user perceives the system.

Persons

Thanks to voice functions, computers become members of the user's society, and their “personalities” perceived by the user are an extremely important aspect of design. Fortunately, we do not need to reinvent the wheel to solve this problem.

UX specialists have long been using user images to classify them. This approach can be applied to computers endowed with speech, to determine the type of personality that is modeled by the system. Creating such images will facilitate the work of the specialists involved in creating the voice interface: copywriters who create scripts, coding developers for text-to-speech, as well as voice actors.

The system can be programmed for empathy by adjusting the recognition of changes in the user's speech - excitement, irritation or anxiety. If the user is depressed by something, the recognized system can switch from an authoritative, authoritative image (demonstrating reliability and inspiring confidence) to caring and parenting (capable of reassuring the user).

Prototyping

For the effective implementation of the recognition technology of body movements and speech patterns of users, it is necessary not to miss a single detail and to maintain a clear exchange of information between stakeholders. Prototypes are becoming more common for the same reason as specifications: it’s better to see once than to hear a hundred times. This also applies to application testing and development.

At the moment there is no such software that would allow to simulate Kinect contactless gestural interaction; it only remains to download the SDK and create applications yourself. For voice interfaces, however, there are several free tools, such as the CSLU Toolkit , thanks to which developers can quickly build a voice interface for modeling and testing.

In general, until the prototyping tools are fast enough, flexible and efficient, we are doomed to stick to the roots and use proven tools: paper , demonstration materials, storyboards and the "Wizard of Oz . "

Roll the wheel

Since the days of electronic tubes and punched cards, computer user interfaces have undergone many changes, each of which was accompanied by the emergence of new features - and new tasks. Thanks to contactless gestural and natural-language interfaces, communication between a person and a computer becomes much more efficient and ... humane. If UX-specialists are going to make full use of the opportunities opening up before them, then they need to follow in the same direction.

I believe that we are all ready for the adoption of this new paradigm of human-computer interaction, which will allow us to approach our users more than ever.

Posted by: Brian Pagán

Note Trans .: This article is still interesting in the context of the 10th anniversary of the release of the film Minority Report , which showed an interesting concept of such an interface.

Source: https://habr.com/ru/post/147475/

All Articles