
Pavel Guay, KODE android developer
Hi, my name is Pavel pavelgvay , I work in the Kaliningrad mobile application development studio KODE and about a year ago I actively immersed in the process of developing applications for Google Assistant and just stuck on the interface design stage, which became a real creative outlet after the lines of code.
Having developed a dozen projects, speaking at several conferences, having met the developers of Google Assistant, who, by the way, will soon speak Russian , sharing experience with developers, studios and even the author of the book , I seriously thought about optimizing the process of designing and testing voice applications, which can already be done even for Alice.
It was this thought that gave me a motivational kick, sent me on a long journey through the existing tools and analysis of their shortcomings, and led to the expected conclusion - about him at the end of the article, but for now about the present.
For those who have not yet tested conversational interfaces from the inside, I’ll explain what the design of such an application is all about.
A good conversational application differs from chatbot lack of coercion to use specific commands - here the user has a free dialogue with the service, similar to communication with a real person. The main thing is voice and text, but in case the device has a screen, the application can connect the visual accompaniment in the form of cards, carousels, lists for better reporting of information.
Take at least “pizza order”: Imagine how many different phrases you can use to tell an application that you want pizza. The user can name a specific name, and can ask for advice on options with mushrooms and ham, or ask to read the entire available list and choose from it, or maybe just report that he is hungry.
All of these options for the development of the plot. Here is what we should consider: every single step in every possible path of each of the application scenarios. Plowed field! And we still have not ordered pizza!
Designing (or design, as you prefer) a conversational interface, regardless of the platform, goes through a standard set of steps. Detailed guidelines can be found at the Google Assistant developers themselves, Amazon Alexa , Microsoft Cortana , I also summarized it in a short checklist:
We are building a dialogue tree - in order to take into account all variants of the course of events, all steps that will lead the user to a hypothetical “pizza order”, it is necessary to visualize all actions.
The root of all the problems of the designer of conversational interfaces is a huge mass of information. Scenarios, options for their passage, the trees of the dialogues, steps, which on a small application can get a hundred pieces. All this mass of information needs to be stored somewhere, somehow synthesized, verified, tested, transferred to development, given to the customer, and there are simply no recommendations on choosing a tool from the developers of voice assistants in the guidelines.
Having designed the first applications, I have reduced all my pains into the main set of problems:
The result of this constant struggle with pain is not only the stretching development time, but also the loss of quality due to inattention, fatigue and, of course, loss of motivation.
The network has already appeared a number of tools that should facilitate the process, but their functionality is quite limited.
In order not to be unfounded in my analysis and subjective criticism, I, in the best traditions of scientific research, took the same part of the real application I worked on and tried to implement with the help of the proposed toolkit.
I summarized all the results in a table and evaluated each set of services according to three main criteria, giving them a rating on a 5-point scale:
Let's start with the “classic” approach: we build a map of the dialogues on a white board, or rather in its digital counterpart - Realtimeboard . Character description and phrases will be stored in Google Docs .

Before building a map, you will have to work out your own symbols - again, time costs, and when building a map, each step is drawn and aligned manually - it comes out slowly, but the map becomes more visual.
The process of collecting materials for testing takes a lot of time. It looks like this: looked at the map, then took a phrase from the table and wrote it in the document. No flexibility, continuous routine and constant switching between tools.
It is easy to edit the map: steps can be swapped, moved whole branches and selected individual elements into groups. But to synchronize the map with the table of phrases has to manually - again scrabbling sense of lost data.
We put "good" Realtimeboard for clarity and flexible adjustment of the methodology of work under the designer. We threaten with a finger for the lengthy testing process and manual synchronization of the table of phrases with the map.
The map and phrases are inside Sayspring , information about the character and person will remain in Google Docs .

The map is formed step by step: there are designations for the user and the interface, it can be divided into scripts. In the process of building you catch minor inconveniences, for example, the need to constantly save changes. At the same time, the map is absolutely linear: transitions are not displayed in any way (links and forks on the screenshot have already been added independently).

The service allows you to test scripts with your voice, but the text equivalent of phrases is not available, there is no possibility to go back a couple of steps (you will have to start again), speech recognition is available only for three languages ​​and does not work well. For testing this mode is useless, because there is no possibility to watch the history of the dialogue, you still have to collect the dialogs into a file.
Fortunately, the collection of dialogs here is facilitated. By clicking on the button, the tool itself will show you the possible dialogues. There are many problems and inconveniences (for example, you cannot collect two scripts into one file; you cannot download a file, just view it in the tool), but this already saves us time for testing.

All replicas are assigned to a specific logical step in the map, which eliminates the need to switch between tools and synchronize their state.
It is inconvenient to make changes to the map: dragging elements is possible only within one scenario, groupings are not available.
Sayspring eliminates the routine work of collecting materials for testing and synchronizing the table of phrases with the map, since the replicas are assigned to the steps. These are the only advantages.
The map is unattractive, working with it is difficult and inconvenient. Testing by voice works, but it is useless, since there is no opportunity to read the replicas, to look at the history, and the unloading of dialogues is limited.
The tool differs in the format of the main screen: the dialogue is initially built, and the map is drawn automatically. Phrases and character will be stored in Google Docs .
Forks and connections between steps are clearly visible on the map. It is interactive: by clicking on a step, editing of the element opens.
There is no division into scenarios, which will lead to a large number of repetitions and a huge confusing flowchart.

Testing is performed in the form of correspondence, which allows you to get a hand in replicas, see the story.
However, it is not possible to choose steps: in fact, we do not control the process, but watch the video, which makes the mode useless.
Since the phrases and the map are stored separately, the problem with synchronization remains. Editing the map is quite convenient, there is a drag-and-drop, but you cannot select several elements and make a general action on them.
By the way, the service implements the so-called build-mode: you can embed variables in phrases and access them through the API. Thus the tool can become the content keeper. What exactly is not clear, because you can specify only one version of the phrase.
The tool is most likely created for rapid prototyping of simple applications, and not for full-fledged design. Testing does not work, leaving the problem with the collection of materials open. Dialogs download is available only in MP4, GIF or AVI format.
The tool allows you to build maps, but does not specialize in the design of conversational interfaces. Character and phrases will be stored in Google Docs .

The map can be divided into scripts. It is built conveniently and quickly thanks to convenient hotkeys, removing from us the need for alignment.
The connections between steps are poorly implemented, it is impossible to change curves, and they are built on top of everything, greatly reducing the readability of the map.
As in realtimeboard, before building a map, you will have to work out a legend.
There is nothing to collect materials in the tool, the problem is not solved at all.
It is convenient to work with the map: selection and dragging of elements is available. Since the phrases are stored separately, the synchronization problem remains.
The process of building a map is very convenient, the map itself is quite visual, but there is a problem with the connections between the steps. Problems with testing and synchronization of the table of phrases and maps are not solved.
It is clear that the study did not consider all the available options (I will welcome your advice in the comments), but according to the analyzed services, we can make a clear conclusion - not a single tool is similar to the Holy Grail. A temporary solution for me personally is a combo from Realtimeboard + Google Sheets + Google Docs.
However, I did not put up with the loss of time and energy for designing and set myself the goal of developing my own tool - Tortu .
The development of the functionality directly depends on the opinion of the interested developers. Especially for this I have prepared several questions that will help me navigate. I would be grateful if you help me and fill out the form . Filling will take no more than 5-7 minutes.
If you are interested in the topic of conversational interfaces, and you want to learn more about the design, development, or you have any questions, then click on my telegram chat dedicated to conversational interfaces, where a small community of developers and designers has already gathered.
Source: https://habr.com/ru/post/352136/
All Articles