Planning usability testing. Part 2

Hi, Habr, Natalia Sprogis is with you again, the head of UX-research at Mail.Ru Group! This is the second part of the usability planning article. In the first one, I talked about the formulation of goals and hypotheses, the choice of test method and data fixation tools, as well as organizational issues for the researcher. This part is devoted to compiling a test protocol (scenario): thinking through what tasks you will give to the respondent, what questions you will ask, and what questionnaires you will ask to complete.

The protocol of any usability testing consists of the following parts:

Instructing / briefing (greeting, description of the event, signing of documents)
Introductory interview (screening screening, short interview about product usage, context and scenarios)
Work with the product (testing tasks)
Collecting final product impressions based on testing experience

Briefing / briefing

')
Regardless of the subject of testing, any study begins the same way. Need to:

Create an atmosphere. Meet the person, offer him tea, coffee or water, show where the toilet is. Try to relax the respondent a little, because he may be nervous before the event. Find out if it was easy to find you, ask how your mood is.
Describe the process. Tell us what the event is waiting for the respondent, how long it will take, what parts it consists of, what you will do. Be sure to pay attention to the respondent that his contribution will help improve the product and that you are not testing a person’s abilities. If you are recording a video, warn the respondent about this and calm it down that the data will not appear on the web. I say something like this:

“We are located in the office of Mail.Ru Group. Today we will talk about the project XXX. It will take about an hour. First, we will talk a little, then I will ask you to try to do something in the project itself, and then we will discuss your impressions. We will keep a video of what is happening in the room and on the computer screen. The record is needed solely for analysis, you will not see yourself on the Internet. We are conducting research to make the project XXX better, to understand what needs to be corrected in it and in which direction it should develop. Therefore, I beg you to openly express any comments, both positive and negative. Do not be afraid to offend us. If you do not succeed in studying a project, take it calmly. So, we have found a problem that the project team needs to fix. Most importantly, remember: we are not testing you, it is you testing the product. If you are ready, I suggest starting. ”
To sign documents. As a rule, this is consent to the processing of personal data, and sometimes - an agreement on non-disclosure of information about testing. For tests with minors, parental consent is required for the child to participate in the study. We usually send it to parents in advance and ask to bring it with us. Be sure to explain why you are asking to sign documents, and allow time to study them. In Russia, people are wary of any papers that need to be signed.
Customize hardware. If you are using aytreking, biometric equipment or just keep video recording, it's time to turn it all on. Warn the respondent when you start recording.

Introductory interview

It solves the following tasks:

Check recruiting. Just in case, always start with this - even if you trust the agency or the person who found the respondent. More than once already on the test, we found out that the respondent misunderstood the questions and in fact uses the product not exactly as we need. Try to move away from the formalities and not ask questions from the screening questionnaire: the person may already know what to answer them.
Scenarios and context of use of the product. Even if you have little time to test, do not miss this item. At least in general, ask the respondent what tasks he solves using the product, whether he uses similar projects, in what conditions he interacts with them and with what devices. Knowing the characteristics of using the product will help you better understand the reasons for the respondent's behavior. And if you use a flexible scenario, then formulate suitable tasks. If there is enough time, ask the respondent to show what he usually does and how. This is a source of further questions and insights.
Expectations and relationships. The start of testing is a good time to find out what the respondent knows about the product, how it relates to it and what it expects from it. After the test, you will be able to compare expectations with the final impression.

For most tests, this structure is an introductory interview. However, if you are testing a new product, you may want to skip the introductory questions. If you start discussing a topic in too much detail, this can create some user expectations for the product. Therefore, leave only a couple of the most common questions in order to establish contact with the respondent, and immediately proceed to the tasks. Discussing scenarios, relationships, and context is better in this case after the user first examines the product.

Work with the product, writing tasks

What are the tasks

Let's say that you want to test an online store. You have important scenarios (search and selection of goods, the ordering process), known problems (frequent mistakes in the form of payment) and even the hypothesis that the designer did something with the filter on the price. How to formulate the job?

Focused tasks. It seems obvious to do something like this:

"Choose a dishwasher with a width of 45 centimeters with the function" beam on the floor "worth no more than 30 thousand rubles."

This motivates the respondent to use filters and compare products with each other. You can check the filter for the price on all respondents and look at the key product selection scenario. Such tasks are quite right to life and are good for testing specific hypotheses (as with a filter by price). However, if the entire test consists of them, then you risk the following:

Spot check interface. You will only find problems related to the details of the task (filter by price and width). You will not see other problems, such as sorting products or other filters, if you do not specify them. And you can hardly do tasks for all elements of the site.
Lack of engagement. Users often perform such tasks mechanically. After seeing the first product that fits the criteria, they stop. Perhaps in his life the respondent had never chosen a dishwasher and he didn’t care what a “beam on the floor” is. The more the task resembles a real-life situation, the more the context that the user understands it, the higher the chances to involve the respondent, who imagines that he is actually choosing a product. And the involved user better “lives the interface”, leaves more comments, his chances of finding problems and giving useful knowledge about the behavior and features of the audience increase.
Narrowed range of insights. In real life, the user would probably pick up the product completely differently. For example, without using filters at all (and here you indicated them). Or I would search for a product by criteria that are not on the site. Giving tough, focused tasks, you will not know about the real context of the product use, you will not find scenarios that the project team may not have foreseen, you will not collect data on the needs for content and functionality.

Tasks with context. One way to better engage users is to add a real story and context to a dry task. For example, instead of “Find a recipe for plum pie” on a recipe site, suggest the following:

“In an hour, guests will come to you. Find what you can bake during this time. You have everything in the fridge for biscuit, as well as a little plums. But, unfortunately, there is no butter. ”

A similar approach can be used with an online store. For example: “Imagine that you are choosing a gift for your sister. She recently had a hair dryer broken, and she would be happy to get a new one. You need to meet the 7 thousand rubles . " It is important that the respondent really chooses a real person to whom a gift will be “bought” (if there is no sister, suggest another relative or girlfriend). The key to such tasks is the reality and clarity of the context. It is easy to imagine that you choose a gift for your family, where it is more difficult - that you are an “accountant who makes up the annual report”.

A striking example of this approach is the " Bollywood method ", which came up with the Indian UX-expert Apala Lahiri Chavan. She argues that Hindus, like many Asians, find it difficult to openly express their opinions about the interface. But, presenting themselves as heroes of fictional dramatic situations (as in their favorite films), they reveal themselves and begin to lively participate in testing. Therefore, assignments for Indians should look something like this:

“Imagine that your beloved young niece is about to get married. And here you learn that the future husband is a fraud, and he is married! You urgently need to buy two tickets for the flight to Bangalore for yourself and the cheater's wife in order to upset the wedding and save the family from shame. Hurry up! ”

Assignments based on the experience of respondents. Recall: for successful testing, respondents must match the audience of the project. Therefore, to check the online store of household appliances, we recruit those who recently chose the equipment or chooses it now. This is what we will use when composing assignments based on the experience of the respondents. There are two options for using this approach:

Parameters of respondents. In this case, you adapt the fixed tasks for the respondents. For example, for the case of a home appliance store and the task of working with filters, check with the person what he recently purchased. You recognize the criteria (price, features, etc.) and offer to repeat the "purchase" on your site.
Scenarios of respondents. Assignments are fully formed based on the experience of the participants. To understand which scenarios to check, the moderator finds out exactly how a person solved a problem in life, and suggests that this be done on the site. For example, the buyer has long compared with each other several models before choosing. Even if the site does not have a suitable function, ask the respondent to compare products in order to understand what parameters he will rely on. You may get ideas on how the comparison function should look, and also adapt the product page for this script.

Such tasks give many real-life examples of basic operations in the product. This often gives rise to a much larger range of problems and findings. In addition, it allows you to check the product on new scenarios that you did not consider the main ones or did not even think through. For example, when we tested the Real Estate Mail.Ru project, it was the tasks based on the experience of the respondents that helped us make many discoveries. So, we saw that when searching for an apartment in the Moscow region, people indicate the final metro stations in the geofilter, meaning that these are stations that can be reached from the area. We counted that the filter on the subway is looking for an apartment near the station. We also learned the difference between the scenarios of the search for new buildings and the secondary market, which later helped bring the search for new buildings to another section on the site - with its own filters and its own concept of apartment description.

I also advise you to read the excellent article by Jared Spoole about the benefits of such tasks.

Tasks without tasks. Sometimes it is better not to offer users the tasks to work with the project at all, but to see how they themselves will become familiar with the product. Give the respondent an introductory statement: “Imagine that you decide to try this product. I'll leave you for a few minutes. Do what you would do in real life. I do not give you any tasks . ”

It is important that the moderator while leaving the room. Otherwise, the user is tempted to immediately ask something, clarify: “Do I have to register? How to do it? ” , Etc.

This type of assignment is useful for completely new products. We often use it for mobile applications and games. So, we see whether users read educational materials, what details immediately attract attention, what people understand in the concept of a product, how it is then described. Already after the free task are scheduled specific scenarios.

Another area of application for free assignments is content projects. If you want to understand how your articles are being read, where they linger for a long time, what they miss, what elements on the page they pay attention to, etc., then just leave the respondent for a few minutes alone with the project. Only without a moderator looking over his shoulder, the user will relax and read the text in the same way as usual, in real life. So we are testing the Mail.Ru News, Lady Mail.Ru projects, etc. This approach allowed us to highlight different behaviors on the site, different patterns of reading articles and understand which types of materials should be designed differently.

We make good tasks

The first task is simple. Start testing with introductory and uncomplicated tasks. The respondent must become familiar with the format of the test, especially if you use the “thought out loud” method: on the first task a person needs to get used to the need to voice his thoughts and feelings. You should not immediately throw out all the pain and suffering of the interface.

Do not prompt. Formulate tasks so as not to suggest the correct actions to the respondent. For example, if you want to test the ability to add products to your favorites in an online store, go without the task “Let's add this TV to your favorites,” especially if the button is called that. Having read the task, the respondent will simply find the button with the required signature on the screen, perhaps without even understanding what he is doing. It is better to explain the meaning of the task, without resorting to terms in the interface. For example: “The site has the opportunity to save your favorite products and then choose which of them to order. Let's try to do it with such a TV . ”

Watch out for terminology. Do not use incomprehensible words and symbols. This seems obvious, but we often, having become accustomed to some terms, forget that few people know them outside the IT crowd. For example, when testing new functionality of threads (chains of letters) in Mail.Ru Mail, we had to be quite difficult. After all, users unfamiliar with such functionality simply do not have a term in their head that would designate threads. As a result, we did not call them. We simply showed respondents a box with connected chains and discussed this new feature. As part of the discussion, they gave users to choose the word for designating threads. This helped us later use the most understandable texts in teaching promotional materials. Watch not only for the tasks, but also for the moderator's questions, especially for those that come from the team during testing. You should not, for example, use the word “toolbar” when discussing functions: it is not familiar to everyone. A few years ago, not even all users knew the word “browser”. How exactly it is best to formulate tasks depends on the audience of testing. Do not throw in the opposite direction, explaining all the terms in a row. For example, experienced players do not need to explain what a “buff”, “frag”, “respawn”, etc. are.

Less test. Often there is a great temptation to make the respondent a test account in the system and conduct testing on it. After all, you can run everything in advance in this account, avoid overlaps and do not waste time on the registration or authorization of the respondent. Often, it is also technically much easier to include a new design on test data, rather than on real data. However, with this approach, you risk getting much less useful results. After all, test actions have no real consequences. The situation becomes completely artificial, it is difficult for users to project it on real experience. For example, when working in their own account on a social network, respondents, as in real life, will neatly do everything that their friends can see (publish links, send messages). When setting up your own mailbox in the mail - try not to delete important letters. When testing online stores, an approach is sometimes used when a reward needs to be spent right on the test. In this case, the respondent will not poke the first suitable product for the task, but will pick up what he really needs.

Plus, having only test data, you will find problems related only to them, and do not check the functionality for different variations. For example, when we tested the social panel of the Amigo browser, one of the respondents who connected his VKontakte account to the panel immediately noted that it was inconvenient for him to read this way. Almost the entire tape consisted of subscriptions to groups with erotic pictures. And in the narrow panel in the pictures there was just nothing to see.

Another problem with test data is that it is difficult to understand the system, since everything is unusual. For example, a social network user is used to recognizing his page from his own photo. Even testing prototypes, we try to personalize them as much as possible. For example, when testing clickable prototypes in Odnoklassniki, we always adapt them for each user, inserting his name and photo into the page, and sometimes the latest news.

Do not be limited to the interface. Do not forget that the interaction with the product is often not limited to only one interface. If possible, test related products or services and the connections between them. For example, when testing games, we try to check not only the game, but also its website and related downloads, registration in the game, search for information on the forum. And when testing one online store, I also checked the operator’s call after placing the order, which gave recommendations for the call center.

Think about timing. For a good scenario, it is important to prioritize tasks. Most likely, if the system is large and there are many goals for the test, you will want to do a lot of tasks. However, the tired respondent will no longer be helpful. A good test lasts no more than one and a half hours, two is the maximum. The exception is that the game. And remember that your goals are not only assignments, but also interviews, questionnaires, setting up equipment and signing documents. All this usually takes at least half an hour. If there are too many tasks, and you don’t really want to refuse any, you can put the least priority into the rotation, that is, show only parts of the respondents. Or make part of the test mandatory for all, and the rest is only for those who have enough time. But these will most likely be the most successful respondents.

Evaluate the usefulness of the job. Think whether it really fits your hypotheses. For example, you want to check the function of subscribing to news on the site. The task “Sign up for the newsletter” will allow you to check only if those who will be able to find it can find the newsletter. However, we understand that people rarely come to the site to subscribe to news. The assignment does not apply to real life. You need to understand, whether those who carries out absolutely other tasks notice a possibility of a subscription. You can check it in different ways - depending on the implementation of the function. If a person was engaged in tasks in which he could get a subscription opportunity, ask him if she is on the site. Just be sure to specify where he saw this opportunity or how it is implemented to make sure that the respondent does not simply agree with you. If the subscription offer is embedded in the registration or checkout process, see if the respondent will take advantage of it, and then discuss it after the assignment. There are very few chances that under laboratory conditions people will actually sign up for mailing lists, but you can check whether a person has paid attention to this possibility, what he expects from the mailing list, etc.

Collection of final impressions

The purpose of the final testing phase is to collect impressions from working with the product, to understand what the user liked and what upset him, to assess subjective satisfaction. Typically, this part of the test uses a combination of interviews with a moderator and filling out formal questionnaires.

Interview with moderator

In the final interview, we always ask the respondents about the same questions: “What were your impressions?”, “What did you like and what did not?”, “Was it something that seemed difficult or inconvenient?”, “What was missing ? "," What would you like to change in the product? " , Etc. It's time to clarify the incomprehensible moments of the respondent's behavior, if you have not done so during the test. And if you learned from users about the brand or product and the expectations from it before the test - find out if something has changed. When interviewing, pay attention to the following:

Social desirability. Handle the interview results very carefully. If during a test you often hear impulsive comments influenced by problems, then social desirability flourishes in the final interview. It seems to one that, speaking about the problems in the product, they admit their own incompetence. Others just do not want to upset a pleasant moderator. Very often, respondents (especially women) who have tormented the entire test, say that everything is, in principle, normal. Negative reviews can also be dictated by social desirability: if the respondent is confident that the purpose of the test is to find flaws, he diligently tries to find them.
Quotes and priorities. Despite the fact that all the words of the test participants in the final interview often need to be divided into two or even ten, this does not mean that they are useless. By the way respondents summarize their impressions, you can make a conclusion about priorities. Product - "sucks"? What exactly has affected this? Which of the many problems did the respondent memorize the most and consider the most annoying? However, make a discount on what is best remembered the last task. It is also very useful to keep track of what adjectives the respondents describe the product in a conversation with which they compare their experience.
Do not forget about the good. Very often, a usability testing report is a long list of problems found during the test. In general, the search for problems is one of the main tasks. However, please do not forget about the positive aspects of the product. Because, first, a report without positive results simply demotivates the team. And secondly, it is useful to know what users like in the product. Suddenly, the next redesign will decide to remove the function that everyone has liked. Therefore, be sure to ask the respondents about the positive aspects of the product, even if they scold the interface during the whole test.
Attitude to "Wishlist." Most likely, in addition to their impressions, the respondents will also express wishes and ideas. Your task is to understand what problem is behind the proposals. Because the solutions offered by users are not likely to work for you. After all, participants in testing are not designers, they are not aware of the features and limitations of development. However, behind any such request is the need, which you must fix. If the respondent says that he certainly needs a big green button here, be sure to ask - why?

Measure of satisfaction

Often, according to the respondent, in the final interview it is difficult to understand whether he ultimately liked the product or not. And all the more difficult it is to compare the attitudes of several respondents who noted both advantages and disadvantages and found them. Here come to the aid of the researcher questionnaires. First, when filling out the questionnaire (especially before talking to the moderator), the influence of the notorious social desirability is slightly less, although you will not get rid of it completely. Secondly, the questionnaire gives you clear parameters that allow you to compare scenarios, products or project stages.

Making a good questionnaire is a separate and very big topic. Here the wording, the scales, and much more are important. Therefore, ready-made and proven questionnaires can be a good help. They are already honed and repeatedly tested. The only problem is that almost all of these forms do not have official translations into Russian. Naturally, you can translate them yourself, but from a methodological point of view, translations must be tested to verify the correctness of the wording. However, questionnaires can be a guideline when compiling your own questionnaires.

There are questionnaires that are given after each assignment in order to assess satisfaction with specific scenarios. For example:

After Scenario Questionnaire (ASQ) . Three questions about complexity, productivity and tips in the system.
Single Ease Question (SEQ) . One question about the complexity of the script.

And there are questionnaires that are used in the final phase of testing. Here are some examples that we use when necessary:

System Usability Scale (SUS) and Post-Study System Usability Questionnaire (PSSUQ) . Two classic and popular questionnaire, created more than 20 years ago. Both are made up of statements. Respondents should indicate the degree of agreement with them. All these statements from different sides characterize usability of the product. For example: “I could easily find the necessary information”, “The various possibilities of the system are easily accessible” , etc.
Microsoft Desirability Toolkit . Questionnaire, which often helps us on tests. The user is given a set of adjectives from which he chooses those that, in his opinion, can characterize the product. As a result, you get a cloud of words - the characteristics of your project. Often this technique brings very interesting results.
Game Experience Questionnaire . Classical usability questionnaires cannot be applied to games: involvement in the gameplay is much more important than clarity of interfaces. Therefore, for games you should always make special questionnaires or use GEQ. The questionnaire contains several modules: the base module, the in-game block, the questionnaire and the questionnaire of the social capabilities of the game.

You can use the proposed questionnaires or create your own questionnaires that are relevant to your product.

Conclusion

Good planning will allow you to conduct research that accurately meets the goals of the customer and is as useful as possible for the project team. Also, a competent plan reduces the time to prepare a report. However, no matter how well you prepare, be sure to conduct a pilot test for someone from colleagues or acquaintances. This will avoid incidents with incorrectly configured equipment, will show whether you are keeping up with timing, whether the texts of the tasks are clear and there are no typos in the polls.

Source: https://habr.com/ru/post/308054/

All Articles