Subtleties of selection of respondents for UX-research

Hi, Habr! My name is Natalia Sprogis, and I lead the direction of UX research in Mail.Ru Group. Today I would like to talk about those people without whom we could not help our projects. The search for participants in UX research is the cornerstone of the success of the whole event. And, judging by the number of questions “And how are you looking for respondents?” That our researchers receive at any conference, this is a hot topic. Let's try to figure out who to call, how many they should be and how they can be found. First of all, it will be about recruiting for qualitative research, in particular for usability testing.

Part 1. Who?

Who to invite to the study?

Perhaps the most important thing is to determine who exactly is needed for the study. This is no less important than creating a literate scenario. If you show your product to a programmer from a neighboring department and your mom, you will get completely different results. Moreover, even though we assume that mom will have more problems, this does not mean at all that the problems will turn out to be the ones that you really need to fix. Because both your mom and the neighbor programmer may simply not be your users.

Do not start from demographics. Often, when we work with a product for the first time and discuss the audience with the manager for a test, we get something like this: “Women are 60%, mostly between the ages of 25–35, Mail.Ru users”. Although this is valuable and useful data, it cannot be a profile for research. Imagine that we decided to test the design of the letter creation page in Mail.Ru. And called to the test users who have a box on Mail.Ru, distributed by gender and age according to statistics. Most likely, it turns out that one of them does not write letters at all, but only receives mailings, someone uses only the mobile client, and for some this box is not the main one. The experience of these people will be useless to us. This does not mean that it is necessary to ignore knowledge about demography altogether, these criteria should not be basic.
')
We go from the objectives of the study. The main parameter for the selection of respondents should be experience in the product or functionality under study. According to this characteristic, we may be interested in three groups:

advanced user;
a user with similar experience (a competitive product or a solution to a problem in some other way);
inexperienced user.

Suppose we want to test a new online TV program design on the Mail.Ru Kino project. It is important for us to invite the users of the current version of the project to make sure that they can comfortably perform familiar scripts. We may also be interested in users of competitive online TV shows. In addition, it is good to look at people who watch the program in newspapers and magazines, that is, they have no experience in using such online services.

Not just "product experience." For successful recruiting, you need to specify what you mean by "product user". The more you describe exactly which users are suitable for you for this test, the more chances you have to get interesting results. First of all, think about the activity (frequency of call, the duration of the session, the use of various functional) and the features of the use of the product. For example, in the example described above about testing a TV program, we thought that it was important to invite people with different TV interests. For example, those who regularly watch TV shows may have some scenarios and requirements for the interface: read the description of the missed series, find when it will be repeated. And those who watch sports broadcasts are completely different (there are broadcasts at night, it is important not to miss, when it will be interesting, to have reminders).

Be careful with inexperienced users. Often it is important for us to see how completely inexperienced users interact with our product. After all, we imitate the moment of the first acquaintance with the project of a new audience. So, the "inexperienced" audience must also be chosen thoughtfully. Ideally, it should not be just people who do not use the current product. And people who potentially have a need for it. For example, at one time we broke the test registration in Odnoklassniki. The participant refused to perform the tasks. It turned out that she was deliberately not registered in any social network, as she was afraid of control by “Big Brother”, and was not ready to do that even on a test. This, of course, an extreme case. But if you call people who have no interest in the subject area at all, most likely their assignment will be mechanical. So, they will bring you much less knowledge. For example, to test the “Unpack Mailbox” service in Mail.Ru Mail, which allows you to automatically sort incoming mailings, we consciously called people suffering from the fact that their mail was filled up with numerous subscriptions. Users sorted out their own boxes, not test boxes created for them. Therefore, we have seen with living examples whether the service really solves their problems with mailings, and not just mechanically checking the clarity of each form.

Other common parameters. What other requirements for research participants you will have, always depends on the specific tasks. In addition to experience and demography, it is often found:

Mobile platform. When testing mobile applications and sites, even if the version has a single interface for different platforms, it is important to test it on different users. Practice shows that very often problems on iOS and Android are very different, and users of these platforms have different skills and habits.
"Advanced". Five years ago we used the criterion of "computer literacy" or "Internet experience" when recruiting for testing sites and services. Now the average level of "experience on the Internet" has grown, if not to take into account the age audience. Therefore, when testing web services, we do not consider this criterion. But the question of "advanced" is relevant now for mobile platforms, and we often include it in the requirements. We are accustomed to conditionally advanced for ourselves to consider those users who "know how to download and install applications on their own."
"Hardcore". This criterion pops up when testing games. Any producer of the game will tell you that there are casuals, middies and hardcore players. The difficulty is that there is no clear definition of who belongs to which group. Therefore, for each game you have to invent your own. As a rule, they consist of hours spent in the game, and preferences in similar games in the genre.

Tests on friends and colleagues

Periodically, any researcher encounters a customer who says: “Yes, our product is absolutely for everyone! Let's not spend money and test on colleagues. " This idea seems very tempting, colleagues or friends can help you without remuneration, recruit them easily. In the end, even among them there are people who suit us in terms of experience use parameters. Such tests quite have the right to life, especially as intermediate, but it is necessary to remember the following dangers:

"I am unrepresentative." Most likely, your social circle is quite advanced in technology. When we are offered: “Let's call HR or marketing,” I remind you that even non-developers in our company are also somewhat geeky. Keep in mind that these will be fairly advanced users and they may simply not encounter some problems.
Skewed loyalty. The experience of conducting tests on our company's employees shows that they are either very loyal to their native product (arguing something like this: “Oh, well, I know the guys from the team, they also tried very hard. And this doesn’t work for them because in our project with this, too, suffered ”), or, on the contrary, are unnecessarily demanding and picky, expecting from their colleagues the best. A similar situation occurs with friends. In order not to offend us, they can try to turn a blind eye to any shortcomings of the product. Or, on the contrary, they will want to help us find more problems and begin to find fault with everything in a row. Any good UX specialist, of course, remembers that you need to look at the behavior, and not listen to the words of the user. But these reviews can create a false impression on the project team overseeing the testing process. And you have to spend a lot of effort to explain to them that overly positive or negative impression on the test does not mean the same reaction of the real target audience.
The experts. It is acceptable to take colleagues and friends as respondents, but avoid the danger of calling experts in the field of the product under investigation. People do not know how to abstract from their experience and knowledge and look at the product exclusively as users. For example, if your mobile application is tested by a developer, even from another team, he will begin to pay attention to compliance guidelines, download speed and other technical features. The problem is not only in the developers: the marketer will look at banner places, the producer of the game will be professionally compared with other games in the genre and so on. For example, once a web developer on our test site for a game has long argued that "this parallax scroll spoils the user experience very much." As a result, instead of the user experience you will get an expert assessment. Such estimates are also needed and useful, but do not confuse them with user testing. Especially since there are experts in your team.

Despite this, it is better to call colleagues and friends than not to test at all. Such tests are usually quick and cheap, which is undoubtedly their advantage. The main thing is to try to pick people as close as possible to the audience of the product, and also do not pay too much attention to words and assessments, concentrate on the problems and behavior (which, in principle, is relevant for any test).

Part 2. How much?

A question that worries many: how many respondents should be taken so that the results of the study can be trusted?

"Just a handful of users"

Usability guru Jacob Nielsen more than 20 years ago began to popularize the idea that five people are enough for any usability test ( Jakob Nielsen and Thomas Landauer , 1993). This statement is firmly entrenched in many heads. Someone always began to take five people, regardless of the task. Others have become skeptical of all usability studies, because they are based on the study of such a small sample.

Let's see where the magic number five came from. As practice shows, after the first few participants of usability testing, the number of new problems detected decreases. Nielsen postulates that a sample of five people makes it possible to detect 85% of interface problems ( Why You Only Need to Test with 5 Users ). That is, taking more people may simply be impractical.

In fact, those 85% of which Nielsen speaks are among the most serious and frequent problems (the number is valid for problems affecting 31% of the audience). If we take any interface, then it most likely contains a number of frequency problems that many users face (for example, registration problems in the game), as well as a variety of less frequent, but perhaps quite critical problems. For example, problems faced by 15% of the audience on five people will be detected with a probability of 50%. Increasing the sample size will increase the chances of finding less frequent problems. Jeff Sauro, who is actively involved in issues of statistical relevance in usability, examines this issue in detail in his article.

Despite the popularity of the assertion about the five respondents, many researchers prefer to take more study participants. Jeff Sauro in 2012 conducted a survey of about a hundred readers of his blog. He tried to find out how many respondents in reality are invited to research in various companies. The survey showed that more than 81% of studies are based on samples larger than five (median 10) ( How many users do people actually test? ).

What determines the number of people

When five people is enough. On small samples it is good to test prototypes and concepts - when it is important to find the most critical problems, and the amount of functionality that can be included in the test is small. However, it must be remembered that after the introduction of additional tests will be required, since the testing of prototypes revealed only the most obvious problems. Also, samples of three to five people work well in iterative tests: when we test the functionality, we send it back for revision, and then we test it again. It is generally a very effective audit method for systems that have a permanent team of researchers. So in each iteration we have the opportunity to see if the previously found problems are fixed, and find something new.

When you need more people.

A variety of audience. In five people do not include diversity in experience, preferences, patterns of interaction, and even demographics of the audience. And often this is what gives us more knowledge. To understand how many people you need, you first need to paint all your requirements for the audience, and then take at least three people in each group. The simplest example: if we want to test a mobile application, then we need two groups on different platforms (iOS and Android), as well as two groups on experience (novices and users of similar applications). This already gives us 12 people.
The volume of the tested functional. Within one test, you can pass a limited number of scenarios. A good test usually lasts about an hour. You can torment a person for up to two hours, but this is tedious and ineffective. Therefore, if the task “to test the entire interface of VKontakte” came to us, then we obviously would have divided the project into several “subtests”, since we would have to look at the functionality of the tape, photos, groups, and much more. The correct solution would be to include the most critical scenarios in all tests, and let the rest into rotation. Each of the scenarios in the rotation must be performed by at least three people.
You rarely conduct tests. If you release a product and contact an external agency for one-time final usability testing, do not skimp on the number of respondents. Take at least ten people. This will increase not only the budget, but also the probability of seeing more problems. Small samples are good for frequent iterative tests. If you only have one chance, use it correctly.
More knowledge. Often, the study sets the task not only to find flaws in the interface, but also to collect additional data about the target audience: interaction patterns, context of product use, and so on. For these purposes, it is good to have a sample with a diverse background. Which also leads us to more people.

Do not overdo it. Having cited many arguments in favor of increasing the sample, I want to warn you not to go to the opposite extreme. Remember that we are talking about high-quality laboratory research. Rarely for such projects really need more than 24 people. On average, 10–12 people are enough to cover the main groups of the audience and find most of the problems. For example, demographic parameters rarely serve as a reason to greatly increase the sample.

Myths and fears

"Not serious research . " Although we realized that five respondents were not always enough, UX research continues to be mostly qualitative. Often, research results are met with skepticism from outsiders. After all, costly product changes are proposed based on a conversation with a dozen people. On this score, I really like the argument read in one book and sounds like this: “If you ask ten people what should be changed in the design of this door, you will most likely get ten different answers. If you ask ten people to go out through this door, you will see whether it is clear from it, you need to pull or push it. ” The fact is that UX studies do not study people's opinions , but their behavior . If it is important for us to know the opinions, then a dozen users are indispensable, because "how many people, so many opinions." If we want to find problems of interaction with the product, then a small number of people allows us to identify behavioral problems and features characteristic of the mass of users.

“And if your respondent is a fool?” When a user struggles with a task or shows illiteracy in some aspect, the command that watches the test often raises the question: “Or maybe it’s him who is the idiot are the rest normal? ”The answer is simple: all moments of controversial behavior should be checked on other people. When it turns out that "tupit" is not one, but many people, doubts disappear. So, on one of the tests for the service team posting links on the social network was a shock that many users do not know how to copy links from the browser. To share an article on the social network, the test participants tried to copy the full text of the article, shared screenshots, or simply copied the name, saying: "Well, a friend will then find himself on the Internet." The first such person still caused doubts: perhaps, such inexperience is typical for him. But after the fifth user, all doubts disappeared.

Part 3. How?

Search methods

We understood how many and which users we would like to invite to our research. But how are we going to look for them? What are the options and what's special about them?

Special Recruitment Agencies

This method is most convenient for the researchers themselves, since it removes a lot of headaches from them. However, there are some pitfalls here. First, it will have to beat a fairly large budget, because the work of the agency must be paid. In addition, you will definitely encounter the problem of "walkers". Market research has been around for a long time. And in this industry a whole stratum of people has emerged who are trying to make a living by going to surveys, focus groups, etc. For these people, the most important thing is to get a reward. Often they lie about their experience in order to adjust to quotas. And on the test itself they try to behave in such a way as to please you (and then you will call them again). As a result, they can begin to praise vigorously or to scold your product, because they do not understand the objectives of the study. It is not so important how they try to adjust their behavior. The main thing is that it becomes unnatural. Ways to deal with these people can devote a separate article. Here are some tips:

Closed screening. Always coordinate with the agency a set of questions that will be asked to potential study participants. For example, if you need Odnoklassniki users, the question should sound not “Do you use Odnoklassniki?”, But “What social networks do you use?”.
Check your agency. Even well-proven agencies sometimes work with non-professional field recruiters. Therefore, from time to time it is worth checking whether the announcement of your research with open requirements appeared on special “survey sites”.
Check respondents data. Check to the maximum potential respondents before the test. If these are users of your product, then ask in advance for their account and see if their experience really meets your requirements. We always check the activity of users of our mail, games or social networks. For example, people who registered an account the day before tried to enter us for testing Odnoklassniki.
Keep a database of respondents. On it, you can check if you are trying to bring the same person again. The industry has a six month rule. If a person goes to the polls no more than twice a year, then he is not considered a “walker”.
Train your visual memory. One person tried to break through to us three times in a few months, and under different surnames. But we just remembered his face and quickly found records with his tests.

Panels

If you are doing online research, you can turn to recruiting through the panels. These services have a huge pool of registered users who regularly respond to online surveys or take online tests. Your task is to create a screening questionnaire for which participants will be recruited. As well as with the agency, it is important to make a set of these questions well, because the panel members can also try to adjust to the requirements. Sometimes it is useful to even include deliberately false answer choices (for example, a non-existent product) and cut off those who chose it. Remember that for the panel members your research is almost a job. After all, they take several surveys a day. Among these users there are also “hackers” who try to “call out” your survey and get rewarded. They can be calculated by too high speed of assignment, lack of answers to open questions, etc. Despite these features, the panels allow you to recruit quickly, get a large base of answers and geographic coverage.

Self recruiting

You can recruit users yourself. It is important to understand that this is a laborious task. Even when you find people ready to come, a lot of time and effort will be spent on dialing to clarify the criteria, agreeing on a convenient time, explanations and reminders, etc.

Friends and colleagues. Throwing a call on Facebook or on the corporate portal seems to be one of the easiest options. But we have already discussed the dangers of testing on familiar people. Firstly, in your environment, most likely, will be a fairly advanced audience. Secondly, it will be difficult for them to openly express their opinions to you. Nevertheless, it is always better than nothing.
Group in the social network / forum project. This is a loyal audience interested in improving the product. They are often easier to invite than your other users, and they may even be willing to participate in the study for free. But we must understand that these will be quite advanced users of your product. Most likely, they are the most demanding of all and can ask you about the functionality that is not needed by the majority. In addition, the easiest way to catch "freaks". The appropriateness of recruiting through the forum / group also depends on the project. For example, for games, the forum is a good place to recruit active loyal players. But in the Mail.Ru Children project we are faced with the fact that forum users spend little time on the main site and can only help us in testing the forum itself.
Mailing List. You can send your users a letter - an invitation to participate in the study. This option is good because the newsletter can be targeted according to certain criteria (activity in the project, geography, etc.). Try to make the text of the letter personal and informal enough, it will increase the number of responses. Usually, in a letter we give a link to an online survey that checks whether a person is eligible by criteria, and at the end of a survey we suggest leaving contact details. But be prepared for the fact that the response will be small (the percentage depends on the loyalty of the audience, but I would not expect more than 2-3%). In addition, even those who left the phone may end up not agreeing to participate. And agreeing to be scared and not to come to the interview. The first time we were very surprised when the turnout in this form of recruiting was about 50% of the appointed interviews.
Banner / form on the site. Also a good option for recruiting. The banner, like the link in the letter, should lead to an online questionnaire, where you check the selection criteria and collect contact information. It's great if you can also show the banner only for the audience you need to study: for example, in the section of the site that interests you or after certain user actions.
Interest groups. In the case when you do not have your product yet or you are interested in the users of competitors, you can try to recruit respondents in thematic communities and groups (with the consent of their administration). For example, to test some games, we published ads on the Games.Mail.Ru news project or in the Cc-combo Breaker VKontakte group.
"Snowball". Field recruiters from agencies look for respondents using this method. They ask people who have already been invited to the study if they have any acquaintances who would agree to also participate in such an event. In order for this method to work as the main search method, you need to have a fairly large contact database and spend a lot of time. However, no one bothers you to ask people whom you have already found, if they have suitable friends, and to recruit additional respondents. The main thing is to make sure that they will not tell each other about what happened during the study, so as not to spoil your results. Also, it is difficult for familiar people to call for one study if you are not just testing the product, but collect data on different patterns of behavior. For example, students from a student group are likely to tell you about the same thing about how they use their email to study.
Sites to find a part-time job. On sites like YouDo or Workzilla, you can easily find people looking for a one-time job. However, I would recommend using this type of recruiting only if the rest do not work. On these sites there are many people who earn by going to surveys, and are more likely to try to lie to you.

Field recruiting

For simple and fast tests that take no more than 15 minutes, you can recruit and research immediately in places with large concentrations of people. These can be cafes or shopping centers. This option is suitable if you have a wide enough audience. For example, in a shopping center, we found associations with different variants of the name for a single project from young people. If the product allows, you can search for places where your target audience is concentrated. For example, we conducted small game studies in computer clubs. To hear fewer bounces, prepare a small reward (money or souvenir).

Organizational issues

No matter how you look for respondents, there are a number of organizational points to keep in mind. Sometimes even the little things can disrupt the interview.

Warn about what to do. As a rule, respondents are very poorly aware of what will actually happen during testing. It is important that the person was ready to be recorded on video, he knew how long the event would take. Be sure to warn about any documents that you have to sign (consent to the processing of personal data, non-disclosure agreement). We had a case when a woman left the test, because she was not ready to sign an agreement. Also, be warned if any non-standard equipment will be used, such as eytracking or psycho-physiological sensors.
Additional checks. As already mentioned in the recruiting block with the help of an external agency, it is better to check the respondents' data. This recommendation applies to any type of recruiting. , , . , ( , ), . , , «everybody lies».
. , . , . , , , .
«, !» 18 , . . , , , , .
. -. , . , , - , . .
/ . . , . , , . , , . , Wi-Fi.

Conclusion

— . , «», , , . , . , , . .

Source: https://habr.com/ru/post/304720/

All Articles