Now the topic of online education is popular: everyone heard about Coursera, Udacity, EdX. These are excellent educational platforms containing many useful courses. But can they be made more intelligent? In general, research on intelligent tutoring systems has been going on for a long time and scientists have something to offer to programmer practitioners. In this article, in a popular science form, the results and conclusions obtained by the scientific community on the construction of a specific type of intelligent learning systems are considered . The issues of building subsystems for checking the solution of problems, models of a student, algorithms for managing the learning process are touched upon.Introduction
There are widespread tutorials with tests containing questions with choices of answers, including one correct and the rest incorrect, containing typical errors. The theoretical basis for the creation of such systems in the 50s. XX century developed a famous psychologist B.F. Skinner and researcher N.A. Crooder The concepts they proposed were repeatedly criticized. In particular, critics pointed out that not only the answers should be controlled, but also the paths leading to them. Well, really, because the main goal of the training is not to memorize the correct answers, but to form rational methods for solving typical problems of the studied subject. Therefore, scientific thought went a new way. First, scientists began to create training programs that can recognize not only the final answer, but also assess the course of the student's reasoning when performing the task (see part 1 of this article). Secondly, scientists began to develop tools for measuring the characteristics of students who are important for managing the learning process (the so-called “student models”, see part 2) and the learning process control algorithms (see part 3).
1. "Tracking" intelligent tutorials
An interesting and perhaps the most promising type of intelligent tutoring programs are “tracking” intelligent tutorials. “Followers” ​​are called training programs intended for teaching natural sciences (such as mathematics or physics) that are capable of
- evaluate each step of the student’s decision as “right” or “wrong”,
- provide a hint indicating what is wrong with the decision step just introduced or what needs to be done next,
- put a mark for the decision.
Such programs are called “followers” ​​because to check the progress of the student’s decision for completeness and correctness, they compare the steps of the student’s solution with the steps of their solutions. The solutions available in them can be generated automatically by some algorithm or entered into the database by the teacher.
')
Perhaps the most well-known and most developed “tracking” intellectual tutorial is Andes Physics Tutor [1-2] (Fig. 1). Its knowledge base includes several sections of physics: “static”, “kinematics”, “work and energy”. Trainees enter decision steps in special fields. If the decision step is correct, the program tints the corresponding formula in green, if the wrong one, then in red. In the lower left part of the program window, hints received by the student are displayed.

Fig. 1. “Tracking” intelligent tutorial Andes Physics Tutor.
How is solution verification implemented in such training programs? The developers of Andes Physics Tutor propose to solve this problem in two stages:
- validation of the formula entered by the student,
- measuring how far the learner has progressed after entering the solution step, i.e. measure progress in solving.
It is precisely on the basis of information about correctness that the formulas entered by the student are colored by the program in red or green. Information about the progress in the decision will be used by the program at the time of forming tips and scoring for the decision.
Testing for correctness is very simple: in the formula entered by the student, you need to substitute the numerical values ​​of the variables contained in it. If, as a result of the substitution, an identity is obtained, then the formula is correct. For example, if the student introduced the formula
a = b + 2 , and from the conditions of the problem it follows that
a = 4, b = 2 , then, since
4 = 2 + 2 , then the formula entered by the student is correct.
Measure progress in solving the problem is much more difficult. The easiest way to measure progress in a solution is to present the problem solutions known to the program as lists of formulas and compare the formulas entered by the student with the formulas from these lists. Then, to measure progress, you will first need to select one of the known program solutions that is closest to the learner’s solution, and then see what percentage of the formulas it contains are implemented in the learner’s decision steps. The greater this percentage, the greater the progress in solving the student. Unfortunately, when using such a “naive” method of storing and processing information about problem solutions, even for the simplest task, it is necessary to introduce too many possible solutions that differ from each other in one or several formulas. Therefore, the developers of Andes Physics Tutor took a different path.
In Andes Physics Tutor information about possible solutions to problems is stored in the form of a small list of "basic" equations. The solution steps introduced by the trainees are also considered as equations. To determine which "basic" equations were used by the student to enter one or another solution step, a special algorithm is used, during which the systems of equations are solved and partial derivatives are calculated. Meanwhile, this algorithm is not able to cope with all the steps taken by the students. The more variables are replaced by numeric values ​​in the step of solving the learner, and the more computationally minimized it is, the more difficult it is to understand which “basic” equations the learner used when entering the solution step.
Suppose that during the solution of the problem the student needs to calculate the value of
a , which can be found in two ways:
a = b + c ,
a = d / 2 . If the trainee simply enters
a = 6 , then how to understand how he used it? In addition, some trainees can “cheat”: for example, if they work with a tutorial in a computer class and heard from a neighbor that
a is 6 and somehow depends on
b , then they, knowing that
b = 2 , can Enter the solution step of the form
a = 8 - b . In this case, as a result of the work of the progress measurement algorithm, it will still be concluded that the student has entered the formula using one of the above two methods. Of course, you can come up with heuristics that will clarify the situation in some cases, but not in all. Therefore, it cannot be argued that, at any time, algorithms for checking the solutions of training programs with 100% certainty will be able to recognize any step of the solution of the student. Rather, it is only possible to ensure that the training programs can with high probability correctly recognize the majority of the steps of the students' decisions for specific classes of problems from certain natural science disciplines.
2. Student models
Another way to intellectualize training programs is to use student models. Recall that the model of the learner may be called a means for measuring the characteristics of the learner, important for managing the learning process, as well as the results of measuring these characteristics. Trainee models are of two types:
- reflecting the level of knowledge and skills of the student,
- characterizing the mental state of the student during the execution of tasks in the training program.
Overlay models are most often used to
characterize the level of knowledge and skills of students . In the case of the overlay model, it is assumed that the knowledge representation that the expert has coincides with the learner's representation, except that the learner’s knowledge is less complete (see Fig. 2). Expert knowledge is divided into simple and small parts. The student either knows each particular part, or does not know (or knows to some extent).

Fig. 2. Overlay model of the learner (those parts of the expert’s “knowledge” that are present in the learner are also painted over).
Currently, overlay models are most often implemented in the form of hierarchical structures, including the totality of all the concepts of the training course and / or skills corresponding to this course. In fig. 3 shows the hierarchical structure of the concepts of the subject “Geometry”, which formed the basis of the model of knowledge and skills of the student. The nodes of the structure correspond to the definitions, or axioms, or theorems. The arrow leading from one node to another indicates the connection between the portions of theoretical material corresponding to these nodes. This connection can be interpreted as: “before studying
A , you need to know
B ”, or so: “if
A knows, then
B also knows”. Taking into account the connections between portions of theoretical material allows to reduce the number of exercises, the results of which calculate the level of knowledge and skills of the student.

Fig. 3. Fragment of the hierarchical structure of concepts from the subject area "Geometry".
Each node of the structure is associated with the label “learned” or “not learned”. A label change can occur after a student reads theoretical material or performs a practical task. Various algorithms and methods can be used for calculations that implement label changing: Bayesian networks, fuzzy logic methods, etc.
To
build a model of the mental state of the student , various data sources can be used - video cameras (with their help facial expressions are recognized), sensors measuring pulse rate, etc. An imperceptible source of data acquisition for the student and, accordingly, the most convenient for practical use is the user's work history in the training program [3-4] (see. Fig. 4).

Fig. 4. A fragment of the student’s work history.
As an example of a model for diagnosing a student’s mental state in the history of his work, we will describe a model developed in the laboratory of 17 IPU RAS with the participation of members of the faculty of psychology at Moscow State University [5]. In this model, it is assumed that the student’s mental state is characterized by the values ​​of three indicators: “Independence”, “Efforts”, and “Frustration Behavior”. Values ​​are recalculated every n seconds (for example, you can take n = 300 seconds, i.e. 5 minutes). The values ​​of indicators for a certain period are mainly formed on the basis of various numerical characteristics of various events that occurred during this period. These events can be simultaneous (for example, “the learner checked the decision step for correctness, the step was correct”) or long (for example, “the learner read the reference material”). Numerical characteristics of events can be “number”, “average duration”, “total duration”, etc.
When building a model, an event “inaction in a training program longer than 7 seconds” turned out to be an interesting event in the history of a student’s work. It turned out that the occurrence of this event in the history of the student’s work can speak both about the student’s state of learning and about his very undesirable state. Most often, the unfavorable state of the student manifested itself after he entered the step of solving the problem and received a message from the program about its incorrectness. In this case, he stopped working in the program and tried to get help from neighbors or teachers, or even fell into a state of stupor for a while. The favorable state of the student manifested itself after the program marked the decision step entered by him as correct. Then he began to work on the next step of the solution and began the calculations with the help of paper and a pen (respectively, at that time no actions of the student were recorded in the program). Therefore, to measure the trainee’s mental state, events such as “the number of inaction events in the program are longer than 7 seconds, preceded by the input of the correct decision step”, and “the number of inaction events for longer than 7 seconds, which were preceded by the input of the wrong decision step” were taken into account.
Before using the model, it is necessary to adjust it, that is, to select the coefficients of the functions with the help of which the current values ​​of the indicators “Independence”, “Efforts”, “Frustration Behavior” are calculated. Without going into details, we note that the coefficient selection process is iterative, and one of the main steps of the coefficient selection algorithm is minimization of the error function, reflecting the magnitude of discrepancies between expert estimates and estimates that were generated by the model with coefficients obtained as a result of the previous iteration on some part experimental data.
Let us explain the procedure of data collection, on the basis of which the above described model was set up. To collect data, an experiment was conducted in which volunteer students solved one or two tasks in a training program. After the experiment, expert evaluations were collected. Experts evaluated the mental state of students on the basis of video recordings, in which both the student’s screen recording and the student’s face were available at the same time. Every 5 minutes the playback of the student’s video recording stopped - the expert had to enter the student’s assessment into the special fields of the program window. Then, for each 5-minute fragment of the history of each student’s work, a vector was formed, one component of which corresponded to the expert assessment of the student’s mental state during the period in question, and the other components to the numerical characteristics of events from the history of work that occurred during the period under review. On the basis of these data (a set of vectors), the coefficients of the model were selected - using the known methods of machine learning.
3. Management of the educational process using student models
Now that we have learned about the various types of learner models and how to build them, let us return to the issue of intelligent learning management. Immediately the question arises, at what points in time can the learning process be managed and how can one influence the student - the program user? First, you can intelligently select educational material (theory and / or tasks to solve) for the next lesson, taking into account the knowledge and skills of the student, recorded in the relevant model. Secondly, it is possible to carry out interactive support of the task solving process in the training program. The “smart” control algorithm will choose the time of appearance and the frequency of such tutorial actions, such as:
- the provision of assistance (for example, in the form of brief text hints on the next predictable task step or in the form of links to theoretical material that will be useful in solving this problem),
- refusal to provide assistance at the request of the student,
- recommending another educational material instead of the problem to be solved (for example, a simpler task if the student cannot cope with the current task),
- recommending the temporary completion of work in the program (as a reminder of the need for periodic rest or stating the fact that “the student is not in shape today”),
- display of various motivating messages (for example, “you have almost completed this task!”).
But how “persistent” and “decisive” should such an algorithm be? To answer this question, let's look at the results of Benedict de Boulay's research [6]. Benedict de Boulay is an English professor, under whose guidance a lot of research was done on the automated management of students' emotions while working in a training program. He and his colleagues drew attention to the fact that students of technical specialties may negatively relate to the automated management of the learning process, especially in those cases when they are faced with a refusal to provide assistance. Some of the most disgruntled by the behavior of the program of students who participated in the experiments of de Boulay, even said something like "this is just a program and she must do exactly what they say!"
As we see, the success of the automated management of the educational process strongly depends on the degree of the student’s faith in the intellectual program. This belief essentially depends on how well the program understands its actions, in particular, what exactly it introduced at one time or another to solve the problem as a step to solve it. As mentioned earlier, it cannot be argued that at any time the test decision algorithms of training programs with 100% certainty will be able to recognize any step of the student’s decision. Therefore, the algorithm of automated learning management should only be of a recommendatory nature. It should also be possible to disable it at the request of the student.
Meanwhile, in cases of irrational behavior of students (for example, abuse of short text prompts by students while solving problems), it is still possible to influence the learning process, albeit with the involvement of additional forces in the person of the course teacher. The teacher can “fine” students on the basis of automatically generated reports on their behavior in the training program. Thus, intellectual tutorials can significantly facilitate this process for both the student and the teacher, but in the case of “cunning” students with the wrong motivation (when it is not interesting to study the subject, but it is interesting to get good grades) it will be impossible to do without the participation of the teacher.
useful links
As a book, in which a fairly complete and relatively recent review of the state of research on intelligent learning systems is given, I recommend the book Woolf, Beverly Park (2009). Building Intelligent Interactive Tutors. Morgan Kaufmann. ISBN 978-0-12-373594-2.
The Intelligent Tutoring Systems conference takes place almost every year.
ITS 2010 was held in Pittsburgh (USA), ITS 2012 - in Crete. Periodically, the society holds conferences
AIED .
- Shapiro JA An Algebra Subsystem for Diagnosing Students' Input in a
Physics Tutoring System - VanLehn K. (et al.) The Andes Physics Tutoring System: Lessons Learned
- Baker RSJd (et al.) Towards Sensor-Free Affect Detection in
Cognitive Tutor Algebra - Baker RSJd (et al.) Labeling Student Behavior
Faster and More Precisely with Text Replays - Smirnova N.V. Automated analysis of the mental state of students on the history of their work in the tracking intellectual learning system
- B. de Boulay, Soldato T. Implementation of motivational tactics in
tutoring systems