Use code highlighting

This is a review and partial translation of Advait Sarkar from The Computer Laboratory, University of Cambridge, “The Impact of Coloring on Program Comprehension” .

Summary

They took 10 random "computer science" master students from the University of Cambridge (in fact, 7, as 3 points were partially incompatible with the Tobii2 X120 eye tracker equipment). They offered to deal with short computational tasks in Python (it was necessary to name the correct result of the code execution). Measured the time of task execution. They did not allow me to write anything down on a piece of paper (only to think it is possible out loud). They followed the eyes with the help of the mentioned oculographic system.

In the end they asked to evaluate their experience in programming (yes, given the possible effect of Dunning - Kruger).
')

findings

Illumination helps to understand the code faster. On the example of 6 tasks with a total solution time of 13 - 20 minutes:
- 8.4 seconds difference between the medians of the task execution time (with the hypothesis that the effect becomes more noticeable for large tasks)
- significant (by 23 pieces) reduction in the number of switching of attention (transfers and fixing from the reading place to a different area of the task)
aid effectiveness is inversely proportional to qualification (but non-linear, i.e. it is not established that the highlight will stop helping in some time)
only works if you know what color is highlighted
the brain can ignore the hints of the backlight if you are “looking for a free search” (i.e. the backlight does not interfere with thinking)

A vivid example of how the backlight allows you to focus on the content of the code, paying less attention to familiar keywords:

In general, all is well, all colors!

Graphs, figures and more detailed conclusions from the article:

1. Time to complete tasks

The histogram illustrates the comparison of the execution time of a specific participant of a task with highlighted and ordinary code.
Considering the resulting abnormal distribution, the medians of problem solving time were calculated ( WSRT : T = 136, p = 0.047). The difference was 8.4 seconds in favor of the backlight.

2. The number of "switching attention"

The switching of attention was determined as follows: it is a fact of fixing attention (translating and delaying the gaze at a certain time) to an area that is different from the area of the previous fixation.

For example, at the moment of reading the code, the translation of the glance to the “conditions” section (usually to the area with the values of the arguments) and the return to the code are considered as 2 facts of switching attention.

In this histogram, we see a comparison of the number of "attention shifts".

Everything is the same: each pair of pillars is the result for a specific participant and a specific pair of tasks (with highlighted and regular code).
There is less data, as it was necessary to exclude the indicators of the participants with glasses: the accuracy of recognition of eye movements suffered because of the glare of the lys.

The medians of the “number of switches” differ by 23 switches in favor of the highlighted one when (WSRT: T = 13.5, p = 0.045).
For other parameters of oculographic analysis, which are supposedly necessary for understanding and solving the problem (in particular, the duration of fixing attention on objects, the number of facts of fixing a look at an object, the number of appeals to the description of the problem), it was impossible to determine the degree of influence of the backlight.

3. The effect of qualifications

To compensate for the abnormal distribution of data, the “time advantage” value was used for the analysis: the ratio of the time the task was completed with an unlit code to the time of its duplicate with the highlighted code. Each point in the diagram is the execution of one pair of tasks by one participant.

The x- axis participants are sorted by level of competence. Note that the y axis is logarithmic: only after its logarithmic normalization did the linear correlation become noticeable (r = −0.39, p = 0.033). Technically, this means that for novice programmers, code highlighting is more important than for experienced ones. However, this is only a correlation, which means that this conclusion may be a consequence of the brevity of the tasks and a causal relationship in this case requires additional research.

Conclusions and assumptions

This small study quantifies the intuitive idea that code highlighting helps to understand it faster.

Slender theoretical substantiation of this fact ~~yet~~ .

A simple hypothesis looks like this: “general mental efforts”, which are necessary for understanding the unlighted code, are more necessary for understanding the similar highlighted code due to the fact that the backlight contains an additional semantic level in the form of a color code.

These greater efforts are likely to cause additional overhead and push some elements out of working memory ... such as the values of the input arguments (which can also explain more of the facts of switching attention with unlit code).

An additional example confirming this hypothesis is the data showing that syntax highlighting allowed some participants in the experiment to focus on a smaller area of the code (as shown on the heat map of gaze facts):

Source: https://habr.com/ru/post/271841/

All Articles