Research software in universities in the UK

From translator

In this article, briefly about:

How many researchers use research software in high schools in the UK?
What kind of software is used?
how many researchers develop their research software, how many are just users of research software, how does this depend on the discipline, their gender and other factors?
What computer operating system is chosen by developers and users of research software?

You will also find a link to the file with “raw” and detailed research results, such as a list of universities studied, the number of people surveyed from each university, their area of work, and so on. This will be especially useful for those who wish to independently analyze the results.

This note is a translation of the publication “It is impossible to conduct research without software - noted 7 out of ten researchers of the United Kingdom” (English It is impossible to conduct research without software, say 7 out of 10 UK researchers ) by author Simon Hetrick (Eng. Simon Hettrick ), in which he spoke briefly about the results of a statistical study of software used in scientific studies of several universities in the United Kingdom. Simon is the Deputy Director of the Institute for Software Compatibility (Eng. Software Sustainability Institute ), on behalf of whom the University of Edinburgh and conducted these studies (Eng. The University of Edinburgh on behalf of the Software Sustainability Institute).
')
This literary translation from English to Russian. If there are tips (or you see an error), then please inform me about them. After the publication of this translation, I will send Simon a link to this publication (I already have permission from the author to translate and use the charts from his report).

Acknowledgments (eng. Asknowledgements):
I would like to thank Simon Khitrick from the EPSRC-supported Program Compatibility Institute for his help in preparing the material (“Using Software Sustainability Institute ”).

Further in the text - translation of the post of Simon Hitrick.

No one knows how much software (software) is used in scientific research. Look at any laboratory and you will find both standard and specially written software that is used in all disciplines and researchers at all levels. Software is clearly a fundamental component of research, but we cannot confirm this without evidence. And this lack of evidence is the reason why we conducted research on the research software used at fifteen universities of the Russell group.

Highlights

92% of respondents use research software;
69% of them said that their research would be impractical without such software;
56% develop their own software (but worried that 21% of them do not have training in software development);
70% of male researchers develop their software and only 30% of female researchers do it.

Data

Data collected from this statistical study is available for download and licensed under the Creative Commons by Attribution license (attribution to the University of Edinburgh on behalf of the Program Compatibility Institute).

Software is much more important for research than anyone knows.

If we do not know how much we rely on software, then we cannot and be sure that researchers have the tools and skills necessary for them to remain research leaders. For the first time, on such a scale, we collected data on the use of research software, its development and the level of training of researchers. In addition, we collected and demographic data, so that we can answer such a question as “Is the most likely to develop software by the male than the female?” (The answer, as it turned out, is yes, but women and men use the research software in equal shares) ).

our team

Thanks to Mario Antonioletti (Eng. Mario Antonioletti - approx. Translator), Neil Chue Hong (Eng. Neil Chue Hong - approx. Translator), Steve Crouch (Eng. Steve Crouch - Approx. Translator), Devasena Inupakutik (Eng. Devasena Inupakutika - comment of the translator) and Tim Parkinson (English Tim Parkinson - a comment of the translator) for their help in creation of interrogation, development of necessary program codes and the analysis of results. Also, thanks to the full members (Institute - approx. Translator) for being “guinea pigs” for the period of working with the drafts of this survey.

Survey scale

The survey results described here are based on the responses of 417 researchers, who were randomly selected from fifteen universities from the Russell group. We have achieved a good representativeness among various disciplines, levels of researchers and genders. This number of respondents is statistically significant in order to represent the opinions of employees of those British universities in which scientific activity is at a high level.

Restrictions

The main problem of the “blind” survey is that it must be short in order for us to receive the maximum number of answers from respondents. This means that we had the opportunity to reveal the facts about the use of software, but we had no place to investigate individual cases. We will conduct further research to achieve this goal.

Note translator to the above diagrams:
in the upper left corner - percentages of using or not using research software;
in the upper right, the frequency of responses to the proposed variants of the question “What can happen if the research software cannot be used?”;
in the lower left - the percentage of respondents who are developing or not developing their research software;
in the lower right - the percentage of those developers who received or did not receive training in software development.

How many researchers use software in research?

It would not be an exaggeration to say that software is vital for research. If software had disappeared from research in some magical way, then 7 out of 10 researchers would be left without work.

92% of respondents indicated that they used research software. Moreover, 70% of respondents indicated that “carrying out my work would be impractical in the absence of such software”.

Dependence on the level of respondents

The use of research software depends little on the professional level of the respondents.

To measure the professional level is difficult. Therefore, we simply asked respondents how many years they had worked in the field of research. Variation in different groups is not significant and amounts to 12%.

98% of those who worked in studies of 6-10 years, noted that they used the most research software and of those who worked for more than 20 years, noted the least use at 86%.

The first two categories are those who have worked for less than a year and those who have worked from one to five years noted that 91-92% of them use such software. The use of software in research reaches a maximum of the next ten years and then decreases in groups of researchers with work experience of 15–20 years and longer.

There are several ways to explain this variation. Unfortunately, they can not be confirmed by our results. It seems that researchers with primary and secondary professional levels are the "workhorses" in research and, most likely, can generate the most results, and, therefore, most likely use research software. As soon as the researcher reaches a higher level, there is a tendency for him to exercise management functions, which reduces the likelihood of his use of research software.

What software is used?

Many different products are used: we have registered 566 different software, some of which are mentioned only once by the respondents, but some are mentioned more often. The most popular Matlab packages (20% of the respondents use it), R (16%), SPSS (15%) and Excel (12%). In order to show the list of used software packages in the form of a diagram, we have collected a tag cloud shown at the top of this page.

Many researchers develop their software products, even without a sufficient level of training.

Not only proprietary products are used. Many researchers write their own codes - these are 56%. This is great news because the real power of software is in development to do more work in less time and still make new research possible.

Many researchers are developing their own research software, but is this development in safe hands?

55% of respondents received some training in software development (15% through self-study and 40% through attending relevant courses). It is worrying that 21% of the respondents among the developers have no training in software development. This is every fifth.

Computer programs that are developed without adequate training will most likely not be those that can be relied upon. Researchers, by their nature, are intelligent people who absorb new skills quickly. But there are many pitfalls in the development of good programs (for example, the results of the software used, in the future, will not lead to the recall of already published works). And this is only about reliability! We need results that can be protected in the future, which require a lot of skills related to the writing of the recreated code. We also want to protect research investments that require even more skills to write programs that can be used in the future.

Variation from discipline to discipline

Information about the main sponsor of a particular study is a convenient way to divide respondents into separate disciplines. About half of the respondents were funded by the EPSRC (organization and physical research council - approx. Translator), from university funds and from other sources (which represent a wide range of funding organizations from private to foreign). The other half of the respondents were divided in fairly equal shares between the remaining research councils, EU funds and large charitable organizations.

The use of research software is almost homogeneous among the respondents and, regardless of their funding, is approximately in the range of 87-100%. A notable exception was shown by respondents who have AHRC as the main funding organization, of which only 60% use research software.

Differences in the results begin to appear when we look at the respondents who write their own programs. These respondents can be divided into three groups. The leaders are researchers funded by STFC and EPSRC , among which, 93%, 90% and 79%, respectively, develop their own research software. The next software development group is approximately 50%. This group includes researchers who are funded from other sources. The third group consists of respondents funded by the National Institute for Health Research (31%), industrial organizations (17%) and AHRC (10%).

It is probably not surprising that the percentage of researchers who received training in software development, in some form, follows the percentage of those who develop software. The variation between these two categories is within ± 10%.

Software development costs are not included in the project budget.

Many researchers believe that the inclusion of software development costs in a project proposal will make this proposal weaker. We received feedback from research councils which suggests that this is not the case - this is what we are trying to convince the researchers. But we can give up in this quest.

When we asked those researchers who are responsible for writing project proposals whether they included software development costs, 22% answered that they did it, 57% answered that they did not, and 20% answered that they didn’t even think that software development can be part of the budget! (note that rounding errors lead to the fact that the sum of the percentages of these groups is equal to 99%.)

Differences in software usage by gender

36% of respondents were female and 62% male. The rest went to those who chose their gender as “other”, “I prefer not to report” or left this question unanswered (the answer to the question about the field of respondents was not mandatory).

There is no difference in the percentage of software use between male and female respondents - 92% for both groups. This is exciting news!

Differences in software development by gender

Although there is no difference in the use of research software among representatives of different genders, there is a huge difference as soon as it concerns its development. 70% of male respondents develop their research software, while only 30% of female respondents do it.

This male dominance in the design is reflected, as might be expected, in the amount of training. Only 39% of female respondents received training in software development compared to 63% of male respondents.

What can be said on the basis of information on the choice of the operating system of computers of developers?

There is a difference, though not significant, when it comes to simply using research software: 88% of users of the Windows operating system are also users of research software, compared to 93% of OS X and noticeable 98% of Linux.

When it comes to developing research software, the difference becomes noticeable. Only 41% of Windows users develop research software, which again came down behind the number of OS X users from their 53%. Users of the Linux family of operating systems themselves: 90% of them develop their own research software.

There is a potentially important lesson for the software community. If you want users to use your software, then it is better to be sure that it works in OS X and Windows, as well as in the native Linux environment.

How did we collect the data?

We needed results that could represent the research community. So we conducted a study and contacted thousands of randomly selected researchers at each of the 15 universities of the Russell Group. After 15,000 invitations for the study, we received 417 responses, which represents 3%, which in turn is quite normal for a “blind” study.

We asked respondents about the “research software”, which was definitely as follows:

Software that is used to generate, process or analyze the results that you plan to put in a publication (in a journal, conference publication, monograph, book or abstract). Research software can be anything from a few lines of self-written code to a professionally developed software package. Software that does not generate, does not process, or does not analyze the results, such as text editors, or used to search the Internet, is not considered research software in this study.

We used Google Forms to collect responses from respondents. Subsequently, the results were transferred to Excel files for analysis and uploaded to Google Drive for further distribution.

Source: https://habr.com/ru/post/245171/

All Articles