📜 ⬆️ ⬇️

Are you already using R in business?

This publication does not contain any code or pictures, since the essence of the question is somewhat broader, and you can always answer specific questions in the comments.


Over the past couple of years I have been able to apply R to solve very diverse tasks in different verticals. Naturally, the application of R obviously implies the solution of problems associated with one or another mathematical processing of digital data, and the diversity of tasks was determined, first of all, by the very subject area in which these applied problems arose. Partially separate tasks were briefly mentioned in previous publications. Different subject areas, from the earth (AIC) and ending with application for applied tasks using aircraft, up to space.


Accumulated practice suggests that the initial credit of trust in R, the accompanying ecosystem and community was completely justified. There was not a single case that could not be solved by means of R in a reasonable time.


Independent confirmation of this thesis can be obtained by observing the exponential growth of the successful use of R in ordinary business (not IT) in the West. For example, almost half of the reports from the EARL 2017 conference (Enterprise Applications of R Language) , held in September of this year, contain cases on how to use R to solve business problems. The reports have examples of data analysis in real estate, automation of auditors, analysis of transport systems, sewage system analysis and many other industries ...


Business cases, when the use of R is justified, can generally be characterized as follows: for a set of heterogeneous internal and external sources, it is necessary to quickly obtain information on potentially problematic places that require human intervention. It is also desirable to provide the entire set of informational cuts and presentations that help a person make the best decision .


It is clear that in such a formulation of the task it is required to give answers not only to standard questions, but also to be ready to quickly provide everything necessary for a one-time request. The emphasis shifts somewhat from the methodical pereplepichivaya all the information stored in the corporate system, to the local composition of relevant to the context of the question elements from different data sources.


What functionality is usually in demand?


  1. import data from various sources. txt \ csv, xls, web scrapping, RDBMS.
  2. the simplest data processing (grouping, aggregation).
  3. time analysis (as a rule, 80% of the data are accompanied by timestamps).
  4. advanced processing (elements of higher mathematics, including elements of machine learning); The most popular search for anomalies, various classifiers, recommendations and forecasting and the current trendy topic “process mining”.
  5. visualization in ways X, Y, Z, (enter the missing).
  6. integration with external information systems for exporting calculated data.
  7. export to human-readable formats. pdf, html, xls, doc, ppt.
  8. web-base workplace for analyst \ ordinary user.
    This functionality is available within the R ecosystem without the special need to install any additional third-party components. The optimal open-source suite looks like this:
    • RStudio IDE - for development and ad-hoc analysis;
    • CRAN \ GitHub packages - to expand functionality in the context of the task;
    • Shiny Server - to create interactive web-based analytical applications.
    • Plumber API for publishing R-analytics functions for use by third-party applications.

All of the above has been relatively extensively discussed in previous publications .


Using R allows you to postpone worries about the physical implementation. Practical confidence that any business needs can be realized allows us to focus on the most important - business needs, technological and business processes, physical limitations (if we are talking about the real sector of the economy), to delve into the subject area. Freedom from limited IT technologies and products!
And it often turns out that it is not necessary to listen to users, but to interact with a technologist, and study the physics and chemistry of processes in order to understand the real problem area and offer a more adequate solution.


From a business point of view, the R toolkit can be considered almost perfect and this is why:


  1. There is no financial barrier to start using:
    • Do not need any initial investment in the license.
    • There are no licensing restrictions and potential expansion issues.
    • There are no annual fees for licensing support.
    • Everything works fine on linux, no need to purchase additional operating systems.
  2. If external systems provide the necessary information, then this is enough to start the project. Related development projects are not required, everything can be done at the analytic level.
  3. There is already a proven practice of applying R in business in virtually all verticals.
  4. There is no need to plan a global project, just start with private problem areas. Projects are compact and fast, results are easily converted into money (earned or saved). The results obtained allow us to look at the existing tasks from a different angle, to detect real problems and to place accents in a more correct form.

However, as always, there is a fly in the ointment.


In the West, R and Python are skating on the tasks of working with data. Any interested person has even heard about these languages ​​/ platforms. In Russia, about R knows and heard a vanishingly small group of people. Step to the left, step to the right - and we find ourselves in the world of 1C, C ++, Java. Difficult, long, expensive. Endless development, strongly limited in functionality “fat client”.


Western R community can be considered formed. Russian R community cannot appear from nowhere. Maybe it makes sense to look around and try to solve problems in a different way? After successfully solving several business problems, it will be difficult to force yourself to return to the old methods. Too much change will be.


Previous Post: “Digital Economy and Ecosystem R”


')

Source: https://habr.com/ru/post/340316/


All Articles