install.packages(“Name_Of_R_Package”) .dplyr, ggplot2, reshape2 . Of course, this is not a complete list. In this article we will focus more on the packages used in machine learning. dataset <- data.frame(var1=rnorm(20,0,1), var2=rnorm(20,5,1)) dataset[c(2,5,7,10),1] <- NA dataset[c(4,8,19),2] <- NA summary(dataset) 
install.pckages(“mice”) require(mice) dataset2 <- mice(dataset) dataset2<-complete(dataset2) summary(dataset2) 
rpart : let's divide the datarpart package in the R language is used to construct classification and regression models using a two-step procedure, and the result is represented in the form of binary trees. The easiest way to build a regression or classification tree using rpart is to call the function plot() . By itself, the function plot() may not give a fairly beautiful result, so there is an alternative - prp() - a powerful and flexible function. prp() in the rpart.plot package rpart.plot often called the real Swiss knife for building regression trees.rpart() function allows you to establish a relationship between the dependent and independent variables to show the variance of the dependent variable based on the independent ones. For example, if an online training company wants to know how sales (dependent variable) are affected by promotion in social networks, newspapers, referral links, word of mouth, etc., there are several functions in rpart that can help with the analysis of this phenomenon. rpart(formula, data=, method=,control=) Here, the formula contains a combination of dependent and independent variables; data is the name of the data array, method depends on the target, i.e. for a classification tree, this will be a class; control depends on your requirements, for example, you need a variable with a minimum value to separate the vertices.iris dataset, which looks like this:
rpart_tree <- rpart(formula = Species~., data=iris, method = 'class') summary(rpart_tree) plot(rpart_tree) Here is what a built tree looks like:
predict(tree_name,new_data) , which will predict(tree_name,new_data) predictable classes as a result.PARTY : let's divide the data againPARTY package in R is used for recursive separation and displays continuous improvement of ensemble methods. PARTY is another package for building decision trees based on the conditional inference algorithm. ctree() is the main function of the PARTY package, it is widely used and reduces training time and possible deviations.PARTY has a syntax similar to other predictive analytics functions in R, i.e. ctree(formula,data) The function will build a decision tree, taking the default values for numerous arguments, you can change them if necessary. party_tree <- ctree(formula=Species~. , data = iris) plot(party_tree) 
CARET : Classification And REgression Training (classification and regression training)CARET - Classification And REgression Training (classification and regression training) is designed to combine training and forecasting models. There are several algorithms in the package that are suitable for different tasks. A data analyst cannot always say exactly which algorithm is best for solving a particular task. The CARET package allows CARET to choose the optimal parameters for the algorithm using controlled experiments. The cross-search method implemented in this package searches for parameters by combining various methods for evaluating model performance. After going through all the possible combinations, the cross-search method finds the combination that gives the best results.CARET package is one of the best in R. The developers of this package understood how difficult it is to choose the most suitable algorithm for each task. There are cases when a particular model is used, and there are doubts about the quality of the data, but still the problem most often turns out to be in the chosen algorithm.CARET package, CARET can execute names(getModelInfo()) and see a list of 217 available methods.CARET uses the train() function. Its syntax is: train(formula, data, method) Here method is the prediction model you are trying to build. Let's use the iris data array and the linear regression model to predict Sepal.Length. Lm_model <- train(Sepal.Length~Sepal.Width + Petal.Length + Petal.Width, data=iris, method = “lm”) summary(lm_model) 
CARET package not only builds models, but also breaks the data into test and training, makes the necessary transformations, etc.Source: https://habr.com/ru/post/305692/
All Articles