Habr, hello! We analyzed big data cases in which big data technologies helped companies to more effectively work with clients or optimize internal processes.
By the way, very soon we will start the first set of the Big Data for Executives program, the goal of which is to prepare the manager or business owner to use the data in his work. You can read more about it
here .
Customer orientation
1. Company: Bookmate.
Industry: Subscription content provision - e-books.
Bookmate is a Russian service for reading electronic books by subscription on mobile devices, has more than 3 million users worldwide. Together with the company
E-Contenta, the company managed to solve the “cold start” problem - recommendations to new users who have not yet selected any books in the application. To offer books to new users, a recommender system was developed using external data — social network data and DMP (history of clicks, search queries on the Internet, and other data on user behavior).
')
Result: the number of views of recommended books by new users increased 2.17 times, conversion to paid users increased 1.4 times.

2. Company: BikeBerry.
Industry: retail, online store.
BikeBerry.com - American online store of bicycles, motorcycles and parts and accessories for them. With the help of
RetentionScience , sophisticated machine learning algorithms and statistical models were introduced to track and predict consumer behavior. The technologies used made it possible to identify and use patterns of behavior on the site in models, also used data on purchase history, demographic and behavioral information. As a result, the store was able to recommend to customers the most relevant products for them and make personalized discount offers only to those customers who really needed them, which increased profitability, more than doubled sales and improved a number of other indicators.
The result: a 133% increase in sales, a 200% increase in user activity, a doubling of the number of customers making repeated purchases, an increase in the average bill of such customers by 30%.
3. Company: Red Roof Inn.
Industry: Hospitality.
In the winter of 2014, the American hotel chain Red Roof Inn faced a decrease in the flow of tourists due to the harsh winter and adverse weather conditions. However, due to such weather conditions at airports, a large number of flights were canceled every day, passengers stayed at airports for a long time and needed a hotel. Using open data on weather conditions and flight cancellation, the company was able to send personalized offers to passengers of delayed flights with contact details of the nearest hotel to the airport when they were most in demand.
The result: an additional increase in revenue by 10% compared to the previous year, even in the conditions of a reduced tourist flow.
4. Company: Skillsoft.
Industry: education.
Skillsoft is an American company that develops educational software and content, one of the world leaders in the field of corporate educational programs. In partnership with IBM, the company used internal data about user interaction with the system, directly through the program and via e-mail distribution, to personalize their experience, increase engagement and improve learning outcomes. Data on user behavior in the program was used to monitor engagement, to determine the best time and channel of communication, with which you can attract the user's attention. Also, on the basis of the preferences of this and other users, a recommender educational content system was built (84% of users rated the recommendations as relevant), and the ways of visualizing the material that were optimal for each user were proposed.
The result: a 128% increase in user engagement with content.

5. Company: Huffington Post.
Industry: media, journalism.
Huffington Post is a popular American online publication, aggregator and blog with many localized versions for different territories and languages. The company uses AB testing to select the best article headlines, examines the behavior and preferences of the target audience in order to publish materials that are interesting to individual groups at their most active hours (for example, parental materials are published late at night when the children are already asleep). The company uses analysis of user behavior in the browser and recommendation systems to offer users the most interesting content and make it most accessible and attractive from the home page of the site (Gravity technology).
Result: in August 2014, the threshold of 100 million unique visitors per month was exceeded, the first place in terms of popularity in the USA among Internet publications was reached, the average number of articles viewed per session increased to 10-12.
6. Company: VidiMax.
Industry: content provision - movies.
VidiMax is a Russian service that provides licensed access to feature and documentary films, TV shows, cartoons, sports broadcasts and TV shows. Available via smart TV, has about 1 million users. In order to increase user loyalty during the free two-week trial use of the service, a recommendation system was introduced together with E-Contenta, and a personal recommendation block appeared.
Result: films in the block of personal recommendations are watched 2.5 times more often than films in a selection of the most popular films.
Internal optimization
1. Company: Sberbank.
Industry: banks.
Sberbank uses big data and machine learning in many areas, including credit scoring. To solve this problem, the company uses not only traditional data, such as socio-demographic parameters, credit history, transaction history, financial statements, but also a number of others. For credit scoring, Sberbank also uses customer relationship graphs based on remittance data and social network data. For credit scoring companies use news texts with their mention, for which an automatic analysis of tonality is carried out. In 2015, the company added cellular data to the models, which made it possible to improve the quality of the classifier by 7 percentage points. by the Gini coefficient. A large number of active SIM cards and a small time of their work, small and numerous replenishment of accounts, suspicious geography of calls indicate fraud and reduce the likelihood of approval of the loan application. For retail customers, the use of machine learning algorithms has improved the quality of scoring models by 4 percentage points. according to the Gini coefficient due to a more accurate selection of factors.
The result: a constant increase in the quality of scoring models, including due to the latest innovations.
2. Company: Union Pacific Railroad
Branch: transport.
Union Pacific Railroad is the largest US railway company, has more than 8,000 locomotives, and owns the largest US rail network. Thermometers, acoustic and visual sensors and other sensors were installed at the bottom of each company. Data from them is transmitted to the processing center via fiber optic cables stretched along the railway network. The processing center also receives data on weather conditions, data on the status of braking and other systems, GPS-coordinates of trains. The collected data and predictive models built on them allow you to track the condition of wheels and railroad tracks and predict the collapse of trains from rails a few days or even weeks before a possible incident. This time is enough to quickly fix problems, avoid damage to the train and delay other trains.
Result: the company managed to reduce the number of train convergences from the rails by 75% and to avoid significant losses (previously, losses from one derailment could reach $ 40 million).
3. Company: Los Angeles Police Department.
Industry: public sector - police.
Using solutions developed by
PredPol , Los Angeles police were able to get the most likely time and areas (with high accuracy, about 50 sq. M) of various types of crimes and send additional police forces there to prevent them. The system uses historical data on time, type and region of crimes, processes them using clustering algorithms in space and time. Predictive modeling is carried out using mathematical models of point processes (
Self-Exciting Point Process Modeling ). No personal data of people in the city and information about their whereabouts is used, which allows them to observe the privacy requirements of private life. Reducing the number of crimes has led to a reduction in costs in the police, the judicial system and the system of execution of sentences.
The result: a reduction in the number of thefts by 33%, a decrease in the number of violent crimes by 21%.
4. Company: Entro.py.
Industry: building management.
St. Vincent's is a large Australian network of public and private clinics located primarily in Sydney and Melbourne. Entro.py, the building manager of the clinics, together with
BuildingIQ, implemented a solution that analyzes current data on the use of the premises, temperature conditions and weather conditions, as well as building characteristics and historical data on energy consumption to reduce heating and cooling costs for buildings.
Result: in 2014, the cost of climate control decreased by 12%.
5. Company: United Parcel Service (UPS).
Industry: Logistics.
UPS is an American logistics company, the world's largest package delivery and supply chain management company, delivering more than 16.9 million shipments per day to more than 220 countries around the world. UPS uses big data to optimize routes, reduce fuel costs and environmental load. The company uses radar to track cargo, collects and analyzes indicators of multiple sensors to monitor vehicle condition and driver behavior, uses mobile CRM data to monitor delivery and customer service quality. To optimize routes and reduce costs, the company introduced the ORION system - one of the world's largest systems based on the results of the mathematical theory of operations research. The construction of optimal routes is performed in real time using huge computational power. To solve this problem, the system uses cartographic data, data on the points of departure and arrival, the size and the required delivery time.
Result: savings of about 6 million liters of fuel per year, reduction of carbon emissions into the atmosphere by 13 thousand tons annually, increasing the speed of delivery.

6. Company: ThyssenKrupp AG.
Industry: mechanical engineering.
ThyssenKrupp AG - one of the world's leading manufacturers of elevators, serves more than 1.1 million elevators worldwide. In partnership with Microsoft, the company launched the MAX system, which collects data from a variety of sensors installed in the company's elevators via the Internet of Things (monitor cabin speed, door operation, engine temperature, etc.) and build predictive models on the Azure Machine Learning platform. The models allow you to prevent an incident before it occurs and transfer a specific code of malfunction to the technician, one of 400 possible, in order to reduce maintenance time. As a result, maintenance and repair costs are reduced (one breakdown costs at least $ 300) and additional value is created for customers: elevators become more reliable, safe, owners of shops, hotels and other organizations located in buildings do not incur losses.
Result: the uptime of elevators increased on average by 50%.
Learn about our Big Data for Executives program
here . And
then a new set for the Big Data Specialist program, and a discount of 15% is valid until November 15.