📜 ⬆️ ⬇️

IBM continues to work with Apache Spark: corporation launches Spark-as-a-service



At the IBM Insight 2015 conference, several interesting announcements were made at once. The main thing is a continuation of the development of the Apache Spark project support idea. IBM launches IBM Analytics on Apache Spark, with Bluemix serving as the cloud platform. Recall that in June, IBM announced its intention to invest in the project more than $ 300 million over several years. In addition, it was previously announced that Apache Spark for Linux will be supported by z Systems.

Such support will be provided in the framework of the analytics on mainframe project. Thanks to this, data mining specialists will be able to use Apache Spark on powerful z Systems mainframes.

Apache Spark will not only work as a service on the Bluemix platform, it will also integrate the system with other cloud and analytical solutions, including the Cloudant NoSQL solution and the cloud storage platform SashDB. Developers, using Bluemix, will be able to integrate their projects with analytical solutions and databases from IBM.
')
Together with Spark, IBM also offers what is called Insight Cloud Services. This is a solution that allows you to get "external data about people, events, companies, business projects from sources like Twitter and The Weather Company." IBM customers will be able to complement and expand already existing information using Insight Cloud Services, and then carry out a full-fledged analysis of the collected data set using Apache Spark.

Since Spark supports both machine learning, and natural language recognition, and image processing technology, as well as offering a large number of other features, IBM sees Spark as a complete environment for working with data. For example, using the IBM Datacap service, which is part of Insight Cloud Services, a client can automatically classify and recognize the content of a document, including its format and structure, text and numeric information.

The company believes its tool is very reliable, so more than fifteen of IBM’s own commercial and analytical products have been transferred to Spark. Thanks to this, for example, it was possible to reduce the number of lines of code in DataWorks from 40 to 5 million.

In the near future, IBM will expand support for Apache Spark beyond analytics in all areas of its own business.

Source: https://habr.com/ru/post/274575/


All Articles