
BlackRock is one of the largest investment companies in the world and the largest in terms of assets under management
(5.7 trillion USD as of July'17) . It is also called the world's largest “
shadow bank ”. With a 30-year history and such impressive performance, the company does not lag behind the current trends in IT infrastructure. Last Friday, the CNCF organization’s blog showed that BlackRock managed to roll out its production environment to Kubernetes in 100 days. How did they come to this?
Prehistory
BlackRock’s potential interest in Kubernetes became publicly known in 2015–2016, when the company began hosting relevant thematic meetings. For example, in September 2015, representatives of the Mesosphere and Calico spoke at an
event organized in BlackRock, and in January 2016, another
CoreOS London Meetup was held at the company's office.
')
Aladdin and Docker
However, interest in technology itself still does not guarantee anything when it comes to practical application. And in May of last year, a
note on the transfer of the Aladdin software platform to the cloud architecture appears in the BlackRock tech blog.
Aladdin (stands for Asset Liability and Debt and Derivatives Investment Network) offers investment managers risk analytics combined with portfolio management and trading tools.
The functionality of Aladdin (for details, see the product website )The core of the Aladdin Core Platform, originally created for this solution, appeared to be “a single unified technological stack for all applications, also called the Aladdin Operation System”. This “operating system” was launched on each host where the platform applications worked. The chosen approach ensured the availability of all necessary (common for applications) capabilities in the host system and did not require a virtualization layer.
However, over time began to show serious problems caused by:
- the need to run on the same machines third-party utilities for analyzing Big Data (Hadoop and Spark);
- the need for simple scaling with a surge in the number of requests;
- the need for simple Deployed isolated test environments for various development teams.
All this has led the company's engineers to decide to switch to the
cloud architecture , which implies the possibility of allocating a common pool of resources for any workloads (both core applications and third-party analytical software) and running processes on any hosts (for horizontal scaling of applications).
Schematic comparison of the architecture of the application according to BlackRockAladdin decided to implement the cloud architecture using
Docker containers. At the same time, there was a need for an orchestration system, the key requirements for which were designated as follows: dynamic planning, service discovery, and isolation. However, details about its selection and use at that time were not reported.
And in December last year, Bitnami reported that BlackRock was already a
Kubeless user — a serverless framework for Kubernetes. And later in the same month
, the first public details about the use of K8s are
distributed to various media in the same Aladdin platform. Finally, they were complemented by a
success story on the Kubernetes website.
Aladdin and Kubernetes
The objectives of the new web application for Aladdin, the infrastructure of which was implemented with the help of Kubernetes, were to provide a user-friendly interface offering investors access to the analytical data they need. This approach has replaced complex client
(i.e., installed on user desktops) installations with Python and R, interacting with project servers, where distributed computing has been launched. To implement it, a team of 20 people from different departments of BlackRock (for the development, operation of infrastructure and products, project management) was assembled, who managed to achieve a successful result in
100 days .
The initial plan for creating a cloud-based web application was proposed by the developers of the company and consisted in using Ansible to describe all the necessary infrastructure. However, the operators pointed to the poor prospects for such an approach (according to their version, it will lead to “the emergence of a completely different product” and “prove to be too expensive”) and began to search for another solution.
Having already had experience with “other cloud environments” (which, unfortunately, it is not specified exactly), the joint team of engineers chose Kubernetes. The two main reasons are the
open source code and confidence in the long-term prospects of the project . As explained by the vice president of product operations at BlackRock (Uri Morris), companies usually choose technologies that will be relevant in one form or another in another 5-10 years, and the young Kubernetes already looked quite convincing. In part, this was due to the fact that the number of non-Google people who make changes to the K8s codebase exceeded the number of committers from Google. As a distribution Kubernetes, the implementation was chosen from a commercial company -
OpenShift from Red Hat.
Pull-request to add BlackRock to the list of users OpenShiftFeatures and Results of Implementation
In the process of implementing the architecture based on Kubernetes, for the Aladdin platform it was not without challenges. The main ones were:
- corporate firewalls that prevented the installation of various packages using ready-made instructions (as a result, some improvements were sent to the Minikube project codebase);
- the difficulty of finding services caused by the use of a proprietary messaging bus to interoperate multiple Aladdin services, providing an API for quickly building applications;
- integration with existing tools - for example, with the same message bus, which was done using a separate gateway, through which the Kubernetes cluster interacts and whose built-in mechanisms allowed us to control and regulate incoming requests.
Note : At the core of the BlackRock Messaging System (BMS) is the request-response paradigm. The server part — BmsServer — is a multi-threaded application in C ++, and client libraries are available in Java, C ++, Python, JavaScript, Perl, C #, and Julia. Read more about this system in the company's blog .
Illustrations of the device and operation of the BlackRock Messaging SystemIn general, the new infrastructure for the application, based on Kubernetes, was complemented
by the existing and familiar
tools for the operation team, which saved staff (minimized the need to hire new people to service the project).
The process of rolling out the infrastructure became phased and began with the development environment, followed by a test and two production environments. Throughout the project, the team held weekly 1-hour meetings with all participants (distributed across different countries), as well as additional shorter discussions on specific technical issues. An important feature of this implementation in general, representatives of BlackRock called the
initiative on the part of engineers , who have been entrusted to do what they themselves think is right.
The result of the project was the launch of the platform for internal users 100 days after the start of implementation. The infrastructure was designed for 30 users who appeared during the first hours after launch, therefore, the services for servicing 150 people were scaled promptly. The success of the implemented infrastructure allowed BlackRock to talk about the intention to transfer to Kubernetes and other applications of the company, but before making a final (and large-scale) decision on this issue, they want to gain operating experience in production, which they estimate will take from six months to a year. In addition, the already mentioned
interview confirms the information that the company began working with the serverless framework for Kubernetes - Kubeless.
But the technological stack used in BlackRock, which is listed in the current vacancy of the company -
Senior Big Data Engineer :
- Java, Python, Greenplum (SQL, SP, Functions), Java, Cassandra, ESP (Complex Event Processing), Git, Maven, Linux;
- New product development may include Docker, Kubernetes, Spark, Kafka and Kafka Streams (the same list is specified in the candidate requirements).
Among the duties is also a clause on “deploying, designing and scaling microservices”.
Other articles from the cycle