📜 ⬆️ ⬇️

Our experience in creating API Gateway

Some companies, including our customer, develop the product through an affiliate network. For example, large online stores are integrated with the delivery service - you order a product and soon receive a tracking number for the package. Another example is that you buy insurance or aeroexpress ticket with an air ticket.

To do this, use one API, which must be issued to partners via API Gateway. We have solved this problem. In this article we will tell the details.

Given: an ecosystem and API portal with an interface where users are registered, receive information, etc. We need to make a convenient and reliable API Gateway. In the process we needed to provide
')



In the article we will tell about our experience in creating API Gateway, during which we solved the following tasks:



There are two types of API management:

1. Standard, which works as follows. Before connecting, the user tests the capabilities, then pays for and embeds on his site. Most often it is used in small and medium businesses.

2. Large B2B API Management, when a company first makes a business decision to connect, becomes a partner of a company with a contractual obligation, and then connects to the API. And after settling all the formalities, the company receives test access, is tested and goes to the prod. But this is impossible without a management decision to connect.



Our solution


In this part we will talk about the creation of API Gateway.

End users of the created gateway to API are partners of our customer. For each of them we already have the necessary contracts. We will only need to extend the functionality, noting the granted access to the gateway. Accordingly, a controlled process of connection and management is needed.

Of course, it was possible to take any ready-made solution for solving the API Management task and creating API Gateway in particular. For example, this could be Azure API Management . It did not suit us, because in our case we already had an API portal and a huge ecosystem built around it. All users have already been registered, they already knew where and how they can get the necessary information. The necessary interfaces already existed in the API portal, we only needed the API Gateway. Actually, we started developing it.

What we call API Gateway is a proxy of sorts. Here again we had a choice - you can write your proxy, and you can choose something already ready. In this case, we went the second way and chose a bunch of nginx + Lua. Why? We needed reliable, tested software that supports scaling. After implementation, we did not want to check the correctness of business logic and the correctness of the work of the proxy.

Any web server has a request processing pipeline. In the case of nginx, it looks like this:



(scheme from GitHub Lua Nginx )

Our goal was to integrate into this pipeline at the moment where we can modify the original request.

We want to create a transparent proxy so that the functional request remains as it came. We only control access to the final API, help the request to get to it. In case the request was incorrect, the final API should show the error, but not us. The only reason why we can reject the request is due to the lack of access by the client.

An extension to Lua already exists for nginx. Lua is a scripting language, it is very lightweight and easy to learn. Thus, we implemented the necessary logic using Lua.

The configuration of nginx (route application analogy), where all the work is done, is quite understandable. Noteworthy here is the last directive - post_action.

location /middleware { more_clear_input_headers Accept-Encoding; lua_need_request_body on; rewrite_by_lua_file 'middleware/rewrite.lua'; access_by_lua_file 'middleware/access.lua'; proxy_pass https://someurl.com; body_filter_by_lua_file 'middleware/body_filter.lua'; post_action /process_session; } 

Consider what happens in this configuration:
more_clear_input_headers - clears the value of the headers specified after the directive.
lua_need_request_body - controls whether the original request body should be read before executing rewrite / access / access_by_lua directives or not. By default, nginx does not read the body of the client request, and if you need to access it, this directive must be on.
rewrite_by_lua_file - the path to the scripts, which describes the logic for modifying the request
access_by_lua_file - the path to the script, which describes the logic that checks for access to the resource.
proxy_pass is the url to which the request will be proxied.
body_filter_by_lua_file - the path to the script, which describes the logic for filtering the request before returning to the client.
And, finally, post_action is an officially undocumented directive, with which you can perform more actions after the answer is given to the client.

Further we will tell in order how we solved our problems.

Authorization / authentication and request modification


Authorization

We built authorization and authentication using certificate accesses. There is a root certificate. Each new customer of the customer is generated his personal certificate with which he can get access to the API. This certificate is configured in the server section of nginx settings.

 ssl on; ssl_certificate /usr/local/openresty/nginx/ssl/cert.pem; ssl_certificate_key /usr/local/openresty/nginx/ssl/cert.pem; ssl_client_certificate /usr/local/openresty/nginx/ssl/ca.crt; ssl_verify_client on; 

Modification

A fair question may arise: what to do with a certified client if we suddenly wanted to disconnect it from the system? Do not reissue the same certificates for all other customers.

So we smoothly approached the next task - modification of the original query. The original client request, generally speaking, is not valid for the end system. One of the tasks is to add the missing parts to the query in order to make it valid. The point is that the missing data is different for each client. We know that the client comes to us with a certificate from which we can take a fingerprint and extract the necessary client data from the database.

If at some point you need to disconnect the client from our service, his data will disappear from the database and he will not be able to do anything.

Work with customer data


We needed to ensure high availability of the solution, especially how we obtain customer data. The difficulty is that the source of this data is a third-party service that does not guarantee uninterrupted and sufficiently high speed of work.

Therefore, we needed to ensure high availability of customer data. As a tool, we chose Hazelcast , which provides us with:


We chose the simplest cache delivery strategy:



Work with the final system occurs within the sessions and there is a limit on the maximum number. If the client has not closed the session, we will have to do it.

Data on the open session comes from the end system and is initially processed on the Lua side. We decided to use Hazelcast to save this data with a jOBIC written in .NET. Then, with some frequency, we check the right to life of open sessions and close down the foul.

Access to Hazelcast from both Lua and .NET


There are no customers for Lua to work with Hazelcast, but Hazelcast has a REST API, which we decided to use. For .NET, there is a client through which we planned to access the Hazelcast data on the .NET side. But it was not there.



When storing data through REST and retrieving through the .NET client, different serializers / deserializers are used. Therefore, it is impossible to put data through REST, but to get it through the .NET client and vice versa.

If you are interested, we will tell you more about this issue in a separate article. Spoiler - on the scheme.



Logging and monitoring


Our corporate standard for logging through .NET - Serilog, all logs end up in Elasticsearch, we analyze them through Kibana. Something like to do in this case. The only client for working with Elastic on Lua, which was found, broke on the very first require. And we used Fluentd.

Fluentd is an open source solution for providing a single layer of application logging. Allows you to collect logs from different layers of the application, and then translate them into a single source.

API Gateway works in K8S, so we decided to add a container with fluentd to the same one to write logs to the existing open tcp port fluentd.

We also investigated how fluentd will behave if it does not have a connection with Elasticsearch. For two days, requests were continuously sent to the gateway, logs were sent to the fluentd, but IP Elastic was banned from the fluentd. After the connection was restored, fluentd exceeded absolutely all the logs in Elastic perfectly.

Conclusion


The chosen implementation approach allowed us to deliver a really working product to the combat environment in just 2.5 months.

If ever you happen to be engaged in such things, we advise first of all to clearly understand what problem you are solving and what resources you already have. Be alert to the complexity of integrating with existing API management systems.

Understand for yourself what exactly you are going to develop - only the business logic of processing requests or, as could be the case in our case, the whole proxy. Do not forget that everything you do yourself should be thoroughly tested afterwards.

Source: https://habr.com/ru/post/446438/


All Articles