Data on the frontend: a step towards future applications

The client-server architecture for web application developers is about as one of the turtles on which the world stood in the views of our ancestors. It is difficult to imagine a different state of affairs. However, countless web applications have created a new need - data management in the frontend. While there is no single approach and implementation, there are only separate technologies that allow working with data on the client. Nobody really bothers with them. And by the way, it's time. We already talked with Nikita Prokopov aka tonsky about what is already in terms of working with data on the front end and what will happen next.

- Nikita, hello! Tell me a few words about yourself, what are you doing?

- I work for Cognician , a web-based educational and training platform. I'm doing front-end and partially backend there.
')
- In the annotation to your report on HolyJS it is written that you are Clojure-hacker ...

- This is due to the fact that I have been writing on Clojure for several years, probably four years. All my recent work has been on Clojure, there is a certain expertise.

- If you walk through Habra, then immediately noticeable that Clojure is much less common than other programming languages. At the same time, it uses JVM, integrates with Java in both directions, has a number of advantages. What can you do on it and who can Clojure do?

- Yes, indeed, Clojure has some minor problems with popularity compared to, for example, Scala, which is doing well in this regard. But Clojure in the shade. However, this is not some marginal language, it is still relatively widespread. There are some cool ideas in it.
Clojure has become a pioneer of left and right immunity, it has cool primitives for synchronizing multi-threaded parallel programs, it can compile into JavaScript. Clojure and ClojureScript for JVM and JavaScript go hand in hand, you can write code that works both there and there. This is a general purpose language, recommended for anyone who does not need super low latency or super predictability, as in real-time systems. It gradually finds its application in the corporate sector - in Boeing and Walmart they write code on it, that is, quite a serious business decides to switch to Clojure.

- The subject of the report declared by you is “Data on the frontend”. What kind of data is it and who needs it at all?

- There is such a story - I started working on the front-end seriously a year or so ago, before that, the emphasis was mainly on the server side. Now there is a growing trend that applications can be written in the browser and they really can be written there. But while the frontend culture has not grown to those approaches that have long existed on the server. Writing applications on the front end is a big front of work: what to do and how to do it. Some problems are already clear how to solve, some - not yet. One of these problems is how to store data? There are many options on the server: the file system, the database, can be stored in memory or in caches. And on the client it is not very clear: there are ad hoc solutions (as God will per capita, save somewhere in local storage or in variables), you can try to use databases that are also for the client, but they are much smaller. Applications become more and more, they need to work with data. This is a serious task requiring a system response, architecture search. In the report, I just want to consider what approaches already exist, what are their pros and cons.

- And what will happen to the backend? Will it remain or will it not be necessary?

- Of course, it continues to exist, because web applications are ephemeral: you have gone and gone, but you want your data to be preserved. There remains a client-server architecture, perhaps a database-server-client, and maybe in the future sometime just a database-client will appear, this is also one of the possible options. Having a server is one of the most interesting moments: when the server is working with data, it is in greenhouse conditions, and the client is in a wild environment, plus the user is looking at it. Just because of this, many problems arise, how to do it well. That is, I want the application to load quickly, work offline, etc., and how to do it is not very clear. If you tense up and do it specifically for yourself, then something can happen with great effort. But sooner or later, I think, there will be some system solutions that can be taken "off the shelf" and used.

- It's hard for me to imagine. Somehow familiar: there is an application, for example, some kind of web-based CRM, the client sends data to the server, everything is as usual. And here again - and you need to manage data on the frontend. When, when developing what type of applications, is it time to think about data on the frontend?

- If we are talking about the application (graphics editor, something terribly interactive), then as soon as the site / application becomes dynamic, it immediately becomes necessary to manage data on the frontend.

- In connection with the conversation about managing such data, tell about DataScript . It seems this is your project?

- Well, yes, I am the main developer. There were several pool rekvestov, people helped me, but mostly this is my development. His idea is that if the data is structured and there are relatively many of them, it is convenient to store them in a kind of database. DataScript is one example of a very easy database. It is not even a fully database, but rather an in-memory data structure by which queries can be launched and in which there are transactions. Roughly speaking, this is a fairly convenient data structure (they are relational, there are connections between them), everything inside is stored in a flat form, indexed. According to such a structure, it is convenient to make navigation, as in a graph, back and forth and conveniently quickly find the necessary data. Plus, there is immobility - you can create architectures that are made on React, when state is stored in one object, it is immutable and you can undo / redo, there are transactions - you can create reactive systems. There is a query language Datalog with its syntax and capabilities. There is only persistence, DataScript works in memory.

DataScript is needed when there is a complex multidimensional state and you want to look at it from different sides, transform it. DataScript is best known in the Clojure environment (although it supports Clojure, ClojureScript, JavaScript, and JVM) and is often used in conjunction with Datomic on the server. We use DataScript in Cognician to present the client session (questions, answers, comments, scripts), plus we synchronize the event log in Datomic (on both sides) on top of this. The guys from Precursor use the same stack for collaborative drawing of graphics (figures, objects, groups), they also have their own solution for synchronization. There are a couple of projects where many small properties of objects are stored in DataScript, inventory projects. There is an online store even. This is a very convenient basis when the data is neatly in one place to write on top of it a system solution for synchronization, for example.

- Now in the hands of the developers is a zoo of databases. What to choose? Or does it make no difference?

- Document-oriented - the simplest type of database. If there is nothing better, then you need to take them. If we talk specifically about the client, the choice is quite small: there is miniMongo, PouchDB, they are both document-oriented. You can write on the bare local storage. Ideally, you need to take the base, which gives a lot of opportunities - in particular, synchronization with the server. Transparent two-way synchronization with the server will remove most of the headache.

- Since we are talking about synchronization ... Reactive data synchronization, Swarm.js - what is it?

- Reactive data synchronization is when the server has new information and he himself understands which client to send it to. So far no one knows how to do it. There is a solution based on RethinkDB and Meteor - you explicitly subscribe to a server for certain objects or collections and they come to you with a server-push. This is not a trivial task and there are many problems: first, there are many clients, and one server, and second, how to maintain this list of subscriptions. The flow of changes on the server is constant - there are many clients, all transactions go through the server and the server must recognize the transactions and understand whom to notify and whom not. These tasks are effectively solved by very narrow technologies, if at all.

It seems that this picture will soon cease to be relevant

And there the question of consistency arises - I would also like to have some order and completeness in all these objects. This, in fact, is a reactive model - here and on the server with this, everything is relatively bad, almost no databases can natively notify about transactions and changes. This problem is solved by server programmers using a queue: we do not write directly to the database, but we add tasks to the queue, read from this queue and write - then everyone who is interested can also read from this queue. The reactive model is quite new for the architecture of the entire system. That is, inside a programming language, you can pile up something like this, but when it comes to the whole system (data, database, server, user interface), then approaches are only groped for now. It's about how to collect such a thing.

The server and the client are a distributed system in which there are many clients and one server. Clients can write and read from the server. And here a model arises when customers accumulate a pool of changes and we want them to fall on all other customers. Purely theoretically, this problem is not solved in the general case, but it is solved beautifully for certain types of fairly simple data structures. Swarm.js is a library that provides data structures for which the synchronization problem is beautifully solved, and their synchronization protocol. Roughly speaking, if I went to the same Amazon page from two browsers and in one browser added one item to the basket, and in the other - another, these two baskets can be merged with the operation of combining and conflicts cannot be. Problems begin, if possible, for example, removal - then it is impossible to merge and you need to invent something. There is such a new trend in distributed systems - CRDT (Conflict-free Replicated Data Type), these are data types that easily merge and Swarm.js is one of the implementations of such pieces. These are all pieces of a puzzle, and our global goal is to make a fully reactive system, in which any changes that occur, spread throughout the system, and the system consists of a base, a server, and many clients. IT is in search of such a system. Most likely, it will not be some kind of ready-made solution, but an approach, following which you can already create architectures.

You can listen to Nikita, discuss with him, as well as with other speakers, about your findings and thoughts about the future at the HolyJS conference, which will be held in St. Petersburg on June 5, 2016.

Source: https://habr.com/ru/post/283518/

All Articles

Data on the frontend: a step towards future applications

More articles: