Our current source is Dave Cramer , one of the key contributors to PostgreSQL, the author and manager of the JDBC driver for Posgres for over 15 years.
Dave doesn’t perform in public very often. We were very lucky, he agreed to come to PG Day'17 Russia to hold a workshop / workshop on the optimization and efficient use of Java in PostgreSQL together with his fellow contributors in pgjdbc, Álvaro Hernández Tortosa and Vladimir Sitnikov. No less interesting is scheduled Dave's report on the hidden features of the driver .
The conversation turned out to be interesting. Dave is a very laconic interlocutor who responds briefly and to the point. We were able to talk in detail about the current state of affairs in the development of the JDBC driver and the role of Dave in this process. He also shared his views on the use of Java for developing stored procedures and his vision of the current state of the international PostgreSQL community. We did not ignore the traditional announcement of the upcoming master class.')
PG Day: Dave, tell us briefly about yourself: who are you, what are you doing, how do you spend your free time?Dave : Hi, my name is Dave Kramer, I have been working with the PostgreSQL language for 15-16 years
since 2000 , and my main activity here is working with the JDBC driver, because I used to be a Java programmer. I also did some procedural languages, for example pl / R. What interests me in PostgreSQL?
Promotion of the community, its development.As for the personal, I have two children, a wife, two dogs, a couple of grandchildren, I did not even think that this would happen.
I live in Canada, and sometimes in the winter I leave for Florida. As for free time, I like to drive a car around the track. I always rush as fast as I can. I think you can accelerate even faster, but my own skills limit me.
PG Day: How and why did you start working with the PostgreSQL Java driver?Dave : In 1999-2000, open source software was not as common as it is now. At that moment, I just became a private contractor, working on myself. Somehow I called Microsoft tech support (I had a client), there was a problem, and I hoped they would solve it. They replied: "We will contact you in three weeks." It is clear that for three weeks I was left without work, so it strained me a little. I began to consider other options, and came across
open source , I knew nothing about it. I asked people how to figure it out? And I was advised to
answer questions in the mailing list . That's exactly what I did. I answered the first question for two or three days. With the second, I figured out easier. And slowly, I began to respond fairly quickly to them.
At that time, the other person was busy with the JDBC driver, and suddenly he wasn’t interested in it, but I was always in the
spotlight , so
Bruce Momjian asked me to do it. And I said yes. It was 15-16 years ago and now - until now.
PG Day: Are you still updating the code base now?Dave : I rather delegate this to others.
I manage the development. Sometimes I don’t work on a driver for a long time, others do it for me. I seem to be in the background. More people are working than me.
My main task is to manage the process, make sure that no one breaks anything . Everyone tries to improve the code on their part, but not everyone always thinks globally.
According to my calculations,
thousands of companies rely on the JDBC driver . And if we add some option, because of which everything will be covered even in five hundred, then a lot of people will suffer. So I'm more like a project manager.
Just today I fixed a couple of bugs, so ... By the way, there are a lot of Russian code.
Vladimir Sitnikov , for example. A lot of code introduced into the driver.
PG Day: How many people are currently involved in the active development of the driver? How is the workflow going?Dave : Frankly, there
are not enough people . Their number changes from time to time. Depending on who wants to solve this or that problem. Most recently, a guy in the UK
completely rewrote the driver for Maven. Vladimir Goryachev, it seems that his name is, wrote a component for
logical decoding to the driver . Vladimir Sitnikov spent a lot of time optimizing the driver.
We have a lot of people working on the driver, but they appear and disappear. And I'm trying to figure out how to attract even more people.
PG Day: Are there any large companies that are trying to sponsor development, support the initiative, paying people to improve the project?Dave : I have never seen anyone offer their help and pay for the work on the driver. I think Vladimir Sitnikov is paid by his company (
Netcracker - ed. ) To work on the driver. But I'm not so sure. I think
Red Hat sponsored some work related to packaging.
Pivotal , at that time when I worked for them, were not against me working on the driver. Open SCG also did not mind. But that the company sponsored someone's work on the driver ... This, it seems, did not happen.
PG Day: Do you plan to leave your role as curator and project manager? What are your plans for the future?Dave : I have no plans to leave. I have not thought about the future yet, but I
do not plan to leave my work . While other people help me, I will follow the driver, manage the development, fix bugs. So no, in the near future I plan to continue to do this.
PG Day: Which of the latest JDBC enhancements do you find most interesting, impressive when it comes to PostgreSQL?Dave : The most amazing thing that happened to my eyes with the driver was that we had
logical replication . Why is this so important? Now we can carry out the
collection of the changed data (Changed Data Capture). We write
clean code in Java . Until that time, we had to write triggers, send data to some algorithm, create a file, a queue or something like that, read the data asynchronously, and then update the code. Update application.
Now we
have a logical replication built into the driver . We can do real-time updates as data changes. I would call it the main innovation. Although this does not detract from our past work - to make the driver faster. Generally there are two things. First, we
accelerated the driver, and second, we
completely rewrote the code to make it easier for people who are unfamiliar with the driver to read and work on it. Here are the two most important things. In addition to logical replication.
PG Day: Are there any problems in the driver architecture that you would like to fix?Dave : The only thing the driver lacks is the ability to correctly handle
user types . I would like this to be added to the driver, but this option is not so often requested.
As far as I can tell, the driver is fine, it works relatively well. And with architecture, everything seems fine.
PG Day: How do people cope with the lack of support for custom types? Are there any ways to solve the problem?Dave : For PostgreSQL, there is another driver that performs this task. It is called
pgjdbc-ng , the driver of the new generation. It does not attract attention, and I do not think that custom types are an element that is often used in the community where PostgreSQL is used. It seems to me that
most of those who use Java use Hibernate, Spring technology , and for the most part their data is very simple. They make selections, insert data, delete and in general everything.
In my opinion, there is one thing - and it relates to the question you asked earlier - do people, companies, spend money on working on a driver ... What is the biggest challenge for us, as for the creators of the JDBC driver (and any other drivers)? Drivers are like wheels on cars. People buy a car and expect it to have wheels. They are round, they have few functions, but they move the car. But while the wheels are spinning, people do not care. It is not fashionable. People do not spend hours of their time discussing this on hackers' list.
My most important test is not to break the driver.PG Day: Java inside PostgreSQL as a language for writing stored procedures - is this a good idea? What are the typical uses for such a tool?Dave : Yes, there is a technology called
PL / Java - Java as a stored procedure language in PostgreSQL. The problem is that
PostgreSQL is process oriented . Every time you write in the language of Java stored procedures and execute them, you need to start the JVM inside the connection.
It becomes a rather costly process. So you need to make sure that you are using a stored procedure run from Java only when it can be written in Java. And it is worth the startup costs that are spent on the JVM. This can be observed in other languages, for example,
PL / Python and especially in
pl / R. We would like to use all these languages in order to perform some significant data processing in JVM, pl / R or Python, the languages used by data processing and analysis specialists.
Do I think PL / Java is a good idea? I am not very convinced of this. For a while I was working on another project called pl / J, which provided shared access to the JVM among the connections, and I would be happy if someone revives this project. Personally, I did not have time for him. The code still remained. It is quite difficult.
Oracle did quite a lot of work on Java inside the database. Their JVM inside Oracle is not the JVM that we have.
They have a separate JVM optimized specifically for these tasks. I think that it does not have those startup costs that are for PostgreSQL. In this case, if something can be done only in Java, it will be justified.
PG Day: It's no secret that Java is one of the core enterprise technologies. How do you think PostgreSQL is a good choice for building a complex of corporate technologies around it?Dave : I would say yes, but I have a personal interest in it. I think the main snag here is the price. There are no obstacles to developing using PostgreSQL at minimal cost.
A more serious test that we face as PostgreSQL developers is the introduction of technology and our attempts to make people believe in it. In North America, we noticed a serious interest in PostgreSQL. A surge of interest, as
people began to pay more attention to the cost of projects.It seems to me that a significant amount of commerce is the huge number of databases that are currently managed by large corporations. You often hear that they have thousands of databases, which means that if you pay per core for your databases, then it becomes very expensive. Especially now, when everybody is scaling their services, introducing microservice architectures, using a proprietary technical solution becomes much more expensive.
PG Day: What changes have you noticed in the last 5-10 years spent in the community? Is there an understanding in which direction it moves?Dave : Over the past five years, the popularity of PostgreSQL has grown. The number of people working on it is increasing, as is the number of people writing code for PostgreSQL itself. When I started, quite a few people wrote the code, only 4-5 people who actually worked with the code.
Today we have about 10 companies, each with 10 people, and all of them make a significant contribution to the project code.Sometimes pieces of code are even incompatible with each other. So the main task facing our group is to solve these problems. And choosing the right path for the community to provide the user with optimal functionality. Many opportunities are managed by their respective companies. Customers have a personal interest in why they want this or that option. This is perhaps the most difficult task for us.
PG Day: Can you give a brief preview of what will be at your seminar that you have with Alvaro and Vladimir at PG Day'17 Russia? What do you include in the program, what problems and tasks will you work on?Dave : We want to make the course practice oriented. We divide it into several blocks; each of us prepares material for his own block. There will be about one and a half hours of theory, the rest is practice.
Our task is to challenge the listener, make him sweat. It is unlikely that everyone will have time to solve absolutely all problems. But it will give impetus to development. We, of course, will show the
right and wrong approaches to solving problems . This will give the listener food for thought and further practice.
We are currently working on several main topics: data insertion performance,
logical decoding, work with data types . Vladimir is preparing a very interesting material on
methods of measuring performance in programs . How to recognize where “bottleneck” is: at the level, application, driver or database? Vladimir about all this will tell.
These are root threads. Perhaps something else have time, at the discretion of the audience. We will procure material with some excess. Then we will share all the groundwork so that students can practice after the seminar.
PG Day: Thank you, Dave!