What I learned from my own bitter experience (over 30 years in software development)

This is a cynical, clinical collection of what I have learned over 30 years of experience in software development. I repeat, some things are very cynical, and the rest is the result of long observations at different places of work.

Software development

First the specs, then the code

If you do not know what exactly you are trying to solve, then you do not know which code to write.
First describe the operation of your application before starting programming.

“Without requirements or a project, programming is the art of adding bugs to an empty text file” - Louis Sraegley

Sometimes even a “brief presentation” is enough - no more than two paragraphs describing what your application does.
')
There have been cases when, due to unwritten steps, I spent more time looking at the code and wondering what to do next. This is a good sign that it is time to stop and discuss the situation with colleagues. Or, perhaps, rethink the decision.

Describe stages as comments.

If you don’t know how to get started, describe the high-level data flow in your application, just in your own language. And then fill in the code with voids between the comments.

Or even better: consider each comment a function, and then write a function that does just that.

Gherkin will help to realize the expectations

Gherkin is a format for describing tests, whose principle states: "Given that the system is in a certain state, if something happens, then it is expected." If you do not use testing tools that understand Gherkin, it will give you a good idea of what to expect from the application.

Unit tests - good; integration tests - even better.

In my current job, we only test modules and classes. For example, we write tests only for the presentation level, then we write tests only for the controller level, and so on. This helps us to understand whether everything is in order, but does not allow us to see the whole picture of what is happening — for this, integration tests are more useful for checking the behavior of the entire system.

Tests allow you to improve the API

We program within levels: there is a storage level that should make our data eternal; there is a processing level that must somehow convert the stored data; there is a presentation layer that contains information about the presentation of the data, etc.

As I said, integration tests are better, but testing the levels themselves makes it possible to better understand what their API looks like. Then you will be better able to imagine the situation with calls to something: is the API too complicated? Do I need to keep so much data around to make one call?

Do tests that you can run from the command line.

I mean that not the command lines themselves are important for any object, but your knowledge of the commands to run tests, your ability to automate their execution, which you can then apply in the continuous integration tool.

Get ready to send your code to the cart.

Many of those who start testing-based development (TDD) are annoyed when you tell them that you may need to rewrite a lot of their code, including what you yourself have written.

TDD is invented to throw out the code: the more you learn about the problem, the more you understand that what you wrote will not solve it in the long run.

Do not worry about it. Your code is not a wall: if you always have to throw it away, this is not an empty loss. Of course, you have lost time writing code, but now you understand the problem better.

Good language has integrated tests.

I assure you that if there is a testing framework in the standard language library, even if it is minimal, then the associated ecosystem will test better than a language that does not have such a framework, regardless of the merits of external testing frameworks for this language.

Thinking about the future means wasting energy.

When developers try to solve a problem, sometimes they try to find a way that will solve all problems, including those that may arise in the future.

I'll tell you what: these future problems will never arise, and you will have to accompany a huge pile of code that will not be used entirely, or you will have to rewrite everything because of the uninspired code.

Solve the problem that is now. Then decide the next one. Then the next one. Once you notice a pattern arising from these decisions, and only then will you find your “one-stop solution”.

Documentation is a love message to yourself in the future.

We all know what a hemorrhagic is - write damn documentation on functions, classes and modules. But understanding your thoughts when writing a particular function can save your ass in the future.

Function documentation is its contract.

Starting from programming with the writing of documentation, you are actually creating a contract (perhaps with yourself): “I argue that this function does this , and that is what it does.”

If you later discover that your code does not match the documentation, this will be a code problem, not a documentation.

If the description of the function is "and", then this is bad

A function should do only one thing. When you write documentation to it and see that you added “and”, it means that the function does something else. Divide it into two functions and get rid of “and”.

Do not use boolean values as parameters

When developing a function, it may be tempting to add a flag. Do not do this.
Let me explain with an example: let's say you have a message transfer system, and there is a getUserMessages function that returns all messages to the user. But there is a situation in which you need to return either the brief essence of each message (say, the first paragraph), or the entire message. Therefore, you add a parameter in the form of a flag or a boolean value that you call retrieveFullMessage .

Again, do not do this.

Because those who will read your code will see getUserMessage(userId, true) and will wonder what it is all about?

Or you can rename the getUserMessageSummaries function and type getUserMessagesFull , or something like that, but each function will simply call the original getUserMessage with true or false - but the interface outside of your class / module will be understandable.
But do not add flags or boolean parameters to functions.

Be wary of interface changes.

In the previous paragraph, I mentioned renaming a function. If you control the source in which the function is used, this is not a problem, it is just a matter of search and replace. But if the function is provided by the library, then it is not necessary to change the name according to its whim. It will break many other applications that you do not control, and will upset many people.

You can create new functions in the document or using code to mark the current function as unwanted. And after a few releases you can finally kill her.

Ugly solution: create new functions, mark the current as unwanted and add sleep to the beginning of the function to force those who use the old function to update themselves.

Good languages have documentation built in.

If the language uses its own way of documenting functions, classes, modules and everything else, and even there is a simple documentation generator, then everything mentioned will be well documented (not great, but at least good).

And languages that do not have embedded documentation are often poorly documented.

Language is more than just a language.

You write in a programming language and make things "work." But there are not only special words in it: the language has an assembly system, a dependency management system, tools for interacting tools, libraries and frameworks, there is a community, there is a way to interact with people.

Do not choose languages for ease of use. Remember that you may think the syntax is simple, but choosing this language, you also choose the way the creators of the language communicate with its community.

Sometimes it's better to let an application fall than do nothing.

Although it sounds strange, it is better not to add error handling than to quietly catch them and do nothing.

Java has a notorious pattern:

 try { something_that_can_raise_exception() } catch (Exception ex) { System.out.println(ex); }

With the exception, nothing is done here, only a message is displayed.

If you do not know how to handle the error, let it happen, so at least you can find out when it happened.

If you know how to process, do it.

In contrast to the previous point: if you know when an exception, error or result pops up, and you know how to handle it, then do it. Show the error message, try to save the data somewhere, discard the data entered by the user for later use in the log - just process .

Types tell what data you have

Memory is just a sequence of bytes. Bytes are simply numbers from 0 to 255. What these numbers mean is described in the language type system.

For example, in C, the character type (char type) with a value of 65 will most likely be the letter “A”, and an int with a value of 65 will be the number 65.

Keep this in mind when working with your data.

When adding a Boolean, many people forget to check the number of True values. Recently I met this JavaScript example:

 console.log(true+true === 2); > true console.log(true === 1); > false

If your data has a schema, store it as a structure.

If the data is simple — for example, only two fields — then you can store them in a list (or tuple, if your language allows). But if the data has a scheme — a fixed format — then always use some kind of structure or class to store it.

Recognize and stay away from cargo worship

The idea of a "cargo cult" is that if someone did, then we can. Most often, a cargo cult is simply an “easy escape” from the problem: why should we think about how to properly store user data if X has already done this?

“If the Big Company stores the data in this way, then we can.”

"If it uses the Big Company, then that's good."

"The right tool for the task" is a way to impose your opinion.

The phrase “the right tool for a task” should mean that there is a right and wrong tool for something. For example, using a specific language or framework instead of the current language or framework.

But every time I hear this expression from someone, people thus push their favorite language / framework instead of, say, the correct language / framework.

The “right tool” is more obvious than you think.

Perhaps now you are involved in a project in which you want to process some text. Maybe you want to say, "Let's use Perl, because everyone knows that Perl is very good at word processing."

What are you forgetting: your team specializes in C. Everyone knows C, not Perl.

Of course, if this is a small project "on the knee", then it is possible on Perl. And if the project is important for the company, it is better to write it in C.

PS: Your heroic project (more on this below) may fail due to this.

Do not get into what is outside of your project.

Sometimes, instead of using appropriate extension tools, people start changing external libraries and frameworks. For example, make changes directly to WordPress or Django.

This way you can easily and very quickly make a project unsuitable for maintenance. As soon as a new version is released, you will have to synchronize changes with the main project, and soon you will find that the changes will no longer be applied, and leave the old version of the external tool full of security holes.

Data Flows Overcome Patterns

It's my personal opinion. If you understand how data should go through your code, then it will be better for it than if you use a pack of design patterns.

Design patterns are used to describe solutions, not to search for them.

Again, my personal opinion. According to my observations, most often design patterns are used to find a solution. As a result, the solution — and sometimes the problem itself — is distorted to fit the pattern.

First, solve your problem. Find a good solution, and then look among the patterns to know the name of your solution.

I have seen it many times: we have a problem, the pattern is close to the correct solution, let's use the pattern, now we need to add a lot of everything to the correct solution so that it matches the pattern.

Learn the basics of functional programming.

You do not need to go deep into the questions of “what are monads” or “is this a functor?” But remember: do not constantly change the data; create new elements with new values (consider the data unchangeable); as far as possible, make functions and classes that do not store internal states (pure functions and classes).

Cognitive effort is the enemy of readability

“ Cognitive dissonance ” is a veiled expression “in order to understand this, I need to simultaneously remember two (or more) different things.” And the more indirect relation this information has, the more effort you need to spend on keeping it in your head.

For example, adding boolean values to calculate True values is a soft version of cognitive dissonance. If you read the code and see the sum() function, which, as you know, adds all the numbers in the list, then you expect to see a list of numbers; and I met people using sum() to calculate True values in a list of booleans, which is completely confusing.

Magic number seven plus or minus two

A “ magic number ” is an article on psychology that describes the number of elements that a person is capable of simultaneously keeping in short-term memory.

If you have a function that calls a function, that calls a function, that calls a function, that calls a function that calls the function, then this is just hell for reading your code.

Just think: I will get the result of this function, pass it to the second function, get its result, pass the third, and so on.

Moreover, today psychologists are more often talking about the magic number FOUR, rather than seven.
Think in the category “composition of functions” (for example, “I will call this function, then that one, then that ...”), and not in the category “function call” (for example, “this function will call that, it will trigger that one. .. ").

Cuts are good, but only in the short term.

Many languages, libraries, and frameworks offer abbreviations to reduce the number of characters you type.

But later it comes back to you, and you will have to remove the cuts and write everything in its entirety.
So find out first what a particular abbreviation does before using it.
You do not need to write everything entirely first and then change it to abbreviations: do what abbreviations do for you, and you will at least understand what can go wrong, or how to replace something with an abbreviated version.

Resist the temptation of "lightness"

Of course, IDE will help you with the automatic completion of a heap of everything and make it easy to build a project, but do you at least understand what is happening there?

Do you know how your build system works? If you have to run it without an IDE, can you do it?

Do you remember the names of functions without automatic completion? Is it possible to break something or rename it to make it easier to understand?

Be interested in what happens under the hood.

ALWAYS use time zones in dates

When working with dates, always add time zones. You will always have problems with the mismatch of time zones on computers and servers, and you will lose a lot of time debugging, trying to understand why the wrong time is displayed in the interface.

ALWAYS use UTF-8

With encodings you will have the same problems as with dates. Therefore, always convert string values to UTF-8, save them in databases to UTF-8, and return from your APIs to UTF-8.

You can convert to any other encoding, but in the encoding war, UTF-8 won, so it's easiest to stick with it.

Start in a silly way

One of the ways to get away from the IDE is to “start in a silly way”: just take a compiler, ANY editor with code highlighting and - program, build, run.

Yes, it is not easy. But when then you use some kind of IDE, then you will think about the buttons only “Yes, it starts something.” This is what the IDE does.

Logs are for events, not for the user interface.

For a long time I used logs to show users what was happening with the application. Well, you know, because it's much easier to use one thing than two.

To inform users about events, use the standard output form. For error reporting - standard error messages. And use logs only to store data that you can easily process later.

Logs are not a user interface, but an entity that you need to parse to retrieve information at the right time. Logs should not be human readable.

Debuggers overvalued

I have heard from many people complaints that code editors without debuggers are terrible, precisely because there are no debuggers in them.

But when your code is in use, you cannot run your favorite debugger. Hell, you can't even run your favorite IDE. But journaling ... it works everywhere. You may not have the desired information at the time of the fall (for example, due to different levels of logging), but you can turn on logging to find out the reason later.

I am silent about the fact that the debuggers themselves are bad, they simply do not provide the help that many expect of them.

Always use a versioning system.

"This is just my stupid application, with which I want to learn something" - this does not justify the lack of a versioning system.

If you use such a system right from the start, it will be easier to roll back when you make a mistake.

One change per commit

I met people who write such messages in commits: "Corrects problem 1, 2 and 3". Unless all these problems overlap - two of which should already be closed, there should be three commits instead of one.

Adhere to the principle of “one change per commit”. And by change, I mean a change in one file. If you need to change three files, then commit these files together. Ask yourself: “If I roll out this change, what should disappear?”

"Git add -p" will help you with an excess of changes

This only applies to Git. It allows you to partially merge files using the "-p" parameter, so you can only select changes that are related to each other, leaving others for the new commit.

Structure projects by data or type, not by functionality.

Most projects use the following structure:

 . +-- IncomingModels | +-- DataTypeInterface | +-- DataType1 | +-- DataType2 | +-- DataType3 +-- Filters | +-- FilterInterface | +-- FilterValidDataType2 +-- Processors | +-- ProcessorInterface | +-- ConvertDataType1ToDto1 | +-- ConvertDataType2ToDto2 +-- OutgoingModels +-- DtoInterface +-- Dto1 +-- Dto2

That is, the data is structured by functionality (all input models are in one directory or package, all filters in another directory or package, etc.).

It works great. But when structured by data, it is much easier to divide the project into smaller ones, because at some point you may need to do almost everything the same thing as now, only with small differences.

 . +-- Base | +-- IncomingModels | | +-- DataTypeInterface | +-- Filters | | +-- FilterInterface | +-- Processors | | +-- ProcessorInterface | +-- OutgoingModels | +-- DtoInterface +-- Data1 | +-- IncomingModels | | +-- DataType1 | +-- Processors | | +-- ConvertDataType1ToDto1 | +-- OutgoingModels | +-- Dto1 ...

Now you can make a module that works only with Data1, another module that works only with Data2, etc. And then you can separate them into isolated modules.

And when you need to create another project that also contains Data1 and works with Data3, you can reuse most of the code in the Data1 module.

Make libraries

I often saw how developers either create mega-repositories with different projects, or keep different branches, not to be a temporary environment for later joining the main part, but just to split the project into smaller parts (speaking of splitting into modules, imagine that instead of building a new project that reuses Data1 type, I use a branch with a completely different main function and Data3 type).

Why not select frequently used parts in libraries that can be connected in different projects?

Most often, the reason is that people do not know how to create libraries, or are worried about how to “publish” these libraries in sources of dependencies without giving them away (is it not better to understand how your project management tool gets dependencies so that you can create your own dependency repository?).

Learn to monitor

In a previous life, I added many metrics to understand how the system behaves: how quickly it came, how quickly it went, how much everything was between inlet and exit, how many tasks were processed ...

It really gives a good idea about the behavior of the system. Speed decreases? To figure it out, I can check what data goes into the system. Is it normal to slow down at some point?

The fact is that without follow-up monitoring, it is rather strange to try to find out how “healthy” the system is. A health check in the style “Answers Requests” is no longer appropriate.
Early addition of monitoring will help to understand how the system behaves.

Use configuration files

Imagine: you wrote a function to which you want to pass the value so that it starts processing (say, the Twitter account ID). But then it needs to be done with two values already, and you simply call the function with a different value again.

It is better to use configuration files and simply run the application twice, with two different configs.

Command line options look weird, but they are useful.

If you transfer something to configuration files, you can make life easier for your users and add the ability to select and open a file.

Today, for each language there are libraries that work with command line options. They will help you create a good utility by providing a standard user interface for everything.

Not just composition of functions, but composition of applications.

Unix uses the following concept: "applications that do one thing and do it well."

I said that you can use one application with two configuration files. And if you need results from both apps? Then you can write an application that reads the results of the second, and combines everything into a common result.

Even when using app composition, start in a silly way.

( ), , «» ( ).
. , . , .

, . «, », , « , ».

, . , .

, , , . , . IS ALWAYS. . , .

, , . Lisp, . , Python yield , , , . , , , , .

, , . , « », « » ..

, .

,

, .

, , «»: , , , . , , . , , , .

, , . , .

, . (« ?»), .

… Google

, . , Google , . , , Google , .

C/C++ — K&R

. :)

Python — PEP8

PEP8. , .

, ? sleep() .

? ?

, . sleepForSecs sleepForMs , , sleep .

, .

«Zen of Python», .

,

, . , - . - , , , .

« , » — .

: , — . , .

, Java , Rust. , Spring , ++.

, — , «» .

, .

—

, , - .

— , — .

« , »

« » « ». , , , , .

, .

,

, , , , .

.

( «»), , , . , AWS SQS ( ), , , RabbitMQ.

- , , , .

Personal

,

, . , . , .

( , ). , , , .

,

- , , . , , , , .

, . , , , « » « , ».

, .

«». , . , . , , . , .

: «, , , , ». , .

.

. - . «» «».

, , , . , .

, ,

. , , - , .

Do not do this.

- , .

- , . , .

, . .

- ,

: - , . , .

«, , » — , .

, , . . , - . . .

, , . , , , - . , , .

,

« ».

- , , , . : « , , ».

.

, , , . .

, .

«»

«» — . , - « », - .

, , , . , , , .

.

, , «»

. - : «, , ?»

, . , , (, , ).

, —

, -, , , , ( ).

… , - , , « !».

:

«» , , , , . , , .

, /, .

, .

- .

« » « »

: - , , , .

« » — , .

.

,

, , - , .

- .

, .

, , .

, , , .

… .

IT

.

, , 15 , 3-4 .

.

.

, , , - , , , , , , .

. , , URL, .

Trello — ,

« , », .

,

, « , », « , ».

. . , - .

, .

.

, , .

…

, . , « ». - « », , .

. .

Github «, » . , - .

.

: Python, , Java Python, .

«, »

, , , «, ».

- , , - . , .

Source: https://habr.com/ru/post/456862/

All Articles