📜 ⬆️ ⬇️

Teremka style programming

He tried to make the translation as accurate as possible, and changed only the name of the company, which is used as an example, but in terms of its field of activity and the principle of work in retail is similar to the one in the original.

Each pancake in the Teremka menu is just a set of approximately 8 ingredients. With such a simple periodic table of elements, the company earned $ 1.9 billion last year ( no, not Teremok, but still Taco Bell ).
The more I program and design systems, the more I understand that in many cases you can achieve the desired result by simply combining the basic set of tools given to us by Unix. After all, functionality is value, and code is debt. This statement is back to the absurd trend of DevOps ( developer admins ), based on which system administrators start writing unit tests and other things to help developers, and says that Teremka-style programming is developers who know enough about administration (and Unix as a whole ) so that they do not reinvent the wheel, and come to simple and scalable solutions.

Here is a concrete example: imagine you need to download and burn millions of web pages to disk for further processing. How to do it? Spinny kids will say that you need to write a distributed spider on Clojure and run it on EC2, communicating using SQS or 0MQ.
')
xargs and wget. In the rare case of internet channel closure, you can add split and rsync. “Distributed Spider” is actually only about 10 lines of shell-script code.

Moving on, as soon as you have these millions of pages (or even tens of millions), how will you process them? Of course, you will need Hadoop MapReduce, after all, this is how Google handles web pages?

Bue, to hell with this nonsense:

find crawl_dir / -type f -print0 | xargs -n1 -0 -p32 ./process

32 parallel processes and zero vague code for support. Requirements are satisfied.

Every time you write code or use the services of third parties, you bring the possibility of failure to your system. I have much more confidence in xargs than in Hadoop. Yes, I actually trust xargs more than myself in writing a multi-threaded handler. I trust syslog in asynchronous message recording much more than the queue service.

Teremka-style programming is one of the steps to Unix Zen. This is the path that I am just starting, but dividends are already beginning to come. To really enter it, you need to throw away a bunch of thoughts about how systems should be designed: I made most of the SOAP server using static files and Apache mod_rewrite. It was possible to do everything in the style of Teremka, if I had only found the strength to figure out sed, but I was scared and wrote something in Python.

If you do not want to think about it from the point of view of Zen, think from the point of view of the capitalist: you write code to put food on the table. You can reduce risks using well-known tools, or you can step onto unknown land. You will not be invited to give a speech at the conference, but the work will be done, and your pager will not be forced off for the night.

Source: https://habr.com/ru/post/133697/


All Articles