⬆️ ⬇️

Twelve Factor App - The Twelve-Factor App

Dear readers! I present to your attention the translation of The Twelve-Factor App web application development methodology from the developers of the Heroku platform. My comments are hidden by spoilers in the course of the article.



Introduction



Nowadays, software is usually distributed in the form of services called web applications (web apps) or software-as-a-service (SaaS). An appendix of twelve factors is a methodology for creating SaaS applications that:





The methodology of the twelve factors can be applied to applications written in any programming language that use any combination of third-party services (backing services) (databases, message queues, cache memory, etc.).



Prerequisites



Contributors to this document were directly involved in the development and deployment of hundreds of applications and indirectly witnessed the development, implementation and scaling of hundreds of thousands of applications during our work on the Heroku platform.

')

This document summarizes our entire experience of using and observing a wide variety of SaaS applications in the wild. The document is a combination of three ideal approaches to application development: focusing on the dynamics of organic growth of an application over time, the dynamics of cooperation between developers working on the application code base and the elimination of the effects of software erosion .



Our motivation is to raise awareness of some system problems that we have encountered in the practice of developing modern applications, as well as to provide common basic concepts for discussing these problems and propose a set of common conceptual solutions to these problems with associated terminology. The format is inspired by the books of Martin Fowler (Martin Fowler) Patterns of Enterprise Application Architecture and Refactoring .



Who should read this document?



Developers who create SaaS applications. Ops to engineers deploying and managing such applications.



Twelve factors



I. Code base

One code base tracked in a version control system - multiple deployments



Ii. Dependencies

Explicitly declare and isolate dependencies



Iii. Configuration

Save the configuration in runtime



Iv. Third Party Services (Backing Services)

Consider third-party services (backing services) connected resources



V. Build, Release, Execution

Strictly separate the assembly and execution stages



Vi. Processes

Run the application as one or more non-stateless processes (stateless)



VII. Port binding

Export services through port mapping



Viii. Parallelism

Scale an application using processes



Ix. Disposability

Maximize reliability with quick startup and proper shutdown



X. Par development / application performance

Keep the development, staging and production deployment environments as similar as possible.



Xi. Logging (Logs)

View the log as a stream of events.



Xii. Admin Tasks

Perform administration / management tasks using one-time processes.



I. Code base



One code base tracked in a version control system - multiple deployments


An application of twelve factors is always tracked in a version control system such as Git , Mercurial or Subversion . A copy of the database of monitored versions is called a code repository , which is often shortened to code repo or simply to repository (repo)



A codebase is one repository (in centralized version control systems, like Subvertion), or multiple repositories that have common initial commits (in decentralized version control systems, like Git).



One Code Base Multiple Deployments



There is always a one-to-one correspondence between the code base and the application:





There is only one code base for each application, but there can be multiple deployments of the same application. Deployed application (deploy) is a running instance of the application. As a rule, this is a working site deployment and one or more intermediate site deployments. In addition, each developer has a copy of the application running in his local development environment, each of which also qualifies as an deployed application (deploy).



The code base should be the same for all deployments, but different versions of the same code base can be executed in each of the deployments. For example, a developer may have some changes that have not yet been added to a staged deployment; intermediate deployment may have some changes that have not yet been added to the production deployment. However, all these deployments use the same code base, so they can be identified as different deployments of the same application.


Ii. Dependencies



Explicitly declare and isolate dependencies


Most programming languages ​​come with a package manager for distributing libraries, such as CPAN in Perl or Rubygems in Ruby. Libraries installed by the package manager can be installed accessible to the entire system (so-called “system packages”) or available only to an application in the directory containing the application (so-called “vendoring” and “bundling”).



The application of the twelve factors never depends on implicitly existing packages available to the whole system. The application declares all its dependencies completely and accurately using the manifest of the dependency declaration . In addition, it uses the dependency isolation tool at run time to ensure that implicit dependencies do not “leak” from the surrounding system. A complete and explicit specification of dependencies applies equally to both the development and operation of the application.



For example, Gem Bundler in Ruby uses Gemfile as a manifest format for declaring dependencies and bundle exec to isolate dependencies. Python has two different tools for these tasks: Pip is used for declarations and Virtualenv for isolation. Even C has Autoconf for declaring dependencies, and static binding can provide dependency isolation. Regardless of which toolset is used, the declaration and isolation of dependencies should always be used together — only one of them is not enough to satisfy the twelve factors.



One of the advantages of explicitly declaring dependencies is that it simplifies application customization for new developers. The new developer can copy the application code base onto his machine, the only requirements for which are the presence of the language runtime environment and the package manager. Everything you need to run the application code can be configured using a specific configuration command . For example, for Ruby / Bundler, the configuration team is bundle install , for Clojure / Leiningen this is lein deps .



The application of the twelve factors also does not rely on the implicit existence of any system tools. An example is the launch of the ImageMagick and curl programs. Although these tools may be present in many or even most systems, there is no guarantee that they will be present on all systems where the application may work in the future, or whether the version found in another system is compatible with the application. If an application needs to run a system tool, then this tool must be included in the application.



Translator Comments
More package managers:



For insulation, you can also use:

  • RVM in Ruby
  • NVM in NodeJS
  • And not to mention Docker , which makes the whole system “local” for the application and allows you to declare the necessary dependencies and isolate them at runtime.









Iii. Configuration



Save the configuration in runtime


Application configuration is all that can change between deployments (development environment, intermediate and production deployment). It includes:





Sometimes applications store configurations as constants in code. This is a violation of the methodology of the twelve factors, which requires a strict separation of configuration and code . The configuration can vary significantly between deployments; the code should not differ.



The litmus test of whether the configuration and application code are properly separated is the fact that the application code base can be freely available at any time without compromising any private data.



Note that this “configuration” definition does not include internal application configurations, such as 'config / routes.rb' in Rails, or how the main modules will be connected in Spring . This type of configuration does not change between deployments and therefore it is best to keep it in code.



Another approach to configuration is to use configuration files that are not saved to the version control system, for example, 'config / database.yml' in Rails. This is a huge improvement before using constants that are saved in the code, but there are still some disadvantages to this method: it is easy to save the configuration file by mistake to the repository; There is a tendency when configuration files are scattered in different places and in different formats, because of this it becomes difficult to view and manage all settings in one place. In addition, the formats of these files are usually specific for a particular language or framework.



An appendix of twelve factors stores configuration in environment variables (often abbreviated to env vars or env ). Environment variables are easy to change between deployments without changing code; unlike configuration files, it is less likely to accidentally save them to the code repository; and unlike custom configuration files or other configuration mechanisms, such as Java System Properties, they are a language and operating system independent standard.



Another approach to configuration management is grouping. Sometimes applications group configurations into named groups (often called “environments”) named for a specific deployment, such as development , test and production environments in Rails. This method is not scalable enough: the more different application deployments are created, the more new environment names are needed, such as staging and qa . With the further growth of the project, developers can add their own special environments, such as joes-staging , resulting in a combinatorial explosion of configurations, which makes the management of application deployments very fragile.



In the application of the twelve factors, environment variables are unrelated controls, where each environment variable is completely independent of the others. They are never grouped together in an “environment”, but instead are managed independently for each deployment. This model is scaled gradually along with the natural appearance of more application deployments during its lifetime.



Translator Comments
Modules for working with environment variables:










Iv. Third Party Services (Backing Services)



Consider third-party services (backing services) connected resources


A third-party service is any service that is available to an application over the network and is necessary as part of its normal operation. For example, data stores (for example, MySQL and CouchDB ), message queuing systems (for example, RabbitMQ and Beanstalkd ), SMTP services for outgoing e-mail (for example, Postfix ), and caching systems (for example, Memcached ).



Traditionally, third-party services, such as databases, are supported by the same system administrator who deploys the application. In addition to local services, an application can use services provided and managed by a third party. Examples include SMTP services (for example, Postmark ), metrics collection services (such as New Relic and Loggly ), binary data storages (eg Amazon S3 ), and the use of various services APIs (such as Twitter , Google Maps and Last.fm ) .



The application code of the twelve factors does not distinguish between local and third-party services. For an application, each of them is a pluggable resource accessible by a URL or a different location / credential pair stored in the configuration . Each deployment of an application of twelve factors should be able to replace the local MySQL database with any third-party managed (for example Amazon RDS ) without any changes to the application code. Similarly, the local SMTP server can be replaced with a third-party (for example, Postmark) without changing the code. In both cases, you only need to change the resource identifier in the configuration.



Every different third-party service is a resource . For example, a MySQL database is a resource; two MySQL databases (used for fragmentation at the application level) are qualified as two separate resources. An appendix of twelve factors considers these databases to be connected resources , indicating that they are weakly linked to the deployment in which they are connected.



Working deployment of an application connected to 4 third-party services.



Resources can optionally be connected to the deployment and disconnected from the deployment. For example, if the application database is not functioning properly due to hardware problems, the administrator can start a new database server restored from the latest backup. The current working database can be disabled, and the new database is connected - all this without any code changes.






V. Build, Release, Execution



Strictly separate the assembly and execution stages


The code base is transformed into deployment (not taking into account the deployment for development) in three stages:





The code becomes an assembly that combines with the configuration to create a release.



The application of the twelve factors uses a strict separation between the stages of assembly, release and execution. For example, it is not possible to make changes to the code at runtime, as there is no way to propagate these changes back to the build phase.



Deployment tools, as a rule, are release management tools, and, importantly, give the opportunity to roll back to the previous release. For example, the Capistrano deployment tool stores releases in subdirectories of a directory named releases , where the current release is a symbolic link to the current release directory. The Capistrano rollback command allows you to quickly roll back to the previous release.



Each release must have a unique identifier, such as a release time stamp (for example 2015-04-06-15:42:17 ) or an increasing number (for example, v100 ). Releases can only be added and each release cannot be changed after its creation. Any changes are required to create a new release.



The build is initiated by the application developer every time a new code is deployed. The execution phase, on the other hand, can occur automatically in such cases as a server restart, or a process manager restarting a fallen process. Thus, the execution phase should be as technically simple as possible, since problems that can prevent an application from starting can occur in the middle of the night when there are no developers available. The build phase may be more complicated, since possible errors are always visible to the developer who launched the deployment.






Vi. Processes



Run the application as one or more processes that do not save the internal state (stateless)


An application runs in the runtime as one or more processes .



In the simplest case, the code is an independent script, the runtime is a developer’s laptop with the language runtime installed, and the process is started from the command line (for example, as python my_script.py ). Another extreme option is a working deployment of a complex application that can use many types of processes, each of which is launched in the required number of instances .



The processes of application of the twelve factors do not preserve the internal state (stateless) and do not have shared data (share-nothing) . Any data that needs to be stored must be stored in a state - storing third-party service , usually in a database.



The memory and file system of the process can be used as a temporary cache for a single transaction. For example, loading, processing and saving a large file in a database. The application of the twelve factors does not imply that something cached in memory or on disk will be available to the following requests or tasks — with a large number of diverse processes, it is highly likely that the next request will be processed by another process. Even with one process running, restarting (caused by deployment, changing configurations or transferring the process to another physical device) will result in the destruction of all local (memory, file system) states.



Asset packers (for example, Jammit or django-compressor ) use the file system as a cache for compiled resources. The application of the twelve factors prefers to do this compilation during the assembly stage , for example, as in the Rails asset pipeline , rather than at run time.



Some web systems rely on “sticky sessions” - that is, they cache user session data in the application's process memory and expect subsequent requests from the same user to be redirected to the same process. Sticky sessions are a violation of twelve factors and should never be used or relied on. User session data are good candidates for a data warehouse that provides a function to limit storage time, for example, Memcached and Redis .






VII. Port binding



Export services through port mapping


Sometimes web applications run inside a web server container. For example, a PHP application can be launched as a module inside Apache HTTPD , or a Java application can be launched inside Tomcat .



The application of the twelve factors is completely self-sufficient and does not rely on the injection of a web server at run time in order to create a web service. The web application exports the HTTP service by binding to a port and listens for requests arriving at this port.



During local development, the developer follows the URL of the form localhost:5000/ localhost:5000/ to access the service provided by its application. When deployed, the routing layer processes requests to the public host and redirects them to the port-bound web application.



This is typically implemented using dependency declarations to add a web server library to an application such as Tornado in Python, Thin in Ruby, and Jetty in Java and other JVM-based languages. This happens entirely in user space , that is, in the application code. The contract with the runtime environment is the binding of the application to the port for processing requests.



HTTP is not the only service that can be exported via port mapping. Almost any type of server software can be run as a process associated with a port and waiting for incoming requests. Examples of this include ejabberd (provides the XMPP protocol ) and Redis (provides the Redis protocol ).



Also note that the port-binding approach means that one application can act as a third-party service for another application by providing the URL of the third-party application as a resource identifier in the configuration of the consuming application.






Viii. Parallelism



Scale application with processes


Any computer program after launch represents one or more running processes. Historically, web applications have taken different forms of process execution. For example, PHP processes will run as Apache child processes and run on demand in the quantity required to service incoming requests. Java processes use the opposite approach, the JVM is a single monolithic meta-process that reserves a large amount of system resources (processor and memory) at startup and manages parallelism within itself using execution threads. In both cases, the running processes are only minimally visible to the application developer.



Scaling is expressed in the number of running processes, the difference in workload is expressed in the types of processes.



In an appendix of twelve factors, processes are first-class entities. The processes in the appendix of twelve factors took strengths from the unix process model for running daemons . Using this model, the developer can design his application in such a way that for processing different workloads, it is necessary to assign each type of work to its own type of process . For example, HTTP requests can be processed by a web process, and lengthy background tasks are processed by a workflow.



/ , EventMachine , Twisted Node.js . ( ), .



, , , . , . (process formation) .



PID . (, Upstart , , Foreman ) , .






Ix. (Disposability)





, , . , .



. , , , . and scaling. In addition, it is more reliable, since the process manager can freely transfer processes to new physical machines if necessary.



Processes must complete properly when they receive a SIGTERM signal from the process manager. For the web process, the correct shutdown is achieved by stopping listening to the service port (thus, abandoning any new requests), which allows you to complete current requests and then end. In this model, it is assumed that HTTP requests are short (no more than a few seconds); in the case of long requests, the client should smoothly try to reestablish the connection when the connection is lost.



, (worker), . , RabbitMQ NACK ; Beanstalkd , . , , Delayed Job , . , , .



. , SIGTERM , . , Beanstalkd, . , . (Crash-only design) .






X. /



, (staging) (production)


( ) ( ). :





An application of twelve factors is designed for continuous deployment by minimizing the differences between application development and operation. Consider the three differences described above:





Summarizing the above table:

Traditional applicationAppendix twelve factors
Time between deploymentsWeeksClock
The author of the code / the one who unfoldsDifferent peopleSame people
/


, , , , . , , . , .

Type ofTongue
DatabaseRuby/RailsActiveRecordMySQL, PostgreSQL, SQLite
Python/DjangoCeleryRabbitMQ, Beanstalkd, Redis
CacheRuby/RailsActiveSupport::Cache, , Memcached


, . SQLite PostgreSQL ; Memcached .



, . , , , , . , . , .



, . , Memcached, PostgreSQL RabbitMQ , Homebrew apt-get . , , Chef Puppet , Vagrant . / .



- , . ( , ) .



Docker , Docker Compose .








Xi. (Logs)





Logging provides a visual representation of the behavior of a running application. Usually in a server environment, the log is written to a disk file (“logfile”), but this is only one of the output formats.



A log is a stream of aggregated, time-ordered events, collected from the output streams of all running processes and auxiliary services. The log in its raw form is usually represented by a text format with one event per line (although exception traces may take several lines). The log has no fixed start and end, the message flow is continuous while the application is running.



The application of the twelve factors never deals with the routing and storage of its output stream. . stdout . , .



, . , . (, Logplex and Fluent ) can be used for this purpose.



The event flow of the application can be redirected to a file or viewed in the terminal in real time. Most importantly, the event stream can be sent to an indexing and analyzing system for logs, such as Splunk , or a general purpose storage system, such as Hadoop / Hive . These systems have great capabilities and flexibility for a thorough analysis of the behavior of the application over time, which includes:







:



: Elasticsearch , Logstash , Kibana (ELK).










XII. Admin Tasks



/


, ( -), . , , :





. , , , . , .



. , - Ruby bundle exec thin start , bundle exec rake db:migrate . Python Virtualenv bin/pythonboth for starting the Tornado web server, and for starting any manage.pyadministration processes.



The methodology of the twelve factors gives preference to languages ​​that provide a REPL shell out of the box, and which make it easy to execute one-time scripts. In a local deployment, developers perform a one-time administration process using a console command within the application directory. In a working deployment, developers can use ssh or another remote command execution mechanism provided by the runtime to start such a process.




Bug fixes and more suitable translations can be sent:





amalinin litchristina

Source: https://habr.com/ru/post/258739/



All Articles