It took a long time before I matured to write the second part of the article on configuration management. The fact that it finally happened was due to the fact that not so long ago I was lucky to speak at the
PHPCONF 2009 conference on October 8 (
Web Architect Workshop Day ) with the master class “Method of organizing the source code repository”. For the presentation were prepared in advance of the presentation, as well as the text of the report. Despite the excellent organization of the event, the materials of the reports included in the conference program were not posted for public access. As compensation, I decided to publish the material used in my presentation. In addition to this article (which is a logical continuation of the
previous one ) devoted to configuration management, presentation
slides are available for public viewing.
This article will discuss the tools used in configuration management. Therefore, first of all I would like to focus on how the tools used in the development can influence the software creation process.
- First, each tool is used to solve a number of specific tasks.
- Secondly, the solution of the problem presupposes the prior use of the tool to satisfy a set of requirements, without which the tool will not be effective or will not work.
- Thirdly, often the set of requirements put forward by the variety of tools used is not fully satisfied. In addition, the requirements of different instruments may conflict with each other. All this leads to the fact that the effectiveness of the use of such aids decreases.
Here, any tools used by a programmer to develop are tools: be it an IDE, a framework, a version control system, or a separate technology. Each developer explicitly or implicitly faces the problem of managing the tools they use. Most often, the solution of such problems either occurs on an intuitive level, or there is a quite conscious choice of the primary tool on which most of the development process is built. Few people manage to combine most of the used tools into a single platform, whereas this could serve to build a streamlined development process.
The approaches used in configuration management may well contribute to the construction of such a streamlined process, as this is the main discipline in determining how the working materials of a software project, the changes made to it, as well as information about the status of individual tasks and the entire project are controlled . As it was said earlier (in the
previous article ), they fall into the area of ​​configuration management interests (hereinafter each item will be considered separately):
- Version Control
- Build management
- Unit testing
- Static code analysis (static analysis)
- Documentation generation (phpDoc)
- Continuous integration (continuous integration)
- Dependency Management (dependency management)
- Database Integration
- Bug tracking and issue tracking
Also from the previously mentioned, it is possible to note what the configuration management itself does:
- Release management
- Supply adjustments (delivery)
- Organization of established development processes
- Coordination of the way in which different parts of the software project interact
It should be noted that one should try to clearly distinguish between CM processes (version control, build management, etc.) and CM tasks (release management, supply adjustments, etc.). To identify the essential features of QM processes, it is necessary to consider each such process (as well as its connection with configuration management) separately.
Version control
Version control is the primary configuration management process. The absence of the need to use version control during development is practically zero. Most developers are familiar with version control systems (SCR) such as CVS, Subversion, or VSS. Also popular are distributed version control systems: git, mercurial, bazaar. The mentioned version control systems have a great influence on the formation of concepts about configuration management among developers. For this reason, developers usually do not think about the thoughtful organization of the repository. The simplest and only solution offers hard currency subversion. This solution is used most often: division by trunk, branches and tags. Although in most subversion repositories you can see this hierarchy of directories, you cannot say that they are always equally actively used: at best, you can see several directories of parallel development branches in the branches directory. The developers' fear of creating an extensive repository directory structure can be understood - this often promises many problems associated with merging changes between branches. But with team development of applications with multiple versions or working on different platforms, it is almost impossible to avoid mergers. But you can minimize their number by regulating situations in which you can merge, and in which - not. But to determine such a regulation, a thorough understanding of the phenomena associated with the maintenance of the repository is needed.
')
Assembly management
Another important configuration management process is assembly management. Assembly management is an automation of actions regarding:
- Source code compilation
- Deploying (deploying) an application
- Run unit tests
- Database initialization
It should be noted that assembly management tools are characterized by the fact that assembly files usually use XML syntax (Ant, Nant, Maven, MSBuild, Phing), although there are exceptions (make, nmake, cmake, rake). Assembly management tools usually have build-specific commands, such as:
- Batch file operations
- Compilation
- Deployment
- Interaction with version control systems
These and other features of assembly management tools allow for easy integration with other tools used in development; assembly management tools act as a link for most processes to be automated in configuration management.
Unit testing
Although unit testing cannot be fully attributed to configuration management processes, it has quite important properties that determine the special role of unit testing in configuration management. Thus, the use of unit testing in software projects written in interpreted programming languages ​​determines, apart from the need for assembly management, also the need for a more thorough approach to configuration management. First, unit testing appears where refactoring is intended. Secondly, when deciding on the use of unit tests, the architectural and logical integrity of the project is usually taken into account, which must be maintained at a sufficient level. This means that making a decision on the use of unit testing automatically implies the introduction of a number of additional processes (as well as costs) associated with the development organization, although this may not be so obvious.
Static code analysis
In addition to supporting architectural and logical integrity in large projects, there is a need to identify syntax errors in the early stages of development. In addition to identifying syntax and logical errors in the source code, it is often necessary to automatically verify for compliance with code conventions. Static code analysis is more relevant for interpreted languages, as in compiled languages, most errors are detected at the compilation stage. Sometimes static analysis can be done to collect the source code metrics and compile relevant reports. An example of libraries used for static source code analysis are the libraries presented in the table:
Programming language | Tool |
Java | PMD Findbugs |
C / C ++ | Cppcheck Lint |
C # | FxCop Stylecop ReSharper |
Php | PHP_CodeSniffer Php sat PHP_Depend Pixy |
Python | PyChecker Pylint PyFlakes |
Ruby | Reek Roodi Rufus Flay Flog |
Language independent tools | Rats Yasca |
Documentation generation
Documentation generation based on source code and comments in javaDoc style, phpDoc and so on is most often relevant for projects with active use of source code: libraries, reusable components, frameworks, etc.
Oriented to a specific programming language | Php | phpDocumentor |
Java | Javadoc |
C ++ | Cppdoc |
Python | pyDoc |
Ruby | Rdoc |
Deplhi | DelphiCodeToDoc |
C # | NDoc |
Not focused on the use of a specific programming language | Doxygen |
ROBODoc |
Twintext |
Continuous integration
Continuous Integration (Eng. Continuous Integration, CI) is the practice of software development, which is to perform frequent automated project builds for the earliest identification and resolution of integration problems. In a typical project (not using the practice of continuous integration), where developers work independently on each part of the application, the integration stage is final. As rightly observed in the
concise wikipedia article describing CI, continuous integration cannot be applied to any project. To implement the practice of continuous integration, the project must meet several requirements:
- The source code and everything you need to build is stored either in the source code repository, or in an easily accessible place.
- Copy operations from the repository, assembly and testing of the entire project are automated and easily called from an external program.
Usually, continuous integration tools are configured to trigger the assembly when the repository update event occurs. With the deployment phase included in the build process, continuous integration can be adapted to provide a team of testers with a working and test-ready copy of the application. During development, such continuous integration tools can be used as: CruiseControl, phpUnderControl, Xinc, CruiseControl.rb, TeamCity, Apache Continuum, Hudson, Parabuild, Atlassian Bamboo, etc.
Database Integration
An important issue that occupies a separate place when considering configuration management tasks is database integration. In most cases, databases are an integral part of applications. Development using agile-methodologies provides not only continuous improvement of the program code, but also the database structure, as well as its functionality. And it usually happens in parallel - along with a change in the program code, changes are made to the types of fields, the names of fields, functions, triggers, and database indexes. The evolution of the database during the life cycle of a project resembles the evolution of software code, having practically the same features when considering a database as an object of configuration management. Although database configuration management and version control of databases have significant differences compared to software configuration management, this is an integral part of project management and is reflected in the database integration process. An SQL application can often be viewed as an application in an application and separated into a separate project to be versioned. There are two types of identification elements used in database versioning: Database Management Language (DML) and Database Definition Language (DDL). DML is a subset of SQL used to manipulate data: selections, inserts, deletes, updates (CRUD operations). DDL is also a subset of SQL, but used to describe the database structure: creating tables, indexes, triggers, integrity constraints, etc. When organizing database integration, it is recommended to separate DML and DDL artifacts and manage them separately.
Issue tracking
The tendency to use systems (task control systems) indicates an increasingly frequent attempt by such systems to integrate software project management tools. Often this happens not directly, but indirectly through the release of various plugins (integration with version control systems, javaDoc, phpDoc or Doxygen documentation, etc.). But even in the basic set of functionality related to CRM (change request management, control for change requests) there are elements that need to be improved. Such an element could be, for example, versioning. But from the use of standardized version names keeps the need to use certain agreements not only in the issue-tracking system, but also in version control, as well as other CM processes.
Agile
The approach to software development, which involves the use of a flexible methodology, has a number of distinctive features that most significantly influence the organization of configuration management. These properties are:
- changing requirements
- iteration
- continuous delivery.
It is assumed that when considering configuration management processes, these properties should be reflected in the individual components of a particular process or the results obtained at one of the stages.
Conclusion
In this article, without going into details, we consider the processes that are part of configuration management. To ensure each process uses a separate tool that can be selected from a variety of alternatives. It was noted that due to the heterogeneity of the tools themselves, as well as the difference in the tasks that they solve, it is difficult to achieve their maximum integration and efficiency. The next article will be about this - how by introducing additional formalization (organization of the source code repository) to achieve a more efficient use of the tools used in the Criminal Code.
To be continued
References:
- Issue tracking system (wiki)
- Continuous integration (wiki)
- Version Control for Multiple Agile Teams
- Continuous deployment in 5 easy steps
- A tool to build metrics svn repository
- Code coverage analysis
- Code metrics and their practical implementation in IBM Rational ClearCase
- SLOCCount - tool for counting the number of lines of code
- Continuous Builds with CruiseControl, Ant and PHPUnit - an example of organizing a simple configuration management platform
- My PHPCONF 2009 Report