
Instead of the preface
I’ve been programming on
Yii for two years now and I’ve recently started to stare at Symfony Framework 2. Partly, I’m attracted to the well-thought-out architecture, partly the weak connectivity of components, and partly the flexibility of built applications. Immediately after I dealt with the main device of the new framework, I wondered if it was possible to build a CMS on it, and maybe even use the finished one.
I haven’t yet come up with a box solution, however, somehow I wandered onto the Symfony CMF project site and found myself completely overwhelmed by a methodical approach to solving the problems I encountered while working on a conveyor to tighten a design on some Drupal. On Habré there are no publications about CMF, and the project itself is still very raw, but in the long run everything looks interesting, although in some places there is something to complain about.
Symfony CMF
The
Symfony CMF project is designed to simplify the development of functionality inherent in the CMS, for all those who use Symfony Framework 2 in their work.
The main features of the project:
')
- weak connectivity
- scalability
- convenience
- testability
It is necessary to focus on the word
CMF - the project is not a CMS in itself, it is a
framework . Unlike CMS, where all components are tightly tied to each other, in Symfony CMF you:
- use whatever you want
- replace what you don't like
- ignore what is not required
That is, you are given a set of modular development tools, and not a ready-made application on a turnkey basis, although basic bundles have already been developed that provide CMS functionality.
Why another CMF?
It’s no secret that there are a lot of ready-made products on the market, both paid (1-Bitrix, UMI) and free (Drupal, MODx, Wordpress, Joomla). Therefore, it is quite logical that a question may arise when seeing the inscription
Whatever CMS / CMF. Why even make another CMS at all? They are already so full.And I absolutely agree. As a user.
CMS is really a dime a dozen. But as a developer, I often shed sweat, blood and tears, trying to get something more out of them, something the authors of the underlying system and third-party extensions did not.
Due to insufficiently thought-out architecture, when working with ready-made solutions, one has to deal with:
- the lack of a clear separation of logic, configuration, content and presentation. Suffice it to recall the modules Drupal - a bunch of files, a jumble of unclear how named global functions, hooks and other things. That's the way a good article in which this issue is discussed
- a lot of legacy code left over from old versions. Periodically, developers are trying to fix it, promising to rewrite the kernel and other joys, but until the new (rewritten) version reaches the “can be used” stage, a lot of time can pass
- often there are no such concepts as development, testing, and there are no deployment tools
- caching problems Somewhere it is not, somewhere it is, but it does not give a sufficient degree of flexibility, or there are problems with disability, or it simply does not save, etc.
- poor performance on large (in this case, this concept is relative) data volumes
- difficulty in creating your own components or overriding existing ones
- you have to choose between pre-defined data types, or be content with EAV-storage on top of relational DBMS, or worse
- uncomfortable template engines-bicycles, invented by the authors of CMS ...
- ... and all this as a consequence of the NIH syndrome.
The developers of these systems are aware of flaws and do not reject the charges, but at the moment it is impossible to solve all these problems. However, we will not swear at everyone, we will better formulate a number of problems that the CMS should solve for the sake of user convenience, and then we will see how these problems are solved in Symfony CMF. So the problems are:
- data storage
- template system
- routing, CNC, and how the user can control all this
- menu setting
- content management (editable blocks on the page, front-end editing on a live site, uploading files)
- i18n
- good admin
Let's start in order from the problems.
Data storage problem
Based on the actual interpretation of the concept of CMS, it becomes clear that the most important component of the CMS is data storage. Even more - the
CMS should provide data storage with different properties . For example, for materials like BlogPost or NewsItem, you can create common fields
title
and
body
, and then the differences will follow - you may need to attach pictures to the news.
Imagine an online store. What is stored in the database? At least - descriptions of goods and order history. Unlike the first, for the second it is much easier to design a storage scheme, although it is obvious that both friends cannot exist without friends. Hence the following requirement: the
CMS should be able to refer to the content both within the CMS and in other parts of the system .
The content on the site itself is most often organized in the form of a tree structure, in some way repeating the file system. At the same time, the authors of the site want to organize content in different ways depending on their needs, as well as flexibly adjust the menus and addresses of materials. Thus, the
CMS must present the data in a tree structure and
be able to maintain several independent trees at the same time .
The information that users enter into the CMS is rarely perfectly structured. Sometimes, you need to add one, another, third, tenth field -
CMS should not be forced to use a single scheme for content or, even better,
give the opportunity to define your own scheme .
In large organizations, it is not uncommon for the material to go through several stages of verification before appearing on the site, instead of publishing with one click - the
CMS should support moving and exporting content between trees . And for the story it would be nice
to keep versions of the content that can be restored at any time.
It should be remembered about users from other countries and regions. Although the whole site is not usually required to translate into another language, the
CMS should provide an opportunity to present content in different languages, with an optional fallback according to the rules specified .
When the content becomes too much, you will definitely need
full-text search , the ability to determine the
rules for controlling access to subtrees , and
assistance in the process of publishing a document by several authors (workflow is different for everyone).
Content Repository
It becomes clear that one "muscle" will not get off. Relational databases with such tasks simply can not cope, although there are algorithms such as Materialized path or Nested set, which allow you to store the structure of a graph in flat databases. But even if a single implementation will work, it will most likely be rigidly tied to a specific engine, and this is already bad, because it deprives us of freedom and flexibility. No need to blame the RDBMS - they are conceived for completely different tasks, they need clearly described data, and not trees consisting of weakly structured elements.
However, we will not be upset - after all, we have invented content
repositories or
content repositories long ago, if we translate bourgeois. Repositories are designed to give access to reading, writing and searching data, regardless of the applications that need this data. In essence, this is a data warehouse with an emphasis primarily on the logical aspect of data processing.
JCR-170
The problem of data storage for document-oriented systems arose many years ago, so even in the first half of the two thousandth, people from Day Software (namely David Nüscheler) submitted a request through the Java Community Process to accept the
Content Repository API for Java (JCR) specification, which The serial number was assigned to 170. Later, the specification was held under the number JSR-283 (2.0), JSR-333 (2.1, the final draft was completed on August 31), but the link to the first version is still more common.
According to the specification, the repository is an object database that provides storage, search and retrieval of hierarchical data. In addition, the provided API allows you to use data versioning, transactions, change tracking, import / export to XML, and also store binary and metadata.
Such a repository is organized as a tree of nodes that have properties. Directly the data is stored in them, and it can be numbers, and strings, and binary data of arbitrary length. Nodes can be subdivided into types, have child nodes, certain behavioral characteristics, or simply refer to neighbors (using a special property and a unique identifier that each node has).
Starting with the second version of the specification, the repository should be able to respond to SQL queries, which is more convenient than their XPath counterparts from the first edition.
As a vivid example of the implementation of such happiness, you can highlight the
Apache Jackrabbit project, an open-source repository written in Java. In addition to all the goodies described above, this project (started back in 2004 as the initial implementation of the JCR API) is able to flexibly control access to the content. There is also clustering, locking mechanisms, etc., but this is not very interesting for us now, so we’ll skip it.
PHPCR
But not everyone writes in Java! (Omit the jokes on this topic)
For people like us, the
Content Repository for PHP was created - the JCR API described above, adapted to the style of PHP. Assuming that the API is the same and well specified, it follows: you can write the application once, and then just change the backends (theoretically, of course).
An important plus is that we do not reinvent the wheel (as we remember, the problem of data storage in the CMS has already been solved).
Of course, such an initiative could not be ignored - David
sent a request for the adoption of PHPCR in JCR 2.1. Very cute.
Since you cannot just take and port APIs from Java to PHP, there are still differences between implementations. In short, this is due to the fact that PHP is weakly typed and does not support method overloading. Therefore, some of the interfaces and functions were simply thrown away as unnecessary, and where there was an overload, methods were simply added optional arguments. Details of the differences are described
here , but nothing terrible is not there.
Currently PHPCR supports the following functions:
- tree access
- access to nodes by UUID
- search by nodes
- versioning
- identifying opportunities
- Import and export to XML
- Locks
- Transactions *
- Permissions
- Access control*
- Change tracking
(*) - Not yet implemented in Jackalope-Jackrabbit (more on this below), although the information could be a little outdated.
Key PHPCR concepts :
- all content is stored in the node tree
- nodes have name and type
- nodes have child nodes and value-storing properties
- property values can store numbers, strings, binary objects, and references to other nodes.
Somewhere we have already heard, is not it?
Let's see what this repository might look like (schematically, of course):
<root> <cms> <pages> <home title="Hello"> <block title="News" content="Today: PHPCR presentation"></block> </home> <contact title="Contact" content="phpcr-users@groups.google.com"></contact> </pages> </cms> </root>
So far, nothing supernatural.
Consider a little more detail what you have to work with.
Knots
- node is a named container that always has a parent
- resembles XML elements
- nodes can be created, deleted, modified, copied
- the path to the node consists of the path of the parent node and the name of the current node:
- Path: / cms / pages / home
- Parent path: / cms / pages
- Host Name: home
Node properties
- nodes have named properties that store values
- resemble XML attributes
- data types: STRING, URI, BOOLEAN, LONG, DOUBLE, DECIMAL, BINARY, DATE, NAME, PATH, WEAKREFERENCE, REFERENCE
- types (WEAK) REFERENCE create links to other nodes
- nodes and properties can have namespaces:
jcr:created
, jcr:mimeType
, phpcr:class
Basic node types
- define the names allowed for use, as well as the types of properties and child nodes
- each node must have a main type installed
- for storage, anything is used
nt:unstructured
- among other built-in types are nt: address, nt: folder, nt: file and others
- You can define new types of nodes to create your own scheme
Mixin node types
- main types do not have multiple inheritance
- but there are mixin types that add trait- like functionality to nodes
- mixin types can be assigned to a node during its lifetime
Example: let's say we have a
jcr:uuid
property that stores a unique identifier. Knowing uuid, we can create a mixin
mix:referenceable
, and based on it
mix:versionable
(but then we still need to have the properties
jcr:versionHistory
,
jcr:predecessors
,
jcr:baseVersion
,
jcr:isCheckedOut
,
jcr:mergeFailed
, etc. )
Workspaces
- there can be several workspaces, each one keeps its own node tree
- resembles the Unix file system and branches in Git / SVN, each can be cloned and merged
- can be used independently

And now some examples of how to work with all this:
Session creation
use PHPCR\SimpleCredentials;
CRUD operations
$root = $session->getRootNode();
Tree traversal
$node = $session->getNode('/site/content'); foreach ($node->getNodes() as $child) { var_dump($child->getName()); }
Versionality
Search
$qm = $workspace->getQueryManager();
Other code examples can be viewed
in this presentation .
However, let us return to the alluring thought about different backends.
We currently have not so many implementations, but also those already interesting:
- Midgard2 PHPCR
- Jackalope
- supports jackrabbit
- supports Doctrine DBAL (data storage on relational databases)
- supports MongoDB (actually not)
Midgard2 PHPCR
Midgard2 is an open source content repository with binders for C, Python and
PHP .
A little
different terminology from JCR, Midgard2 provides the same functions for accessing content via Midgard2 PHPCR using the
php5-midgard2 extension . Being built on top of the GNOME libgda library, Midgard2 maintains
an impressive list of relational databases in which you can place your repository.
Immediately I will say about a fly in the ointment - a PHP extension is compiled for a sufficiently small number of OS:
- under Debian 7 Wheezy, the package is still in unstable branches (and deservedly silently drops PHP-FPM in a segfolt).
- for CentOS, the packages are either outdated or not for all architectures (but where there is, it is very likely that it works, the hands did not reach)
- Windows builds do not exist in nature (it is possible that the “gnome” roots of Midgard2 itself are affected, although four years old files are still in PHP4 in the repository)
- I could not test it under Mac OS due to my lack of it (but judging by the site, everything is put through brew).
In general, everything was successfully installed on Ubuntu Server 12.04, there are fresh packages and nothing crashes.
However, from communicating with the Symfony CMF developers on IRC, it became clear to me that this backend provider had been broken for several months, even tests for it were disabled. The reason is somewhere on the side of the Midgard2 team, although
bergie promised to fix it.

Midgard2 PHPCR as part of the symfony CMF did not work for me. Maybe someone else will. Not now, then later.
Jackalope
Continuing to beat the hare topic in the names (
Jackrabbit ,
Jackalope ), Jackalope provides access to three types of data warehouses:
- this is already known to us apache jackrabbit
- Doctrine Database Abstraction Layer, which allows you to use supported DBAL engines. This is in theory, in practice, only MySQL, PostreSQL and SQLite have been tested (someone is using something else?).
- MongoDB (not updated for two years, most likely broken or irrelevant)
Jackalope (and in particular jackalope-jackrabbit) is fairly stable and is recommended for use as the most complete implementation of the PHPCR API in terms of features. We will work with her. However, phpcr-api-tests that check the availability and performance of the PHPCR API are also included for jackalope-doctine-dbal, which may eventually catch up.
PHPCR Summary
So, we have an (adapted for PHP) API for accessing content repositories that conform to the JCR API standard. For this API, several libraries have been developed that abstract the application code from the data store.
So far, there should be two main questions, and both will be answered:
When to use PHPCR?- When to work with hierarchical navigation structures
- When you have data related to each other
- When data versioning is needed
When to NOT use PHPCR?For strictly structured content and the use of aggregate queries, it is recommended to use relational databases. For example, in an online store, product descriptions can be stored in PHPCR, and orders can be stored in RDBMS.
PHPCR ODM
The specification is great, but the API is too abstract and inconvenient for everyday use (after all, most are accustomed to some ORM system). And here comes the
PHPCR ODM project, which is a bundle of PHPCR and Object Document Mapper.
A Doctrine ORM , familiar to developers using SF2 (and not only SF2), implements the
Data mapper pattern to access data stored in RDMBS.
ODM, like Doctrine ORM, uses Data mapper to completely separate business logic from the data storage layer, which in this case is the content repository. The authors honestly admit that ODM is inspired by the ideas of
Hibernate .
ODM stores objects as PHPCR nodes, calling them documents. At the same time, since PHPCR is already independent of implementations, it does not require writing a new abstraction layer from the database (DBAL).
What is a
document in PHPCR ODM terminology?
The document is a concise PHP class that does not implement any interfaces (or rather, you can always implement it, but the library itself does not require this) and is not inherited from some basic abstract classes. Such an entity should not have methods with the keyword
final
, implement the
clone()
and
wakeup()
methods, or implement them, but doing so
very carefully . By itself, an entity consists of properties fixed in a repository. Since ODM works on top of the
Doctrine Common library, which implements the basic functionality (annotations, caching and autoloading of classes), mapping of properties in the data store to class properties is done using the familiar way — through annotations in PHP comments or in YAML / XML configurations. Each document has a title (title) and content (content). All documents are organized as a tree and can refer to other documents. Take a look at the sample document:
namespace Demo; use Doctrine\ODM\PHPCR\Mapping\Annotations as PHPCRODM; class MyDocument { private $id; private $parent; private $name; private $children; private $title; private $content;
Note that in addition to the usual data types (for example, String), annotations can also specify the type of references to child or parent documents.
For those unfamiliar with the Data mapper pattern, it may seem that such classes are a bit similar to
Active record (hello, rails and Yii-shniki), but they are not anyway.
How to work with such a document?
require_once '../bootstrap.php';
require_once '../bootstrap.php'; $doc = $documentManager->find(null, "/doc"); echo 'Found '.$doc->getId() ."\n"; echo 'Title: '.$doc->getTitle()."\n"; echo 'Content: '.$doc->getContent()."\n"; foreach($doc->getChildren() as $child) { if ($child instanceof \Demo\Document) { echo 'Has child '.$child->getId() . "\n"; } else { echo 'Unexpected child '.get_class($child)."\n"; } }
A small note is that in ORM it is usual to receive data using queries. In ODM, you need to use a hierarchy for this. However, you can
do queries if you really want to.
PHPCR ODM has already implemented two very important functions - versioning and multilingualism. Let's start with the first.
Versioning in PHPCR is of two kinds - simpleVersionable and versionable. For simple versioning, checkin / checkout methods and a linear history are provided. Chekin creates a new version of the node and makes read-only available. To write something down, you need to make a checkout.
( - PHPCR-ODM ) ( , Jackalope). , ( , , ).
mix:versionable
PHPCR . , PHPCR Version API PHPCR ODM ,
PHPCR\VersionManager
PHPCR-.
.
PHPCR . - ( ). , .
( —
restoreVersion()
removeVersion()
.
- , :
class MyPersistentClass { private $versionName; private $versionCreated; }
, , , Phpdoc- . , .
$article = new Article(); $article->id = '/test'; $article->topic = 'Test'; $dm->persist($article); $dm->flush();
. . , , , , . — DocumentManager, ,
find()
. :
class MyPersistentClass { private $locale; private $publishDate; private $topic; private $image; }
:
, , , . , , ( , , ). , , .
, , ( ), Solr/ElasticSearch Doctrine DBAL MongoDB. Jackrabbit ( Oak) , - PHPCR .
Summarize. ODM :
- PHP Content Repository Jackalope Midgard2 ( Jackrabbit )
- PHPCR-ODM Doctrine Common
- .
.