Constantly developing corporate information systems that work with documents, you are faced with the task of organizing the storage of documents in a database. It is required to provide: reliable storage of structured and unstructured documents, links between documents, storage of document versions, advanced document search functions, mass input / output of documents, effective simultaneous work of a large number of users with a large number of documents, as well as access control to documents.
The data in information systems can be very diverse: from completely unstructured (graphic images, miltimedia, etc.) to complexly structured objects that have hundreds of details. All this can be generalized to the concept of a document with its own life cycle, versioning, security attributes. The document may, from the initial unstructured process, become structured. Often the document has a type (office memo, contract, application form, etc.), which determines the structure of its details.
Often, customers of corporate systems put forward requirements for the used DBMS. For quite a long time in our work we used our own object-oriented DBMS NIKA. Not rarely there are tasks for the integration of information systems at the DBMS level. In recent developments, we transferred the accumulated experience with documents to the use of popular relational DBMS (Microsoft SQL Server, Oracle, MySQL).
These requirements led us to the creation of a document-oriented document repository (Nexus).
The storage is a set of interrelated software components and applications that organize document storage in modern industrial DBMS (MS SQL Server 2005/2008, Oracle, MySQL, etc.). Software components assume various configurations: they can be integrated into an information system, providing document correlation mapping and hiding the specifics of a DBMS from an information system, or they can be launched as a stand-alone lightweight Content Server.
')
At a high level of abstraction, the Storage contains a collection of collections of typed documents (that is, documents with a fixed structure, although arbitrary format files can be attached to them). The Query Storage Language allows you to effectively formulate the search terms for structured documents, as well as include full-text search elements related to documents and attached files.
The storage serves as an intermediary between the corporate system and the DBMS, converting the queries of the corporate system, formulated in terms of structured documents (for example, XSD, XML, XPath), into DBMS commands (for example, in SQL). The storage at the same time provides competitive access to documents, saves document versions, and also applies the specified rules of access control to documents and logs user actions.
To be continued…