Often, when solving problems of system integration, it is required to present a certain amount of data in one format or another. In this case, the consumer of data can be anyone, but the source is almost always the corporate database. For example, a manufacturer may require periodic reports from the supplier on the movement of its goods in the XLSX or XML format, etc.
There are many tools for converting data into various formats, and their use depends on the technology stack adopted by the enterprise and the software architecture. At the same time, you always want the chain consisting of various libraries, frameworks, system layers used to transform the source data to be as short as possible. This would reduce the time spent on developing a solution and increase its productive efficiency.
If we accept that, in fact, the SQL query is at the root of the data sampling process, then ideally we would like to see a chain of conversions like this:
Where - initial data, - SQL query for data retrieval, - a function that converts the sample into the required format, - data in the required format. ')
For Oracle PL / SQL, there are a number of built-in and third-party packages that implement this functionality. These are DBMS_XMLGEN, DBMS_XMLQUERY, AS_XLSX, PL / JSON and others.
However, when there was a question about converting data into CSV format, for some reason, there were no ready-made solutions. I had to do it myself, then it will be shown how.
Formulation of the problem
Create a tool (PL / SQL package), which accepts an arbitrary SELECT query as a string or as a cursor variable, and returns a CLOB object that encapsulates data in CSV format. In case of any error, it should return NULL. The CSV format itself does not need to be presented - these are lines whose elements are separated by a certain character, most often ";", but in general, an arbitrary character can act as a separator. We assume that the characters 0x0D + 0x0A are used to separate the lines. The first row in a CSV file is usually the header and defines the column names.
Here are two overloaded procedures, the difference between them is that one of them accepts the request as a string, and the other as a reference to the cursor. The second parameter is the output, this is the desired result in the CLOB object. Finally, the third parameter is the CSV delimiter.
The implementation of these procedures will be helped by the built-in DBMS_SQL package, which allows working with dynamic cursors, when it is not known in advance (at compile stage) how many columns are involved in the sample.
DBMS_SQL features allow you to dynamically parse and execute arbitrary queries. For a procedure that accepts a request as a string, this happens as follows:
AS cur PLS_INTEGER := DBMS_SQL.OPEN_CURSOR; -- DBMS_SQL ignore INTEGER; BEGIN DBMS_SQL.PARSE(cur, stmt, DBMS_SQL.NATIVE); ignore := DBMS_SQL.EXECUTE(cur);
For a procedure that accepts a cursor variable, everything is simpler - starting with version 11 of Oracle, the conversion “cursor variable → number of SQL cursor” has become available.
The DBMS_SQL.TO_CURSOR_NUMBER function converts the REFCURSOR variable (strongly or weakly typed) into the number of the SQL cursor, which can then be passed to the DBMS_SQL subroutines. In this case, the cursor variable must be opened before its transfer to the DBMS_SQL.TO_CURSOR_NUMBER function.
cur PLS_INTEGER := DBMS_SQL.TO_CURSOR_NUMBER(ref_cursor);
Thus, both call options were reduced to obtaining the number of the dynamic cursor and the associated data set. The next step is to get information about the columns of this set and extract the data itself.
The DMBS_SQL package allows you to describe the columns of a dynamic cursor, returning information about each column in an associative array of records.
To do this, you must declare a PL / SQL collection based on the DBMS_SQL.DESC_TAB collection type (or DESC_TAB2, if the query can return column names that are longer than 30 characters). You can then use the collection methods to iterate through the table and retrieve information about the cursor.
AS cols DBMS_SQL.DESC_TAB2; ncols NUMBER; -- col_val_chr VARCHAR2(32767); BEGIN DBMS_SQL.DESCRIBE_COLUMNS2(cur, ncols, cols);
Next, the DBMS_SQL package needs to report the type of each column selected using a dynamic query. This is done by calling DEFINE_COLUMN.
FOR i IN1 .. ncols LOOP DBMS_SQL.DEFINE_COLUMN(cur, i, col_val_chr, 32767); ENDLOOP;
In the second argument DEFINE_COLUMN, a number is passed - the consecutive position of the column in the list. The third argument specifies the data type of the cursor column. It is passed an expression of the appropriate type. In other words, DBMS_SQL.DEFINE_COLUMN is not passed a string with a type name (say, “VARCHAR2”), but a variable defined with type VARCHAR2.
When defining a character type column in the fourth argument, you must specify the maximum length of the values ​​that can be loaded into the cursor.
The preparatory operations are completed, you can now form a header line and start extracting data.
AS cap CLOB; -- CSV- BEGIN DBMS_LOB.CREATETEMPORARY(cap, TRUE, DBMS_LOB.SESSION); FOR i IN 1 .. ncols LOOP DBMS_LOB.APPEND(cap, cols(i).col_name || delimeter); END LOOP;
Data is retrieved line by line using DBMS_SQL.FETCH_ROWS and subsequent calls to DBMS_SQL.COLUMN_VALUE to retrieve the value of individual columns.
AS bod CLOB; -- CSV- c_line_break CONSTANT VARCHAR2(2) := chr(13) || chr(10); BEGIN DBMS_LOB.CREATETEMPORARY(bod, TRUE, DBMS_LOB.SESSION); WHILE DBMS_SQL.FETCH_ROWS(ur) > 0 LOOP FOR i IN 1 .. ncols LOOP DBMS_SQL.COLUMN_VALUE(cur, i, col_val_chr); DBMS_LOB.APPEND(bod, col_val_chr || delimeter); END LOOP; DBMS_LOB.APPEND(bod, c_line_break); END LOOP;