In a
previous article, I talked about the release of Solid12 - the version of the
IEM platform for PostgreSQL. As promised, we tell in more detail about what the developers will have to face (and what we encountered during the migration).
This review is not exhaustive; rather, it should be considered as a minimum introductory for a developer under Oracle who starts work on PostgreSQL.
So, here is the list of differences that we recognized as the most significant:
- Names of tables, columns, stored procedures are lowercase by default (except for identifiers in quotes). This also affects the column names in the results of SQL queries.
- Packages are not supported. The Solid12 kernel uses schemas to group functions into packages, so several kernel circuits are used.
- Instead of variable packages, session variables are used. Their semantics is similar, although there are some differences.
- Temporary tables in PostgreSQL are always local and cannot be bound to schemas, unlike Oracle. Solid12 has a feature set that provides emulation of Oracle global temporary tables that fully support the standard syntax for working with global temporary tables.
- The PL / PgSQL stored procedure language is similar, but still differs from PL / SQL in many details.
- Stored procedures always use dynamic binding (i.e., characters are interpreted at run time). For example, if a stored procedure executes a query like “SELECT * FROM USERS”, a table lookup named “USERS” will be run at run time, not at compilation. Dynamic binding makes PostgreSQL procedures quite flexible, but prone to much more runtime errors than Oracle.
- By default, PostgreSQL functions run with the current user permission set (in Oracle, function creator permissions are used by default). This option can be explicitly overridden in every function where it is required.
- Blank lines are not the same as NULL. Unlike Oracle, PostgreSQL concatenation of varchar or text values ​​with NULL always returns NULL. Always initialize local variables with empty lines to avoid unintended zeroing of the result.
- DDL operations in PostgreSQL are transactional. Creating tables and functions, changing column types, clearing tables (TRUNCATE), and so on - must be committed.
- Replacing the function with a new version will fail if the signature of the function differs from the original. In this situation, performing “CREATE OR REPLACE FUNCTION” is not enough: you need to delete (DROP) the old version of the function and create it again.
- Any database error marks the current transaction as erroneous, so the transaction cannot execute any commands except ROLLBACK. However, such a transaction can be rolled back to the last named save point (SAVEPOINT) that was performed before the error occurred. After rolling back to the save point, the transaction can continue to work and fix (this technique is used in integration tests, which should continue execution, despite errors).
- Several simultaneously active DataReaders on the same database connection are not supported (the so-called MARS mode is Multiple Active Result Sets). To read several data sets, you must either open several connections to the database, one for each data set, or execute queries in turn.
It is not always possible to completely abstract from the features of specific databases in the application code. Often, in commands or services you need to form and execute a dynamic SQL query, call a stored procedure, and so on. An application schema may require triggers, views, constraints, or indices, so an Oracle application developer will need to understand at least the basic PostgreSQL properties.
')
Below are some instructions that can help with some of the difficulties described.
How to bypass dynamic binding in PL / PgSQL code
Dynamic binding is a powerful mechanism that in some cases can replace dynamic query execution (EXECUTE sql). The downside is the fragility of the structure, the lack of checks at compile time. The compiler cannot statically check if a given symbol refers to any database object.
When a function refers to a symbol, for example, a table or a function, a specific object will be found by name only during the execution of the function. In addition, this search is affected by the contents of the search_path variable, which means that the symbol can be found in any scheme, depending on the settings of the current session.
This is usually not the case.
To disable dynamic binding, we follow two simple rules:
- Add the line “set search_path to (current schema name)” to all function definitions and
- We qualify all tables outside the current schema with the names of their schemas.
This does not make the binding static (PostgreSQL still does not check the validity of the characters), but simply turns off the possibility of inadvertently binding the character to something wrong.
Here is an example of the source code of the PL / PgSQL function, which no longer suffers from dynamic binding:
Overriding permissions that apply to a function
By default, PostgreSQL functions are called with the permission set of the current DBMS user, similarly to the Oracle option “AUTHID CURRENT_USER” (by default, Oracle has a different mode - “AUTHID DEFINER”).
To emulate the behavior of Oracle, the function must override the “security option” like this:
create or replace function my_secure_func() returns void as $$ begin
Emulate Oracle global temporary tables
The semantics of temporary tables in PostgreSQL differs significantly from the Oracle one. Here is a brief overview of the differences:
- Temporary tables in Oracle are permanent, that is, their structure is fixed and visible to all users, and the contents are temporary.
- In PostgreSQL, a temporary table is created before each use. Both the structure and the contents of the temporary table are visible only to the current DBMS process that created this table. Temporary PostgreSQL tables are always deleted, either at the end of a session or at the end of a transaction.
- In Oracle, temporary tables are always located within a particular schema specified during creation.
- In PostgreSQL, temporary tables cannot be placed in an arbitrary schema; they are always created in a special implicit temporary schema.
The pack_temp schema contains a library for emulating temporary Oracle-style tables. We are interested in only two functions:
create_permanent_temp_table(table_name [, schema_name]); drop_permanent_temp_table(table_name [, schema_name]);
Creating a permanent temporary table is done in two steps:
- Create a regular PostgreSQL temporary table (the same one that is deleted at the end of the transaction).
- Call the create_permanent_temp_table function to turn this temporary table into a permanent one:
create temporary table if not exists another_temp_table ( first_name varchar, last_name varchar, date timestamp(0) with time zone, primary key(first_name, last_name) ) on commit drop;
This creates a view that behaves exactly like a global Oracle temporary table. You can delete it using the drop_permanent_temp_table function.
Several active DataReaders on the same database connection
This is PostgreSQL’s most annoying limitation: every database connection can have only one open DataReader at a time.
A new request cannot be executed until the previous one is executed and processed.
The problem regularly pops up in application services, LINQ queries and SQL queries in many different forms. Here are a few typical cases:
- A LINQ query uses a constant (or calls a service). The request opens the first DataReader, and the constant service tries to open the second and receives an error. To get rid of the error, you need to read the constant in a local variable before executing the query (or call the service after reading the results of the query). Example:
// var query = from a in DataContext.GetTable<Agent>() where a.ID = Constants.TestAgentID select a; // var testAgentId = Constants.TestAgentID; var query = from a in DataContext.GetTable<Agent>() where a.ID = testAgentId select a;
- The LINQ query results are processed in a loop, but the loop body executes the second query, be it LINQ or SQL. How to bypass the restriction: materialize the results of the query into a list or an array, and run through the list after the first query has already been completed. Example:
// foreach (var langId in DataContext.GetTable<Language>().Select(x => x.ID)) { using (LanguageService.UseLanguage(langId)) { // do something language-specific } } // foreach (var langId in DataContext.GetTable<Language>().Select(x => x.ID).ToIDList()) { using (LanguageService.UseLanguage(langId)) { // do something language-specific } }
- Call ToArray / ToList / ToIDList within a LINQ query. To fix, you need to split the request into parts:
// var dictionary = DataContext.GetTable<CalendarDayStatus>().Where(d => dates.Contains(d.DT)) .GroupBy(g => g.DT, e => e.StatusID) .ToDictionary(k => k.Key, e => e.ToIDList()); // var dictionary = DataContext.GetTable<CalendarDayStatus>() .Where(d => dates.Contains(d.DT)) .GroupBy(g => g.DT, e => e.StatusID) .ToDictionary(p => p.Key); var dict = dictionary.ToDictionary(p => p.Key, p => p.Value.ToIDList());
Unfortunately, this kind of error is very difficult to detect statically. Therefore, every nontrivial LINQ request needs to be thoroughly tested to make sure that it does not try to open several DataReaders simultaneously.
What's next?
We plan to intensively develop interaction with the PostgreSQL team. Actually, most of these restrictions do not seem insurmountable, we may be able to find resources to make the appropriate changes to the PostgreSQL code.
We have not used some of the features that PostgreSQL already has, such as support for geodata processing, but we hope that we can use them in future versions.
In any case, both versions - light Solid12 and enteprise Ultimate Solid - will be developed in parallel, all the important functionality will be supported in both versions of the platform delivery.