How do postgreSQL security_barrier views work

You may have noticed that support for security_barrier views has been added to PostgreSQL 9.2. I looked into this code with an eye on adding support for automatic updates for them, as part of the developing line level security work for the AXLE project , and I thought that I would try to explain how they work.

Robert has already explained what the benefits of such views are and what they protect against (besides, this was discussed in " What's new in PostgreSQL 9.2 "). Now I would like to go into how they work and discuss how security_barrier views interact with automatically updated views.

Ordinary views

A simple, regular representation is expanded in macro as a subquery, which is usually later optimized by putting out its predicate and adding it to the conditions of the contained query. This can be made clearer with an example. Given table:

CREATE TABLE t AS SELECT n, 'secret'||n AS secret FROM generate_series(1,20) n;

and presentation:
')

 CREATE VIEW t_odd AS SELECT n, secret FROM t WHERE n % 2 = 1;

query type:

 SELECT * FROM t_odd WHERE n < 4

will be converted inside the request handler to the following form:

 SELECT * FROM (SELECT * FROM t WHERE n % 2 = 1) t_odd WHERE n < 4

which the optimizer will then turn into a query executed at once, putting the subquery and the WHERE conditions into an external query:

 SELECT * FROM t t_odd WHERE (n % 2 = 1) AND (n < 4)

You cannot see instant requests directly and they never exist as real SQL, but you can see this process by including debug_print_parse = on , debug_print_rewritten = on and debug_print_plan = on in postgresql.conf. I will not reproduce the parsing and planning trees here, as they are rather cumbersome and easy to generate, based on the examples above.

The problem with using views for security

You might think that giving someone access to the view, without giving access to the table itself, would not allow them to see the even rows. Actually it looks like the truth:

 regress=> SELECT * FROM t_odd WHERE n < 4; n | secret ---+--------- 1 | secret1 3 | secret3 (2 rows)

but when you look at the plan, you can see a potential problem:

 regress=> EXPLAIN SELECT * FROM t_odd WHERE n < 4; QUERY PLAN --------------------------------------------------- Seq Scan on t (cost=0.00..31.53 rows=2 width=36) Filter: ((n < 4) AND ((n % 2) = 1)) (2 rows)

The subquery of the view was optimized and its determinants were made to the external query.

In SQL, AND and OR are not ordered. The optimizer / executor has complete freedom in choosing which branch to launch, which they consider to be faster in terms of issuing an answer and possibly allowing them to avoid launching other branches. Those. if the scheduler thinks n <4 is much faster than n% 2 , it will start it first. It looks harmless, right? Try:

 regress=> CREATE OR REPLACE FUNCTION f_leak(text) RETURNS boolean AS $$ BEGIN RAISE NOTICE 'Secret is: %',$1; RETURN true; END; $$ COST 1 LANGUAGE plpgsql; regress=> SELECT * FROM t_odd WHERE f_leak(secret) AND n < 4; NOTICE: Secret is: secret1 NOTICE: Secret is: secret2 NOTICE: Secret is: secret3 NOTICE: Secret is: secret4 NOTICE: Secret is: secret5 NOTICE: Secret is: secret6 NOTICE: Secret is: secret7 NOTICE: Secret is: secret8 NOTICE: Secret is: secret9 NOTICE: Secret is: secret10 NOTICE: Secret is: secret11 NOTICE: Secret is: secret12 NOTICE: Secret is: secret13 NOTICE: Secret is: secret14 NOTICE: Secret is: secret15 NOTICE: Secret is: secret16 NOTICE: Secret is: secret17 NOTICE: Secret is: secret18 NOTICE: Secret is: secret19 NOTICE: Secret is: secret20 n | secret ---+--------- 1 | secret1 3 | secret3 (2 rows) regress=> EXPLAIN SELECT * FROM t_odd WHERE f_leak(secret) AND n < 4; QUERY PLAN ---------------------------------------------------------- Seq Scan on t (cost=0.00..34.60 rows=1 width=36) Filter: (f_leak(secret) AND (n < 4) AND ((n % 2) = 1)) (2 rows)

Oops! As you can see, the function with the predicate put by the user was considered cheaper to run than other tests, so it skipped all the lines before the presentation predicate ruled out the wrong ones. A malicious function can use the same trick to copy strings.

Views security_barrier

The security_barrier views fix this by forcing the view conditions to be executed first, before any conditions created by the user are applied. Instead of extending the view and putting any conditions of the view into an external query, they replace the reference to the view with a subquery. This subquery has a security_barrier flag affixed to the entire range of its entry in the table, which tells the optimizer that it should not touch the subquery, or take conditions out of it, as it would have done in the usual case.

Thus, the presentation with a protective barrier:

 CREATE VIEW t_odd_sb WITH (security_barrier) AS SELECT n, secret FROM t WHERE n % 2 = 1;

we'll get:

 regress=> SELECT * FROM t_odd_sb WHERE f_leak(secret) AND n < 4; NOTICE: Secret is: secret1 NOTICE: Secret is: secret3 n | secret ---+--------- 1 | secret1 3 | secret3 (2 rows) regress=> EXPLAIN SELECT * FROM t_odd_sb WHERE f_leak(secret) AND n < 4; QUERY PLAN --------------------------------------------------------------- Subquery Scan on t_odd_sb (cost=0.00..31.55 rows=1 width=36) Filter: f_leak(t_odd_sb.secret) -> Seq Scan on t (cost=0.00..31.53 rows=2 width=36) Filter: ((n < 4) AND ((n % 2) = 1)) (4 rows)

The query execution plan should tell you what is going on, although it does not show the protective barrier attribute in the explanation output. A nested subquery forces t to scan with the conditions of the subquery representation, after which the conditions of the function written by the user are satisfied on the received data.

But. Wait a second. Why does the user predicate n <4 also appear in the subquery? Isn't that a potential security hole? If n <4 is omitted, then why not f_leak (secret) ?

LEAKPROOF operators and functions

The explanation for this is that the < operator is labeled LEAKPROOF . This attribute signals that the given operator or function will not allow information leakage; therefore, it can be safely applied to security_barrier views. For obvious reasons, you cannot set the LEAKPROOF attribute as a regular user:

 regress=> ALTER FUNCTION f_leak(text) LEAKPROOF; ERROR: only superuser can define a leakproof function

Superuser can do whatever they want and they do not need to resort to tricks with information leakage functions in order to pass the protective barrier of ideas.

Why can't you update the view security_barrier

Normal views in PostgreSQL 9.3 are automatically updated , but security_barrier views do not imply “simplicity”. This is due to the fact that the update of the view relies on the ability to remove the view subquery, turning the update into a regular table update. The whole point of security_barrier views is to prevent this view condition exception. UPDATE cannot currently work directly with a subquery, so PostgreSQL will reject any attempts to update the security_barrier view:

 regress = > UPDATE t_odd SET secret = 'secret_haha' || n; UPDATE 10 regress = > UPDATE t_odd_sb SET secret = 'secret_haha' || n; ERROR: cannot UPDATE VIEW "t_odd_sb" DETAIL: SECURITY - barrier views ARE NOT automatically updatable. HINT: TO ENABLE updating the VIEW, provide an INSTEAD OF UPDATE TRIGGER OR an unconditional ON UPDATE DO INSTEAD RULE.

This is exactly the same restriction that I am interested in canceling as part of the work on the development of row level security for the AXLE project . Kohei KaiGai has done a tremendous amount of work with row-level security, and such things as security_barrier and LEAKPROOF have largely sprung from his work in the direction of adding row-level security in PostgreSQL. The next challenge is how to deal with updating the security barrier safely and in such a way that it can be serviced in the future.

Why subqueries?

You may wonder why we use subqueries for this. I was thinking. The short version is we don’t have to, but if we don’t use subqueries, we’ll instead have to create new, sorting sensitive AND and OR operators and teach the optimizer that it cannot pass conditions through them. Since views are already expanded by subqueries, it is much easier to mark subqueries as fences that do not allow to extract / add data to them.

PostgreSQL already has a simplified, ordered operation - CASE . The problem with using CASE is that no operation can cross CASE boundaries, even LEAKPROOF . As well as the optimizer and can not make decisions about the use of indices? based on expressions inside a CASE block. So, if we used CASE as I asked about this here, we would never have been able to use the index to satisfy the user-provided condition.

In code

Security_barrier support was added at 0e4611c0234d89e288a53351f775c59522baed7c and enhanced with LEAKPROOF support in cd30728fb2ed7c367d545fc14ab850b5fa2a4850 . Words of thanks go to commit notes. Thanks to everyone who participated.

Ps. The article is relatively old, but important as an introduction to the translation of the next article.

Source: https://habr.com/ru/post/273167/

All Articles