The problem with bound variables: how to turn the optimizer from enemy to friend

The author of the article is Victor Varlamov ( varlamovVp18 ), OCP.
The original article was published on 07/07/2017.
Special thanks to the author of the translation - brutaltag .

Our reporting system typically runs hundreds of lengthy queries that are triggered by various events. The query parameters are the list of clients and the time interval (daily, weekly, monthly). Because of the irregular data in the tables, a single query can produce either a single row or a million rows, depending on the parameters of the report (for different clients there is a different number of rows in the fact tables). Each report is designed as a package with a main function that accepts input parameters, performs additional transformations, then opens a static cursor with associated variables and returns this open cursor at the end. The database parameter CURSOR_SHARING is set in FORCE.
In such a situation, one has to deal with poor performance, both in the case of re-use of the query plan by the optimizer, and with full parsing of the query with literal parameters. Associated variables may cause a non-optimal query plan.

In his book “Oracle Expert Practices,” Alex Gorbachev tells an interesting story told to him by Tom Kite. Every rainy Monday, users had to deal with a modified query plan. It is hard to believe, but it was so:

“According to the observations of end users, when it rained heavily on Monday, the database performance was terrible. On any other day of the week or on Monday there were no problems without rain. From a conversation with the database administrator Tom Kite learned that the difficulties continued until the forced restart of the database, after which the performance became normal. That was such a workaround: rainy Monday - restart. ”

This is a real case, and the problem was solved completely without any magic, only thanks to the excellent knowledge of how Oracle works. I will show the solution at the end of the article.
Here is a small example of how related variables work.
Create a table with irregular data.

SQL> CREATE TABLE VVP_HARD_PARSE_TEST(C1 NUMBER, C2 NUMBER, C3 VARCHAR2(300)); TABLE created. SQL> INSERT INTO VVP_HARD_PARSE_TEST SELECT ROWNUM C1, CASE WHEN LEVEL < 9 THEN 1 WHEN MOD(ROWNUM, 100)=99 THEN 99 ELSE 1000000 END C2, RPAD('A', 300, 'A') C3 FROM DUAL CONNECT BY LEVEL CREATE INDEX IND_VVP_HARD_PARSE_TEST_C2 ON VVP_HARD_PARSE_TEST(C2); INDEX created. SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => USER, TABNAME => 'VVP_HARD_PARSE_TEST', CASCADE => TRUE, METHOD_OPT=>'FOR ALL INDEXED COLUMNS SIZE 254'); PL/SQL PROCEDURE successfully completed. SQL> SELECT histogram FROM user_tab_columns WHERE table_name = 'VVP_HARD_PARSE_TEST' AND column_name = 'C2'; HISTOGRAM --------- FREQUENCY SQL> SELECT c2, COUNT(*) FROM VVP_HARD_PARSE_TEST GROUP BY c2 ORDER BY 1; C2 COUNT(*) ----------------------- 1 8 99 10000 1000000 989992

')
In other words, we have the table VVP_HARD_PARSE_TEST with a million rows, where in 10,000 cases the field C2 = 99, 8 records with C2 = 1, and the rest with C2 = 1000000. A histogram across the field C2 tells the Oracle optimizer about this data distribution. This situation is known as a non-uniform distribution of data , and a histogram can help choose the right query plan, depending on the requested data.

Observe simple queries to this table. Obviously, for the request

SELECT * FROM VVP_HARD_PARSE_TEST WHERE c2 = :p

if p = 1, then the best choice would be INDEX RANGE SCAN, for the case of p = 1000000 it is better to use FULL TABLE SCAN. Query1 and Query1000000 queries are identical, except for the text in the comments, this is done to get different identifiers of query plans.

 DECLARE p NUMBER; v NUMBER; BEGIN V := 0; p := 1000000; FOR rec IN (SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); v : =0; p := 1; FOR rec IN (SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); ----------------- V := 0; p := 1; FOR rec IN (SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); v := 0; p := 1000000; FOR rec IN (SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); END;

Now look at the query plans:

 SQL> SELECT sql_id, child_number, executions, plan_hash_value, sql_text FROM v$sql WHERE sql_text LIKE 'SELECT % * FROM VVP_HARD_PARSE_TEST WHERE C2%'; SQL_ID CHILD_NUMBER EXECUTIONS PLAN_HASH_VALUE SQL_TEXT ------------------------------------------------- 7rqnhhp6pahw2 0 2 2782757451 SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1 7xwt28hvw3u9s 0 2 2463783749 SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1 SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7rqnhhp6pahw2', format => 'basic +peeked_binds')); SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1 PLAN hash VALUE: 2782757451 ------------------------------------------------- | Id | Operation | Name | ------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | TABLE ACCESS FULL| VVP_HARD_PARSE_TEST | ------------------------------------------------- Peeked Binds (IDENTIFIED BY position): -------------------------------------- 1 - :B1 (NUMBER): 1000000 SQl> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7xwt28hvw3u9s', format => 'basic +peeked_binds')); SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1 PLAN hash VALUE: 2463783749 ------------------------------------------------------------------ | Id | Operation | Name | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | TABLE ACCESS BY INDEX ROWID| VVP_HARD_PARSE_TEST | | 2 | INDEX RANGE SCAN | IND_VVP_HARD_PARSE_TEST_C2 | ------------------------------------------------------------------ Peeked Binds (IDENTIFIED BY position): -------------------------------------- 1 - :B1 (NUMBER): 1

As you can see, a plan for different queries is created only once, at the time of the first execution (only one child cursor with CHILD_NUMBER = 0 exists for each query). Each request is executed twice (EXECUTION = 2). During a hard parse, Oracle retrieves the values of the associated variables and selects a plan according to these values. But he uses the same plan for the next run, despite the fact that the associated variables changed in the second run. Non-optimal plans are used - Query1000000 with variable C2 = 1 uses FULL TABLE SCAN instead of INDEX RANGE SCAN, and vice versa.

Clearly, patching an application and using parameters as literals in a query is the most appropriate way to solve a problem, but it leads to dynamic SQL with its known flaws. Another way is to disable the query of related variables ( ALTER SESSION SET "_OPTIM_PEEK_USER_BINDS" = FALSE ) or delete histograms ( link ).

One possible solution is the alternative use of data access policies, also known as Virtual Private Database (detailed access control, Fine Grained Access Control , row level control). This allows you to change requests on the fly and therefore can cause a complete analysis of the query plan each time the query uses detailed access control. This technique is described in detail in the article Randalph Geist . The disadvantage of this method is the increasing number of complete parses and the inability to manipulate query plans.

See what we do now. After analyzing our data, we decide to divide customers into three categories - Large, Medium and Small (LMS or 9-5-1) - according to the number of transactions or transactions during the year. Also, the number of lines in the report strictly depends on the period: Monthly - Large, Weekly - Middle, Daily - Small or 9-5-1. Further, the solution is simple - we will make the predicate of the security policy dependent on each category and on each period. So, for each request, we get 9 possible child cursors. Moreover, queries with different policies will lead us to the same query identifiers, this makes it possible to implement SQL PLAN MANAGEMENT (sql plan baseline).

 SQL> CREATE TABLE HARD_PARSE_TABLE AS SELECT * FROM dual; TABLE created. SQL> CREATE TABLE CLIENTS_HP_STATISTICS (client_seqno NUMBER, client_id VARCHAR2(255), cnt_year NUMBER); TABLE created. SQL> INSERT INTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year) VALUES (1, 'SMALL CLIENT', 8); 1 ROW inserted. SQL> INSERT INTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year) VALUES (99, 'MIDDLE CLIENT', 50001); 1 ROW inserted. SQL> INSERT INTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year) VALUES (1000000,'LARGE CLIENT', 989992); 1 ROW inserted. SQL> CREATE OR REPLACE PACKAGE FORCE_HARD_PARSE_PKG IS gc_small CONSTANT NUMBER := 1; gc_middle CONSTANT NUMBER := 5; gc_large CONSTANT NUMBER := 9; gc_client_middle CONSTANT NUMBER := 50000; gc_client_large CONSTANT NUMBER := 500000; gc_daterange_middle CONSTANT NUMBER := 10; gc_daterange_large CONSTANT NUMBER := 50; FUNCTION FORCE_HARD_PARSE(in_schema VARCHAR2, in_object VARCHAR2) RETURN VARCHAR2; PROCEDURE SET_PREDICATE (n NUMBER); PROCEDURE SET_PREDICATES (p_daterange NUMBER DEFAULT NULL, p_clientrange NUMBER DEFAULT NULL); PROCEDURE CALC_PREDICATE; PROCEDURE CALC_PREDICATES(p_date_interval NUMBER DEFAULT 1, p_client_seqno NUMBER DEFAULT NULL, p_client_id VARCHAR2 DEFAULT NULL, p_client_seqno_list VARCHAR2 DEFAULT NULL ); END FORCE_HARD_PARSE_PKG; PACKAGE created. SQL> CREATE OR REPLACE PACKAGE BODY FORCE_HARD_PARSE_PKG IS g_predicate NUMBER; -- g_daterange || 0 || g_clientrange g_daterange NUMBER; -- 1 - small, 5 - middle, 9 - large g_clientrange NUMBER; -- 1 - small, 5 - middle, 9 - large -- FUNCTION FORCE_HARD_PARSE(in_schema VARCHAR2, in_object VARCHAR2) RETURN VARCHAR2 IS BEGIN IF NVL(g_predicate, 0) = 0 THEN RETURN NULL; ELSE RETURN TO_CHAR(g_predicate, 'TM') || ' = ' || TO_CHAR(g_predicate, 'TM'); END IF; END FORCE_HARD_PARSE; -- PROCEDURE SET_PREDICATE (n NUMBER) IS BEGIN g_predicate := n; END; PROCEDURE SET_PREDICATES (p_daterange NUMBER DEFAULT NULL, p_clientrange NUMBER DEFAULT NULL) IS BEGIN IF p_daterange IS NOT NULL THEN g_daterange := p_daterange; CALC_PREDICATE; END IF; IF p_clientrange IS NOT NULL THEN g_clientrange := p_clientrange; CALC_PREDICATE; END IF; END SET_PREDICATES; PROCEDURE CALC_PREDICATE IS BEGIN g_predicate := NVL(g_daterange, 0) * 100 + NVL(g_clientrange, 0); END CALC_PREDICATE; PROCEDURE CALC_PREDICATES (p_date_interval NUMBER DEFAULT 1, p_client_seqno NUMBER DEFAULT NULL, p_client_id VARCHAR2 DEFAULT NULL, p_client_seqno_list VARCHAR2 DEFAULT NULL) IS v_cnt NUMBER; BEGIN IF p_date_interval IS NOT NULL THEN g_daterange := CASE WHEN p_date_interval < gc_daterange_middle THEN gc_small WHEN p_date_interval < gc_daterange_large THEN gc_middle ELSE gc_large END; CALC_PREDICATE; END IF; IF COALESCE(p_client_seqno, p_client_id, p_client_seqno_list) IS NOT NULL THEN SELECT NVL(SUM(cnt_year), 0) AS cnt INTO v_cnt FROM CLIENTS_HP_STATISTICS t WHERE 1=1 AND (p_client_seqno IS NULL OR p_client_seqno = t.client_seqno) AND (p_client_id IS NULL OR p_client_id = t.client_id) AND (p_client_seqno_list IS NULL OR t.client_seqno IN (SELECT SUBSTR(s, CASE WHEN LEVEL > 1 THEN INSTR(s, ',', 1, LEVEL - 1 ) + 1 ELSE 1 END, INSTR(s, ',', 1, LEVEL) – CASE WHEN LEVEL > 1 THEN INSTR(s, ',', 1, LEVEL – 1) + 1 ELSE 1 END) FROM (SELECT p_client_seqno_list||',' AS s FROM DUAL) CONNECT BY INSTR(s, ',', 1, LEVEL) > 0)); g_clientrange := CASE WHEN v_cnt > gc_client_large THEN gc_large WHEN v_cnt > gc_client_middle THEN gc_middle ELSE gc_small END; CALC_PREDICATE; END IF; END CALC_PREDICATES; END FORCE_HARD_PARSE_PKG; PACKAGE BODY created. SQL> EXEC DBMS_RLS.ADD_POLICY (USER, 'HARD_PARSE_TABLE', 'HARD_PARSE_POLICY', USER, 'FORCE_HARD_PARSE_PKG.FORCE_HARD_PARSE', 'select'); PL/SQL PROCEDURE successfully completed.

Now, if we want to embed such a technology into a report, we need to add HARD_PARSE_TABLE to the query (this will not spoil it a bit) and call CALC_PREDICATES before the main query is executed.

Let's see how this technique can transform the previous example:

 DECLARE p NUMBER; v NUMBER; BEGIN V := 0; p := 1000000; FORCE_HARD_PARSE_PKG.SET_PREDICATE(1000000); FOR rec IN (SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); v := 0; p := 1; FORCE_HARD_PARSE_PKG.SET_PREDICATE(1); FOR rec IN (SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); ----------------- V := 0; p := 1; FORCE_HARD_PARSE_PKG.SET_PREDICATE(1); FOR rec IN (SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); v := 0; p := 1000000; FORCE_HARD_PARSE_PKG.SET_PREDICATE(1000000); FOR rec IN (SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2 = p) LOOP V := v + 1; END LOOP; dbms_output.put_line(v); END;

Let's look at the execution plans:

 SQL> SELECT sql_id, child_number, executions, plan_hash_value, sql_text, s.* FROM v$sql s WHERE sql_text LIKE 'SELECT % * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2%' ORDER BY 1,2; SQL_ID CHILD_NUMBER EXECUTIONS PLAN_HASH_VALUE SQL_TEXT -------------------------------------------------------------------------------- 7wva3uqbgh4qf 0 1 1136240498 SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 7wva3uqbgh4qf 1 1 3246475190 SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 8cju3tfjvwm1p 0 1 3246475190 SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 8cju3tfjvwm1p 1 1 1136240498 SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 -- SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7wva3uqbgh4qf', cursor_child_no => 0, format => 'basic +peeked_binds')); SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 PLAN hash VALUE: 1136240498 ---------------------------------------------------- | Id | Operation | Name | ---------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | MERGE JOIN CARTESIAN| | | 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE | | 3 | BUFFER SORT | | | 4 | TABLE ACCESS FULL | VVP_HARD_PARSE_TEST | ---------------------------------------------------- Peeked Binds (IDENTIFIED BY position): -------------------------------------- 1 - :B1 (NUMBER): 1000000 -- SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7wva3uqbgh4qf', cursor_child_no => 1, format => 'basic +peeked_binds')); SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 PLAN hash VALUE: 3246475190 -------------------------------------------------------------------- | Id | Operation | Name | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | MERGE JOIN CARTESIAN | | | 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE | | 3 | BUFFER SORT | | | 4 | TABLE ACCESS BY INDEX ROWID| VVP_HARD_PARSE_TEST | | 5 | INDEX RANGE SCAN | IND_VVP_HARD_PARSE_TEST_C2 | -------------------------------------------------------------------- Peeked Binds (IDENTIFIED BY position): -------------------------------------- 1 - :B1 (NUMBER): 1 -- SQl> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '8cju3tfjvwm1p', cursor_child_no => 0, format => 'basic +peeked_binds')); SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 PLAN hash VALUE: 3246475190 -------------------------------------------------------------------- | Id | Operation | Name | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | MERGE JOIN CARTESIAN | | | 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE | | 3 | BUFFER SORT | | | 4 | TABLE ACCESS BY INDEX ROWID| VVP_HARD_PARSE_TEST | | 5 | INDEX RANGE SCAN | IND_VVP_HARD_PARSE_TEST_C2 | -------------------------------------------------------------------- Peeked Binds (IDENTIFIED BY position): -------------------------------------- 1 - :B1 (NUMBER): 1 -- SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '8cju3tfjvwm1p', cursor_child_no => 1, format => 'basic +peeked_binds')); SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 PLAN hash VALUE: 1136240498 ---------------------------------------------------- | Id | Operation | Name | ---------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | MERGE JOIN CARTESIAN| | | 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE | | 3 | BUFFER SORT | | | 4 | TABLE ACCESS FULL | VVP_HARD_PARSE_TEST | ---------------------------------------------------- Peeked Binds (IDENTIFIED BY position): -------------------------------------- 1 - :B1 (NUMBER): 1000000

Looks good! Each query is executed twice, with different child cursors and different plans. For the parameter C2 = 1000000, we see the FULL TABLE SCAN in both queries, and for the parameter C1 = 1, we always see INDEX RANGE SCAN.

At the end I cite a case with rainy Fridays:

“It turns out that every weekend on Sunday there was a cold backup , so that all plans for requests were regenerated on the first execution on Monday morning. One of the employees usually started his work before the others, and his query plan was well implemented for other users during the week. However, if it was raining, this user was late for the start of the working day due to problems with his morning route. Then the batch calculation of reports was first launched, but the query plan was completely bad for other cases due to inappropriate values of the associated variables. ”

And a few useful system views:
• dba_tab_histograms, all_tab_histograms, user_tab_histograms
• v$vpd_policy
• v$sql_bind_capture
• dba_hist_sqlbind

Source: https://habr.com/ru/post/434444/

All Articles

The problem with bound variables: how to turn the optimizer from enemy to friend

More articles: