How to simplify the study of Oracle database: "gentleman's set of" scripts

Good day! We are a team of system analysts at one of the data management divisions of Rostelecom. Our company has more than 300 heterogeneous data sources - such diversity is necessary to support the work of Rostelecom in all the numerous areas. We study data sources and, if necessary, partially unload into the storage loop.

Rover Curiosity on the surface. It also has many heterogeneous data sources. Picture borrowed from therahnuma.com.

In this process, two subtasks are distinguished: the definition of a strategy for collecting data from source tables depending on their properties and the preparation of tables “receivers” of the data warehouse. For this, we use various GUI and reverse engineering tools. In addition, when collecting information, the system analyst begins to overgrow with a pool of auxiliary queries to the DBMS information tables (mainly Oracle). In this article I will share the “gentleman's set” of such scripts used by our team.

First, a little explanation to all the above scripts:
')

Many scripts use xmlagg to aggregate strings, since listagg cannot handle too long strings resulting from concatenation.
In all scripts except “Procedures, functions and packages”, target tables are specified via the filter table in the “with” block. The name of the scheme and the name of the table are filled
Each script is accompanied by one or several usage scenarios, a description of the specification (result set), and a list of the system tables used (to assess the possibility of use on a specific database).

Script "Information about the tables"

Specification :

Column name	Comment
SCHEMA_NAME	Data Schema Name (OWNER)
TABLE_NAME	Table name
COMMENTS	Comment to the table
HEIGHT	Number of rows in the table (approximately)
Widtth	Number of columns
DATETIME_COLUMNS	Columns with time data types and columns, based on the name, supposedly being timestamps (patterns -% period%,% date%,% time%)
AVG_ROW_LEN	Average string length in bytes
PART_KEY	Columns on which partitioning is carried out
SUBPART_KEY	Columns on which subpartization is carried out

System tables used: all_tab_columns, all_tab_comments, all_tab_statistics, all_part_key_columns, all_subpart_key_columns.

The query is useful for defining a strategy for uploading data from the source system. If a primary key is built on the table in question, then you can organize the unloading with the subsequent allocation of the “increment” on it. If you have a time stamp — for example, in technical fields with information about data insertion or update — you can upload only changed / added records for a period of time. Information about the structure of partitions can be useful when creating a similar table- "receiver".

Request body:

with filter (owner, table_name) as (    select 'SCHEMA_NAME_1', t.*    from table(        sys.odcivarchar2list(            'TABLE_NAME_1'            , 'TABLE_NAME_2'        )    ) t    union all    select        owner        , table_name    from        all_tables    where owner = 'SCHEMA_NAME_2' ) select    a.owner as schema_name    , a.table_name    , e.comments    , b.height    , c.width    , d.datetime_columns    , b.avg_row_len    , p.part_key    , s.subpart_key from    filter a    left join (        select            owner            , table_name            , num_rows as height            , avg_row_len        from all_tab_statistics        where object_type = 'TABLE'    ) b        on            a.table_name = b.table_name            and a.owner = b.owner    left join (        select            owner            , table_name            , count(1) as width        from all_tab_columns        group by            owner            , table_name    ) c        on            a.table_name = c.table_name            and a.owner = c.owner    left join (        select            owner            , table_name            , listagg(                column_name || ' (' || data_type || ')'                , ', '            ) within group (order by column_id) as datetime_columns        from all_tab_columns        where            data_type = 'DATE'            or data_type like 'TIMESTAMP%'            or data_type like 'INTERVAL%'            or lower(column_name) like '%period%'            or lower(column_name) like '%date%'            or lower(column_name) like '%time%'        group by            owner            , table_name    ) d        on            a.table_name = d.table_name            and a.owner = d.owner    left join (        select            owner            , table_name            , comments        from all_tab_comments        where table_type = 'TABLE'    ) e        on            a.table_name = e.table_name            and a.owner = e.owner    left join (        select            owner            , name as table_name            , listagg(                column_name                , ', '            ) within group (order by column_position) as part_key        from all_part_key_columns        where object_type = 'TABLE'        group by            owner            , name    ) p        on            a.owner = p.owner            and a.table_name = p.table_name    left join (        select            owner            , name as table_name            , listagg(                column_name                , ', '            ) within group (order by column_position) as subpart_key        from all_subpart_key_columns        where object_type = 'TABLE'        group by            owner            , name    ) s        on            a.owner = s.owner            and a.table_name = s.table_name order by    e.owner    , e.table_name ;

Script "Partitions and Subpartitions"

Specification:

Column name	Comment
SCHEMA_NAME	Data Schema Name (OWNER)
TABLE_NAME	Table name
PART_KEY	Columns on which partitioning is carried out
PARTITION_NAME	Partition Name
PARTITION_POSITION	Part Number
PARTITION_HEIGHT	Number of rows in the partition
SUBPART_KEY	Columns on which subpartization is carried out
SUBPARTITION_NAME	Subpartition name
SUBPARTITION_POSITION	Subpartition number
SUBPARTITION_HEIGHT	The number of lines in the subpartition

System tables used: all_tab_partitions, all_tab_subpartitions, all_part_key_columns, all_subpart_key_columns.

The script will be useful for obtaining characteristics (name, size) of partitions with their direct use as data sources.

Request body:

 with filter (owner, table_name) as (   select 'SCHEMA_NAME_1', t.*   from table(       sys.odcivarchar2list(           'TABLE_NAME_1'           , 'TABLE_NAME_2'       )   ) t   union all   select       owner       , table_name   from       all_tables   where owner = 'SCHEMA_NAME_2' ) select   f.owner as schema_name   , f.table_name   , p.part_key   , pc.partition_name   , pc.partition_position   , pc.num_rows as partition_height   , s.subpart_key   , sc.subpartition_name   , sc.subpartition_position   , sc.num_rows as subpartition_height from   filter f   join (       select           owner           , name as table_name           , listagg(               column_name               , ', '           ) within group (order by column_position) as part_key       from all_part_key_columns       where object_type = 'TABLE'       group by           owner           , name   ) p       on           f.owner = p.owner           and f.table_name = p.table_name   left join all_tab_partitions pc       on           p.table_name = pc.table_name           and p.owner = pc.table_owner   left join (       select           owner           , name as table_name           , listagg(               column_name               , ', '           ) within group (order by column_position) as subpart_key       from all_subpart_key_columns       where object_type = 'TABLE'       group by           owner           , name   ) s       on           p.owner = s.owner           and p.table_name = s.table_name   left join all_tab_subpartitions sc       on           f.owner = sc.table_owner           and f.table_name = sc.table_name           and pc.partition_name = sc.partition_name   order by       f.owner       , f.table_name ;

Script "Attribute composition of tables"

Specification:

Column name	Comment
SCHEMA_NAME	Data Schema Name (OWNER)
TABLE_NAME	Table name
COLUMN_ID	Column sequence number
COLUMN_NAME	Column name
DATA_TYPE	Data type
COLUMN_COMMENT	Comment to column
SCALE	Scale (for data type NUMBER)
PRECISION	Accuracy (for data type NUMBER)
BYTE_LENGTH	Field length in bytes
ENCODING	Encoding for text data types (CHAR, VARCHAR2, NCHAR, NVARCHAR2)
CHAR_LENGTH	Maximum string length for text data types (CHAR, VARCHAR2, NCHAR, NVARCHAR2)
NOT_NULL	Flag "Required"
IS_PRIMARY	Flag "Included in the primary key"
DEFAULT_VALUE	Default value
COLUMN_IMPACT	A list of columns of other tables that reference the column.
COLUMN_DEPEND	A list of columns of other tables referenced by the column.

Used system tables: all_tables, all_constraints, all_cons_columns, all_tab_columns, all_col_comments, v $ nls_parameters.

This script will be useful for the preparation of tables "receivers" in the data warehouse, when you need detailed information about the table, its relationships with other tables, as well as the full attribute composition. Through the auxiliary table filter2, filtering of the tables for which references are searched for (from and to) is set. By default, tables are taken from all schemas except system ones.

Request body:

 with filter (owner, table_name) as (   select 'SCHEMA_NAME_1', t.*   from table(       sys.odcivarchar2list(           'TABLE_NAME_1'           , 'TABLE_NAME_2'       )   ) t   union all   select       owner       , table_name   from       all_tables   where owner = 'SCHEMA_NAME_2' ) , filter2 (owner, table_name) as (   select owner, table_name   from all_tables   where owner not in ('MDSYS', 'CTXSYS', 'SYSTEM', 'XDB', 'SYS') ) , refs as (   select       b.constraint_type as from_constraint_type       , b.constraint_name as from_constraint_name       , d.position as from_position       , d.column_name as from_column_name       , b.table_name as from_table_name       , b.owner as from_owner       , a.owner as to_owner       , a.table_name as to_table_name       , c.column_name as to_column_name       , c.position as to_position       , a.constraint_name as to_constraint_name       , a.constraint_type as to_constraint_type   from       all_constraints a       left join all_constraints b           on               a.r_constraint_name = b.constraint_name               and a.r_owner = b.owner       left join all_cons_columns c           on               a.constraint_name = c.constraint_name               and a.table_name = c.table_name               and a.owner = c.owner       left join all_cons_columns d           on               b.constraint_name = d.constraint_name               and b.table_name = d.table_name               and b.owner = d.owner       where           a.constraint_type = 'R'           and b.constraint_type in ('P', 'U')           and c.position = d.position ) , depends as (   select       rtrim(           xmlagg(               xmlelement(                   e                   , to_owner || '.' || to_table_name || '.' || to_column_name                   , ', '               ).extract('//text()')               order by to_owner           ).getclobval()           , ', '       ) as val       , from_owner as owner       , from_table_name as table_name       , from_column_name as column_name   from refs   where (to_owner, to_table_name) in (select * from filter2)   group by       from_table_name       , from_column_name       , from_owner ) , impacts as (   select       rtrim(           xmlagg(               xmlelement(                   e                   , from_owner || '.' || from_table_name || '.' || from_column_name                   , ', '               ).extract('//text()')               order by from_owner           ).getclobval()           , ', '       ) as val       , to_owner as owner       , to_table_name as table_name       , to_column_name as column_name   from refs   where (from_owner, from_table_name) in (select * from filter2)   group by       to_table_name       , to_column_name       , to_owner ) select   f.owner as schema_name   , f.table_name   , a.column_id   , a.column_name   , a.data_type   , b.comments as column_comment   /*         precision  ,    38       (      )       ,    scale,    0 (  ). */   , decode (       a.data_type       , 'NUMBER', nvl(a.data_scale, 0)       , ''   ) as scale   , decode (       a.data_type       , 'NUMBER', nvl(a.data_precision, 38)       , ''   ) as precision   /*             CHAR, VARCHAR2           DDL-   ,    NCHAR or NVARCHAR2        .*/   , a.data_length as byte_length   , case       when a.data_type in ('CHAR', 'VARCHAR2', 'NCHAR', 'NVARCHAR2')       then d.value   end as encoding   , case       when a.data_type in ('CHAR', 'VARCHAR2', 'NCHAR', 'NVARCHAR2')       then a.char_length --a.char_col_decl_length   end as char_length   , decode(a.nullable, 'Y', 'N', 'Y') as not_null   , decode(c.is_primary, 1, 'Y', 'N') as is_primary   , a.data_default as default_value   , impacts.val as column_impact   , depends.val as column_depend from   filter f   left join all_tab_columns a       on           f.owner = a.owner           and f.table_name = a.table_name   left join all_col_comments b       on           a.owner = b.owner           and a.table_name = b.table_name           and a.column_name = b.column_name   left join (       select           1 as is_primary           , owner           , table_name           , column_name       from all_cons_columns       where (owner, constraint_name) in (           select owner, constraint_name           from all_constraints           where constraint_type = 'P'       )   ) c       on           a.owner = c.owner           and a.table_name = c.table_name           and a.column_name = c.column_name   left join v$nls_parameters d       on decode (           a.character_set_name           , 'CHAR_CS', 'NLS_CHARACTERSET'           , 'NCHAR_CS', 'NLS_NCHAR_CHARACTERSET'           , a.character_set_name       ) = d.parameter   left join depends       on           a.owner = depends.owner           and a.table_name = depends.table_name           and a.column_name = depends.column_name   left join impacts       on           a.owner = impacts.owner           and a.table_name = impacts.table_name           and a.column_name = impacts.column_name order by   f.owner   , f.table_name   , a.column_id ;

Script "Procedures, functions and packages"

Specification:

Column name	Comment
SCHEMA_NAME	Data Schema Name (OWNER)
NAME	Name of the procedure / function / package / package header
BODY	Body
TYPE	Type (PACKAGE BODY, PACKAGE, FUNCTION, PROCEDURE)
WRAPPED	Flag "Body encoded or not (wrapped)"

Used system tables: all_source

When analyzing the source, there may be a task to study the data flows circulating in the system. It is almost impossible to do without the code base of packages, functions and procedures, especially with incomplete or missing documentation. For convenience, the listed objects can be represented as a table through a script. The result of the request using the console utility can be output to the stream and redirected by a simple handler (bash script) to files for further study by your favorite editor. In addition, various handlers can be “hung” on the output stream - beautifying, unwrapping, etc.

Request body:

 select   t.owner as schema_name   , t.name as name   , xmlagg(       xmlelement(           e           , t.text           , ''       ).extract('//text()')       order by t.line asc   ).getclobval() as body   , f.wrapped   , t.type as type from (   select       owner, name, type       , case           when lower(text) like '%wrapped%' then 1           else 0       end as wrapped   from all_source   where type in (       'PACKAGE BODY'       , 'PACKAGE'       , 'FUNCTION'       , 'PROCEDURE'   )   and line = 1   and owner not in ('MDSYS', 'CTXSYS', 'SYSTEM', 'XDB', 'SYS') ) f join all_source t on   f.owner = t.owner   and f.name = t.name   and f.type = t.type group by   t.owner   , t.name   , t.type   , f.wrapped order by   t.owner   , t.name   , t.type ;

Conclusion

The scripts described above help our system analysts to get rid of many routine actions to collect information about the database and focus on more creative things, such as the loading strategy and the structure of the “receiver” tables. I hope the scripts will be useful to you. It would be interesting to know how you automate these and similar tasks.

Source: https://habr.com/ru/post/442082/

All Articles