Dontcheff

Archive for September, 2022|Monthly archive page

DBMS_AUTO_PARTITION: Automatic Partitioning in the Autonomous Database

In Autonomous, Cloud, Databases, DBA on September 26, 2022 at 06:33

Choosing a partitioning strategy is often far from straightforward. Range and interval partitioning are useful when organizing similar data types, most true for date and time data. Hash partitioning is useful for randomly distributing data across partitions based on a hashing algorithm, rather than grouping similar data. I remember how we managed to boost performance in a huge Siebel database by hash partitioning the big indexes. List partitioning is useful to explicitly map rows to partitions based on a set of values – for example the states in the USA.

But this is a tedious process and when doing it, regardless if you are a DBA or a Developer, the functional knowledge of the application and the SQL using the tables and indexes, is of paramount importance.

When using the Oracle autonomous database, we rather prefer things to be automated – goes without saying. So, here comes in ADB a handy package called DBMS_AUTO_PARTITION.

First, we need to configure auto partitioning with the option we need. We use the parameter IMPLEMENT (see the screenshot below) but if you want to enable recommendations and not implement those recommendations automatically, then use the parameter REPORT_ONLY instead (which is the default value).

You can always disable it using the OFF parameter.

Automatic partitioning uses a single-column partition key combined with single-level partitioning. Automatic partitioning does not support more complex partitioning strategies such as multi-column partitioned tables or composite partitioning.

Automatic partitioning chooses from the following 3 partition methods:

  • AUTOMATIC INTERVAL: This choice is best suited for ranges of partition key values
  • LIST AUTOMATIC: This partitioning method applies to distinct partition key values
  • HASH: Applies partitioning on the partition key’s hash values

Next, you can use the AUTO_PARTITION_SCHEMA and AUTO_PARTITION_TABLE settings to specify schemas and tables considered (or not considered) for automatic partitioning. Note that when automatic partitioning is set then all schemas and tables (in user-managed schemas) are considered for automatic partitioning if both the inclusion and exclusion lists are empty.

Here is an example where I prefer to manually decide on the partitioning strategy in the JULIAN schema – so I want to exclude it from auto partitioning:

If I wanted to do the opposite, that is add only the schema JULIAN to the auto partitioning list, I should have replaced FALSE with TRUE in the screenshot above. As soon as the inclusion list is no longer empty, only schemas in the inclusion list are considered.

The analysis and verification of automatic partitioning using RECOMMEND_PARTITION_METHOD is potentially a resource-intensive and long running operation, so avoid doing it in a production database. I would rather use a cloned database for automatic partitioning. The recommendations can be implemented with the APPLY_RECOMMENDATION procedure.

It is good to know that the Autonomous Database automatically collects workload information over time in an internally managed SQL workload repository maintained in the SQL Tuning Set (SYS_AUTO_STS). Recommendations of automatic partitioning generated by the RECOMMEND_PARTITION_METHOD function have a time limit, specified by the TIME_LIMIT parameter, with a default of 1 day. If you are analyzing a large system with many candidate tables, a single invocation may not generate a recommendation for all tables and you will have to invoke the procedure repeatedly to generate recommendations for additional tables. 

Here are the data dictionary views for automatic partitioning: DBA_AUTO_PARTITION_CONFIG and DBA_AUTO_PARTITION_RECOMMENDATIONS.

It is important to understand that unlike automatic indexing, automatic partitioning does not run periodically as a background task. Automatic partitioning only runs when you invoke it by using the DBMS_AUTO_PARTITION.RECOMMEND_PARTITION_METHOD function.

Not all tables are candidates for auto partitioning. The function VALIDATE_CANDIDATE_TABLE checks if a given table is a valid candidate for auto partitioning in Autonomous Database. There are several conditions for a table to be a valid candidate, here are few:

  • Table passes inclusion and exclusion tests specified by AUTO_PARTITION_SCHEMA and AUTO_PARTITION_TABLE configuration parameters
  • Table has up-to-date statistics
  • Table is at least 64 GB
  • Table has 5 or more queries in the SQL tuning set that scanned the table
  • Table does not contain a LONG data type column.

As you can see below, my SALES table is not a candidate as the statistics are stale:

Last, here are few underscore/hidden parameters related to auto partitioning (from a 21c database) :

_autoptn_costing: DBMS_AUTO_PARTITION is compiling query for cost estimates
_autoptn_flags: DBMS_AUTO_PARTITION flags
_autoptn_translate_table_name: DBMS_AUTO_PARTITION table name translation
_autoptn_workload_id: DBMS_AUTO_PARTITION workload id

The package is though missing in 21c:

SQL> desc DBMS_AUTO_PARTITION
ERROR:
ORA-04043: object DBMS_AUTO_PARTITION does not exist

If you are interested in auto partitioning also the application, then I would recommend a rather scientific paper called Automatic Partitioning of Database Applications. But is built onto the assumption that stored procedures have several disadvantages and not everyone would agree with that.

How to view the version history of an Oracle database?

In DBA, Oracle database on September 9, 2022 at 10:57

Often an Oracle database, even being say 19c, was initially created as 10g and upgraded with scripts over the years. From v$database and v$instance, we can find out the current version, the platform name and when the database was created but how to view the version history of that database? Often simple SQL statements can extract important data that we need. We should just know what table or view to query.

Two important views, which actually came with Oracle 10g, can shed some light on the question above: DBA_HIGH_WATER_MARK_STATISTICS and DBA_REGISTRY_HISTORY.

Let us see what information they provide:

SELECT * from DBA_REGISTRY_HISTORY where version is not null order by 4 desc;

I said above “shed some light” and not “answer the question” as the view came only with 10gR2. So we can see the history from 10.2.0.3 until now (19.13) but whether the database was created initially as 10g, 9i or even as v7 is an open question.

Note that since since 12.1.0.1 Oracle use DBA_REGISTRY_SQLPATCH instead of DBA_REGISTRY_HISTORY to track PSUs and BPs applied to the database. Check the post by Mike Dietrich called DBA_REGISTRY_HISTORY vs DBA_REGISTRY_SQLPATCH.

The view DBA_HIGH_WATER_MARK_STATISTICS is based on the table WRI$_DBU_HIGH_WATER_MARK:

create table WRI$_DBU_HIGH_WATER_MARK
(name                 varchar2(64)  not null,
 dbid                 number        not null,
 version              varchar2(17)  not null,
 highwater            number,
 last_value           number,
 error_count          number,
 constraint WRI$_DBU_HIGH_WATER_MARK_PK primary key
    (name, dbid, version)
 using index tablespace SYSAUX
) tablespace SYSAUX
/

Next, we run the following query which shows us along with the database size the database version as well:

SELECT * from DBA_HIGH_WATER_MARK_STATISTICS where name = 'DB_SIZE' order by 3 desc;

From DBA_HIGH_WATER_MARK_STATISTICS, we can view several other historical stats about the database: number of user tables, size of the largest segment, maximum number of partitions belonging to an user table, maximum number of partitions belonging to an user index, number of user indexes, maximum number of concurrent sessions seen in the database, maximum number of datafiles, maximum number of tablespaces, maximum number of CPUs and maximum query length.

If the high-water mark statistics are not populated, then execute manually DBMS_FEATURE_USAGE_INTERNAL.SAMPLE_ONE_HWM. The internal package looks like this:

PROCEDURE CLEANUP_DATABASE
Argument Name                  Type                    In/Out Default?
------------------------------ ----------------------- ------ --------
CLEANUP_LOCAL                  PL/SQL BOOLEAN          IN     DEFAULT 

PROCEDURE EXEC_DB_USAGE_SAMPLING
Argument Name                  Type                    In/Out Default?
------------------------------ ----------------------- ------ --------
CURR_DATE                      DATE                    IN             

PROCEDURE SAMPLE_ONE_FEATURE
Argument Name                  Type                    In/Out Default?
------------------------------ ----------------------- ------ --------
FEAT_NAME                      VARCHAR2                IN             

PROCEDURE SAMPLE_ONE_HWM
Argument Name                  Type                    In/Out Default?
------------------------------ ----------------------- ------ --------
HWM_NAME                       VARCHAR2                IN             

If you would like to update the HWM statistics manually as they are gathered once a week, here is an example:

SQL> SELECT name, highwater, last_value FROM dba_high_water_mark_statistics WHERE name = 'USER_TABLES';

NAME                            HIGHWATER LAST_VALUE
------------------------------ ---------- ----------
USER_TABLES                           533        533

SQL> CREATE TABLE JMD (c1 json);

Table created.

SQL> exec dbms_feature_usage_internal.sample_one_hwm('USER_TABLES');

PL/SQL procedure successfully completed.

SQL> SELECT name, highwater, last_value FROM dba_high_water_mark_statistics WHERE name = 'USER_TABLES';

NAME                            HIGHWATER LAST_VALUE
------------------------------ ---------- ----------
USER_TABLES                           534        534

As post scriptum, I can say without ever being able to prove it scientifically, a better performing database is one created as a fresh database and not upgraded with scripts. I would always advise to create a new database and transfer the data and all objects from the previous version than just upgrade the database (the data dictionary) with scripts. But with current DB sizes and limited downtime, this is getting more and more difficult to achieve.