Dontcheff

Sharding Advisor and Federated Sharding in Oracle Database 20c

In DBA, New features, Oracle database on August 3, 2020 at 05:58

“Shards of glass can cut and wound or magnify a vision.” – Terry Tempest Williams

Same with databases: a single shared database can be a performance bottleneck, a sharded database can magnify your IT vision stragegy.

Oracle RAC and Oracle Data Guard meet more than 99% of the scalability and availability needed by business applications but sometimes infinite availability is also required.

Sharding is an architectural concept which aims at high scalability and absolute availability by splitting the database into a collection of independent physical databases. But this is from the perspective of the DBA. From the perspective of an application, a sharded database looks like a single database; the number of shards, and the distribution of data across those shards, are completely transparent to the application.

Sharding was introduced with Oracle 12c and MOS Doc ID 2226341.1 is the master note for Oracle Sharding. What is important to know is that you can shard database tables by consistent hash (system-managed sharding), by range or list (user-defined sharding), or a combination (composite sharding).

In Oracle 20c, Oracle came with 2 new advisors: Oracle Autonomous Database Advisor and the Oracle Sharding Advisor. The Oracle Sharding Reference is good page to bookmark!

Sharding Advisor is a standalone command-line tool that helps you redesign a database schema so that you can efficiently migrate an existing non-sharded Oracle Database to an Oracle sharding environment. Sharding Advisor analyzes your existing database schema and produces a ranked list of possible sharded database designs.

The Sharding Advisor is a an OS command line tool called gwsadv and by default the workload is captured from V$SQL_PLAN_STATISTICS_ALL.

Here is how you invoke/run the Oracle Sharding Advisor:

 

[oracle@julian ~]$ gwsadv -n 138.121.142.24:1521 -s novopdb1.laika.oraclevcn.com -u julian -p abc -c -w

********** WELCOME TO THE SHARDING ADVISOR **********

Sharding Advisor: Release 20.0 - Development on Sun Jul 26 2020 07:16:07
Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.
  

Altogether there are 10 possible parameters/flags for the tool, only 3 are mandatory: the username -u, the password -p and -w which directs Sharding Advisor to use the query workload for sharding configuration generation and ranking. The -c parameter is required only on a first run or on a changed workload. For the remaining 6 non-mandatory flags, check the Sharding Advisor Usage and Options.

Note that the sch flag specifies the list of schemas for the Sharding Advisor if you want to run it as a different user which you create specifically for the Sharding Advisor. That user will need SELECT privileges on GV$SQL_PLAN and GV$SQL_PLAN_STATISTICS_ALL, and on the DBA_HIST_SQL_PLAN, DBA_HIST_SQLSTAT, and DBA_HIST_SNAPHSOT tables. The user does not need any other special privileges.

Once you start the Sharding Advisor with the command above, you will be asked you some questions to help in the sharded database design, the first question being on how many shards you are planning to have:

 
** The number of shards in a sharded configuration is an important parameter **
** that can influence scalability, fault isolation, and availability. **

ENTER THE ESTIMATED NUMBER OF PRIMARY SHARDS FOR YOUR CURRENT CONFIGURATION (1-1000): 4

*** CAPTURING QUERY WORKLOAD PREDICATES ... ***

*** ANALYZING USER SCHEMA ... ***
  

Usually, a database schema has some tables with multiple foreign key constraints. Sharding has a limitation that a sharded table can only have one foreign key constraint that links it to another table in the same table family. As a result, for every table that has more than one foreign key constraint, a single constraint must be chosen to be enforced in the sharded schema. You will be asked questions like this:

 
** The 'JULIAN.ORDERS' table has multiple foreign key constraints as shown below. **

** 1. 'JULIAN.ORDERS' is related to primary table 'JULIAN.ORDER_TYPE' by foreign key 'ORDER_TYPE_FK' with a fan-out of '1.8' **
** 2. 'JULIAN.ORDERS' is related to primary table 'JULIAN.CLIENTS' by foreign key 'CLIENTS_FK1' with a fan-out of '1.3' **

ENTER WHICH OF THESE CONSTRAINTS YOU WANT TO CONSIDER FOR TABLE FAMILY ENUMERATION
(AS COMMA-SEPARATED RANGES OR ENTER '0' TO CHOOSE NONE): 1
   

Note: Fan-out, in any tree, is number of pointers to child nodes in a node. In Oracle, in this case, fan-out is the ratio of number of rows in the referencing table to that of the referenced table (parent/primary table).

To review the Sharding Advisor, use the following 6 output tables out of which only the first 3 are documented:

SHARDINGADVISOR_CONFIGURATIONS
SHARDINGADVISOR_CONFIGDETAILS
SHARDINGADVISOR_QUERYTYPES
– SHARDINGADVISOR_PREDS
– SHARDINGADVISOR_ECPREDS
– SHARDINGADVISOR_IMPORTANT_TABS

Check also the following Sharding Advisor Output Review SQL Examples.

There are few restrictions to the Sharding Advisor:

– The source database must be Oracle Database 10g or later
– The Sharding Advisor discovers the table families based on primary key-foreign key relationships, so if the schema does not have any primary key-foreign key constraints, sharding by PARENT clause is recommended
– Sharding Advisor recommends only single-table family, system-managed sharding (sharding by reference) configurations if the source database has foreign key constraints
– Sharding Advisor recommends sharding using the PARENT clause if the source database does not have foreign key constraints

Oracle have declarative rapid deployment automation for Sharding available in OCI Marketplace as well as on-premises with Terraform and Kubernitis. Check the Oracle blog article Sharding Oracle Database Cloud Service by Shailesh Dwivedi.

===

In Oracle 20c, you can also create a Sharded Database from multiple existing databases – this is called Federated Sharding.

It is now possible to convert a set of existing physical independent databases running the same application into a sharded database without modifying the database schemas or the application.

Oracle Sharding, in a federated sharding configuration, treats each independent physical database as a shard, and thus you can run multi-shard queries on all the shards.

You can create a federated sharding configuration with minor version mismatches between the shards. For example, the “North America” region could be on Oracle 20.2 and the “Europe” region could be on Oracle 20.3. But all database shards and the shard catalog must be on Oracle Database 20c or later. The databases must have the same schemas and only minor differences are supported. For example, a table can have an extra column or a missing column in one of the databases.

Often there are application upgrades which means that new tables, new columns, new constraint are added or column data types are being modified. When part of an overall Federated Sharding configuration, Oracle Sharding handles the schema differences caused by an application upgrade, as long as the overall schema structure stays the same.

To create a Federated Sharding environment using existing databases, you define the database layout just as you would for user-defined sharding, using GDSCTL commands. You need to execute these 7 steps in order to deploy a Federated Sharding configuration.

Just as an example, here is how you convert tables to duplicated or sharded tables:

 
ALTER TABLE catalog EXTERNALLY DUPLICATED;
ALTER TABLE orders EXTERNALLY SHARDED;

These are the limitations on Federated Sharding:

– There is no concept of chunk in a federated sharding configuration, so the GDSCTL MOVE CHUNK command is not supported
– Application sharding key-based routing is not supported
– The existing databases, before being added to a federated sharding configuration, must be upgraded to Oracle Database 20c or later

If you managed to bare reading until this last paragraph, please check also Partitioning vs. Federation vs. Sharding. Its last paragraph too…

PDB Point-in-Time Recovery and Flashback in Oracle 20c

In DBA, New features, Oracle database, RMAN on July 10, 2020 at 14:41

The most significant point about the Oracle 20c database architecture is that non-CDB Oracle Database upgrades to non-CDB architecture are desupported. Meaning you need a container database in 20c and your data will reside within a pluggable database.

But then how about if you need to restore one PDB to any time in the recent past?

In Oracle database 20c, flashback and PITR (=point-in-time recovery) are supported when recovering PDBs to an ancestor or orphan PDB incarnations. These operations were not possible in 19c and below. Just as a reminder, in Oracle 12.1 flashback database operations were possible on root container level and thus affected all PDBs under the root container. Oracle 12.2 started supporting flashback of a PDB.

There is one restriction though in 20c: you cannot perform PDB flashback or PITR operation to a PDB incarnation within an orphan database incarnation. In other words, you can flashback as long as the CDB incarnation does not change. Or restated: you can flashback a PDB to an orphan PDB incarnation that is either within the same CDB incarnation or in an ancestor CDB incarnation. Also, flashback of a PDB to an orphan incarnation is supported only when the database uses local undo.

Now, this might confusing. Let me first shortly explain what is an ancestor incarnation and an orphan incarnation.

Database incarnations have the following relationships to each other:

– The current incarnation is the one in which the database is currently operating
– The incarnation from which the current incarnation originated after an OPEN RESETLOGS operation is the parent incarnation of the current incarnation
– The parent of the parent incarnation is an ancestor incarnation and any parent of an ancestor incarnation is also an ancestor of the current incarnation
– A noncurrent incarnation that is not a direct ancestor of the current incarnation is called orphan incarnation

During the flashback of the PDB, Oracle modifies only the data files for that PDB. The data in the other PDBs is not impacted. The point in time for the flashback can be one of the following:

– System Change Number
– Specific time in the past
– CDB restore point
– PDB restore point
– PDB clean restore point
– PDB guaranteed restore point

Here is an example of how flashback to any time in the recent past works in Oracle 20c.

We have lost at 2:30pm a table called RDBMS_BRANDS and a materialized zone map RDBMS_ZMAP from a pluggable database called NOVOPDB2. We have a restore point called rp1_novo_pdb2 created before the “disaster” at 8am in the morning. So, let us first flashback and verify we get the 2 objects back:

SQL> select systimestamp from dual;

SYSTIMESTAMP
-------------------------------------------------------------
05-JUL-20 02.35.30.569344 PM +00:00

SQL> SELECT table_name FROM DBA_TABLES where TABLE_NAME like '%RDBMS%';

no rows selected

SQL> ALTER PLUGGABLE DATABASE novopdb2 CLOSE;

Pluggable database altered.

SQL> FLASHBACK PLUGGABLE DATABASE novopdb2 TO RESTORE POINT rp1_novo_pdb2;

Flashback complete.

SQL> ALTER PLUGGABLE DATABASE novopdb2 OPEN RESETLOGS;

Pluggable database altered.

SQL> SELECT table_name FROM DBA_TABLES where TABLE_NAME like '%RDBMS%';

TABLE_NAME
----------------------------------------------------------------------
RDBMS_BRANDS
RDBMS_ZMAP
  

Well, unfortunately, now we notice that slightly before noon time data was loaded into a new table called RDBMS_HISTORY which was not at 8am in the PDB. All SCNs between 8am and the current time are now on an orphan PDB incarnation. We will flahsback again using another restore point created at 12 o’clock.


SQL> ALTER PLUGGABLE DATABASE novopdb2 CLOSE;

Pluggable database altered.

SQL> FLASHBACK PLUGGABLE DATABASE novopdb2 TO RESTORE POINT rp2_novo_pdb2;

Flashback complete.

SQL> ALTER PLUGGABLE DATABASE novopdb2 OPEN RESETLOGS;

Pluggable database altered.

SQL> SELECT table_name FROM DBA_TABLES where TABLE_NAME like '%RDBMS%';

TABLE_NAME
--------------------------------------------------------------------------------
RDBMS_BRANDS
RDBMS_ZMAP
RDBMS_HISTORY

SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
05-JUL-20 02.48.30.569344 PM +00:00

So, we managed to flashback to a point few hours after we opened with RESETLOGS from the previous flashback. But within the same incarnation of the CDB.

DBAs can follow the process of restore and recovery using the V$SESSION_LONGOPS and V$RECOVERY_PROGRESS views, respectively.

For the restore, the V$SESSION_LONGOPS view’s column OPNAME should be ‘Flashback Database’. Just like this:

SELECT sofar, totalwork, units 
FROM v$session_longops 
WHERE opname = 'Flashback Database';

The column SOFAR shows the data currently read in megabytes while the column TOTALWORK shows the total number of megabytes of flashback logs that must be read.

Restore points are created with the following command:

SQL> CREATE RESTORE POINT rp17 FOR PLUGGABLE DATABASE novopdb2;

Restore point created.

The CLOUD_VERIFY_FUNCTION in Autonomous Database Cloud

In Autonomous, Cloud, DBA, New features, Oracle internals, Security and auditing on July 1, 2020 at 12:58

Choosing a hard-to-guess but easy-to-remember password is by far the easiest one from all the hard tasks!

1. Database passwords and their complexity:

According to GDPR personal data must be processed “in a manner that ensures appropriate security of personal data including protection against unauthorized or unlawful processing and against accidental loss, destruction or damage, using appropriate technical or organizational measures.”

But GDPR does not define any requirements about passwords such as password length, complexity, or how often password should be renewed. Regulation (EU) 2016/679 just stipulates that “a high level of protection of personal data” is required.

One way to enforce strong passwords on database users is by using the following rule:

A minimum of 1 lower case letter [a-z] and
a minimum of 1 upper case letter [A-Z] and
a minimum of 1 numeric character [0-9] and
a minimum of 1 special character: ~`!@#$%^&*()-_+={}[]|\;:”,./?
Passwords must be at least N characters in length
N attempts to block login
Set password expiration to N days

Oracle is following the above mentioned rules and the Oracle script catpvf.sql provides several password functions for taking care of the verification process:

– ora_complexity_check,
– verify_function
– verify_function_11G
– ora12c_verify_function
– ora12c_strong_verify_function
– ora12c_stig_verify_function

Note that the VERIFY_FUNCTION and VERIFY_FUNCTION_11G password verify functions are desupported in Oracle Database 20c. Also, in Oracle 20c, the IGNORECASE parameter for the orapwd file is desupported. All newly created password files are case-sensitive.

3. Non-autonomous databases

Now, how about those who prefer to use less complex passwords for database users? How do you bypass that problem first in a non-autonomous environment?

There are several ways to avoid the verification process by say the ora12c_verify_function:

– ALTER PROFILE DEFAULT LIMIT PASSWORD_VERIFY_FUNCTION NULL;
– Create a separate profile for the user
– Edit the catpvf.sql script to use the password verification function that you want, and then run the script to enable it – it is located in $ORACLE_HOME/rdbms/admin/utlpwdmg.sql
– Modify “CREATE OR REPLACE FUNCTION ora12c_verify_function …” in utlpwdmg.sql, a file which is used to change the DEFAULT profile to use different password complexity functions – it is located in $ORACLE_HOME/rdbms/admin/utlpwdmg.sql (not in 20c though)

Note here that the Oracle documentation says clearly: “Do not modify the admin/catpvf.sql script or the Oracle-supplied password complexity functions. You can create your own functions based on the contents of these files.”

3. Autonomous databases

Next, how about Autonomous, where we have no access to the operating system layer?

The Oracle Autonomous Database Cloud offers a new (unique to ADB) a function called CLOUD_VERIFY_FUNCTION. It is not available in the non-autonomous releases and not even in Oracle 20c.

The CLOUD_VERIFY_FUNCTION function is specified in the PASSWORD_VERIFY_FUNCTION attribute of the DEFAULT profile. This function internally calls ORA_COMPLEXITY_CHECK and checks the password entered according to the following specifications.

– If password contains the username
– The password must contain 1 or more lowercase characters
– The password must contain 1 or more uppercase characters
– The password must contain 1 or more digits
– The password length less than 12 bytes or more than 60 bytes

Let us check first what the function CLOUD_VERIFY_FUNCTION looks like:

 
create or replace FUNCTION cloud_verify_function
(username varchar2,
password varchar2,
old_password varchar2)
RETURN boolean IS
   differ integer;
  db_name varchar2(40);
  i integer;
  reverse_user dbms_id;
  canon_username dbms_id := username;
  len integer := nvl (length(password), 0);
BEGIN
  IF (substr(username,1,1) = '"') THEN
    execute immediate 'begin dbms_utility.canonicalize(:p1,  :p2, 128); end;'
                        using IN username, OUT canon_username;
  END IF;
   IF NOT ora_complexity_check(password, 12, null, 1, 1, 1, null) THEN
     RETURN(FALSE);
   END IF;
-- Check password length
   IF len > 60 THEN
     raise_application_error(-20020, 'Password too long');
   END IF;
-- Check if the password contains the username
   IF regexp_instr(password, canon_username, 1, 1, 0, 'i') > 0 THEN
     raise_application_error(-20002, 'Password contains the username');
   END IF;
RETURN(TRUE);
END;
/

We cannot modify the scripts mentioned above as we do not have OS access in ADB – may be then we can change the default profile or create a new one? But in ATP, user’s profile will be set to ‘DEFAULT’, and you are not allowed to create additional PROFILEs. Autonomous Data Warehouse requires strong passwords – the password must meet the default password complexity rules.

The output below is identical in ADW and ATP:

Well, we are stubborn – so let us try in any case:

In ATP:

 
create profile DBA_PROFILE 
LIMIT PASSWORD_REUSE_MAX 10 PASSWORD_REUSE_TIME 30
ORA-01031: insufficient privileges

In ADW:

 
create profile DBA_PROFILE 
LIMIT PASSWORD_REUSE_MAX 10 PASSWORD_REUSE_TIME 30;
Profile DBA_PROFILE created.

alter profile DBA_PROFILE limit PASSWORD_VERIFY_FUNCTION null;
Profile DBA_PROFILE altered.

alter user admin profile DBA_PROFILE;
ORA-01031: insufficient privileges

create user app_user identified by abc profile DBA_PROFILE
ORA-28219: password verification failed for mandatory profile
ORA-20000: password length less than 12 bytes

create user app_user identified by Exadataa2020 profile DBA_PROFILE;
User APP_USER created.

 
alter user app_user identified by abc
ORA-28219: password verification failed for mandatory profile
ORA-20000: password length less than 12 bytes

Well, the password verify function is still used although we set the app_user’s profile to DBA_PROFILE.

Conclusion: there is no way to modify or bypass the CLOUD_VERIFY_FUNCTION in Autonomous Cloud.

In 20c, the script utlpwdmg.sql has been modified, it does not contain any longer the creation of the verification functions. I really don’t know if that is intentional.

 
[oracle@julian admin]$ cat utlpwdmg.sql
...

-- This script sets the default password resource parameters
-- This script needs to be run to enable the password features.
-- However the default resource parameters can be changed based
-- on the need.
-- A default password complexity function is provided.

Rem *************************************************************************
Rem BEGIN Password Management Parameters
Rem *************************************************************************

-- This script alters the default parameters for Password Management
-- This means that all the users on the system have Password Management
-- enabled and set to the following values unless another profile is
-- created with parameter values set to different value or UNLIMITED
-- is created and assigned to the user.

ALTER PROFILE DEFAULT LIMIT
PASSWORD_LIFE_TIME 180
PASSWORD_GRACE_TIME 7
PASSWORD_REUSE_TIME UNLIMITED
PASSWORD_REUSE_MAX  UNLIMITED
FAILED_LOGIN_ATTEMPTS 10
PASSWORD_LOCK_TIME 1
INACTIVE_ACCOUNT_TIME UNLIMITED
PASSWORD_VERIFY_FUNCTION ora12c_verify_function;

/**
The below set of password profile parameters would take into consideration
recommendations from Center for Internet Security[CIS Oracle 11g].

ALTER PROFILE DEFAULT LIMIT
PASSWORD_LIFE_TIME 90
PASSWORD_GRACE_TIME 3
PASSWORD_REUSE_TIME 365
PASSWORD_REUSE_MAX  20
FAILED_LOGIN_ATTEMPTS 3
PASSWORD_LOCK_TIME 1
PASSWORD_VERIFY_FUNCTION ora12c_verify_function;
*/

/**
The below set of password profile parameters would take into
consideration recommendations from Department of Defense Database
Security Technical Implementation Guide[STIG v8R1].

ALTER PROFILE DEFAULT LIMIT
PASSWORD_LIFE_TIME 60
PASSWORD_REUSE_TIME 365
PASSWORD_REUSE_MAX  5
FAILED_LOGIN_ATTEMPTS 3
PASSWORD_VERIFY_FUNCTION ora12c_strong_verify_function;
*/

Rem *************************************************************************
Rem END Password Management Parameters
Rem *************************************************************************
[oracle@julian admin]$

On a final note: the hashed values of the passwords in 20c can be still found in the data dictionary – look into the SPARE4 column of the SYS.USER$ table. It is similar to 19c, read this article for all the details.