Dontcheff

Archive for 2019|Yearly archive page

Automatic Indexing in 19c

In Autonomous, Database tuning, Databases, DBA, Oracle database on February 18, 2019 at 17:38

One of the most impressive new features of Oracle Database 19c is Automatic Indexing. Arguably, this is the most interesting innovation in the database world for a rather long time.

I remember some years ago when a DBA asked me at an Oracle conference: “Julian, why are half of the presentations at Oracle database conferences only about performance tuning? Is the Oracle database performing that badly that people should tune it all the time?” Sigh…

With 19c and ADB (Oracle Autonomous Database), things look very different now, don’t they? Automatic Indexing provides what database systems need: continuous optimization of the database workload, stable & solid performance and almost no human interaction. Let me share some of my early experience with Automatic Indexing and where human interaction is needed.

For now (February 18th, 2019), Oracle 19c is only available on Exadata (Linux 7.4) and in order to enable Automatic Indexing you need to do the following:

EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_MODE','IMPLEMENT');

As you can guess from the picture above, the so called expert system of Automatic Indexing runs every 15th minute for as long as one hour. Note that I disabled the job from 4:43 till 5:56. The Resource Manager plan limits the task to 1 CPU only and the next run is skipped if the job has not been completed within the 15 minutes.

Here are details on how Automatic Indexing works but what is most important to remember is as follows:

– The auto index candidates are created as invisible auto indexes
– If the performance of SQL statements is not improved from the auto indexes, then the indexes are marked as unusable and the corresponding SQL statements are blacklisted
– Auto indexes cannot be used for any first time SQL run against the database
– Auto indexes are created as either single, concatenated indexes or function-based indexes and they all use advanced low compression
– The unused auto indexes are deleted after 373 days (can be changed)
– The unused non-auto indexes (manual indexes) are never deleted by the automatic indexing process but can be deleted automatically if needed

The Auto Indexing can be disabled at any time or can be set to set to reporting mode (new auto indexes as created asinvisible indexes, so that they cannot be used in SQL) with the following commands:

EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_MODE','OFF');

 

EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_MODE','REPORT ONLY');

Here is a way to ask Oracle to create new auto indexes in a separate tablespace called AUTO_INDEX_TS:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_DEFAULT_TABLESPACE','AUTO_INDEX_TS');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01

You can easily check the configuration for Automatic Indexing for the root container and the PDBs from CDB_AUTO_INDEX_CONFIG;

If you need a report of what happened during the expert system activity (either last 3 days or during the last activity), here is a way to generate it:

set long 300000
select DBMS_AUTO_INDEX.REPORT_ACTIVITY(SYSTIMESTAMP-3,SYSTIMESTAMP,'TEXT','ALL','ALL') from dual;
select DBMS_AUTO_INDEX.REPORT_LAST_ACTIVITY('TEXT','ALL','ALL') from dual;

These are the most important views about Auto Indexing:

DBA_AUTO_INDEX_EXECUTIONS: history of execution of automatic indexing tasks
DBA_AUTO_INDEX_STATISTICS: statistics related to auto indexes
DBA_AUTO_INDEX_IND_ACTIONS: actions performed on auto indexes
DBA_AUTO_INDEX_SQL_ACTIONS: actions performed on SQL statements for verifying auto indexes
DBA_AUTO_INDEX_CONFIG: configuration settings related to auto indexes
DBA_AUTO_INDEX_VERIFICATIONS: stats about PLAN_HASH_VALUE, AUTO_INDEX_BUFFER_GETS, etc.

The new package DBMS_AUTO_INDEX can be used for 3 main things:

1. Configuration of the parameters related to Auto Indexing
2. Drop *all* the indexes except the ones used for constraints
3. Report the activity of the “expert system”:

Finally, here are some additional resources:

Automatic Indexing in Oracle Database 19c
Oracle Database 19c is now available!
Managing Auto Indexes

How to check if I have any auto indexes in my database: select auto, count(*) from dba_indexes group by auto;

Advertisements

Few interesting facts about Oracle ADB, Redshift and Snowflake

In Autonomous, Data Warehouse, Databases, DBA on January 14, 2019 at 16:17

Building a new data warehouse in the cloud or migrating an existing one to cloud requires careful consideration and the answer to the question “Which cloud should I use?” is often “It depends”.

An interesting comparison of system properties comparing Amazon Redshift vs. Oracle vs. Snowflake can be found on db-engines.com

There are several other options too: Azure SQL Data Warehouse, Presto, Google BigQuery, etc.

An interesting benchmark paper called “Data Warehouse Benchmark: Redshift, Snowflake, Azure, Presto and BigQuery” by Fivetran is worth reading!

Another comparison called Interactive Analytics: Redshift vs Snowflake vs BigQuery is already more than 2 years old but still interesting.

Recently, things have changed. Oracle’s Autonomous Data Warehouse Cloud has been in GA for almost 1 year (since March 2018). ADW is for enterprise loads and mission critical systems arguably the best solution right now.

Viscosity compared both Oracle Autonomous and Amazon Redshift. The result? Check it here: Amazon vs Oracle: Data Warehouse Services, How do They Compare?

In short, the conclusion of the research above is:

– Oracle’s ADW was able to achieve data retrieval at the lowest latencies, and achieved the highest volume of queries per hour. In terms of serial query execution and multi-user query throughput.
– Oracle’s ADW consistently outperformed Redshift by a factor of 4x in both sets of tests.

And do not ignore the db-engines ranking! Only one of the three is in the Top 10.

What is interesting to know on top of all papers above are these 10 differences or let us call them less known technical facts (in no order of importance) between Oracle Autonomous, Amazon Redshift and Snowflake:

1. Snowflake compute usage is billed on a per-second basis, with a minimum of 60 seconds. Amazon Redshift is based on PostgreSQL 8.0.2 and is built on top of technology from the MPP data warehousing company ParAccel. Oracle Autonomous Database is based on Exadata and 18c.

2. In Oracle Autonomous Cloud, you can provision up to 128 CPUs and 128TB directly from the cloud console but you can provision more if needed.

3. Snowflake manages all aspects of how data is stored in S3 including data organization, file sizes, structure, compression, and statistics.

4. The only things needed for BYOL in Oracle Autonomous Database are Multitenant and RAC (only when using more than sixteen OCPUs). The standby option (not yet available) will require Active Data Guard as well.

5. Snowflake does not disclose the information about processing power and memory. Oracle do disclose the information via internal views but you cannot directly define the SGA or PGA size.

6. Redshift is not built as a high-concurrency database with several concurrent running queries and AWS recommends that you execute no more than 15 queries at a time. The number of concurrent user connections that can be made to a cluster is 500.

7. Oracle ADW and ATP allow you to partition both indexes and tables. In Snowflake partitioning is handled internally. Amazon Redshift does not support tablespaces, table partitioning, inheritance, and even certain constraints. Amazon Redshift Spectrum supports table partitioning using the CREATE EXTERNAL TABLE command.

8. The maximum number of tables in Amazon Redshift is 9,900 for large and xlarge cluster node types and 20,000 for 8xlarge cluster node types. The limit includes temporary tables. An Oracle database does not have a limit for the number of tables.

9. Oracle automatically applies all security updates (and online!) to ensure data is not vulnerable to known attack vectors. Additional in-database features like Virtual Private Database and Data Redaction are also available.

10. There is no operation in Snowflake for collecting database statistics. It is handled by the engine. In Oracle, database statistics collection is allowed. Both Oracle Autonomous and Amazon Redshift monitor changes to your workload and automatically update statistics in the background.

Finally, here are official URLs of all three products:

Oracle Autonomous Database
Amazon Redshift
Snowflake Database