Dontcheff

Few interesting facts about Oracle ADB, Redshift and Snowflake

In Autonomous, Data Warehouse, Databases, DBA on January 14, 2019 at 16:17

Building a new data warehouse in the cloud or migrating an existing one to cloud requires careful consideration and the answer to the question “Which cloud should I use?” is often “It depends”.

An interesting comparison of system properties comparing Amazon Redshift vs. Oracle vs. Snowflake can be found on db-engines.com

There are several other options too: Azure SQL Data Warehouse, Presto, Google BigQuery, etc.

An interesting benchmark paper called “Data Warehouse Benchmark: Redshift, Snowflake, Azure, Presto and BigQuery” by Fivetran is worth reading!

Another comparison called Interactive Analytics: Redshift vs Snowflake vs BigQuery is already more than 2 years old but still interesting.

Recently, things have changed. Oracle’s Autonomous Data Warehouse Cloud has been in GA for almost 1 year (since March 2018). ADW is for enterprise loads and mission critical systems arguably the best solution right now.

Viscosity compared both Oracle Autonomous and Amazon Redshift. The result? Check it here: Amazon vs Oracle: Data Warehouse Services, How do They Compare?

In short, the conclusion of the research above is:

– Oracle’s ADW was able to achieve data retrieval at the lowest latencies, and achieved the highest volume of queries per hour. In terms of serial query execution and multi-user query throughput.
– Oracle’s ADW consistently outperformed Redshift by a factor of 4x in both sets of tests.

And do not ignore the db-engines ranking! Only one of the three is in the Top 10.

What is interesting to know on top of all papers above are these 10 differences or let us call them less known technical facts (in no order of importance) between Oracle Autonomous, Amazon Redshift and Snowflake:

1. Snowflake compute usage is billed on a per-second basis, with a minimum of 60 seconds. Amazon Redshift is based on PostgreSQL 8.0.2 and is built on top of technology from the MPP data warehousing company ParAccel. Oracle Autonomous Database is based on Exadata and 18c.

2. In Oracle Autonomous Cloud, you can provision up to 128 CPUs and 128TB directly from the cloud console but you can provision more if needed.

3. Snowflake manages all aspects of how data is stored in S3 including data organization, file sizes, structure, compression, and statistics.

4. The only things needed for BYOL in Oracle Autonomous Database are Multitenant and RAC (only when using more than sixteen OCPUs). The standby option (not yet available) will require Active Data Guard as well.

5. Snowflake does not disclose the information about processing power and memory. Oracle do disclose the information via internal views but you cannot directly define the SGA or PGA size.

6. Redshift is not built as a high-concurrency database with several concurrent running queries and AWS recommends that you execute no more than 15 queries at a time. The number of concurrent user connections that can be made to a cluster is 500.

7. Oracle ADW and ATP allow you to partition both indexes and tables. In Snowflake partitioning is handled internally. Amazon Redshift does not support tablespaces, table partitioning, inheritance, and even certain constraints. Amazon Redshift Spectrum supports table partitioning using the CREATE EXTERNAL TABLE command.

8. The maximum number of tables in Amazon Redshift is 9,900 for large and xlarge cluster node types and 20,000 for 8xlarge cluster node types. The limit includes temporary tables. An Oracle database does not have a limit for the number of tables.

9. Oracle automatically applies all security updates (and online!) to ensure data is not vulnerable to known attack vectors. Additional in-database features like Virtual Private Database and Data Redaction are also available.

10. There is no operation in Snowflake for collecting database statistics. It is handled by the engine. In Oracle, database statistics collection is allowed. Both Oracle Autonomous and Amazon Redshift monitor changes to your workload and automatically update statistics in the background.

Finally, here are official URLs of all three products:

Oracle Autonomous Database
Amazon Redshift
Snowflake Database

Advertisements

Autonomous Data Warehouse, Autonomous Transaction Processing or Something Else?

In DBA on November 30, 2018 at 14:37

First things first: there is nothing else. Let me explain why.

Both Forbes and the Wall Street Journal wrote about the top 5 industry early adopters of Autonomous Systems.

According to the article, “in the IT industry, the pioneering product is Oracle’s Autonomous Data Warehouse Cloud, a cloud-based database that configures, optimizes and patches itself with minimal human intervention. Oracle Executive Chairman and CTO Larry Ellison says the machine learning technology that underpins the company’s autonomous data warehouse, as well as autonomous integration, developer, mobile and other platform services that will follow, is as revolutionary as the internet.”

To make it clear, the new Autonomous Data Warehouse and the Autonomous Transaction Processing databases are not based on newly written software. It is the same Oracle database with a lot of automation and mathematical algorithms embedded into the original database software. Think of machine learning and computer intelligence.

If you are looking for something similar among other database brands – good luck! Finding all areas of Self-Securing, Self-Automation and Self-Repairing outside Oracle Autonomous Database Cloud is mission impossible. And here are the areas:

Four Areas of Self-Securing of Autonomous Databases:

1. Self-securing starts with the security of the Oracle Cloud infrastructure and database service. Security patches are automatically applied every quarter or as needed, narrowing the window of vulnerability. Patching includes the full stack: firmware, operating system [OS], clusterware, and database. There are no steps required from the customer side.

2. Oracle encrypt customer data everywhere: in motion, at rest, and in backups. The encryption keys are managed automatically, without requiring any customer intervention. And encryption cannot be turned off.

3. Administrator activity on Oracle Autonomous Data Warehouse Cloud is logged centrally and monitored for any abnormal activities. Oracle have enabled database auditing using predefined policies so that customers can view logs for any abnormal access: UNIFIED_AUDIT_TRAIL

4. Built upon Oracle Database Vault, unique to Oracle Cloud, operations personnel have privilege to do all administrative tasks without any ability to ever see any customer data.

Four Areas of Self-Automation of Autonomous Databases:

1. Self-Automation: automatic provisioning of pluggable databases and automatic scaling – PDB resource manager.

2. Automatic tuning: SQL Plan Management, Adaptive Plans, SQL Tuning Advisor – Automatic SQL Tuning, Storage Indexes, Automatic Storage Management, Automatic detection and correction of regressions due to plan changes, Automatically tune memory, process, sessions.

3. Automatic Fault Tolerant Failover: RAC and Data Guard. Automatically kill run-away transactions and SQL. Automatically kill inactive session.

4. Automatic Backup and Recovery: RMAN, Flashback.

Seven Areas of Self-Repairing of Autonomous Databases:

Both Maria Colgan and Franck Pachot wrote on the differences between ADW and ADP:

How does Autonomous Transaction Processing differ from the Autonomous Data Warehouse? by Maria Colgan

ATP vs ADW – the Autonomous Database lockdown profiles by Franck Pachot

But here are in short the four main areas of differences between ADW and ADP:

1. Settings and parameters:
– In ADW: the majority of the memory is allocated to the PGA – joins, aggregations in memory
– In ATP: the majority of the memory is allocated to the SGA – minimize I/O

For DBAs: ADW runs on 94 non-default parameters out of which 35 are underscore. In ATP, the corresponding numbers are 94 and 36. Not same 94 though! And these numbers may slightly vary.

2. Data formats:
– In ADW: data is stored in a columnar format as that’s the best format for analytics processing – ADW uses DBIM option features like in-memory columnar flash cache under the covers
– In ATP: data is stored in a row format

3. Statistics/CBO:
– In ADW: statistics are automatically maintained as part of bulk load and DBMS_CLOUD activities
– In ATP: statistics are automatically gathered when the volume of data changes significantly enough to make a difference to the statistics

4. Client services/connections:
– In ADW: only one service (LOW) automatically runs SQL statements serially, all is parallel
– In ATP: the PARALLEL service does no longer exist (as of 12.11.2018)

FAQ for Oracle Autonomous Database

In order to show the other side of the coin, here are two perspective from IBM and SAP point of view:

Oracle Autonomous Database – is it truly self-driving? by Danny Arnold

How Real is The Oracle Automated Database? by Shaun Snapp

But if you prefer more neutral reading check Oracle’s next chapter: The Autonomous Database and the DBA and Will Autonomous Database Entice Big Business To The Cloud?

Bottom line: if you need extreme high reliability, top-level security, 100% automation of DBA routine tasks and no funny surprises – start testing and using the Oracle Autonomous Database. Really!

PL/SQL quiz

In Oracle database, PL/SQL on October 27, 2018 at 23:42

Years ago, I saw this quiz on dbasupport. Below, we have 2 PL/SQL blocks, have a look:


BEGIN
  WHILE sysdate = sysdate LOOP
    NULL;
  END LOOP;
END;
/

DECLARE
  x DATE;
BEGIN
  LOOP
    BEGIN
      SELECT null INTO x FROM dual WHERE sysdate = sysdate;
    EXCEPTION
      WHEN NO_DATA_FOUND THEN EXIT;
    END;
  END LOOP;
END;
/

Question is: what happens after you run them? Are the loops above both finite, both infinite or is it so that one of them is finite and the other one infinite?

If you cannot answer the question just run them in SQL*Plus, etc. Then, try to explain why – the reason for being finite or infinite.

I will update this blog post after month or so with the answer.

And I have just run this against ATP (~18c) but the output is same in previous versions too:

with SALES as (select /*+ materialize */ 0/0 from DUAL)
select count(*) from SALES;

ORA-01476: divisor is equal to zero
01476. 00000 -  "divisor is equal to zero"
*Cause:
*Action:

with SALES as (select /*+ inline */ 0/0 from DUAL)
select count(*) from SALES;

  COUNT(*)
----------
         1

Updated on October 31st:

As you can see from Eugen Iacob’s answer/comments below, the answer is read consistency. The first loop is finite, the second is infinite. Well done Eugen!

In SQL sysdate is evaluated only once – unlike in PL/SQL.