Dontcheff

DBAs are a wealth of knowledge!

In Autonomous, Cloud, Databases, DBA on October 19, 2020 at 10:53

Fifteen years ago, in 2005, I remember I read the following speculation about the DBA profession: Oracle 2020 – a Glimpse Into the Future of Database Management. I was curious what would be true 15 years later. Just have a look at some of the predictions:

– 2018: Oracle 14m provides inter-instance sharing of RAM resources. All Oracle instances become self-managing.
– 2019: The first 128-bit processors are introduced.
– 2020: Oracle 16ss introduces solid-state, non-disk database management.

Looking even further we read: “Changing Role of the Oracle DBA in 2020: But the sad reality of server consolidation was that thousands of mediocre Oracle DBAs lost their jobs to this trend. The best DBAs continued to find work, but DBAs who were used for the repetitive tasks of installing upgrades on hundreds of small servers were displaced.” And further: “Inside Oracle 2020: The world of Oracle management is totally different today than it was back in 2004. We no longer have to worry about applying patches to Oracle software, all tuning is fully automated and hundreds of Oracle instances all reside within a single company-wide server.” ADB rings a bell?

Four years ago, another forecast article called The 2020 DBA: A Look Into the Future appeared in dzone.com predicting that despite the evolving role of the DBAs, DevOps has actually made database administrators more relevant than ever before.

True, as the DBA profession is still among the Top 10 jobs in Technology. DBA is #4 in Best Technology Jobs, #15 in Best STEM Jobs and #30 in 100 Best Jobs.

Now, going back to the title of this blog post. Jeff Smith’s, Russ Lowenthal’s and Chris Saxon’s Oracle DBA 2020 Data Masterclasses are something I recommend now to every DBA. Jeff raised an important topic but let us take a step back.

Last month (September 2020), FlashDBA discussed the Evolution of the DBA from 1.0 through 2.0 until 3.0:

DBA 1.0: The (Good) Old Days – clearly the old days are over. Regardless if they were good or bad is a memory-lane discussion.

DBA 2.0: The IT Generalist – I remember when saw this paper for the first time about 10-11 years ago: Oracle DBA 2.0. ASM, Direct NFS, Clusterware, VMware, Flash, Linux. The DBA had to learn OS, Storage and Network administration. With Exadata, I even heard the term DBA 2.1

DBA 3.0: The Cloud DevOps DBA – new game, rather new set! “A DBA building a database in the public cloud is making decisions which have a direct affect on the (quite possibly massive) monthly bill from AWS/Azure/GCP/OCI”.

Checking the Database Management Predictions from 2019, we can see they now we are close to:

DBA 4.0: Autonomous DBA – more attention on data management, data security, data architecture, machine learning, devops:

“Oracle Autonomous Database can quickly provision, resize, and relocate databases with little human interaction. However, as more database provisioning tasks are automated, DBAs will still need to classify the data.” – Michelle Malcher

“CEOs will force DBAs to step into more-important roles—such as data architects, data managers, and chief data offcers—as a company’s data and machine learning algorithms become important drivers of the stock price.” — Rich Niemiec

“Oracle Enterprise Manager Cloud and other third-party tools, in conjunction with Oracle Multitenant will make managing large numbers of databases easier. DBAs will be able to manage 10 times or more databases after consolidation with Oracle Multitenant.” — Anuj Mohan

“DBAs need to understand that there is a true sea change afoot, and there’s no way to stop these market forces. Hopefully, we’ll all be able to embrace this tidal wave and avoid being caught up in the undertow.” — Jim Czuprynski

I believe the adoption of a hybrid infrastructure is inevitable, and the new skills needed (such as cloud set-up, configuration, and monitoring) are mostly cloud-related. DBAs will need to be able to assess the databases and define what they are best suited for. Replication of data and databases will become more complex in hybrid environments—especially when different clouds are involved.

That is coming in few years at most (when most companies will adopt multi-cloud) and call the profession DBA 5.0 if you prefer or just DBA. Every decade has its challenges for IT professionals. Challenges are becoming more complex and DBAs are often on the front line with every new fashionable IT concept. And we can quote now Larry Ellison who said that the computer industry is the only industry that is more fashion-driven than women’s fashion.

Here are finally few additional recent articles on the topic of the future of the DBA profession (if interested to read more on the topic):

My Three Beliefs About The Future Of The DBA Job
The Future for the DBA
Will automated databases kill the DBA position?
What Does the Future Hold for DBAs?
For DBAs In 2020: Understand Your Worth, Seize The Moment
What Happens to DBAs When We Move to the Cloud?
The Future of The DBA in The Era of The Autonomous Database

How is Oracle Autonomous JSON Database different from Oracle ATP and MongoDB?

In Autonomous, Cloud, DBA, Oracle database on October 1, 2020 at 09:09

Oracle Autonomous JSON Database is an Oracle Cloud service that is specialized for developing NoSQL-style applications that use JavaScript Object Notation (JSON) documents. Autonomous JSON Database (in short AJD) stores the JSON documents in a native tree-oriented binary format making it highly optimized for fast reads (avoiding linear scans) and partial updates (reducing redo/undo log sizes).

Like ADW and ATP, AJD delivers also auto-scaling, automated patching, upgrades, maximum security and auto-tuning. I do agree with Philipp Salvisberg that AJD is a special version of the Autonomous Transaction Processing (ATP).

The leader of pure document stores (as of September 2020) is MongoDB. Amazon DynamoDB and Microsoft Azure Cosmos DB are also extremely popular.

Autonomous JSON Database provides all of the same features as Autonomous Transaction Processing and, unlike MongoDB, allows developers to also store up to 20 GB of non-JSON data. There is no storage limit for JSON collections.

An excellent comparison of Oracle AJD and MongoDB can be found in the article entitled Introducing Oracle Autonomous JSON Database for application developers by Beda Hammerschmidt:

Oracle AJD is very similar to Oracle ATP with the major difference that AJD is meant for document databases containing lots of JSON format documents. You can think of ATP as more of a hybrid version of the Autonomous database.

Looking at the init.ora parameters I observed something after comparing an ATP and an AJD spfile parameters. Except the obvious ones like instance_name or service_names I found only the following differences (both are with 1 OCPU and 1TB of storage):

cpu_count: 6 for AJD and 2 for ATP
db_recovery_file_dest_size: 88406716M for AJD and 123789329M for ATP
gcs_server_processes: 4 for AJD and 5 for ATP
pdb_lockdown: JDCS for AJD and OLTP for ATP
resource_manager_cpu_allocation: 92 for AJD and 100 for ATP
shared_pool_reserved_size: 2254857830 for AJD and 3248069017 for ATP
transactions: 66083 for AJD and 66110 for ATP

Clearly, this is what I did not expect. Obviously, the biggest difference between AJD and ATP comes from the lockdown profiles.

So, here are the three differences between Oracle AJD and ATP:

1. Lockdown profiles
2. AJD can store at most 20 GB of non-JSON data
3. Small differences in the init.ora parameters

And here are the three differences between Oracle AJD and MongoDB Atlas:

1. Autonomous JSON Database costs 30% less than comparable MongoDB Atlas configurations: $2.74/hr versus $3.95/hr
2. Autonomous JSON database gives you 2x throughput consistently across different workload types and collection sizes
3. Autonomous JSON Database comes with more capabilities than MongoDB Atlas

A good starting point is the documentation of Autonomous JSON Database for Experienced Oracle Database Users. It is mostly about restrictions for SODA and JSON, SQL and other database features.

The SODA and JSON Tutorials are a good starting point to getting used to working with AJD.

Finally, here are 5 Oracle Autonomous JSON Database use cases:

– Mobile applications
– Applications with dynamic personalized experiences
– Content and catalog management
– Integrated IoT applications
– Digital payment applications

And here is a good article by Maria Colgan on How does Autonomous Transaction Processing differ from the Autonomous Data Warehouse.

Detecting Data Tampering and Measuring Asymmetry and Tailedness of Data in Oracle Database 20c

In DBA, New features, SQL on September 11, 2020 at 11:27

There are seven types of SQL functions in the Oracle database:

1. Single-Row Functions
2. Aggregate Functions
3. Analytic Functions
4. Object Reference Functions
5. Model Functions
6. OLAP Functions
7. Data Cartridge Functions

The single-row functions are the most popular and used ones but the aggregate and analytical function are also extremely popular among Oracle developers (and by DBAs too).

Three new analytical and statistical aggregate functions are now available in Oracle Database 20c. Let us use the RDBMS_BRANDS table for the 3 examples below. For the sake of clarity the last column shows if the database is only available from the cloud.

1. CHECKSUM computes the checksum of the input values or expression and can be applied on a column, a constant, a bind variable, or an expression involving them. All datatypes except ADT and JSON are supported.

In earlier releases you can still use DBMS_SQLHASH.GET_HASH in order to check the integrity of result sets or some other other PL/SQL packages: STANDARD_HASH or DBMS_CRYPTO.

Here is an example of how to use CHECKSUM:

SQL> select DB_ENGINES_RANK,DB_ENGINES_SCORE
from RDBMS_BRANDS
where DB_ENGINES_RANK < 50; 

DB_ENGINES_RANK DB_ENGINES_SCORE
--------------- ----------------
              1          1345.44
              2          1282.64
              3          1078.31
              4           514.81
              5           162.64
              8            90.09
             10            73.89
             13            50.54
             15            27.59
             18            20.27
             42             3.73
             48             2.36

12 rows selected.

SQL> select CHECKSUM(DB_ENGINES_RANK), CHECKSUM(DB_ENGINES_SCORE)
from RDBMS_BRANDS where DB_ENGINES_RANK < 50;  

CHECKSUM(DB_ENGINES_RANK) CHECKSUM(DB_ENGINES_SCORE)
------------------------- --------------------------
                   288250                     209742

Let us now update the rankings of Oracle which in September went up to 1369.36 points and observe how the checksum value for DB_ENGINES_SCORE changed from 209742 to 180002!


SQL> update RDBMS_BRANDS set DB_ENGINES_SCORE=1369.36 
where RDBMS_NAME='ORACLE';

1 row updated.

SQL> commit;

Commit complete.

SQL> select 
CHECKSUM(DB_ENGINES_RANK), CHECKSUM(DB_ENGINES_SCORE)
from RDBMS_BRANDS where DB_ENGINES_RANK < 50;  

CHECKSUM(DB_ENGINES_RANK) CHECKSUM(DB_ENGINES_SCORE)
------------------------- --------------------------
                   288250                     180002


Note: NULL values in CHECKSUM column are ignored. Also, if you rollback the transaction, then the checksum value does not change.

There are 2 other analytical functions which are new to Oracle 20c. Skewness and Kurtosis describes the shape of a probability distribution and there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. Oracle is now offering a very easy way of calculating their values by providing the in-built analytical functions in Oracle 20c.

2. SKEWNESS functions SKEWNESS_POP and SKEWNESS_SAMP are measures of asymmetry in data. A positive skewness is means the data skews to the right of the center point. A negative skewness means the data skews to the left.

Here is an example of how to use SKEWNESS:

SQL> select CLOUD_ONLY, count(*) from RDBMS_BRANDS group by CLOUD_ONLY;

C   COUNT(*)
- ----------
N          9
Y          3

SQL> select CLOUD_ONLY,
SKEWNESS_POP(DB_ENGINES_RANK), SKEWNESS_POP(DB_ENGINES_SCORE)
from RDBMS_BRANDS
group by CLOUD_ONLY; 

C SKEWNESS_POP(DB_ENGINES_RANK) SKEWNESS_POP(DB_ENGINES_SCORE)
- ----------------------------- ------------------------------
N                     2.0503487                     .584150452
Y                    .685667537                      -.4626607
   

Skewness makes sense in the situation where DB_ENGINES_RANK and DB_ENGINES_SCORE represent the database brand rank and score in the list and you want to determine whether the outliers in data are biased towards the top end or the bottom end of the distribution, that is, if there are more values to the top of the mean when compared to the number of values to the bottom of the mean.

3. KURTOSIS functions KURTOSIS_POP and KURTOSIS_SAMP measure the tailedness of a data set where a higher value means more of the variance within the data set is the result of infrequent extreme deviations as opposed to frequent modestly sized deviations.

Here is an example of how to use KURTOSIS:

SQL> select CLOUD_ONLY, count(*) from RDBMS_BRANDS group by CLOUD_ONLY;

C   COUNT(*)
- ----------
N          9
Y          3

SQL> select CLOUD_ONLY,
KURTOSIS_POP(DB_ENGINES_RANK), KURTOSIS_POP(DB_ENGINES_SCORE)
from RDBMS_BRANDS
group by CLOUD_ONLY;  

C KURTOSIS_POP(DB_ENGINES_RANK) KURTOSIS_POP(DB_ENGINES_SCORE)
- ----------------------------- ------------------------------
N                    2.88880521                     -1.4382391
Y                             0                              0
  

Note that a normal distribution has a kurtosis of zero. Have a look at the KURTOSIS_POP for the cloud-only databases Google BigQuery, Amazon Redshift and Snowflake.