Dontcheff

Archive for the ‘PostgreSQL’ Category

Do not just Simplify with Databases: Automate, Innovate and Innovate!

In Autonomous, Database tuning, Databases, DB2 database, DBA, PostgreSQL, SQL Server on June 15, 2020 at 09:46

“Automation applied to an inefficient operation will magnify the inefficiency” – Bill Gates
“Innovation distinguishes between a leader and a follower! – Steve Jobs

In the database industry, simplification, automation and innovation have sort of become buzz words – we hear them more often in meetings and see them in power point presentations than in the real world implementations. Databases are being patched, upgraded and migrated but how often automation and innovation are part of the process?

A recent article entitled “After The Pandemic, Don’t Simply Automate. Innovate” quoted an Accenture report that surveyed 1500 C-suite executives in 16 industries, 76% of respondents said they were struggling to scale the technology across their businesses. The numbers tell the story: A full 84% of C-suite executives believe they must leverage artificial intelligence (AI) to achieve their growth objectives.

A survey of database managers and administrators shows the benefits they expect from automation, plus what they think about DBAs’ current workloads. What are the highlights:

– 63% expect faster innovation from database automation
– 62% of the data pros expect that data will grow 25% or more annually at their enterprise over the next three years
– 66% said the DBA/data team’s overwork or scheduling issues are the biggest challenge in database deployments
– The majority of companies have not made significant automation gains in any key data management processes
– 69% of the DBAs think automation will make their job more business-centric.

So, why aren’t we seeing those innovative automations being implemented in full? Is it the DBA mindset, the business is behind the technology and the innovation, lack of time and resources, lack of knowledge and trust in the new?

We can try to look into the issue from the DBA perspective. Recently, Jeff Erickson listed the 3 Can’t Miss Ways to Turn Your DBA Skills into Gold. Here is my paraphrased version of these 3 ways:

1. Expand DBAs skils towards app development – Python, PL/SQL, etc.
2. DBAs should get into the data science game – look more into Oracle Autonomous Database which comes with Oracle ML, a rich library of machine learning algorithms
3. Convert the “A” in DBA from Administrator to Architect – look into big data and data architecture – big data is the driver for innovation in databases

At the end of the article he quoted Kerry Osborne about dealing with vast and growing data volumes: “That added complexity and scale should sound like a huge opportunity knocking“.

A interesting article by Duncan Harvey entitled What is the innovation, automation dynamic?, pointed out how autonomous systems are finally giving people the headspace they need to innovate at speed in the digital era.

So, is there a Database Automation Guide? Yes indeed!

Get a cloud boost with Oracle Autonomous!

Have a look of all Automatic features that came after Oracle 9i. Oracle 7 an 8 were probably the first releases added automatic functionalities to the database but the boost started with 9i. 20c is not skipping on that part either.

From all database brands arguably the Oracle Database is most advanced in terms of tools, automation features and innovation capabilities. Only from 11.2 until Oracle 20c, there are 133 new features purely focused on Automation. But I still see detabases where even old enough automation features (such as Automatic SQL Tuning) have not been yet implemented.

Let us look into few database brands:

SQL Server:

There are dozens of Automatic features and properties embedded into SQL Server:

Automatic Tuning can do a lot things such as Automated performance tuning of databases, Automated verification of performance gains and Automated rollback and self-correction. More importantly, it is the only database besides Oracle to have Automatic Indexing. It is worth checking how Automatic index management works in the Azure SQL database. Azure SQL Database analyzes your workload, identifies the queries that could be executed faster if you create an index, identifies indexes that are not used in a longer period of time, and identifies duplicated indexes in the database.

Db2:

IBM Db2 also has a relatively good list of Automatic features including self-tuning memory (single-partition databases only), Automatic storage, Automatic database backups, Automatic reorganization and Automatic statistics collection. The Db2 documenattion claims that the Db2® autonomic computing environment is self-configuring, self-healing, self-optimizing, and self-protecting but this is far behind what Oracle ADB has to offer. Tuning SQL is more difficult in Db2 than in Oracle and the single reason for that are the tools and features offered by both systems.

Redshift:

Amazon’s Redshift has a good set of features. Looking at what automation is like, we find enough for a relatively new database brand (Redshift, not PostgreSQL).

In terms of storage, it is mostly about how Redshift automatically takes care of data formatting and data movement into S3 and how with managed storage, capacity is added automatically to support workloads up to 8PB of compressed data.

There is of course Automated provisioning and Automated backups. Plus, Automatic workload management (WLM) uses machine learning to dynamically manage memory and concurrency, helping maximize query throughput.

Also, Amazon Redshift continuously monitors the health of the cluster, and automatically re-replicates data from failed drives and replaces nodes as necessary for fault tolerance.

Very much like Oracle RAC and Oracle Exadata, as of May 2020, Amazon Redshift now leverages Bloom filters to enable early and effective data filtering. Redshift automatically determines what queries are suitable for leveraging Bloom filters at query runtime.

Recently, Amazon Redshift also introduces ATS (= Automatic Table Sort), an automated alternative to Vacuum Sort. Automatic table sort complements Automatic Vacuum Delete and Automatic Analyze and together these capabilities fully automate table maintenance. Automatic table sort is now enabled by default on Redshift tables where a sort key is specified.

Amazon Redshift automatically takes incremental snapshots (backups) of your data every 8 hours or 5 GB per node of data change. You now get more information and control over a snapshot including the ability to control the automatic snapshot’s schedule.

In terms of SQL tuning, Amazon Redshift now automatically and elastically scales query processing power to provide consistently fast performance for hundreds of concurrent queries. The database automatically shuts down Concurrency Scaling resources to save you cost. Also, Amazon Redshift now updates table statistics by running ANALYZE automatically but that is no news for Oracle database users. Amazon Redshift improves query performance by automatically moving read and write queries to the next matching queue without restarting the moved queries.

Snowflake:

Looking into the Top 10 cool things about Snowflake, we see the fact that the JSON documents are stored in a table and optimized automatically in the background for MPP and columnar access. There is Automatic Encryption of Data and Automatic Query Optimization. No Tuning! Really cool as “It is all handled “auto-magically” via a dynamic query optimization engine in our cloud services layer. So, no indexes, no need to figure out partitions and partition keys, no need to pre-shard any data for even distribution, and no need to remember to update statistics.”

I like the “auto-magical” part 🙂

My interest was caught by the deep dive of the Revolutionary features of Snowflake that sets it apart — A Deep dive: all I found on automation was “Automatic scale down”. No comment here.

PostgreSQL:

If we look into the PostgreSQL Feature Matrix, we see only one feature about automation: Automatic plan invalidation. There is a mention of WAL Buffer auto-tuning and Autovacuum.

For databases such as MySQL, PosgreSQL and MongoDB, check this article: Why is Database Automation Important?

A recent discussion Oracle vs. PostgreSQL basically tells it all – in case you have the patience to read it – here are both sides:

Pro-Oracle: “Comparing Postgres with Oracle is a bit like comparing a rubber duck you might buy your three year old, with a 300000 ton super tanker. Do they both float? Yeah, but that’s about the only similarity.”
Pro-PostgreSQL: “So bottom line, PostgreSQL beats Oracle by far in my opinion, at least as far as installing it and sizes are concerned.”

Looking into the Features and Benefits of EDB Postgres Cloud Database Service, we see Automated Notifications, Automated Monitoring, Automated Backups and Automated Replica Failover.

Oracle

Of course, Oracle ADB (Autonomous Database) has all automation features embedded into the service but not all applications are certified or fit ADB. Not yet at least. Still, I think that most automation features can be easily implemented (into a non-ADB) with some prior testing and the benefits are enormous. The list if automation features is long – here are the major ones:

Features in 19c and 20c which are worth implementing (at least looking into and testing) are:

Automatic Indexing
– Automatic SQL Plan Management
Automatic Index Optimization
Automatic Zone Maps

Automation and innovation go hand in hand. Database innovation in an enterprise involves using the database technology in new ways in order to create a more efficient organization and improve alignment between technology and business by completing the same DBA work with smaller teams by implementing automation features in the database to reduce storage and labor cost & increase system uptime and performance.

Bottom line for DBAs: regardless of the database, there is room for innovation in every database brand!

Amazon’s Aurora and Oracle’s Autonomous ATP

In Autonomous, Cloud, DBA, PostgreSQL on August 29, 2018 at 09:26

Databases are very much like wine, cheese and trees: they get better as they age.

Amazon Aurora exists since 2015. The word aurora comes Latin, means dawn. The name was borne by the Roman mythological goddess of dawn and by the princess in the fairy tale Sleeping Beauty.

Both Amazon’s “dawn” Aurora and Oracle’s ATP are typical cloud OLTP systems.

The question is: what are their differences, which one is better and meant exactly for my needs?

Oracle ATP is based on Oracle’s database and Exadata, here are all the innovations adopted from both systems:

Amazon’s Aurora has 2 flavors: Amazon Aurora MySQL and Amazon Aurora PostgreSQL.

Amazon Aurora MySQL is compatible with MySQL 5.6 using the InnoDB storage engine. Certain MySQL features like the MyISAM storage engine are not available with Amazon Aurora. Amazon Aurora PostgreSQL is compatible with PostgreSQL 9.6. The storage layer is virtualized and sits on a proprietary virtualized storage system backed up by SSD. And you pay $0.20 per 1 million IO requests.

Oracle’s Autonomous database comes also in 2 flavors: Oracle ADW and Oracle ATP. Check Franck Pachot’s article ATP vs ADW – the Autonomous Database lockdown profiles to see the differences of both cloud databases.

In general, one can compare Oracle ADW with Amazon Redshift and Oracle ATP with Amazon Aurora.

One way to compare is to look at the ranking provided by DB-Engines: Amazon Aurora vs. Oracle. No-brainer who the leader is: score of 1300 vs score of 5 in favor of Oracle.

Another interesting comparison comes from Amalgam Insights. Check how Oracle Autonomous Transaction Processing lowers barriers to entry for data-driven business. Check out the DBA labor cost involved: 5 times less in favor of Oracle ATP compared to Amazon! All the routine DBA tasks have been totally eliminated.

The message from them is very clear: “Oracle ATP could reduce the cost of cloud-based transactional database hosting by 65%. Companies seeking to build net-new transactional databases to support Internet of Things, messaging, and other new data-driven businesses should consider Oracle ATP and should do due diligence on Oracle Autonomous Database Cloud for reducing long-term Total Cost of Ownership.”

This month (August 2018), there was an interesting article by Den Howlett entitled Oracle introduces autonomous transaction processing database – pounds on AWS. Here are 2 interesting and probably correct statements/quotes from there:

1. It really is hard to get off an established database, even one that can be as expensive as Oracle can turn out to be.
2. Some of the very largest workloads will not go to the public cloud anytime soon. Maybe never which in internet years is after 2030.

As a kind of proof of how reliable and fast Oracle’s Autonomous Transaction Processing database is consider the following OLTP workload running non-stop in a balanced way without any major spikes and without a single queued statement!

No human labor, no human error, and no manual performance tuning!

Migrating Amazon Redshift to Autonomous Data Warehouse Cloud

In Autonomous, Data Warehouse, DBA, Exadata, PostgreSQL on July 4, 2018 at 18:34

“Big Data wins games but Data Warehousing wins championships” says Michael Jordan. Data Scientists create the algorithm, but as Todd Goldman says, if there is no data engineer to put it into production for use by the business, does it have any value?

If you google for Amazon Redshift vs Oracle, you will find lots of articles on how to migrate Oracle to Redshift. Is it worth it? Perhaps in some cases before Oracle Autonomous Data Warehouse Cloud existed.

Now, things look quite different. “Oracle Autonomous Data Warehouse processes data 8-14 times faster than AWS Redshift. In addition, Autonomous Data Warehouse Cloud costs 5 to 8x less than AWS Redshift. Oracle performs in an hour what Redshift does in 10 hours.” At least according to Oracle Autonomous Data Warehouse Cloud white paper. And I have nothing but great experiences with ADWC. For the past half an year or so.

But, what are the major issues and problems reported by Redshift users?

One of the most common complaints involves how Amazon Redshift handles large updates. In particular, the process of moving massive data sets across the internet requires substantial bandwidth. While Redshift is set up for high performance with large data sets, “there have been some reports of less than optimal performance,” for the largest data sets. An article by Alan R. Earls entitled Amazon Redshift review reveals quirks, frustrations claims that reviewers want more from the big data service. So:

Why to migrate from Amazon Redshift to Autonomous Data Warehouse Cloud?

1. Amazon Redshift is ranked 2nd in Cloud Data Warehouse with 14 reviews vs Oracle Exadata which is ranked 1st in Data Warehouse with 55 reviews.

The top reviewer of Amazon Redshift writes “It processes petabytes of data and supports many file formats. Restoring huge snapshots takes too long”. The top reviewer of Oracle Exadata writes “Thanks to smart scans, the amount of data transferred from storage to database nodes significantly decreases”.

2. Oracle Autonomous dominates in features and capabilities:

DB-engines shows an excellent system properties comparison of Amazon Redshift vs. Oracle.

In addition, reading through these thoughts on using Amazon Redshift as a replacement for an Oracle Data Warehouse can be worthwhile. It shows how Amazon Redshift compares with a more traditional DW approach. But Enterprises have some Redshift concerns, including:

– The difference between versions of PostgreSQL and the version Amazon uses with Redshift
– The scalability of very large data volume is limited and performance suffers
– The query interface is not modern, interface is a bit behind
– Redshift needs more flexibility to create user-defined functions
– Access to the underlying operating system and certain database functions and capabilities aren’t available
– Starting sizes may be too large for some use cases
– Redshift also resides in a single AWS availability zone

3. Amazon Redshift has several limitation: Limits in Amazon Redshift. On the other hand, you can hardly find a database feature not yet implemented by Oracle.

4. But the most important reason why to migrate to ADWC is that the Oracle Autonomous Database Cloud offers total automation based on machine learning and eliminates human labor, human error, and manual tuning.

How to migrate from Amazon Redshift to Autonomous Data Warehouse Cloud?

Use the SQL Developer Amazon Redshift Migration Assistant which is available with SQL Developer 17.4. It provides easy migration of Amazon Redshift environments on a per-schema basis.

Here are the 5 steps on how to migarte from Amazon Redshift to Autonomous Data Warehouse Cloud:

1. Connect to Amazon Redshift
2. Start the Cloud Migration Wizard
3. Review and Finish the Amazon Redshift Migration
4. Use the Generated Amazon Redshift Migration Scripts
5. Perform the Post Migration Tasks

Check out what Paul Way says about why Oracle thinks Autonomous IT can ultimately win the Cloud War.

Finally, here is what Amazon CTO Werner Vogels is saying: Our cloud offers any database you need. And I agree with him that a one size fits all database doesn’t fit anyone. But mission and business critical enterprise systems with huge requirements and resource needs deserve only the best.

What is the best relational database?

In DB2 database, DBA, MySQL, Oracle database, PostgreSQL, SQL Server, Sybase on May 11, 2013 at 11:45

“I have the simplest tastes. I am always satisfied with the best.” Oscar Wilde

If we look at market figures, then “Gartner 2012 Worldwide RDBMS market share” reports 48.3 percent revenue share for Oracle.

RDBMS_Market_Share

Three major factors make the Oracle database a clear leader:

1. Remains #1 in worldwide RDBMS software revenue share
2. Holds a larger revenue share than four closest competitors combined
3. Leads next closest competitor revenue share by 29%

But why is that so? Let us have a look at the latest DB-Engines Ranking where Oracle is not only #1 in the DB-Engines Ranking of Relational DBMS but also in the Complete Ranking.

The DB-Engines Ranking are measured the popularity of a system by using the following parameters:

– Number of mentions of the system on websites, measured as number of results in search engines queries.
– General interest in the system.
– Frequency of technical discussions about the system.
– Number of job offers, in which the system is mentioned.
– Number of profiles in professional networks, in which the system is mentioned.

What is the best relational database? The best answer given in answers.yahoo.com is the following: “Define “best”. Oracle is like a BMW. Expensive but has all the fixings. But not everyone needs a BMW. MySQL is like a VW Beetle (the old model). Its cheap, and gets you where you need to go. But you have to tweak it to suit your needs.” Nice explanation!

Next, let us look at Gartner’s Magic Quadrant for Data Warehouse Database Management Systems from January 31st, 2013:

Gartner_Magic_DW

The reason Oracle is top on the “Ability to Execute” scale is simple. It can be described by me with just one word: Exadata. Of all the vendors in this analysis, Oracle reports the highest incidence of nontraditional analytics customers: sectors such as hospitality, energy trading, life sciences and food distribution show up in its reference base. According to Gartner, many of the vertical markets where Oracle has the greatest success contain traditional implementers or late adopters of data warehousing. Oracle’s customers range in annual revenue from $100 million to over $10 billion.

The Business Technology Forum raised the same question: Which is the best enterprise RDBMS database? The article is very much to the point coming to the conclusion that Oracle have the upper hand.

Mirror, Mirror on the Wall, which is the best RDBMS of all? You do not expect an answer from the Mirror “It depends” because anyone trying to be diplomatic will answer that way. Well, at least they say that Oracle is most scalable, most feature rich, and just a wonderful RDBMS.

More to read:

Why would you use Oracle database?
Why Oracle is the Preferred Database Platform?
Is Microsoft’s SQL Server really cheaper than Oracle?

The last one favors SQL Server over Oracle but a word of caution on the credibility on the Oracle side: the author calls PL/SQL “P-SQL” and refers to Active Data Guard as “Advanced Data Guard”.

It will take DB2, SQL Server, Sybase, PostgreSQL and all other key RDBMS players at least several more releases before they can approach the current self-management and tuning capabilities of Oracle Database 11g. But when that day comes, Oracle will not be still at version 11.2.0.

12c is knocking on the door.

Oracle12c