On DBA innovation: who is afraid to fail will keep falling forever

In Cloud, Database tuning, DBA, Exadata, Grid Control, Oracle database on April 24, 2012 at 19:06

Managers always ask software engineers to deliver something sooner rather than waiting to deliver everything later.

How come it is fine to deliver an incomplete low quality IT product just because it is delivered on time? Most of us have been on several occasions limited by that deadline and time-schedule in terms of creativity, innovation and pro-activeness.

Innovation in database management and database administration has nothing to do with DBAs from whom you often hear phrases like “If it ain’t broke, don’t fix it” or DBAs who follow blindly the KISS principle.

In DBA terms, innovation is the process of introducing and implementing new features in the database and using new options and database products.

A good example is the adoption of Oracle Enterprise Manager Grid/Cloud control. I have seen so many excuses for not implementing it or delaying its implementation. And the benefits and savings of OEM implementation can be measures in times!

Do you wonder how quickly and efficiently one can find out all details of a certain SQL statement without Cloud Control? No other database brand has so highly sophisticated tool for performance analysis like Oracle has. See all the details offered with one mouse click: Exadata Cell Offload Efficiency (96% in this case), Wait Activity in detail, use of the Result Cache, etc. All that from a single screen!

Accepting new innovative database properties, tools and appliances is hard for many IT architects, IT managers and most of all for DBAs who have the primary responsibility to test, verify and promote these features.

Let us have a look at one other innovation from Oracle. Implementing Exadata can be strongly considered if any of the following 5 points are in the IT roadmap:

1. Implementing a new Business Intelligence solution
2. Datawarehouse licenses are up for renewal
3. Database platform consolidation exercise
4. Storage requirements are increasing and the performance is decreasing
5. Performance of transactional systems requires major improvements

Look at the list below and think how many of these are used by your company or client:

– Exadata
– Enterprise Manager Cloud Control
– Edition-Based Redefinition
– Advanced Compression
– Oracle Database Appliance
– Automatic SQL Tuning
– Total Recall
– Real Application Testing
– One of these init.ora parameters: db_ultra_safe, result_cache_mode, optimizer_capture_sql_plan_baselines, awr_snapshot_time_offset

Although I did not put Enterprise Manager Cloud Control on top of the list, is it still a must for every Enterprise using Oracle products. On the light side, I was recently asked if you can see the temperature of a given computer from OEM 🙂 Here we go:

Jon Taplin said it very well in his article on Steve Jobs and Innovation: “At the Innovation Lab we try to inculcate the notion that you can’t be afraid. You can’t be afraid to fail. You can’t be afraid to “be different”. You can’t be afraid to celebrate the weird mix of art and science that is true innovation. Steve Jobs embodied all of those qualities. I wrote a bit about him in my new book and there is a cool video in the book of his graduation speech at Stanford that you will see replayed too often in the next few days.”

As a DBA, one should try to step out of his comfort zone where (s)he is surrounded by the everyday administrative tasks and reactive performance tuning work and try to make difference in the enterprise by acting more like a Database Architect than a Database Operator. Prove the complexity and importance of the DBA role!

P.S. Often in IT, the way from Insight to Action is longer than the Way of St. James.

▶ 2 Responses

Such a pity it’s dependent on finding managers who are:

a) understanding of “the complexity and importance of the DBA role”, and…
b) willing to allow innovation from “below” and accepting of the possibility of failure

Still, we live in hope 🙂

Reply

Steve 1 May 2012 at 2pm
OEM and other enterprise performance monitoring tools fail big times in delivering consistent infrastructure performance monitoring metrics. There are a lot of reasons:

1. Missing raw data. Vendors don’t want you to have access to raw data since they like making a sell. Keeping long historical data means you can easily move to whatever fronted to analyze that. Everything is aggregated and consolidated so OEM is not exception to this.

2. Server Infrastructure Monitoring: overall cpu, mem, disk io, net io utilization coupled to
run-queue length and other metrics which indicate starvation or some sort of queue.

Can you obtain from OEM the following, say from a Linux based operating system:

cpupct: CPU utilization averaged across all CPUs, always <= 100%, percentage, gauge
sumpct: sum of all CPUs utilization always <= N*100%, percentage, gauge
headpct: headroom CPU available, all CPUs, always = N*100% – sumpct, percentage, gauge
userpct: CPU Utilisation User space in percent, averaged across all CPUs
nicepct: CPU utilization User space with nice priority, averaged across all CPUs
syspct: CPU utilization System space, averaged across all CPUs
idlepct: CPU Utilisation idle state, averaged across all CPUs
iowaitpct: CPU Percentage in idle state because an I/O operation is waiting to complete
irqpct: CPU Percentage servicing interrupts
softirqpct: CPU Percentage servicing softirqs
stealpct: CPU Percentage of time spent in other operating systems when running in a virtualized environment
runqsz: run queue length, number of tasks waiting for run time
plistsz: number of tasks in the task list
memusedpct: size of used memory in percent, gauge
memused: size of used memory in kilobytes, gauge, extended
memfree: size of free memory in kilobytes, gauge, extended
memtotal: size of memory in kilobytes, gauge, extended
buffers: size of buffers used from memory in kilobytes, gauge, extended
cache: size of cached memory in kilobytes, gauge, extended
realfree: size of memory is real free, memfree+buffers+cached, gauge
realfreepct:: size of memory is real free in percent of total memory, gauge
swapusedpct: size of used swap space in percent, gauge
swapused: size of swap space is used is kilobytes, gauge, extended
swapfree: size of swap space is free in kilobytes, gauge, extended
swaptotal: size of swap space in kilobytes, gauge, extended
swapcached: memory that once was swapped out, is swapped back in but still also is in the swapfile, gauge, extended
readReq: total disk read requests, counter
writeReq: total disk write requests, counter
totReq: total disk read+write requests, counter
readByt: total read bytes / sec, in KB
writeByt: total write bytes / sec, in KB
totByt: total read+write bytes / sec, in KB
rxByt: total received bytes /sec, in KB
txByt: total transmitted bytes /sec, in KB
ntByt: total received + transmitted bytes /sec, in KB
rxerr: Number of errors that happend while received packets/second
txerr: Number of errors that happend while transmitting packets/second
rxdrp: Number of rx packets that were dropped per second
txdrp: Number of rx packets that were dropped per second
avg_1: LA of the last minute

These all being operating system metrics not DB !
OEM cannot report such figures unless custom modified and deploying probes to collect such things.

3. How many months of such data can you store cheap on commodity disks from OEM on low cost ?

3 simple questions where OEM fails to deliver since it has been build from start for a different purpose.

As I said many times, computer industry is not serious when it comes to performance monitoring and analyzing data. Compare our computer industry to:

Aerospace industry: FDR. Airplanes for examples use some sort of recorders, usually found as a device called flight data recorder FDR, used to store aircraft data parameters. Such unit is found by default on many airplanes nowadays and it is regulated by clear rules and standard enforced by governments and federal institutions, example FAA in United States. This device sometimes is referred as the black box.

Shipbuilding industry: VDR. Ships, boats or other type of vessels use some sort of recorder, called voyager data recorder VDR, used to store vessel data parameters. Similar to aerospace industry such devices are required when a certain vessel must comply with international standards, example International Convention for the Safety of Life at Sea, SOLAS. Used mainly for accident investigation the VDR can serve as preventive maintenance, performance efficiency monitoring, heavy weather damage analysis, accident avoidance and training purposes to improve safety and reduce running costs. This device sometimes is referred as the black box

Auto industry: EDR. Automobiles use some sort of device used to store vehicle parameters, called event data recorder EDR. Again such devices can serve as the main source for accident investigations. EDRs are not enforced by any standard organizations and are not really required by law so their usage varies from vendor to vendor. National Highway Traffic Safety Administration NHTSA proposed a series of changes to standardize and enforce mandatory EDR installation and usage by vendors. Around 2010 over 85% of all vehicles in US would already have some sort of EDR installed.

Computer industry: None. Computers, mainframes, servers or workstations have no such recording devices installed. Manufacturers are not interested in standardizing this effort since they prefer selling additional software packages which can perform such recording features for an extra cost. The lack of standardization and agreements between vendors resulted in a complete different picture than other industries. Currently, there are houndreads of performance monitoring solutions for computer systems.

Reply

sparvu 11 June 2012 at 3pm

Dontcheff

Julian Dontcheff's Database Blog