Dontcheff

Archive for February, 2014|Monthly archive page

DBA skills are among the Top 10 IT skills for 2014

In Data, DBA on February 24, 2014 at 15:27

Recently, IT Business Knowledge and Global Knowledge published the Top 10 IT skills for 2014. In survey order, they are: Programming and Application Development, Help Desk and Technical Support, Networking, Mobile Applications and Device Management, Project Management, Database Administration, Security, Business Intelligence/Analytics, Cloud and Interpersonal.

I_LOVE_MY_DBA

“Though database administration missed the 2013 list, it will be quite hot in 2014. That stems from the growing interest in big data. Organizations have been gathering huge amounts of information from websites, social media, and third-party arrangements. Now they want to be able to use that data to make better decisions on products, services, and their customers.”

Although the above quote is somehow correct, there are definitely other factors that make the DBA profession being on top of the lists of hiring and salary surveys.

According to an estimate, 90% of the data in the world today has been created in the last two years alone. I think this is based on the estimate that every day we create 2.5 quintillion bytes of data. It is more like that (on top of all the social media and unstructured data) we have created IT systems with logging and auditing processes that generate a significant amount of that data.

Several colleagues of mine, including me, are fascinated by the fact how much unnecessary data is being logged into corporate databases. I would say more than 80% is usually either audited/logged data or data that is not being accessed/used. However, according to Red Hat CEO Jim Whitehurst the advice is to capture everything you can. It is a very interested view on big data’s chicken-and-egg question, I do recommend it.

Of course, we cannot capture it all. For example, there is no computer/storage in the world that can record the processes that take place in the human body within 1 second. Estimates of the human brain’s memory capacity vary wildly from 1 to 1,000 terabytes. For comparison, all the 19 million volumes in the US Library of Congress represents about 10 terabytes of data.

BDA

Data!

The growing amount of data is just a consequence of the complex data and database administration that comes with new technologies and software options that rather increase the complexity of IT systems. Increasing business and legal demands require companies to implement them. Keeping unstructured data outside the database (as the easier option) also increase the complexity. It takes extra skills and persuasion by data architects and DBAs to convince IT organization to seriously take the concept of keeping unstructured data with (relational) databases. I do recommend Marcelle Kratochvil’s book Managing Multimedia and Unstructured Data in the Oracle Database.

Only taking into use Audit Vault and Total Recall in an Oracle databases, might often cause exponential data growth. When such options are implemented in conjunction with other data and database mechanisms such as compression, partition, deduplication, redaction and encryption, one really needs a very experienced DBA (rather a team of DBAs) to support and maintain this growing complexity. Not to mention when such databases are being replicated, have standby copies, patched or part of an EXA environment. On top of that the constantly growing requirement of infinite availability makes the DBA skills very, very hot.

Database Administrator Average Starting Salaries are growing year by year, the 2 year increase is often more than 10%:

DBA_startig_salary

Add 5% for IBM DB2 database skills, 9% for Oracle database skills and 10% for Microsfot SQL Server database skills.

In addition, 2 from the Top 10 highest paying IT certifications for 2014 are database related.

Recently Forbes published the Top Jobs for 2014. Database Administrators are among the Top 10 with:

Total employment in 2013: 119,676 jobs
Jobs added between 2010 and 2013: 11,241 (up 10%)
Median hourly earnings: $37.39

Last but not least, according to Sweet Top 10, the Database Administrators are among the top 10 highest paying jobs in 2014. They also note that the rise of big data and the collection of massive amounts of data greatly increases the demand for database administrators.

DBA

Oracle Exadata myths and misconceptions

In DBA, Exadata, Oracle database on February 9, 2014 at 04:47

Exadata experience, what does that actually mean? by Martin Bach and Friday Philosophy – I Am An Exadata Expert? by Martin Widlake raise few questions on how is an Exadata DBA different than an Oracle DBA.

This chart provides the 3-month moving total of permanent IT jobs citing Oracle Exadata within the UK as a proportion of the total demand within the Database and Business Intelligence category.

Exadata_demand

Clearly, the demand is high and growing. Several Oracle DB experts are now also Exadata experts. There are various internet blogs and discussion forums dedicated to Exadata. Event conferences: E4.

The aim of this blog is to point out few Oracle Exadata myths and misconceptions I have encountered and discussed with Exadata experts during the past year or so.

1. “When logged into the database (SQL*Plus, SQL Worksheet, etc.) there is no way to see if the database is an Exadata one or not.” Actually there is. If select count(*) from (select distinct cell_name from gv$cell_state) returns a number > 0, then the database runs on Exadata storage. This SQL is also used by the SYS.DBMS_FEATURE_EXADATA procedure:

gv_cell_state

Number 7 tells us also that we are using a half Exadata rack. Full X4 Exadata rack has 14 cells. Full T5-8 SuperCluster rack has 8 cells.

2. “Flash Cache compression is enabled by default.” No, it is not. The reason is that clients must have licensed the advanced compression option on all databases that access the compressed flash cache.

You can enable using the Flash Cache compression by the cell command “alter cell flashCacheCompress=TRUE”. On an X3 Exadata you should also run “alter cell flashCacheCompX3Support= TRUE”.

Note that the Flash Cache compression is supported on X3 or X4 storage servers and it requires F40 or F80 cards. X4 Flash Cache compression is expanded to 88TB (raw) per rack. Monitor the Flash Cache compression by using the cell metric FC_BY_USED!

3. “Exadata is always RAC.” No, it is not. In fact, consider carefully if you want to use the RAC option on your Exadata. Ask for second opinion too! Ask for third opinion as well. You can have more databases on a quarter or 1/8th rack if you do not use the RAC option. And if someone comes with the HA excuse to use RAC, ask if (s)he can walk and run on one foot when the other one is injured. With well implemented instance caging one can run even close to 100 database instances on one database node (yes there are such cases). Often simplicity brings better results.

4. “DBAs should let Oracle decide on what is placed in the Flash Cache.” In general yes but I have seen significant improvement when frequently used tables are manually kept in the Flash Cache. Although the Flash Cache is fully automated for reads and writes, DBAs can decide on object caching by setting cell_flash_cache_keep. The granularity is all the way to the object level, where you can define higher priority for caching (KEEP) or not to cache at all (NONE). The flash cache can differentiate between full table scans and single I/O reads.

Cached:

– Frequently accessed data and index blocks
– Control file reads and writes
– File header reads and writes

Not cached:

– I/Os to mirror copies
– Backup-related I/O
– Data Pump I/O
– Data file formatting
– Table scans do not monopolize the cache

Check Uwe Hesse’s article about the Exadata The Flash Cache.

5. “The bigger SGA the better.” Although I have often seen how increasing the SGA can really boost the application performance, this is not always the case with Exadata. Why? In order to offload the work to the sells, the table should be enough big. But what is meant by a big table?

“_small_table_threshold” is sized in units of database blocks, and is supposed to be set by default to 2% the size of the database buffer cache (1.5% in 12c according to my tests) and _very_large_object_threshold = 500 (value for 12c)

Tables with size > 5 x “_small_table_threshold”, are being treated as ‘large tables’ for their reads result in direct reads. It is rather unclear how these things work for tables with size between 2–5 times the small table threshold. Note that 12c has 108 underscore parameters for different threshold values!

Check out these 2 article:

Smart Scan: Why is “_small_table_threshold” Important?
Exadata Optimization Tips

cell_smart_table_scan

6. “Storage indexes can be managed and maintained.” I wish that could be possible. Kellyn says in the above mentioned paper Exadata optimization tips: “Do not count on storage indexes. The feature, when implemented by Oracle is one requiring the choice of a complicated set of algorithms. They can be your savior or your devil.” Well, Oracle tries to maintain storage index summaries during loads but it is just a best effort – not the most optimal given the memory constraint and the characteristics of writes during loads.

Plus storage indexes are not persistent: after cell reboot the very first query will not benefit. In simple words, whoever comes first to the office on Monday morning (after weekend’s cell maintenance) might expect better performance from his/her reports.

You can disable/enable the storage indexes at cell level:

alter cell events = “immediate cellsrv.cellsrv_storidx(‘disable’, ‘ALL’, 0, 0, 0)”;
alter cell events = “immediate cellsrv.cellsrv_storidx(‘enable’, ‘ALL’, 0, 0, 0)”;
alter cell events = “immediate cellsrv.cellsrv_storidx(‘purge’, ‘ALL’, 0, 0, 0)”;

storage_indexes

I would really like to have a dynamic view showing the existing storage indexes!

7. “Oracle Database 12c running on Exadata Database Machine requires Exadata version 12.1 or later.” This is the recommended approach but you can still use Oracle Database 12.1.0 with Exadata version 11.2.3.3 or Exadata version 11.2.3.2.1.

Oracle Database 12c running on Exadata 11.2 is subject to the following restrictions:

– The 12.1 offload libraries are not present on Exadata version 11.2. Therefore, smart scan offloaded filtering and storage indexes are disabled for 12.1 databases. Smart scans are still issued, and the initiation of IOs is offloaded, but complete blocks are returned instead of just the selected rows and columns. 11.2 databases on the same system have full offload capabilities.
– I/O Resource Management (IORM) plans for 12.1 database are not enforced.
– Database resource plans (intradatabase resource plans) set on 12.1 databases are not enabled on cells.
– An interdatabase plan cannot manage 12.1 databases, therefore an interdatabase plan cannot be used to manage flash log or flash cache for a 12.1 database.
– Other Exadata features like Smart Flash Cache, Smart Flash Log, Hybrid Columnar Compression, and Low Latency and High Throughput InfiniBand messaging continue to work with 12.1 databases. Note that these restrictions only effect performance of 12.1 databases.
– All 12c database functionality, such as Pluggable Databases, is fully available.

A final comment: I see often the power of Exachk tool is somehow underestimated. Use also the ExaWatcher utility which is located in each Exadata database server and storage cell under /opt/oracle.ExaWatcher/. ExaWatcher replaces OSWatcher in Exadata software versions 11.2.3.3 and up.