DBA 2.0 What’s in a name…
I have the utmost respect of Graham Wood and the Team that is handling all the technology and expertise that is going on in DB Console or the Oracle Enterprise Manager. They did a great job over all those years and have given us great tools to work with, like ASH, AWR & ADDM.
…but…this presentation of Graham and his “DBA 1.0″ and “DBA 2.0″, talking about all the great features (and demonstrating it) of OEM, really really irritated me. The world isn’t black or white and even if it would be that extreme, the whole world would have an opinion about it. I don’t know, maybe that is also the goal of this presentation, getting people to talk about it.
The presentation starts by Graham putting out that OEM (AWR, ASH, ADDM, etc) is a great tool and, IMHO, he is absolutely right. So instead of doing the “normal” Oracle presentation thing, he presented two DBA people.
The DBA 1.0, which is an Old Timer doing his thing via shell scripting and looking at the operating system stats and using his SQL scripts etc. The DBA 2.0 guy is handling mail problems and in between handles via the OEM console the problems at hand or otherwise in the Oracle database.
They were both given 2 problems and had to solve these problems within 5 minutes (an alarm on the top right of the presentation screen was there to keep track of time). So far so good.
The first problem was a CPU bound Oracle problem, which was clearly visible via TOP on the Linux system. So the 5 min. alarm was set and the DBA 1.0 guy was trying to solve the problem: what is causing this and what could be done to solve it. The DBA 1.0 guy explained what he saw via TOP, 2 top sessions caused by Oracle processes and via the process id’s, while he was explaining why he was doing what he was doing, got into the database and cross referenced those process id’s with data he gathered from views based on the Wait Interface. Then he identified the SQL statements of which he thought were the root cause and run an explain plan, which was very huge…and then time was up.
The DBA 2.0 guy entered the stage and was given the same task.
The alarm was set for 5 min. again and the DBA 2.0 guy, apparently very confident, logged into OEM, had a quick look at the performance real time overview and was (clearly) presented with diagrams that showed a CPU issue. He drilled down and went straight to the SQL that were the root cause (the same two that were pointed out by the DBA 1.0 guy) and not only that, but, he also run the query adviser to see what could be done. Then he was presented by OEM with the solution and how it could be solved by implementing a profile, which he immediately did. Checked afterward that the CPU issue had been solved and went on with his mail issues…
The second problem, also with a 5 min. limit, was a production problem were an application change had taken place the night before. The change had to be reversed due to the huge performance issues afterward and the database was bounced.
So when the DBA 1.0 guy was asked to investigate the problem, he got the statspack data from the time period just before the database was bounced and started looking at the data. While doing that he explained in huge detail what he read and came to the conclusion that some SQL statements were probably not OK.
The DBA 2.0 guy, back again from his mail issues, got the historical overview and looked into the ADDM data just before the database bounce and found some SQL statements that were using string literals. Solutions for those statements were reported and / or the general advice stated on that web page from OEM was implementing, setting the database parameter “cursor_sharing” to value “force”
All Bar One
I had the honor and pleasure to talk about the presentation with Mr John Beresniewicz (he was the “DBA 1.0″) afterwards in the “All Bar One” and why the presentation was irritating me. The atmosphere of the “DBA 2.0″ presentation hugely sounded like a DBA (the DBA 2.0) with no knowledge about the system or even any proper in-depth knowledge of how an Oracle database works, could use OEM and implement any advise it would present him. A tool in the wrong hands can do a lot off damage.
As John was explaining this was definitely not the message they wanted to relay. The message should have been that an experienced DBA 1.0 guy could use the OEM tool to his advantage and being more effective, productive, while using it. This part should have been played by the DBA 2.0 guy, which didn’t come out properly, because of an stand-in they used and that guy had so much enthusiasm playing his part, that the proper message got lost.
We agreed that OEM is a great tool and got even better with AWR, ASH, ADDM etc. The interface could have been better and that it would, could be really really cool if all the data from all the participating systems, like OS and for example (Oracle) Application Servers could be taken into the equation to make the whole system more faster or detect more easily a root cause that could be, for example, a middle tier application server that is hammering the database. Things you can’t do with OEM currently. For that you need an experienced DBA to detect.
It also will not solve the human factor: is a report allowed for 2 minutes or should its normal behavior be less than 5 seconds…although you could probably also feed OEM with that kind of data. Software is still build by humans, so it will contain bugs…
It was pointed out to me in the bar, by JB and Doug, that I should put my money where my mouth is so… and not drop the post… The dangers of using humor while doing a presentation: Sometimes it will go wrong. It would probably be better to have a backup plan in place, the moment you see it happening…otherwise your message gets lost…and that would be a shame in this case.
Oracle Enterprise Manager can be a great tool if used wisely and under the proper circumstances.
And so you know. I am a “DBA 1.0″ who evolved into a “DBA 2.0″, 10 years ago… A database never stands alone.