Category: Performance

July 20

While setting up a baseline for my XMLDB performance tests, I noticed that my “count(*)” on a Binary XML table (using Securefile LOB storage) called “WIKI_STAGE” took an awful long time. So long, that I even had to kill the SQL*Plus session, that was executing the “count(*)”. I started wondering. Why did it take so long to come up with the result?

In the end, due to good advice from Jonathan Lewis, I came up with a solution (although probably unsupported) and a better understanding off the mechanics involved. Also, as a side effect, it triggered a really good discussion on the “Oracle-L” freelist, regarding “counting”.

But lets start from the beginning…

July 15

I just read Thomas Kyte’ s blog post “Read This“, which is dealing with the content of the blog post of Cary Millsap. As Tom phrased it:

I liked what Cary Millsap just said:

I don’t mean “show and tell,” where someone claims he has improved performance at hundreds of customer sites by hundreds of percentage points [sic], so therefore he’s an expert.

I mean show your work, which means documenting a relevant baseline measurement, conducting a controlled experiment, documenting a second relevant measurement, and then showing your results openly and transparently so that your reader can follow along and even reproduce your test if he wants to.

This is more or less funny, because I read Cary’s post, be apparently I didn’t read it… I can really relate to it now.

I am in the middle of setting up a XMLDB test environment to test, among others, load times while using different kinds off XMLType storage based upon CLOB, Object Relational and Binary XML (using Basicfile / Securefile options). And although I am working on a VMware environment, I noticed that it isn’t that easy to setup a “controlled experiment“. What makes it harder is, that I am using the Mediawiki XML English dumpfile, that contains roundabout 7 million records (17 Gb of ASCII data). This makes it more interesting, and the effects more clearer, but it also takes much more time to do stuff.

July 11

Sometimes you will want to load data from huge XML files into the database. So how do you achieve this?

There are more then one ways to achieve this, but most of the time a “SAX parser” is used. The term on Wikipedia for SAX is:

A Simple API for XML (SAX) is a serial access parser API for XML. SAX provides a mechanism for reading data from an XML document. It is a popular alternative to the Document Object Model (DOM).

The disadvantage sometimes of using DOM is that it uses to much resources in the sense of CPU and memory and for really huge files this method simply will not work in terms of performance. Parsing an XML document with DOM acquires the whole document to be loaded into memory before processing can be started. Via SAX only a small memory footprint is needed.