Hmmm, still busy to setup a baseline environment to do some serious XMLDB performance / architecture testing. I am almost now on the point were I really can start, but until now I ran into a lot of issues… It also told me that the mediawiki dumpfile is good set of XML data to work with. A lot of variance is build in. I hope I won’t encounter too muchmore issues though…
Sometimes you will want to load data from huge XML files into the database. So how do you achieve this?
There are more then one ways to achieve this, but most of the time a “SAX parser” is used. The term on Wikipedia for SAX is:
The disadvantage sometimes of using DOM is that it uses to much resources in the sense of CPU and memory and for really huge files this method simply will not work in terms of performance. Parsing an XML document with DOM acquires the whole document to be loaded into memory before processing can be started. Via SAX only a small memory footprint is needed.