OOW 2011 – Oracle XML DB and Big Data

Last day of Oracle Open World and I am currently attending the last presentations. The first presentation, “Oracle XMLDB: A noSQL Approach to Managing all your Unstructured Data”, deals with the no-SQL approach and using Oracle XML DB in the context of using it with “Big Data”, that is unstructured data. The title of the presentation is “a bit” misleading due it reference to noSQL data handling. XML is mostly used in the area’s of structured, data centric, semi-structured an unstructured, that is document centric data. Due to the flexibility of XML, it can be used for bridging those data content forms. Via the XDB repository, xmltype storage and xmlindex, that content can be moved into the XML DB part of the Oracle database, mapped and categorized. You can use repository events to shred and filter this map while the data is going in regarding interfacing via FTP or WebDAV. In all the presentation addressed a lot of already known fact of the XMLDB functionality and not really how to use it with huge amounts of unstructured data.

My last presentation of the day and of Oracle Open World was the one of the FedEx XMLDB implementation on Exadata. They try to use XML to streamline multiple different data streams with different kinds of content and ability to create hierarchy, dependency and rules on the data. Overall its much easier to parse and validate as a flat file, standard SQL operations are available, multiple programmings support and XSLT is available to handle the data. Storage in XMLDB was chosen due to the advantage to store the XML natively without writing a complex application to parse into many relational tables. In principle, applying standards instead of creating your own solution and therefore spending more development time regarding the solving the business problem instead of building the XML data processing methods themselves.

Currently the system is using XMLType Object Relational storage with sub-partitioning and hash partitioned nested table indices on Oracle 11.2.0.3.0 to store their data within a 11g 4 Node RAC linux environment.

In all they also are now on Oracle 11g more able to move to the XQuery standards instead of using propriety solutions for their system to move forward on.

The second bit of the presentation actually dealt a bit with Exadata testing in comparison to a FedEx kind of setup and its findings.

A second use case test with Exadata was setup based on XBRL data compared to a non Exadata machine. Curious point was that the setup was done on a “single-node” Exadata machine.

I liked this last presentation if not only it had some useful and, at least for me, interesting information regarding how far you can push some of the use-case requirements in the XMLDB realm of things.