Oracle 11g – XMLType Storage Options

Though also a recap on presentations of Oracle Open World 2006, where I pinpointed on XML and XMLDB sessions, now it is time to go a little bit further into these great new features that will be available in Oracle 11g regarding XMLDB functionality. One of the big steps forward to full exploitation of these features is the new Binary XML storage.

Until XMLDB one could make use of (C)LOB or varchar2 datatypes to store XML data, both have their limitations regarding storing XML. The introduction of the XMLType datatype / XMLDB features, formed the base of what has become full support of almost every W3C XML standard out there. This datatype allowed storing XML data via “Schema based” or “Schema Less” principles; with or without references to a XML Schema.

XML data can be stored / presented via the standard relational tables. This can be achieved for XMLType Views. If XML data is based on the XMLType data type then storage of XML data could result in Object Relational storage or CLOB storage. A mixture of the two is sometimes also possible and is called “Hybrid” storage. XML data can be stored in a column or as an XMLType table.

The following great overview (taken from the XMLDB Developers Guide for Oracle 11gR1 beta), shows the possible XML data storage options.

XMLDB Storage Options for XML data

Click on the picture to enlarge

Now in 11g, besides CLOB (Unstructured), Object Relational (Structured) and Hybrid storage, a fourth storage option has been added: Binary storage.

CLOB XML Storage

XMLType based CLOB storage is ideal for situations were you have to process document centric XML data, for instance, you don’t want the XML document altered in anyway and you store or retrieve always the whole document. The document keeps intact and no “shredding” takes place.

If no storage option is given, when using the XMLType datatype, then it defaults to CLOB storage.

Until Oracle 11g, a disadvantage was the decrease in performance, when you, for example, needed to select only a fragment of the XML content. Indexes can be created to support XPath selections via function based or Oracle text indexes, but not always with the wanted result: fast retrieval of fragments of the XML data.

The problem outlined before, has now been solved via the new XMLIndex index that hugely improved performance retrieving XML data fragments based on XMLType CLOB storage data. The XMLIndex though, deserves its own post, so I will have to come back on this later.

Below an example of a simple SQL statement creating a CLOB based XMLType table:

SQL> CREATE TABLE XMLCLOB OF XMLType
     XMLTYPE store AS CLOB;
 
TABLE created.

A more detailed view, of what has been created, shows the following:

SQL> CONNECT system/oracle
 
Connected.
 
SQL> CREATE TABLE XMLCLOB OF XMLType
  2  XMLTYPE store AS CLOB;
 
TABLE created.
 
SQL> SET long 10000
SQL> SET pages 5000
 
SQL> SELECT dbms_metadata.get_ddl('TABLE','XMLCLOB')
  2  FROM   dual;
 
DBMS_METADATA.GET_DDL('TABLE','XMLCLOB')
---------------------------------------------
 
  CREATE TABLE "SYSTEM"."XMLCLOB" OF "SYS"."XMLTYPE"
 OIDINDEX  ( PCTFREE 10 INITRANS 2 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM" )
 PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"
 XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS BASICFILE CLOB (
  TABLESPACE "SYSTEM" ENABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 10
  NOCACHE LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT))

Object Relational XML Storage

Object Relational storage is very useful where you have to process content centric XML data, for instance, you will have to select or update only a part of the XML document. The XML data, when using Object Relational storage, will be “shredded” into object-relational rows and columns. This shredding has the advantage that the database is more XML aware and gives the Optimizer the chance to re-write SQL statements, doing so; selection of XML fragments is greatly improved.

One other advantage of Object Relational XML data storage is that this storage method will get rid of often unneeded whitespace. The abundance of whitespace will greatly improve performance because less data has to be parsed. Most of the time, the percent of whitespace set against the amount of actual data can be vast.

A disadvantage of this storage method is that it doesn’t leave the document intact (it removes the whitespace) and overhead has to be taken into account for shredding and rebuilding the XML structure.

An example how to create a simple XMLType table based on Object Relational storage is shown below:

SQL> CREATE TABLE XMLOR OF XMLType
     XMLTYPE store AS OBJECT RELATIONAL
     XMLSCHEMA "http://localhost/xsd/myXMLSchema.xsd"
               ELEMENT "Root";
TABLE created.

The more detailed view on what has been created is shown in the following example:

SQL> CONNECT system/oracle
 
Connected.
 
SQL> CREATE TABLE XMLOR OF XMLType
  2  XMLTYPE store AS OBJECT RELATIONAL;
XMLTYPE store AS OBJECT RELATIONAL
                                 *
ERROR at line 2:
ORA-19002: Missing XMLSchema URL
 
-- Register a XML Schema
 
/*
  begin
    dbms_xmlschema.deleteSchema
    ('http://localhost/myXMLSchema.xsd',4);
  end;
  /
*/
 
SQL> BEGIN
  2    dbms_xmlschema.registerSchema
  3    ('http://localhost/myXMLSchema.xsd'
  4    , xdbURIType('/public/myXMLSchema.xsd').getClob()
  5    , FALSE, TRUE, FALSE, TRUE
  6    );
  7  END;
  8  /
 
PL/SQL PROCEDURE successfully completed.
 
SQL> SELECT schema_url FROM user_xml_schemas;
 
SCHEMA_URL
---------------------------------------
http://localhost/myXMLSchema.xsd
 
SQL> SET long 10000
SQL> SET pages 5000
 
SQL> CREATE TABLE XMLOR OF XMLType
  2  XMLTYPE store AS OBJECT RELATIONAL
  3  XMLSCHEMA "http://localhost/myXMLSchema.xsd" ELEMENT "Root";
 
TABLE created.
 
SQL> SELECT dbms_metadata.get_ddl('TABLE','XMLOR')
  2  FROM dual;
 
DBMS_METADATA.GET_DDL('TABLE','XMLOR')
--------------------------------------
 
  CREATE TABLE "SYSTEM"."XMLOR" OF "SYS"."XMLTYPE"
  XMLSCHEMA "http://localhost/myXMLSchema.xsd"
  ELEMENT "Root" ID 4760 PCTFREE 10 PCTUSED 40
  INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"

Binary XML Storage

The new binary XML storage model for the datatype XMLType is XML Schema aware, but can also be used without the use of an XML Schema. It is very space efficient. As said in the XMLDB Developers Guide: �XMLType data is stored in a post-parse, binary format, specifically designed for XML data. Binary XML is compact, post-parse, XML Schema-aware XML.� It has the advantage over Object Relational storage that it misses the XML parsing bit. It is also much more efficient then CLOB storage, regarding updating, indexing or selecting XML fragments.

Binary XML storage has the advantage that it has not to now the XML schema in advance and you can store multiple XML schemas in the same table and query across common elements.

A SQL statement to create an XMLType table based on Binary Storage is shown here:

SQL> CREATE TABLE XMLBIN OF XMLType
     XMLTYPE store AS BINARY XML;
 
TABLE created.

A more detailed view on what has been created is shown in the following example:

SQL> CONNECT system/oracle
 
Connected.
 
SQL> CREATE TABLE XMLBIN OF XMLType
  2  XMLTYPE store AS BINARY XML;
 
TABLE created.
 
SQL> SET long 10000
SQL> SET pages 5000
 
SQL> SELECT dbms_metadata.get_ddl('TABLE','XMLBIN')
  2  FROM dual;
 
DBMS_METADATA.GET_DDL('TABLE','XMLBIN')
---------------------------------------
 
 CREATE TABLE "SYSTEM"."XMLBIN" OF "SYS"."XMLTYPE"
 OIDINDEX  ( PCTFREE 10 INITRANS 2 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM" )
 PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"
 XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS BASICFILE
 BINARY XML (TABLESPACE "SYSTEM" ENABLE STORAGE IN ROW
  CHUNK 8192 PCTVERSION 10 NOCACHE LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT
 )) ALLOW NONSCHEMA DISALLOW ANYSCHEMA

Small test and Use Case

The next overview (taken from the XMLDB Developers Guide for Oracle 11gR1 beta), shows some use cases versus design decisions:

XML Use Cases and XMLType Storage Models

Click on the picture to enlarge

In a test case, based on customer production data, a table that was inserted with 438 Mb (size on Windows) of XML data (ranging from < 100 Kb up to 3 Mb documents), it showed that contrary to the original XMLType CLOB based storage, the total volume claimed on tablespace level decreased from 454 Mb (CLOB) to 116 Mb (Binary XML) on this locally managed ASSM tablespace (so a difference of 338 Mb of reclaimed storage on tablespace level)

XMLType column based on CLOB storage

 
CREATE TABLE "HGO"."HGO010_DETAM"
   (    "HGO_ID" NUMBER(12,0) NOT NULL ENABLE,
        "GEGEVENS" "SYS"."XMLTYPE"  NOT NULL ENABLE)
 TABLESPACE "DAT_TS01"
 XMLTYPE COLUMN "GEGEVENS" STORE AS BASICFILE CLOB;
 
BEGIN STATUS (BEFORE loading the DATA)
 
Tablespace Name Specific Free SPACE(Mb) Used SPACE(Mb) MAX.SPACE(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE          1,000             24         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            170             30           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
SUM                               1,291          1,247         2,538
 
END STATUS (after loading the DATA)
 
Tablespace Name Specific Free SPACE(Mb) Used SPACE(Mb) MAX.SPACE(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE            546            478         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            166             34           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
SUM                                 833          1,705         2,538

XMLType column based on Binary XML storage

CREATE TABLE "HGO"."HGO010_DETAM"
   (    "HGO_ID" NUMBER(12,0) NOT NULL ENABLE,
        "GEGEVENS" "SYS"."XMLTYPE"  NOT NULL ENABLE
   )
TABLESPACE "DAT_TS01"
XMLTYPE COLUMN "GEGEVENS" STORE AS BINARY XML;
 
BEGIN STATUS (BEFORE loading the DATA)
 
Tablespace Name Specific Free SPACE(Mb) Used SPACE(Mb) MAX.SPACE(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE          1,014             10         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            164             36           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
SUM                               1,299          1,239         2,538
 
 
END STATUS (after loading the DATA)
 
Tablespace Name Specific Free SPACE(Mb) Used SPACE(Mb) MAX.SPACE(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE            898            126         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            164             36           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
SUM                               1,183          1,355         2,538

As shown in this post, Oracle 11g XMLDB functionality has become a very flexible and mature environment. As always, proper design is a key factor to success, so it is vital you choose with care and with possible future needs in mind. XML is a new player on the market regarding data storage, so also take into account methods like relational storage that has the advantage that it already had the chance to evolve for more then 30 years.

Despite that I don’t want to start a relational contra XML discussion here; The Oracle database environment has the big advantage, while you are using XMLDB functionality, that you also can apply Oracle Search, Text, Java, PL/SQL, etc. or other great features at your disposal when in need for a solution.

[Oracle 11g - XMLType Storage Options.pdf]

Disclaimer

The information demonstrated and shared here is based on Oracle beta software. The following is intended to outline Oracle’s general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.