Oracle 11g – XMLType Storage Options

Though also a recap on presentations of Oracle Open World 2006, where I pinpointed on XML and XMLDB sessions, now it is time to go a little bit further into these great new features that will be available in Oracle 11g regarding XMLDB functionality. One of the big steps forward to full exploitation of these features is the new Binary XML storage.

Until XMLDB one could make use of (C)LOB or varchar2 datatypes to store XML data, both have their limitations regarding storing XML. The introduction of the XMLType datatype / XMLDB features, formed the base of what has become full support of almost every W3C XML standard out there. This datatype allowed storing XML data via “Schema based” or “Schema Less” principles; with or without references to a XML Schema.

XML data can be stored / presented via the standard relational tables. This can be achieved for XMLType Views. If XML data is based on the XMLType data type then storage of XML data could result in Object Relational storage or CLOB storage. A mixture of the two is sometimes also possible and is called “Hybrid” storage. XML data can be stored in a column or as an XMLType table.

The following great overview (taken from the XMLDB Developers Guide for Oracle 11gR1 beta), shows the possible XML data storage options.

XMLDB Storage Options for XML data

Click on the picture to enlarge

Now in 11g, besides CLOB (Unstructured), Object Relational (Structured) and Hybrid storage, a fourth storage option has been added: Binary storage.

CLOB XML Storage

XMLType based CLOB storage is ideal for situations were you have to process document centric XML data, for instance, you don’t want the XML document altered in anyway and you store or retrieve always the whole document. The document keeps intact and no “shredding” takes place.

If no storage option is given, when using the XMLType datatype, then it defaults to CLOB storage.

Until Oracle 11g, a disadvantage was the decrease in performance, when you, for example, needed to select only a fragment of the XML content. Indexes can be created to support XPath selections via function based or Oracle text indexes, but not always with the wanted result: fast retrieval of fragments of the XML data.

The problem outlined before, has now been solved via the new XMLIndex index that hugely improved performance retrieving XML data fragments based on XMLType CLOB storage data. The XMLIndex though, deserves its own post, so I will have to come back on this later.

Below an example of a simple SQL statement creating a CLOB based XMLType table:

SQL> create table XMLCLOB of XMLType
     XMLTYPE store as CLOB;

Table created.

A more detailed view, of what has been created, shows the following:

SQL> connect system/oracle

Connected.

SQL> create table XMLCLOB of XMLType
  2  XMLTYPE store as CLOB;

Table created.

SQL> set long 10000
SQL> set pages 5000

SQL> select dbms_metadata.get_ddl('TABLE','XMLCLOB')
  2  from   dual;

DBMS_METADATA.GET_DDL('TABLE','XMLCLOB')
---------------------------------------------

  CREATE TABLE "SYSTEM"."XMLCLOB" OF "SYS"."XMLTYPE"
 OIDINDEX  ( PCTFREE 10 INITRANS 2 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM" )
 PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"
 XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS BASICFILE CLOB (
  TABLESPACE "SYSTEM" ENABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 10
  NOCACHE LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT))

Object Relational XML Storage

Object Relational storage is very useful where you have to process content centric XML data, for instance, you will have to select or update only a part of the XML document. The XML data, when using Object Relational storage, will be “shredded” into object-relational rows and columns. This shredding has the advantage that the database is more XML aware and gives the Optimizer the chance to re-write SQL statements, doing so; selection of XML fragments is greatly improved.

One other advantage of Object Relational XML data storage is that this storage method will get rid of often unneeded whitespace. The abundance of whitespace will greatly improve performance because less data has to be parsed. Most of the time, the percent of whitespace set against the amount of actual data can be vast.

A disadvantage of this storage method is that it doesn’t leave the document intact (it removes the whitespace) and overhead has to be taken into account for shredding and rebuilding the XML structure.

An example how to create a simple XMLType table based on Object Relational storage is shown below:

SQL> create table XMLOR of XMLType
     XMLTYPE store as OBJECT RELATIONAL
     XMLSCHEMA "http://localhost/xsd/myXMLSchema.xsd"
               ELEMENT "Root";
Table created.

The more detailed view on what has been created is shown in the following example:

SQL> connect system/oracle

Connected.

SQL> create table XMLOR of XMLType
  2  XMLTYPE store as OBJECT RELATIONAL;
XMLTYPE store as OBJECT RELATIONAL
                                 *
ERROR at line 2:
ORA-19002: Missing XMLSchema URL

-- Register a XML Schema

/*
  begin
    dbms_xmlschema.deleteSchema
    ('http://localhost/myXMLSchema.xsd',4);
  end;
  /
*/

SQL> begin
  2    dbms_xmlschema.registerSchema
  3    ('http://localhost/myXMLSchema.xsd'
  4    , xdbURIType('/public/myXMLSchema.xsd').getClob()
  5    , False, True, False, True
  6    );
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> select schema_url from user_xml_schemas;

SCHEMA_URL
---------------------------------------
http://localhost/myXMLSchema.xsd

SQL> set long 10000
SQL> set pages 5000

SQL> create table XMLOR of XMLType
  2  XMLTYPE store as OBJECT RELATIONAL
  3  XMLSCHEMA "http://localhost/myXMLSchema.xsd" ELEMENT "Root";

Table created.

SQL> select dbms_metadata.get_ddl('TABLE','XMLOR')
  2  from dual;

DBMS_METADATA.GET_DDL('TABLE','XMLOR')
--------------------------------------

  CREATE TABLE "SYSTEM"."XMLOR" OF "SYS"."XMLTYPE"
  XMLSCHEMA "http://localhost/myXMLSchema.xsd"
  ELEMENT "Root" ID 4760 PCTFREE 10 PCTUSED 40
  INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"

Binary XML Storage

The new binary XML storage model for the datatype XMLType is XML Schema aware, but can also be used without the use of an XML Schema. It is very space efficient. As said in the XMLDB Developers Guide: “XMLType data is stored in a post-parse, binary format, specifically designed for XML data. Binary XML is compact, post-parse, XML Schema-aware XML.” It has the advantage over Object Relational storage that it misses the XML parsing bit. It is also much more efficient then CLOB storage, regarding updating, indexing or selecting XML fragments.

Binary XML storage has the advantage that it has not to now the XML schema in advance and you can store multiple XML schemas in the same table and query across common elements.

A SQL statement to create an XMLType table based on Binary Storage is shown here:

SQL> create table XMLBIN of XMLType
     XMLTYPE store as BINARY XML;

Table created.

A more detailed view on what has been created is shown in the following example:

SQL> connect system/oracle

Connected.

SQL> create table XMLBIN of XMLType
  2  XMLTYPE store as BINARY XML;

Table created.

SQL> set long 10000
SQL> set pages 5000

SQL> select dbms_metadata.get_ddl('TABLE','XMLBIN')
  2  from dual;

DBMS_METADATA.GET_DDL('TABLE','XMLBIN')
---------------------------------------

 CREATE TABLE "SYSTEM"."XMLBIN" OF "SYS"."XMLTYPE"
 OIDINDEX  ( PCTFREE 10 INITRANS 2 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM" )
 PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"
 XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS BASICFILE
 BINARY XML (TABLESPACE "SYSTEM" ENABLE STORAGE IN ROW
  CHUNK 8192 PCTVERSION 10 NOCACHE LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1
  MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
  FREELIST GROUPS 1 BUFFER_POOL DEFAULT
 )) ALLOW NONSCHEMA DISALLOW ANYSCHEMA

Small test and Use Case

The next overview (taken from the XMLDB Developers Guide for Oracle 11gR1 beta), shows some use cases versus design decisions:

XML Use Cases and XMLType Storage Models

Click on the picture to enlarge

In a test case, based on customer production data, a table that was inserted with 438 Mb (size on Windows) of XML data (ranging from < 100 Kb up to 3 Mb documents), it showed that contrary to the original XMLType CLOB based storage, the total volume claimed on tablespace level decreased from 454 Mb (CLOB) to 116 Mb (Binary XML) on this locally managed ASSM tablespace (so a difference of 338 Mb of reclaimed storage on tablespace level)

XMLType column based on CLOB storage


CREATE TABLE "HGO"."HGO010_DETAM"
   (    "HGO_ID" NUMBER(12,0) NOT NULL ENABLE,
        "GEGEVENS" "SYS"."XMLTYPE"  NOT NULL ENABLE)
 TABLESPACE "DAT_TS01"
 XMLTYPE COLUMN "GEGEVENS" STORE AS BASICFILE CLOB;

Begin status (before loading the data)

Tablespace Name Specific Free Space(Mb) Used Space(Mb) Max.Space(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE          1,000             24         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            170             30           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
sum                               1,291          1,247         2,538

End status (after loading the data)

Tablespace Name Specific Free Space(Mb) Used Space(Mb) Max.Space(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE            546            478         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            166             34           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
sum                                 833          1,705         2,538

XMLType column based on Binary XML storage

CREATE TABLE "HGO"."HGO010_DETAM"
   (    "HGO_ID" NUMBER(12,0) NOT NULL ENABLE,
        "GEGEVENS" "SYS"."XMLTYPE"  NOT NULL ENABLE
   )
TABLESPACE "DAT_TS01"
XMLTYPE COLUMN "GEGEVENS" STORE AS BINARY XML;

Begin status (before loading the data)

Tablespace Name Specific Free Space(Mb) Used Space(Mb) Max.Space(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE          1,014             10         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            164             36           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
sum                               1,299          1,239         2,538


End status (after loading the data)

Tablespace Name Specific Free Space(Mb) Used Space(Mb) Max.Space(Mb)
--------------- -------- -------------- -------------- -------------
DAT_TS01        DATAFILE            898            126         1,024
IDX_TS01        DATAFILE             67              3            70
SYSAUX          DATAFILE             35            496           531
SYSTEM          DATAFILE              7            663           670
TEMP            TEMPFILE              7             31            38
UNDOTBS1        DATAFILE            164             36           200
USERS           DATAFILE              5              0             5
                         -------------- -------------- -------------
sum                               1,183          1,355         2,538

As shown in this post, Oracle 11g XMLDB functionality has become a very flexible and mature environment. As always, proper design is a key factor to success, so it is vital you choose with care and with possible future needs in mind. XML is a new player on the market regarding data storage, so also take into account methods like relational storage that has the advantage that it already had the chance to evolve for more then 30 years.

Despite that I don’t want to start a relational contra XML discussion here; The Oracle database environment has the big advantage, while you are using XMLDB functionality, that you also can apply Oracle Search, Text, Java, PL/SQL, etc. or other great features at your disposal when in need for a solution.

[Oracle 11g – XMLType Storage Options.pdf]

Disclaimer

The information demonstrated and shared here is based on Oracle beta software. The following is intended to outline Oracle’s general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Marco Gralike Written by:

19 Comments

  1. Regen
    June 30

    So there are improvements in inner handling of XMLType elements and their storage but no improvements on schema based XML-data handling ie. no schema based XML generating (No empty xmltags), no table generating according to schema etc? We just have to rely on extractvalue and XMLElement (with nvl and other gizmos) type handling with XMLType?

  2. July 1

    Hmmm.

    First of all “empty” xmltags will result in default W3C behavior (Oracle XMLDB is following the W3C rules here), in short, there is only a end tag presented for that element if it contains no value. If you looking for an improvement on that; W3C has to alter their rules on that issue.

    Tables can be generated via XML Schemas, look it up in the XMLDB Development Guide, and there is a lot of control (via xdb annotations) on how storage will be applied in the database. Binary XML is even smarter regarding how you can control it and you are able to handle more that one XML Schema for the same XML Instance.

    If you see the toolset available; SQL, PL/SQL, SQL/XML, Xquery and XPath (v2) then I don’t think there is a reason to complain. Performance is improved because almost all functionality is not build in and doesn’t have to be handled via java wrappers.

  3. shukor
    January 15

    hello, I wnant 2 ask u about how 2 storagimg data…

  4. January 15

    What you ask (if I decipher correctly), is how to store image data. If you need a demonstration how this can be done with help off XMLDB functionality, the I would suggest that you install the XFILES demo which you should be able to find on OTN.

    For a quick intro, have a look at the XFILES post on this site.

  5. ahmad
    April 22

    first of all thank u for this info.

    could u plz send me the references for this article in oracle website or document.

  6. April 22

    Reference can be found in the XMLDB Developers Guide, so for the latest information regarding the Binary XML storage model: the Oracle XMLDB Developers Guide v 11gR1.

    Also have a look here, especially the whitepaper references of database version 10g regarding storage methods…”Mastering XML DB Storage in Oracle Database 10g Release 2″

    BTW “ahmad” your email address is not valid, so I can’t follow up on this.

  7. ahmad
    April 23

    when we have query like

    CREATE TABLE CustomerTypeTab of XMLTYPE 
    XMLSCHEMA "http://homepage/sample1" ELEMENT "Customer"; 
    

    is it CLOB type or relational type.

    THANK U AGAIN

  8. April 23

    You will know for sure / you can check for yourself via:

    set long 1000000
    
    select dbms_metadata.get_ddl('TABLE','CUSTOMERTYPETAB') 
    from   dual; 
    
    
  9. Nancy
    December 15

    Hi,

    Is there a recommended way to migrate data in a XML CLOB column in Oracle 10g to the binary XML column in Oracle 11g?
    thanks
    Nancy

  10. December 15

    No not really, at least, not afaik. An “insert into” should do the trick.

    Be aware though that you will have far more options to chose from while using binary XML storage: Encoding, support for 1 or multiple XML Schematas, securefile (=extra license) or basicfile clob, compression, deduplication, etc.

    Really recommend Oracle 11.1.0.7.0 instead of 11.1.0.6.0 in this case

  11. akhil
    July 17

    hi,… how we can store data in semi structured storage

  12. July 20

    It depends a bit on the XML document structure and/or if you are using an XML Schema.

    If you are using an XML Schema then an XMLType table based on Object Relational storage would be an option, in conjunction with storing the unstructured data bit in a CLOB. This combination can be found in the manual, searching for XMLType “hybrid” storage.

    Or if in 11g, you could also use Binary XML (with or without using an XML Schema)…

    The best choice depends on the XML structure. In general the “hybrid” form would be the best way to go if in doubt.

  13. akhil
    July 22

    For instance such as complex data such as MARC or library data which has no rigid structure ..which is a character data…how we can store the character data in the semi structured data format?

  14. July 24

    Your requirements sound like a solution based on XMLType binary XML storage and indexed with an unstructured XMLIndex and with a possible second index on the XMLIndex Path Table, being a text index.

  15. Amit
    April 1

    Hi,

    Does dbms_xmlstore supports bulk binding for inserts?

  16. shashi
    November 11

    I would like to know XML features provided by Oracle11g.Can anybody help me .

  17. Fernando
    November 28

    Good post. One simple question

    When you create a folder with dbms_xdb.createFolder , where is it stored in the database? If I upload files to the folders created with dbms_xdb.createFolder, in which schema and table are they stored?

    Thank you.

    • December 12

      DBMS_XDB.createFolder, creates a folder structure in the XDB Repository. In principal file and folders are “created”, the information is stored in a XMLType table called XDB$Resource in the XDB user environment.

      “In principal”, because this is the default for non enabled hierarchical content. All metadata of files and folders in the XDB Repository is stored in this table and by default also file content like pictures. The content of those files end up in the LOB column of this table, and the metadata concerning those files will be stored in different sections (XML, Object Relational storage) of that table, info like: create time, creator, mime type, size, etc.

      Hierarchical enabled content, mostly XML content, can be stored seperately in a different table. The metadata info is still stored in the XDB$Resource table but also linked to this extra created table. Hierarchical enabled, in this sense, means also that the content of this extra XML content related table is secured via WebDAV driven ACL structures (based on Oracle VPD methods). This is done, because no irregular/unauthorized database access method, can alter etc via this way content in the database, that is, via those XDB Repository protocols access methods: WebDAV, HTTP(s) or FTP.

      Also have a look at the following for more information about those kinds of structures and example code:

Comments are closed.