This post will show you some of the first numbers I collected regarding “Loading XML data”, while making use of different XMLType “physical storage containers”.
I also have done some initial testing with Object Relational XMLType storage, but because this method of storage has many options and extra features, I won’t describe them yet here. This topic is interesting enough to earn its on post.
If you need some background on Oracle XMLType Storage option than have a read through the “Binary-, CLOB, Object Relational Storage” Category option in the menu, the Oracle XMLDB Developers Manual or a short intro via Oracle 11g – XMLType Storage Options.
After having created an environment as described in “XMLDB Performance: Environment, Set-up, Procedure“, the following results were gathered by me while keeping values constant, for example the values for “connection.xml“, as described in the “XMLDB Performance: Environment, Set-up, Procedure” . Only the WIKI_STAGE create statements are different.
My Goal
My goal here is to see how these XMLType storage structures perform, in this case, what happens while loading XML data. Another is issue is finding out how to create them and can I deduct some extra “concepts” or rules of numb out of these testing procedures. While setting the test environment up and loading the data I noticed that there are some issues, for example, regarding altering structures or counting the data (XMLDB Performance: The Side Effects of a Simple “count(*)”).
During this learning process, beside insight in, I also had some new questions that have to be answered to understand the mechanics involved. One of this questions I don’t have an answered yet for is “What are the internal differences in bookkeeping between a Smallfile and Bigfile tablespace, besides ROWID structures” (and what is the effect on XMLType storage structures).
Create Table Statements
In this first report, regarding my loading XML data test results, I compared the following structures:
- Binary XMLType Table using Basicfile storage
- Binary XMLType Table using Securefile, not compressed, storage
- Binary XMLType Table using Securefile, compressed high, storage
The results shown were measured while using the following SQL settings for create table WIKI_STAGE
Binary XML Basicfile (XMLType Table)
The following create WIKI_STAGE create table statement was used:
select dbms_metadata.get_ddl('TABLE','WIKI_STAGE','WIKI') from dual;
DBMS_METADATA.GET_DDL('TABLE','WIKI_STAGE','WIKI')
--------------------------------------------------------------------------------
CREATE TABLE "WIKI"."WIKI_STAGE" OF "SYS"."XMLTYPE"
OIDINDEX ( PCTFREE 10 INITRANS 2 MAXTRANS 255
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MEDIAWIKI_STAGE" )
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS1 BUFFER_POOL DEFAULT)
TABLESPACE "MEDIAWIKI_STAGE"
XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS BASICFILE BINARY XML
(TABLESPACE "MEDIAWIKI_STAGE" ENABLE STORAGE IN ROW CHUNK 8192
NOCACHE LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT))
ALLOW NONSCHEMA DISALLOW ANYSCHEMA
.
Binary XML Securefile (XMLType Table) – NO COMPRESS
The following create WIKI_STAGE create table statement was used:
select dbms_metadata.get_ddl('TABLE','WIKI_STAGE','WIKI') from dual;
DBMS_METADATA.GET_DDL('TABLE','WIKI_STAGE','WIKI')
--------------------------------------------------------------------------------
CREATE TABLE "WIKI"."WIKI_STAGE" OF "SYS"."XMLTYPE"
OIDINDEX ( PCTFREE 10 INITRANS 2 MAXTRANS 255
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MEDIAWIKI_STAGE" )
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MEDIAWIKI_STAGE"
XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS SECUREFILE BINARY XML
(TABLESPACE "MEDIAWIKI_STAGE" ENABLE STORAGE IN ROW CHUNK 8192
NOCACHE LOGGING DECRYPT NOCOMPRESS KEEP_DUPLICATES
STORAGE(INITIAL 106496 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 BUFFER_POOL DEFAULT))
ALLOW NONSCHEMA DISALLOW ANYSCHEMA
.
Binary XML Securefile (XMLType Table) – COMPRESS HIGH
The following create WIKI_STAGE create table statement was used:
select dbms_metadata.get_ddl('TABLE','WIKI_STAGE','WIKI') from dual;
DBMS_METADATA.GET_DDL('TABLE','WIKI_STAGE','WIKI')
--------------------------------------------------------------------------------
CREATE TABLE "WIKI"."WIKI_STAGE" OF "SYS"."XMLTYPE"
OIDINDEX ( PCTFREE 10 INITRANS 2 MAXTRANS 255
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MEDIAWIKI_STAGE" )
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MEDIAWIKI_STAGE"
XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS SECUREFILE BINARY XML (
TABLESPACE "MEDIAWIKI_STAGE" ENABLE STORAGE IN ROW CHUNK 8192
NOCACHE LOGGING DECRYPT COMPRESS HIGH KEEP_DUPLICATES
STORAGE(INITIAL 106496 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 BUFFER_POOL DEFAULT))
ALLOW NONSCHEMA DISALLOW ANYSCHEMA
.
Segment Findings
All these storage structures, these binary XMLType tables, contain:
- A table segment
- An index segment
- A LOB segment
- A LOB index segment
While using the create table statements above, this resulted in the following used amount of bytes per segment type after 6.996.742 Mediawiki XML records where loaded.
Table Segments

Click on the picture to enlarge
Index Segments

Click on the picture to enlarge
LOB Segments

Click on the picture to enlarge
LOB Index Segments

Click on the picture to enlarge
Size Totals
Would you compare the original size on disk of these 6.996.742 Mediawiki XML files on Windows NTFS against the amount of bytes used in the database, then you would could see the following differences.
Totals Overview – Segments

Click on the picture to enlarge
and while looking at file size alone…
Totals Overview – Size on Disk, Size in the Database

Click on the picture to enlarge
A simple conclusion, while looking at the graphs, is that “standard” XML storage using Basicfile or Securefile CLOB storages are bigger that the XML size on Windows disk (while using NTFS). The other conclusion in this case is that you will regain 4.2 Gigabyte while compressing it in Securefile Binary XML format, compared to the Mediawiki XML Windows flat file.
To my surprise, I am now aware of the fact that a “Securefile” structure in Oracle 11.1.0.6.0 can be bigger that its alternative the “Basicfile” structure.
I also encountered a bug. A bug I had completely forgotten. One that I had encountered and reported during the end of beta testing of Oracle version 11.1.0.6.0 in 2007. My bug number 6364855, relates to “LOB SIZE USING SECUREFILE IS VERY LARGE”.
See the following alteration statement:
SQL> alter table wiki_stage modify lob (xmldata) (compress high)
Table altered
.
The symptoms are that if you compress or de-compress Securefile storage, the internal storage size increases, despite using “compress”. The bug was very quickly solved and will be packaged in patch version 11.1.0.7.0. So if needed, ask for a back port for Oracle database version 11.1.0.6.0.
Duration
An interesting comparison is, of course, also: “How much time was needed the load the XML data?”. Are there consequences while using one or the other storage model? And there are, as expected…

Click on the picture to enlarge
Conclusions?
What is new for me is that there seems to be a small load time trade off between the old “Basicfile” structure (the structure also used before 11g) and the new Securefile ( no compress) structure in the 11.1.0.6.0 (Oracle Enterprise Edition 32 bit for Linux), which I can’t explain. I did not yet trace / count, for instance, database server CPU time. It could be that storing data, while making us off Securefile structures in general, would consume more CPU cycles, but because I didn’t trace on this, also can’t prove this… As said, just a gut feeling.
The other differences can be explained because the can be explained via the information given in the XMLDB Developers Guide and Storage Guide
XMLType storage will be checked on XML Well-Formedness (this is default behavior) . This takes time (Memory and CPU cycle consumption)
Binary XML is a post parse storage model, this means that XML tags and information, in labor terms, will be cut in to pieces, sorted, ordered and categorized regarding “same elements”, “the same information”, usage, etc. Also the XML data will be stripped from unnecessary whitespace and the tree structure, meta data info will be optimized regarding storage, its accessibility and maintenance. All this information will used when the information is requested again and has to be reconstructed into its original form (with the exception regarding whitespace). The internal mechanism regarding the “How” is only known to Oracle.
Compressing (data) takes extra time, most of it needed for CPU processing. This explains the difference between the Securefile storage via “NOCOMPRESS” and the Securefile storage via “COMPRESS HIGH”.
Other conclusions…?
Securefile storage is smaller, so as advertised, while using compression, in the Oracle Enterprise Edition version 11.1.0.6.0. I hope that patch 11.1.0.7.0 will be released this year so I can re-test some of the features for Securefile. One of my questions, still unanswered, is “How much time is needed to “compress high” a Securefile LOB segment, using a default Securefile LOB segment (NOCOMPRESS) as a starting point”. Some internal information about Basicfile and Securefile Binary XML would be useful and could be used to explain the differences between seen in table, lob segment and lob index segments.
There are still some questions I didn’t answer yet that are related to the described storage models here, such as “What happens if the Binary XML storage model depends on XML Schemata” or “What happens if the Binary XML storage uses features like ‘de-duplication’ or different ‘schema controlled encoding’ “…
What I have described here wasn’t “exact” science. Uncontrolled functionality like AWR mechanisms were still enabled and in place, but I think some comparisons can be made and give a “statement of direction”. On the other hand, just doing it (setup, testing, loading data, thinking about “what does this figure mean…”), made me a lot wiser…
Although I believe that I only scratched the surface of some of the internals, dealing with XML storage etc., I hope that people can use this information to their advantage.
😎
Related Posts
Hi Marco!
Very interesting comparison! Looks like high compression brings quite some space savings!
Could you add the queries you used to get the segments and database sizes? Thanks!
Thomas
There is a dedicated presentation dealing with securefile during Oracle Open World (if you are going). Search in the scheduler on keyword “securefile”.
Scripting, segments:
SQL> get segment 1 prompt 2 PROMPT Segments: 3 prompt 4 column segment_name format a30 heading 'Segment' 5 column segment_type format a15 heading 'Type' 6 column bytes format 99999999999 heading 'Bytes' 7 column extents format 9999999 heading 'Tot.Ext' 8 column ini format 99999999999 heading 'IniSize' 9 column next format 99999999999 heading 'NextSize' 10 column min_extents format 9999 heading 'MinE' 11 column max_extents format 999999 heading 'MaxE' 12 column owner format a15 heading 'Owner' 13 column pct format 9999 heading '%Inc' 14 column tbsname format a20 heading 'Tablespace' 15 break on report 16 compute sum of bytes on report 17 undefine segment_type 18 undefine segment_name 19 undefine tablespace_name 20 undefine owner 21 undefine extents 22 SET LINESIZE 200 23 SET PAGESIZE 5000 24 SET FEEDBACK OFF 25 select SEGMENT_NAME 26 , SEGMENT_TYPE 27 , BYTES 28 , EXTENTS 29 , INITIAL_EXTENT ini 30 , NEXT_EXTENT next 31 , PCT_INCREASE pct 32 , MIN_EXTENTS 33 , MAX_EXTENTS 34 , OWNER 35 , tablespace_name 36 from sys.dba_segments 37 where segment_name like upper('%&Segment_name%') 38 and segment_type like upper('%&Segment_type%') 39 and owner like upper('%&Owner%') 40 and extents >= to_number(nvl('&Extents','0')) 41 and tablespace_name like upper('%&Tablespace_name%') 42 order by owner 43 , extents desc 44 , segment_name 45 , segment_type 46 , bytes 47 / 48 set linesize 132 49 set pagesize 36 50 set feedback on 51 undefine segment_type 52 undefine segment_name 53 undefine tablespace_name 54 undefine owner 55* undefine extents 56If appropriate for detailed LOB info, I used:
SQL> get bytes 1 SET SERVEROUTPUT ON 2 PROMPT 3 PROMPT 4 UNDEFINE USERNAME 5 UNDEFINE SEGMENTNAME 6 UNDEFINE SEGMENTTYPE 7 declare 8 l_segment_name varchar2(30); 9 l_segment_size_blocks number; 10 l_segment_size_bytes number; 11 l_used_blocks number; 12 l_used_bytes number; 13 l_expired_blocks number; 14 l_expired_bytes number; 15 l_unexpired_blocks number; 16 l_unexpired_bytes number; 17 begin 18 dbms_output.put_line(chr(10)); 19 dbms_space.space_usage (segment_owner => upper('&&USERNAME'), 20 segment_name => upper('&&SEGMENTNAME'), 21 segment_type => upper('&&SEGMENTTYPE'), 22 partition_name => NULL, 23 segment_size_blocks => l_segment_size_blocks, 24 segment_size_bytes => l_segment_size_bytes, 25 used_blocks => l_used_blocks, 26 used_bytes => l_used_bytes, 27 expired_blocks => l_expired_blocks, 28 expired_bytes => l_expired_bytes, 29 unexpired_blocks => l_unexpired_blocks, 30 unexpired_bytes => l_unexpired_bytes 31 ); 32 dbms_output.put_line('OBJECT_NAME = &&SEGMENTNAME'); 33 dbms_output.put_line('-----------------------------------------'); 34 dbms_output.put_line('segment_size_blocks => '|| l_segment_size_blocks); 35 dbms_output.put_line('segment_size_bytes => '|| l_segment_size_bytes); 36 dbms_output.put_line('used_blocks => '|| l_used_blocks); 37 dbms_output.put_line('used_bytes => '|| l_used_bytes); 38 dbms_output.put_line('expired_blocks => '|| l_expired_blocks); 39 dbms_output.put_line('expired_bytes => '|| l_expired_bytes); 40 dbms_output.put_line('unexpired_blocks => '|| l_unexpired_blocks); 41 dbms_output.put_line('unexpired_bytes => '|| l_unexpired_bytes); 42 end; 43 / 44 UNDEFINE USERNAME 45 UNDEFINE SEGMENTNAME 46* UNDEFINE SEGMENTTYPE 47