In Memory XML Performance (XVM)

I wouldn’t believe the bad XMLType performance statement given stated in Martin Preiss’ blog post, and/or the referenced one from the Oracle Developer Advocates for SQL, so I started testing…

I tried to answer the following question:

“Is the XMLType / XML XQuery parser indeed so bad that it would take more than an hour to return the results…?”

Both posts don’t reference any database version although the “Oracle Developer Advocates for SQL” hints the following error, which only happens on database versions higher than 12.1, when the PGA_AGGREGATE_LIMIT parameter was introduced.

1 hour later…

ERROR at line 2:
ORA-00028: your session has been killed
ORA-00028: your session has been killed
ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT
ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT

Martin mentioned Adrian Billington‘s post regarding similar tests. I used Martin’s mentioned wall clock TIMER package downloadable via his site, because I had to use SQL*Developer. The scripts shown were run multiple times before taking down the results noted here.

Database versions used to test the scripts:

11.2.0.3 Enterprise Edition database
12.1.0.2 Enterprise Edition database

The two different version, to see the impact/improvements made by the XMLDB developer team regarding introducing a more optimized XQuery Virtual Machine (XVM) implementation. Officially performance improvements were announced for Oracle 12.1, like XMLTYPE “parallel” support and the additional XMLTABLE “returning sequence by ref” feature, but probably already implemented in 11.2.0.4 (at least Marc noticed behavior changes – I still couldn’t test on 11.2.0.4).

Be aware while looking at the results below, they represent XMLTYPE handling in memory (XVM section), so without any reference to storage. If storage would come into play, (huge) differences regarding additional performance optimizations would become also possible for XMLTYPE content.
XVM_decision_tree

Also I tested with the NO_XML_QUERY_REWRITE SQL hint, to see if and/or how it would effect performance. As mentioned, NO_XML_QUERY_REWRITE forces a different (functional evaluation) optimization method.

I created the following code snippets to run my tests in

--
SET timing ON
--
BEGIN
  -- -------------------------------------------------
  -- test with 1.000.000
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( {test sql goes here} )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('DUAL');
  -- -------------------------------------------------
END;
/
--

Oracle 12.1.0.2

I went for the 1.000.000 as mentioned in Oracle Developer Advocates for SQL. As you can see I also added Adrian’s “CONNECT BY ROWNUM” initial example, to have a little bit extra cross reference output.

The script I used on database version 12.1.0.2 (PGA=500MB/SGA=1GB):

--
SET timing ON
--
BEGIN
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT ROWNUM as rno
      FROM dual 
   CONNECT BY ROWNUM < 1000000 )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('DUAL');
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT level 
      FROM dual 
   CONNECT BY level <= 1000000 )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('LEVEL');
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  (  WITH NATURAL (n) AS
     ( SELECT 1 AS n FROM dual
       UNION ALL
       SELECT n+1 FROM NATURAL WHERE n < 1000000
     )
     SELECT * FROM NATURAL
   )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('NATURAL');
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
   (SELECT column_value as rno
      FROM xmltable ('1 to 1000000') )  
  LOOP 
    NULL; 
  END LOOP;
  timer.SHOW('XMLTABLE');  
  -- -------------------------------------------------
  /*  

  -- bug in 12.1.0.2 - socket connection lost ?

  timer.SNAP;
  FOR timecheck IN 
   ( SELECT rownum FROM
      ( SELECT 1 FROM dual GROUP BY cube( 1, 1, 1, 1, 1, 
                                          1, 1, 1, 1, 1, 
                                          1, 1, 1, 1, 1, 
                                          1, 1, 1, 1, 1 ) 
      ) WHERE rownum < 100000 
    )  
        LOOP 
    NULL; 
  END LOOP;
  timer.SHOW('CUBE');
  
  */
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
   (SELECT /*+NO_XML_QUERY_REWRITE*/ 
           column_value as rno
      FROM xmltable ('1 to 1000000') )  
  LOOP 
    NULL; 
  END LOOP;
  timer.SHOW('XMLTABLE (NO-REWRITE-HINT)');
  -- -------------------------------------------------
END;
/

Results

In short, the output from the script above is the following:

-- -------------------------------------------------
anonymous block completed
-- -------------------------------------------------
[DUAL]      5.25 seconds
[LEVEL]     5.05 seconds
[NATURAL]  16.99 seconds
[XMLTABLE] 54.05 seconds
-- -------------------------------------------------
[XMLTABLE (NO-REWRITE-HINT)]          63.31 seconds
-- -------------------------------------------------

After multiple run’s of the same script, I would conclude, based on this simple setup:

It looks like a CONNECT BY LEVEL is a bit more optimized than CONNECT BY ROWNUM
Using XMLTABLE to generate 1000000 rows DOES NOT take more than 1 hour…
The CUBE solution “crashed” with a lost connection socket error in SQL*Developer…

So 11.2.0.3 must be horrible then…?

Oracle 11.2.0.3

Again I went for the 1.000.000 as mentioned in Oracle Developer Advocates for SQL.
The script I used on database version 11.2.0.3 (PGA=500MB/SGA=1GB) was the same, but the CUBE solution, out of the comparison with the 12.1.0.2 test due to its socket error (bug?), wasn’t used in this setup.

--
SET timing ON 
--
BEGIN
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT ROWNUM as rno
      FROM dual 
   CONNECT BY ROWNUM < 1000000 )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('DUAL');
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT level 
      FROM dual 
   CONNECT BY level <= 1000000 )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('LEVEL');
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  (  WITH NATURAL (n) AS
     ( SELECT 1 AS n FROM dual
       UNION ALL
       SELECT n+1 FROM NATURAL WHERE n < 1000000
     )
     SELECT * FROM NATURAL
   )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('NATURAL');
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
   (SELECT column_value as rno
      FROM xmltable ('1 to 1000000') )  
  LOOP 
    NULL; 
  END LOOP;
  timer.SHOW('XMLTABLE');  
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
   (SELECT /*+NO_XML_QUERY_REWRITE*/ 
           column_value as rno
      FROM xmltable ('1 to 1000000') )  
  LOOP 
    NULL; 
  END LOOP;
  timer.SHOW('XMLTABLE (NO-REWRITE-HINT)');
  -- -------------------------------------------------
END;
/

Results

In short, the output from the script above is the following:

-- -------------------------------------------------
anonymous block completed
-- -------------------------------------------------
[DUAL]      1.08 seconds
[LEVEL]     0.84 seconds
[NATURAL]  10.54 seconds
[XMLTABLE] 83.78 seconds
-- -------------------------------------------------
[XMLTABLE (NO-REWRITE-HINT)] 113.67 seconds
-- -------------------------------------------------

After multiple run’s of the same script, I would conclude, based on this simple setup:

It looks like a CONNECT BY LEVEL is a bit more optimized than CONNECT BY ROWNUM
Using XMLTABLE to generate 1000000 rows STILL DOES NOT take more than 1 hour…
Using the NO XML REWRITE hint doesn’t add, in this example, to produce a faster result

Behavior differences…

I couldn’t find a difference (yet, still might have to dive deeper) why the CONNECT BY LEVEL or CONNECT BY ROWNUM where definitively faster on 11.2.0.3 than on 12.1.0.2. I had no PGA issues (like described by Tanel) on my environments, database parameters as far as I can see, are behavior identical.

CONNECT BY

Both database versions show exactly the same explain plans for the two, like the following.

--
explain plan for
SELECT level 
  FROM dual 
CONNECT BY ROWNUM < = 1000000;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY);
--
------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 |  COUNT                        |      |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(ROWNUM<=1000000)
--

and...

--
explain plan for
SELECT level 
  FROM dual 
CONNECT BY level < = 1000000;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY);
--
-----------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |     2   (0)| 00:00:01 |
|*  1 |  CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   2 |   FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(LEVEL<=1000000)
--

A database event setting like 10053 might shed some light on the differences between the two database versions.

XMLTABLE / XVM

Both database versions show the expected XMLTABLE behavior:

COLLECTION ITERATOR PICKLER FETCH (functional evaluation)
XMLTABLE EVALUATION

The COLLECTION ITERATOR PICKLER FETCH (XQSEQUENCEFROMXMLTYPE), an in memory operation, is typically a sign that the operation is solved by general XML Parser means, that is, no Oracle side (mostly storage) optimization is in effect.

The optimized XVM or XML parser method is faster in 12.1, than the one used in 11.2. In database version 12.1.0.2, this results in 54.05 seconds. In database version 11.2.0.3, the result was 83.78 seconds. So (FOR THIS EXAMPLE!) the difference is significant while generating 1.000.000 numbers, between the two database versions, almost 30 seconds...

--
explain plan for
SELECT column_value as rno
FROM xmltable ('1 to 1000000')
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY);
--
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                       |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| XQSEQUENCEFROMXMLTYPE |  8168 | 16336 |    29   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
--
OR...(via /*+NO_XML_QUERY_REWRITE*/ in 12.1)
--
explain plan for
SELECT /*+NO_XML_QUERY_REWRITE*/
     column_value as rno
FROM xmltable ('1 to 1000000')
--
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  XMLTABLE EVALUATION |      |       |       |            |          |
-----------------------------------------------------------------------------

There are multiple XML related hints you could use in 12.1.

SELECT NAME, 
       INVERSE, 
       SQL_FEATURE, 
       VERSION 
  FROM V$SQL_HINT 
 WHERE NAME LIKE '%XM%' 
ORDER BY VERSION, NAME;

Via the statement above you get the following output in 12.1.0.2:
XML Hints 12.1

Using the query transformation hint NO_XML_QUERY_REWRITE in Oracle 12.1 decreases the performance.

NO_XML_QUERY_REWRITE

The NO_XML_QUERY_REWRITE hint instructs the optimizer to prohibit the rewriting of XPath expressions in SQL statements. By prohibiting the rewriting of XPath expressions, this hint also prohibits the use of any XMLIndexes for the current query.

So from the XVM / XML DB standpoint, what is the effect from using this HINT in our current used statement?

Tracing via XMLOptimizationCheck

Using “XMLOptimizationCheck” or the alternative event 19027 (“19027 trace name context forever, level 0x2000”), the following trace content is generated:

Without the hint:

	XMLTABLE RWT QUERY 

select  sys_xqseq2con(value(QMXTABF$)) 
        as column_value 
from table(xqsequence( xmlquery('1 to 1000000' RETURNING SEQUENCE BY VALUE)  )
          ) QMXTABF$

	XMLTABLE RWT QUERY END
NO REWRITE
	Reason ==> xseq:not lazydef
NO REWRITE
	Reason ==> xseq:not lazydef
NO REWRITE
	Reason ==> xseq:not lazydef
NO REWRITE
	Reason ==> xseq:not lazydef
NO REWRITE
	Reason ==> xseq:not lazydef
NO REWRITE
	Reason ==> xseq:not lazydef

With the hint:

	XMLTABLE RWT QUERY 

select /*+ NO_XML_QUERY_REWRITE */  
       sys_xqseq2con(value(QMXTABF$)) 
       as column_value 
from table(xqsequence( xmlquery('1 to 1000000' RETURNING SEQUENCE BY VALUE)  )
          ) QMXTABF$

	XMLTABLE RWT QUERY END
NO REWRITE
	Reason ==> xseq:not optuop
NO REWRITE
	Reason ==> xseq:not optuop
NO REWRITE
	Reason ==> xseq:not optuop

So both statements will be “rewritten” into the following, but no optimization or rewrite will be done (with or without the HINT)

select  sys_xqseq2con(value(QMXTABF$)) 
        as column_value 
from table(xqsequence( xmlquery('1 to 1000000' RETURNING SEQUENCE BY VALUE)  )
          ) QMXTABF$

Apparently this will we passed on the SQL Optimizer.

Tracing via 10053 event

The 10053 SQL Cost based optimizer event, shows now the following snippets in its trace file content (still Oracle 12.1.0.2 of course)

Without the hint:

******************************************
----- Current SQL Statement for this session (sql_id=7u0zqnpxzwbj7) -----
SELECT column_value as rno
      FROM xmltable ('1 to 1000000')
*******************************************


=====================================
SPD: BEGIN context at statement level
=====================================
Stmt: ******* UNPARSED QUERY IS *******
SELECT SYS_XQSEQ2CON(VALUE(KOKBF$0)) "RNO" FROM TABLE("SYS"."XQSEQUENCE"(SYS_XQ_NRNG(1,1000000))) "KOKBF$0"
Objects referenced in the statement
Objects in the hash table
Return code in qosdInitDirCtx: ENBLD
===================================
SPD: END context at statement level
===================================
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT SYS_XQSEQ2CON(VALUE(KOKBF$0)) "RNO" FROM TABLE("SYS"."XQSEQUENCE"(SYS_XQ_NRNG(1,1000000))) "KOKBF$0"
kkoqbc: optimizing query block SEL$3463D746 (#0)
        
        :
    call(in-use=19584, alloc=353744), compile(in-use=105568, alloc=108576), execution(in-use=28760, alloc=32424)

kkoqbc-subheap (create addr=0x000000001A5BEF10)
****************
QUERY BLOCK TEXT
****************
SELECT column_value as rno
      FROM xmltable ('1 to 1000000')

...
..
...
	  
============
Plan Table
============
------------------------------------------------------------------+-----------------------------------+
| Id  | Operation                          | Name                 | Rows  | Bytes | Cost  | Time      |
------------------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT                   |                      |       |       |    29 |           |
| 1   |  COLLECTION ITERATOR PICKLER FETCH | XQSEQUENCEFROMXMLTYPE|  8168 |   16K |    29 |  00:00:01 |
------------------------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
 
Content of other_xml column
===========================
  db_version     : 12.1.0.2
  parse_schema   : IM
  plan_hash_full : 1791238968
  plan_hash      : 4083502163
  plan_hash_2    : 1791238968
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
      DB_VERSION('12.1.0.2')
      ALL_ROWS
-- --------------------------------	  
      FORCE_XML_QUERY_REWRITE
      XML_DML_RWT_STMT
      XMLINDEX_REWRITE
      XMLINDEX_REWRITE_IN_SELECT
      NO_COST_XML_QUERY_REWRITE
-- --------------------------------	  
      OUTLINE_LEAF(@"SEL$3463D746")
      MERGE(@"SEL$2")
      OUTLINE(@"SEL$C5A29E22")
      MERGE(@"SEL$4F380743")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$4F380743")
      OUTLINE(@"SEL$AC912020")
      FULL(@"SEL$3463D746" "KOKBF$0"@"SEL$2")
    END_OUTLINE_DATA
  */	  
	  
...
..
...

Query Block Registry:
SEL$2 0x1a5b1a00 (PARSER)
  SEL$3463D746 0x2349c98 (VIEW MERGE SEL$C5A29E22; SEL$2) [FINAL]
SEL$1 0x2349c98 (PARSER)
  SEL$C5A29E22 0x2349c98 (VIEW MERGE SEL$1; SEL$4F380743)
    SEL$3463D746 0x2349c98 (VIEW MERGE SEL$C5A29E22; SEL$2) [FINAL]
SEL$AC912020 0x2340dc0 (PARSER)
  SEL$4F380743 0x2340dc0 (QUERY BLOCK TABLES CHANGED SEL$AC912020)
    SEL$C5A29E22 0x2349c98 (VIEW MERGE SEL$1; SEL$4F380743)
      ...

:
    call(in-use=29816, alloc=188960), compile(in-use=145608, alloc=212992), execution(in-use=38936, alloc=40536)

End of Optimizer State Dump
Dumping Hints
=============
====================== END SQL Statement Dump ======================

With the hint:

******************************************
----- Current SQL Statement for this session (sql_id=889shsakhcccb) -----
SELECT /*+NO_XML_QUERY_REWRITE*/ 
           column_value as rno
      FROM xmltable ('1 to 1000000')
	  
=====================================
SPD: BEGIN context at statement level
=====================================
Stmt: ******* UNPARSED QUERY IS *******
SELECT SYS_XQSEQ2CON("P"."C_00$") "RNO" FROM XMLTABLE(/*XMLTAB_RWS*/ '1 to 1000000' PASSING '1 to 1000000' COLUMNS "C_00$" XMLTYPE PATH '.')  "P"
Objects referenced in the statement
Objects in the hash table
Return code in qosdInitDirCtx: ENBLD
===================================
SPD: END context at statement level
===================================
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT SYS_XQSEQ2CON("P"."C_00$") "RNO" FROM XMLTABLE(/*XMLTAB_RWS*/ '1 to 1000000' PASSING '1 to 1000000' COLUMNS "C_00$" XMLTYPE PATH '.')  "P"
kkoqbc: optimizing query block SEL$70B7277A (#0)
        
        :
    call(in-use=13952, alloc=343608), compile(in-use=91544, alloc=93208), execution(in-use=22384, alloc=24312)

...
..
...

============
Plan Table
============
----------------------------------------+-----------------------------------+
| Id  | Operation             | Name    | Rows  | Bytes | Cost  | Time      |
----------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT      |         |       |       |    29 |           |
| 1   |  XMLTABLE EVALUATION  |         |       |       |       |           |
----------------------------------------+-----------------------------------+
Predicate Information:
----------------------
 
Content of other_xml column
===========================
  db_version     : 12.1.0.2
  parse_schema   : IM
  plan_hash_full : 1897395438
  plan_hash      : 3781821901
  plan_hash_2    : 1897395438
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
      DB_VERSION('12.1.0.2')
      ALL_ROWS
-- --------------------------------	  
      XML_DML_RWT_STMT
      XMLINDEX_REWRITE
      XMLINDEX_REWRITE_IN_SELECT
      NO_COST_XML_QUERY_REWRITE
-- --------------------------------	  
      OUTLINE_LEAF(@"SEL$70B7277A")
      MERGE(@"SEL$5D74F85A")
      OUTLINE(@"SEL$81403AD4")
      MERGE(@"SEL$2")
      OUTLINE(@"SEL$5D74F85A")
      OUTLINE(@"SEL$C8284B8B")
      MERGE(@"SEL$F335B45F")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$C1DFDF14")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$F335B45F")
      OUTLINE(@"SEL$1CFB9CC0")
      FULL(@"SEL$70B7277A" "P"@"SEL$C1DFDF14")
    END_OUTLINE_DATA
  */

...
..
...
 
Query Block Registry:
SEL$C1DFDF14 0x1a563110 (PARSER)
  SEL$5D74F85A 0x1a563110 (QUERY BLOCK TABLES CHANGED SEL$C1DFDF14)
    SEL$70B7277A 0x23498b0 (VIEW MERGE SEL$81403AD4; SEL$5D74F85A) [FINAL]
SEL$2 0x18873c40 (PARSER)
  SEL$81403AD4 0x23498b0 (VIEW MERGE SEL$C8284B8B; SEL$2)
    SEL$70B7277A 0x23498b0 (VIEW MERGE SEL$81403AD4; SEL$5D74F85A) [FINAL]
SEL$1 0x23498b0 (PARSER)
  SEL$C8284B8B 0x23498b0 (VIEW MERGE SEL$1; SEL$F335B45F)
    SEL$81403AD4 0x23498b0 (VIEW MERGE SEL$C8284B8B; SEL$2)
      ...
SEL$1CFB9CC0 0x23410a8 (PARSER)
  SEL$F335B45F 0x23410a8 (QUERY BLOCK TABLES CHANGED SEL$1CFB9CC0)
    SEL$C8284B8B 0x23498b0 (VIEW MERGE SEL$1; SEL$F335B45F)
      ...

:
    call(in-use=21544, alloc=343608), compile(in-use=122312, alloc=193384), execution(in-use=26408, alloc=28368)

End of Optimizer State Dump

Dumping Hints
=============
  atom_hint=(@=000007FF65179570 err=0 resol=0 used=1 token=967 org=1 lvl=1 txt=NO_XML_QUERY_REWRITE ())
  atom_hint=(@=000007FF651796E0 err=7 resol=0 used=1 token=967 org=1 lvl=1 txt=NO_XML_QUERY_REWRITE ())
====================== END SQL Statement Dump ======================

Some things in those trace files can be noticed:

The FORCE_XML_QUERY_REWRITE is used when the statement is parsed “as is”
The hints XMLINDEX_REWRITE and XMLINDEX_REWRITE_IN_SELECT are still in effect, despite the use of hint NO_XML_QUERY_REWRITE
The “as is” SQL statement has way less MERGE operations going on during parsing than the SQL statement with the NO_XML_QUERY_REWRITE hint
The CBO explain plan has no COST or BYTES values in the trace file
The NO_COST_XML_QUERY_REWRITE hint is used twice, so the CBO uses its default COST table values

Finally the rewritten statements are different.

Without the hint:

Final query after transformations:
******* UNPARSED QUERY IS *******
SELECT SYS_XQSEQ2CON(VALUE(KOKBF$0)) "RNO" 
FROM TABLE("SYS"."XQSEQUENCE"(SYS_XQ_NRNG(1,1000000))
) "KOKBF$0"

With the hint:

Final query after transformations:
******* UNPARSED QUERY IS *******
SELECT SYS_XQSEQ2CON("P"."C_00$") "RNO" 
FROM XMLTABLE(/*XMLTAB_RWS*/ '1 to 1000000' 
             PASSING '1 to 1000000' 
             COLUMNS "C_00$" 
             XMLTYPE PATH '.')  "P"

Conclusions?

I think one can safely state that a “1…1.000.000” number generation via a XMLTABLE statement does not take more than 1 hour to complete. I am guessing that people waited the statement to complete in, for example, Windows command window (CMD), and yes that will take a while before all content has scrolled through your command window…mine indeed, continued show scrolling figures for more than an hour.

Windows command window output

Depending on the database version, the method based on XMLTABLE should take 1 up to 2 minutes time.

I wouldn’t have expected the database performance difference behavior for the “CONNECT BY” methods. Maybe if I have more time, I deep dive into that phenomena as well.

Crossing the bridges and aligning between the SQL/Relational CBO realm and the XML/DB XVM realm in the database is more complex than one would expect. Also much is documented “internal only” and not easy to deduct in respects of the “what is going on here”. A lot of type and conversion functions and methods in the XML XQuery Virtual Machine framework is not documented and makes it therefore very difficult to understand.

One “rule of thumb”, at least, goes up in this context. Do not use HINTS if you are not absolutely sure it is necessary for that particular situation and for that particular database version.

The used NO_XML_QUERY_REWRITE hint had a performance negative effect for this example and apparently is regarding is documentation not on par anymore (“By prohibiting the rewriting of XPath expressions, this hint also prohibits the use of any XMLIndexes for the current query“). Apparently the “XMLINDEX_REWRITE” CBO hint is set by default and I can imagine that there are situations where NO_XML_QUERY_REWRITE alone is just not adequate enough avoiding also XMLIndexes.

…If not only due to all the different optimization methods in place for XML handling in-memory, XPath, XQuery, XMLType Object Relational re-writes, XMLType Binary XML optimizations, behavior differences while using Securefile or Basicfile storage and/or the In-Memory Column Store XML handling optimizations possible…

Addendum – Regarding Marc’s comment

Based on Marc’s comment, I also added the figures here regarding what happens if you project the XMLType content into NUMBER format. As Marc said, but now here also added with some real numbers based on the initial setup and environments used above…

Oracle 12.1.0.2

With hint:

SET timing ON
--
BEGIN
  -- -------------------------------------------------
  -- test with 1.000.000
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT /*+ no_xml_query_rewrite */ i 
     FROM xmltable('1 to 1000000' 
                    COLUMNS i NUMBER path '.'
                  ) 
   )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('NUMBER PROJECTION');
  -- -------------------------------------------------
END;
/

[NUMBER PROJECTION] 7.24 seconds

anonymous block completed
Elapsed: 00:00:07.293

Without hint:

SET timing ON
--
BEGIN
  -- -------------------------------------------------
  -- test with 1.000.000
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT i 
     FROM xmltable('1 to 1000000' 
                    COLUMNS i NUMBER path '.'
                  ) 
   )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('NUMBER PROJECTION');
  -- -------------------------------------------------
END;
/

[NUMBER PROJECTION] 8.26 seconds

anonymous block completed
Elapsed: 00:00:08.399

Oracle 11.2.0.3

With hint:

SET timing ON
--
BEGIN
  -- -------------------------------------------------
  -- test with 1.000.000
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT /*+ no_xml_query_rewrite */ i 
     FROM xmltable('1 to 1000000' 
                    COLUMNS i NUMBER path '.'
                  ) 
   )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('NUMBER PROJECTION');
  -- -------------------------------------------------
END;
/

[NUMBER PROJECTION] 9.35 seconds

anonymous block completed
Elapsed: 00:00:09.395

Without hint:

SET timing ON
--
BEGIN
  -- -------------------------------------------------
  -- test with 1.000.000
  -- -------------------------------------------------
  timer.SNAP;
  FOR timecheck IN 
  ( SELECT i 
     FROM xmltable('1 to 1000000' 
                    COLUMNS i NUMBER path '.'
                  ) 
   )
  LOOP
    NULL;
  END LOOP;
  timer.SHOW('NUMBER PROJECTION');
  -- -------------------------------------------------
END;
/

[NUMBER PROJECTION] 3.70 seconds

anonymous block completed
Elapsed: 00:00:03.768

3 Comments

Marc

March 10

Interesting.

As you said, the test case posted on https://blogs.oracle.com/sql/entry/row_generators_part_2 is not relevant at all (no db version, no settings whatsoever).

Besides, XMLTABLE is not used correctly for this task since, by omitting the COLUMNS clause, we force returning each row as an XMLType, which causes much overhead.

We get much better results by defining the projection as a NUMBER.

Run on 12.1.0.2, on my laptop :

SQL> select /*+ no_xml_query_rewrite */ i 
     from xmltable('1 to 1000000' 
                    columns i number path '.'
                  );
1000000 rows selected.

Elapsed: 00:00:10.87

Execution Plan
----------------------------------------------------------
Plan hash value: 3781821901
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  XMLTABLE EVALUATION |      |       |       |            |          |
-----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
   18913643  bytes sent via SQL*Net to client
     733877  bytes received via SQL*Net from client
      66668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000000  rows processed

Marco Gralike

March 10

The more interesting part, for me, was the “deep-dive” into the internal mechanics…

🙂
Martin Preiss

March 28

Hi Marco,
I almost missed your article (having reached a remarkable delay in my blog reading) – but I am very glad that my notice provoked your thorough analysis of the behaviour: I think you answer some of the questions from the initial post in the Oracle Developer Advocates for SQL blog (and raise some others).

At this point I have to admit that I did not even try to verify the timing information from Natalka Roshak’s investigation: I guess I stopped the execution of the the XML operation after less than a minute – and labeled it “too slow for this purpose”. The windows cmd explanation of the extreme results sounds plausible (Jonathan Lewis already mentioned the arraysize (and roundtrips) as one major factor in the timing); for quick and dirty test I use autotrace with the traceonly option to avoid this kind of client representation problems.

Regards
Martin

P.S.: by the way: whenever I am confronted with questions containing the words Oracle and XML at first I take a look at your blog.

Comments are closed.