XMLDB Performance: Collection Iterator Pickler Fetch

Nowadays, this “Collection Iterator Pickler Fetch” output in an explain plan is less and less seen as the XML database functionality becomes more and more stronger in re-writing statements. Alas, I got one, again.

SQL> xquery
  2   let $auction := ora:view("XM_TAB") return
  3   count(
  4     for $i in $auction/site/closed_auctions/closed_auction
  5     where $i/price/text() >= 40
  6     return $i/price
  7*  )

Result Sequence
---------------
6539

Execution Plan
----------------------------------------------------------
Plan hash value: 4083502163

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                       |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| XQSEQUENCEFROMXMLTYPE |       |       |            |          |
-----------------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
        143  recursive calls
          0  db block gets
     385537  consistent gets
        119  physical reads
          0  redo size
       1439  bytes sent via SQL*Net to client
        942  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
      19500  sorts (memory)
          0  sorts (disk)
          1  rows processed
SQL>

I couldn’t find much on the internet about the “collection iterator pickler fetch” (lets call it CIPF for now). These posts from Steve Adams and Tom Kyte shed some light on what is happening.

In the XMLDB realm, I now found two examples of CIPF.

One is emerging while using XQuery (so this could also appear while XQuery statements while using XMLTABLE combined with an XQuery statement) and the older syntax from 10.1 and before while using TABLE(XMLSEQUENCE()) or the XMLTable function that is fed a XPath expression. They manifest themselves in an explain plan via:

  • COLLECTION ITERATOR PICKLER FETCH| XQSEQUENCEFROMXMLTYPE (XQuery)
  • COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE(XPath)

In the XML Database this means that the intermediate result is an XML fragment that had to be serialized in memory to get the end result wanted.

Translating the XMLDB issue in “relational world” terms, this would be “getting a full table scan” when it isn’t appropriate, to get one. Most of the time, it happens if you use “fuzzy searches”, for instance via using a “//element_needed” construct. The simple solution, most of the time, is re-writing the XPath search to the full path needed, for example something like “/root/next_element/element_needed”.

If the intermediate XML fragment that has to be serialized is too big, then you not only use a lot of CPU time, but probably also will run out of (PGA) memory. Combined with some DOM validation, this will be killing your performance.

So my general rule of numb for this is, that is, if I encounter one of these CIPF issues, then I will have to find a different XML database approach, because in the long run, this probably will not be a performing solution.

If you encounter this CIPF issue, while making use of XMLTypes based on CLOB or Binary XML storage; an XMLType Index supporting your query could help, if the XML database functionality can re-write your statement based on the information in the indexed PATH_TABLE.

HTH

M.

Marco Gralike Written by:

3 Comments

  1. Ants
    October 26

    >Nowadays, this “Collection Iterator Pickler Fetch” >output in an explain plan is less and less seen..

    If you process xml on the fly you always end up with CIPF,
    for example if you pass an xml into a procedure and you extract data inside a cursor.

  2. October 26

    I totally agree, but if you are handling xml data “on disk” then this shouldn’t be the case, because you have the opportunity to support the process via indexes, extra info for the optimizer like constraints, etc, that should result (if realistic) to avoid a CIPF

Comments are closed.