Home > HowTo, SQL, XQuery > How To : Parse sibling repeating groups without wrapper element

How To : Parse sibling repeating groups without wrapper element

November 9, 2014 Leave a comment Go to comments

Long time no post, so here’s a new (short) article.
Behind the cryptic title lies a situation like this :

<root>
  <id>1</id>
  <name>SMITH</name>
  <id>2</id>
  <name>WILSON</name>
  <id>3</id>
  <name>GRANT</name>
</root>

where each logical record {id, name} is not enclosed by a row tag.
This is the result we want out of that data :

ID    NAME
----- ---------
    1 SMITH
    2 WILSON
    3 GRANT

 

0. Set up

This script creates an XMLType table (binary XML storage) and a sample XML document :

create table tmp_xml of xmltype;

insert into tmp_xml
select xmlelement("ROOT"
       , xmlagg(
           xmlforest(
             owner, object_name, object_id, object_type, cast(created as timestamp(0)) created, status
           )
         )
       )
from all_objects
where rownum <= 10000;

The resulting XML document has the following structure :

<ROOT>
  <OWNER>SYS</OWNER>
  <OBJECT_NAME>ICOL$</OBJECT_NAME>
  <OBJECT_ID>20</OBJECT_ID>
  <OBJECT_TYPE>TABLE</OBJECT_TYPE>
  <CREATED>2014-09-11T08:40:51</CREATED>
  <STATUS>VALID</STATUS>
  <OWNER>SYS</OWNER>
  <OBJECT_NAME>TAB$</OBJECT_NAME>
  <OBJECT_ID>4</OBJECT_ID>
  <OBJECT_TYPE>TABLE</OBJECT_TYPE>
  <CREATED>2014-09-11T08:40:51</CREATED>
  <STATUS>VALID</STATUS>
  <OWNER>SYS</OWNER>
  <OBJECT_NAME>I_USER2</OBJECT_NAME>
  <OBJECT_ID>47</OBJECT_ID>
  <OBJECT_TYPE>INDEX</OBJECT_TYPE>
  <CREATED>2014-09-11T08:40:51</CREATED>
  <STATUS>VALID</STATUS>
  
  <!-- 59982 more elements here -->
  
</ROOT>

Expected output :

OWNER         OBJECT_ID   OBJECT_NAME   OBJECT_TYPE   CREATED     STATUS
------------- ----------- ------------- ------------- ----------- ---------
SYS           20          ICOL$         TABLE         11/09/2014  VALID
SYS           4           TAB$          TABLE         11/09/2014  VALID
SYS           47          I_USER2       INDEX         11/09/2014  VALID
...

Following are three methods to achieve that.

 

1. XMLTABLE and following-sibling axis

select x.*
from tmp_xml t
   , xmltable(
       'for $i in /ROOT/OWNER
        return element r {
          $i/following-sibling::OBJECT_NAME[1]
        , $i/following-sibling::OBJECT_ID[1]
        , $i/following-sibling::OBJECT_TYPE[1]
        , $i/following-sibling::CREATED[1]
        , $i/following-sibling::STATUS[1]
        , $i/.
        }'
       passing t.object_value
       columns owner        varchar2(30) path 'OWNER'
             , object_name  varchar2(30) path 'OBJECT_NAME'
             , object_id    number       path 'OBJECT_ID'
             , object_type  varchar2(19) path 'OBJECT_TYPE'
             , created      timestamp    path 'CREATED'
             , status       varchar2(7)  path 'STATUS'
     ) x
;

 

2. XMLTABLE + PIVOT operation

Here, XMLTABLE is only used to produce a dataset containing the column name and its value, then the pivot operator transposes each subset of six consecutive column into a single row, building back the original data in relational format.
Restriction : each logical record must be complete, i.e. all six elements must exist for each row.

with all_nodes as (
  select x.node_name
       , x.node_value
       , trunc((rn-1)/6) as rn
  from tmp_xml t
     , xmltable(
         '/ROOT/*'
         passing t.object_value
         columns node_name    varchar2(30)   path 'local-name(.)'
               , node_value   varchar2(4000) path '.'
               , rn           for ordinality
       ) x
)
select owner
     , object_id
     , object_name
     , object_type
     , to_date(created, 'YYYY-MM-DD"T"HH24:MI:SS') as created
     , status
from all_nodes
pivot (
  min(node_value)
  for node_name in (
    'OWNER'       as owner
  , 'OBJECT_ID'   as object_id
  , 'OBJECT_NAME' as object_name
  , 'OBJECT_TYPE' as object_type
  , 'CREATED'     as created
  , 'STATUS'      as status
  )
) ;

 

3. XMLTABLE + RETURNING SEQUENCE BY REF

Available starting with 12c, the RETURNING SEQUENCE BY REF clause (see this post) allows us to reference nodes (typically ancestors or siblings) relative to the context node passed from the main XQuery expression, thus providing a simple alternative to the first method described above.

select x.*
from tmp_xml t
   , xmltable(
       '/ROOT/OWNER'
       passing t.object_value
       returning sequence by ref
       columns owner        varchar2(30) path '.'
             , object_name  varchar2(30) path 'following-sibling::OBJECT_NAME[1]'
             , object_id    number       path 'following-sibling::OBJECT_ID[1]'
             , object_type  varchar2(19) path 'following-sibling::OBJECT_TYPE[1]'
             , created      timestamp    path 'following-sibling::CREATED[1]'
             , status       varchar2(7)  path 'following-sibling::STATUS[1]'
     ) x
;

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s