JSON Patch, Merge Patch, and Redact features in Oracle 18c

August 27, 2018 Leave a comment

These days, I am looking into the latest additions to SODA in Oracle Database 18c, in particular the REST implementation available via Oracle REST Data Services (ORDS) : SODA for REST.

I wrote once about SODA in the past (see here), to give a preview of JSON filter expressions before their official documentation in release 12.2.

Before looking more closely at the features mentioned in the title, I’ll set up a little SODA collection.

A collection of documents created and managed via SODA is backed up by a database table containing at least an “id” column and a “content” column to store JSON documents.
Though Oracle 18c introduced SODA for PL/SQL as well, I’ll use the HTTP protocol to set up the following example.
 

Creating a new SODA collection

Without custom metadata, documents of a SODA collection are stored in a BLOB column by default.
For subsequent ease-of-use, I’ll pass the following metadata to create the table with the minimum structure : a numeric “ID” column based on a sequence, and a “DOC” column of CLOB data type.

collMetadata.json

{
  "tableName": "JSON_DOCUMENTS",
  "keyColumn" : {
    "name": "ID",
    "sqlType": "NUMBER",
    "assignmentMethod": "SEQUENCE",
    "sequenceName": "JSON_DOC_SEQ"
  },
  "contentColumn": {
    "name": "DOC",
    "sqlType": "CLOB",
    "cache": true
  }
}
Creating the sequence :
create sequence json_doc_seq;

 

Creating the collection, using a curl command :
curl -i -X PUT --data-binary @collMetadata.json -H "Content-Type: application/json" 
http://localhost:8083/ords/dev/soda/latest/MyCollection
SQL> desc json_documents

 Name    Null?    Type
 ------- -------- -------------
 ID      NOT NULL NUMBER
 DOC              CLOB

 

Inserting documents :

I’ll load the ready-to-use sample file ‘POList.json’ available in the ./examples/soda/getting-started directory of ORDS installation.
This file holds a JSON array containing 70 PO documents.

curl -X POST --data-binary @POList.json -H "Content-Type: application/json" 
http://localhost:8083/ords/dev/soda/latest/custom-actions/insert/MyCollection

Here’s the content of the table after successful execution :

SQL> select * from json_documents order by id;

        ID DOC
---------- ----------------------------------------------------------------------------
         1 {"PONumber":1,"Reference":"MSULLIVA-20141102","Requestor":"Martha Sullivan",
         2 {"PONumber":2,"Reference":"MSULLIVA(-20141113","Requestor":"Martha Sullivan"
         3 {"PONumber":3,"Reference":"TRAJS-20140518","Requestor":"Trenna Rajs","User":
         4 {"PONumber":4,"Reference":"TRAJS-20140520","Requestor":"Trenna Rajs","User":
         5 {"PONumber":5,"Reference":"MSULLIVA-20141121","Requestor":"Martha Sullivan",
         6 {"PONumber":6,"Reference":"TRAJS-20140530","Requestor":"Trenna Rajs","User":
         7 {"PONumber":7,"Reference":"VJONES-20140503","Requestor":"Vance Jones","User"
...
        66 {"PONumber":66,"Reference":"SMCCAIN-20141007","Requestor":"Samuel McCain","U
        67 {"PONumber":67,"Reference":"SMCCAIN-20141010","Requestor":"Samuel McCain","U
        68 {"PONumber":68,"Reference":"SHIGGINS-20141028","Requestor":"Shelley Higgins"
        69 {"PONumber":69,"Reference":"KPARTNER-20140905","Requestor":"Karen Partners",
        70 {"PONumber":70,"Reference":"SSTILES-20141011","Requestor":"Stephen Stiles","

70 rows selected.

 
The setup is complete.
Let’s now try out the new PATCH operation.
 

JSON PATCH operation

PATCH is a standard request method of the HTTP protocol, whose purpose is to make piecewise changes to a resource (a JSON document in this case).
The SODA HTTP PATCH operation implements the JSON Patch standard as per RFC 6902.
A JSON Patch specification is itself a JSON document representing an array of Patch steps. For example :

poPatchSpec.json

[
  { "op"    : "test",
    "path"  : "/ShippingInstructions/Address/street",
    "value" : "200 Sporting Green" },
  { "op"    : "replace",
    "path"  : "/ShippingInstructions/Address/street",
    "value" : "Winchester House, Heatley Rd" },
  { "op"    : "copy",
    "from"  : "/ShippingInstructions/Phone/0",
    "path"  : "/ShippingInstructions/Phone/1" },
  { "op"    : "replace",
    "path"  : "/ShippingInstructions/Phone/1/number",
    "value" : "861-555-8765" }
]

– First step is a “test” operation, which acts as a filter predicate.
It simply checks whether the value at location '/ShippingInstructions/Address/street' is '200 Sporting Green'. If not, the processing stops, else continues with the next steps.

– Second step is a “replace” operation.
It replaces the value at '/ShippingInstructions/Address/street' with value 'Winchester House, Heatley Rd'.

– Third step is a “copy”” operation.
It copies the value at location '/ShippingInstructions/Phone/0' (an array location) to the target location '/ShippingInstructions/Phone/1'.

– Fourth step is another “replace” operation, this time targetting the ‘number’ field of the new Phone item added on the previous step.

Path expressions used in JSON Patch are defined in RFC 6901 (JSON Pointer).

Here’s the relevant data for doc ID = 1 before applying the patch spec :

SQL> select po.*
  2  from json_documents t
  3     , json_table(t.doc, '$.ShippingInstructions'
  4         columns street varchar2(30)  path '$.Address.street'
  5               , phone  varchar2(256) format json with array wrapper path '$.Phone[*].number'
  6       ) po
  7  where t.id = 1;

STREET                         PHONE
------------------------------ --------------------
200 Sporting Green             ["979-555-6598"]
 

Applying patch :

curl -i -X PATCH --data-binary @poPatchSpec.json -H "Content-Type: application/json-patch+json" 
http://localhost:8083/ords/dev/soda/latest/MyCollection/1

Data after patch operation :

SQL> select po.*
  2  from json_documents t
  3     , json_table(t.doc, '$.ShippingInstructions'
  4         columns street varchar2(30)  path '$.Address.street'
  5               , phone  varchar2(256) format json with array wrapper path '$.Phone[*].number'
  6       ) po
  7  where t.id = 1;

STREET                         PHONE
------------------------------ -----------------------------------
Winchester House, Heatley Rd   ["979-555-6598","861-555-8765"]
 

 
So what happened behind the scenes?
We can check that using SQL_Trace, or by just querying V$SQL (before cursors age out of the shared pool).
Here are my findings :

select DBMS_SODA_DOM.JSON_PATCH_C("DOC",:1 ),to_char("ID") from "DEV"."JSON_DOCUMENTS" where ("ID" = to_number(:2 ) )

update "DEV"."JSON_DOCUMENTS" set "DOC" = :1  where ("ID" = to_number(:2 ) )

First statement selects the document and apply the patch using DBMS_SODA_DOM.JSON_PATCH_C function.
Second statement saves back the document in the collection table.

DBMS_SODA_DOM is not documented. It contains various functions supporting – among other interesting things – the PATCH method against different data types (VARCHAR2, NVARCHAR2, RAW, CLOB, NCLOB, BLOB).

Pushing the analysis a bit further, we can discover that those PATCH routines are actually based on PL/SQL JSON_ELEMENT_T.patch() method :

MEMBER PROCEDURE patch(self IN OUT NOCOPY JSON_ELEMENT_T, spec VARCHAR2)

 
Let’s now experiment with this method directly from PL/SQL.
 

Disclaimer :
The patch method is not yet documented, so “unsupported” for direct usage.
All the following is then presented for informational purpose.

 
Read more…

Advertisements
Categories: JSON, PL/SQL, SQL Tags: , , , ,

ExcelTable 2.3 : new API for DML operations

August 23, 2018 Leave a comment

We have seen in previous posts that PL/SQL does not support static reference to pipelined table functions returning an AnyDataset instance.
Somehow, Oracle is not able to call ODCITableDescribe routine at compile time in order to create the necessary object and collection type to back up the dataset :

SQL> begin
  2
  3    insert into tmp_ooxdata (id, name)
  4    select t.id, t.name
  5    from table(
  6           ExcelTable.getRows(
  7             p_file  => ExcelTable.getFile('TMP_DIR','ooxdata.xlsx')
  8           , p_sheet => 'DataSource'
  9           , p_cols  => '"ID" number, "NAME" varchar2(10)'
 10           )
 11         ) t
 12    ;
 13
 14  end;
 15  /
  from table(
       *
ERROR at line 5:
ORA-06550: line 5, column 8:
PL/SQL: ORA-22905: cannot access rows from a non-nested table item
 

 
The first (obvious) workaround is to use dynamic SQL :

SQL> declare
  2
  3    stmt  varchar2(2000) :=
  4    q'{
  5    insert into tmp_ooxdata (id, name)
  6    select t.id, t.name
  7    from table(
  8           ExcelTable.getRows(
  9             p_file  => :1
 10           , p_sheet => :2
 11           , p_cols  => '"ID" number, "NAME" varchar2(10)'
 12           )
 13         ) t
 14    }';
 15
 16  begin
 17
 18    execute immediate stmt
 19    using ExcelTable.getFile('TMP_DIR','ooxdata.xlsx')
 20        , 'DataSource'
 21    ;
 22
 23    dbms_output.put_line(sql%rowcount || ' rows inserted.');
 24
 25  end;
 26  /

50000 rows inserted.

PL/SQL procedure successfully completed.
 

 

When the purpose is to use getRows() function in a cursor, I’ve added a convenience function getCursor() taking the same set of parameters as getRows() and returning a REF cursor (SYS_REFCURSOR) that the user may consume from PL/SQL or another client language :

SQL> declare
  2
  3    type rec_t is record (id number, name varchar2(10));
  4
  5    my_rec  rec_t;
  6    rc      sys_refcursor;
  7
  8  begin
  9
 10    rc := ExcelTable.getCursor(
 11            p_file  => ExcelTable.getFile('TMP_DIR','ooxdata.xlsx')
 12          , p_sheet => 'DataSource'
 13          , p_cols  => '"ID" number, "NAME" varchar2(10)'
 14          , p_range => '1:10'
 15          );
 16
 17    loop
 18      fetch rc into my_rec;
 19      exit when rc%notfound;
 20      dbms_output.put_line(to_char(my_rec.id,'fm0999')||';'||my_rec.name);
 21    end loop;
 22
 23    close rc;
 24
 25  end;
 26  /

0001;LINE-00001
0002;LINE-00002
0003;LINE-00003
0004;LINE-00004
0005;LINE-00005
0006;LINE-00006
0007;LINE-00007
0008;LINE-00008
0009;LINE-00009
0010;LINE-00010

PL/SQL procedure successfully completed.
 

 

ExcelTable 2.3 introduces another way to load data from a spreasheet file, directly into a given target table (or view) using one of these DML operations :

  • INSERT
  • UPDATE
  • MERGE

A DELETE mode is also available to delete rows in a table based on data (keys) from the external spreadsheet file.

The DML API comprises three routines :
 

1) createDMLContext

function createDMLContext (
  p_table_name in varchar2    
)
return DMLContext;

This creates a new DML context to affect the specified target table or view.
The table name may be passed as a simple or qualified SQL name. For instance, these are valid examples of table specification :

MY_TABLE
MY_SCHEMA.MY_TABLE
"myTable"
MY_SCHEMA."myTable"

Database link spec is not supported yet.
 

2) mapColumn

procedure mapColumn (
  p_ctx       in DMLContext
, p_col_name  in varchar2
, p_col_ref   in varchar2
, p_format    in varchar2     default null
, p_meta      in pls_integer  default null
, p_key       in boolean      default false
);

For a given DML context, maps a spreadsheet column reference (A, B, C …) to a column of the target table.
Optional arguments p_format and p_meta are the equivalent of FORMAT and FOR METADATA clauses respectively, from the columns specs of getRows() function.
Optional argument p_key, when set to true, indicates the column is (part of) the key of the input dataset. At least one key column must be specified for an UPDATE, MERGE or DELETE context, an exception is raised otherwise.
 

3) loadData

function loadData (
  p_ctx          in DMLContext
, p_file         in blob
, p_sheet        in varchar2
, p_range        in varchar2       default null
, p_method       in binary_integer default DOM_READ
, p_password     in varchar2       default null
, p_dml_type     in pls_integer    default DML_INSERT
, p_err_log      in varchar2       default null
)
return integer;

Function loadData() performs the data loading operation according to the DML mode specified via p_dml_type argument.
Argument p_err_log may be used to pass a DML error logging clause.
Note that the syntax of this clause is not checked before runtime.

The rest of the arguments are identical to those of getRows() function.
The function returns the number of rows affected by the operation.
 

Download

/mbleron/ExcelTable
 

Examples

Inserting first 1,000 rows from file sample_2.xlsx into table TMP_SAMPLE2 :

SQL> create table tmp_sample2 (
  2    id   number       primary key
  3  , name varchar2(10)
  4  , val  varchar2(30)
  5  );

Table created.

SQL> declare
  2
  3    ctx    ExcelTable.DMLContext;
  4    nrows  integer;
  5
  6  begin
  7
  8    ctx := ExcelTable.createDMLContext('TMP_SAMPLE2');
  9
 10    ExcelTable.mapColumn(ctx, p_col_name => 'ID',   p_col_ref => 'A');
 11    ExcelTable.mapColumn(ctx, p_col_name => 'NAME', p_col_ref => 'B');
 12    ExcelTable.mapColumn(ctx, p_col_name => 'VAL',  p_col_ref => 'C');
 13
 14    nrows :=
 15    ExcelTable.loadData(
 16      p_ctx      => ctx
 17    , p_file     => ExcelTable.getFile('XL_DATA_DIR','sample_2.xlsx')
 18    , p_sheet    => 'DataSource'
 19    , p_range    => '1:1000'
 20    , p_method   => ExcelTable.STREAM_READ
 21    , p_dml_type => ExcelTable.DML_INSERT
 22    );
 23
 24    dbms_output.put_line(nrows || ' rows inserted.');
 25
 26  end;
 27  /

1000 rows inserted.

PL/SQL procedure successfully completed.

SQL> select count(*) from tmp_sample2;

  COUNT(*)
----------
      1000

SQL> commit;

Commit complete.
 

 
Read more…

Interconversion between XML and JSON : Part 1

August 12, 2018 3 comments

This publication inaugurates a new series focusing on XML/JSON interconversion.

In this first post, I’ll present a PL/SQL implementation of two XPath 3.1 functions performing lossless conversion between XML and JSON format :

XPath 3.1 bridges the gap between JSON data and the XML Data Model (XDM) by providing facilities to handle JSON via map and array types (introduced as part of XDM 3.1 as well).

The conversion is based on a standardized XML representation of JSON.
Here’s a small sample JSON document and its XML equivalent according to that mapping :

JSON:

{
  "item":[
     {"id":1, "v":"ABC"}
   , {"id":2, "v":"XYZ"}
  ]
}

XML:

<map xmlns="http://www.w3.org/2005/xpath-functions">
  <array key="item">
    <map>
      <number key="id">1</number>
      <string key="v">ABC</string>
    </map>
    <map>
      <number key="id">2</number>
      <string key="v">XYZ</string>
    </map>
  </array>
</map>

A larger yet concise example is available in the specifications.

The transformation between one format to the other is very straightforward : a JSON object translates to an XML “map” element, a JSON array to an “array” element and so on for scalar JSON types : “string”, “number”, “boolean”, “null”, each complemented with a “key” attribute when the element represents an object member.

Because my implementation makes use of PL/SQL JSON object types, it is only available in Oracle release 12.2 and onwards.
NB : the second (optional) argument of fn:json-to-xml and fn:xml-to-json is not supported (for now).

Source code available on GitHub :

/mbleron/JSONUtilities

 

Examples

SQL> select doc from json_documents where id = 1;
 
DOC
--------------------------------------------------
{
  "_id":"53e3c6ed-9bfc-2730-e053-0100007f6afb",
  "content":{
    "name":"obj1",
    "type":1,
    "isNew":true,
    "clientId":null,
    "values":[
      {"name":"x", "v":1},
      {"name":"y", "v":2}
    ]
  }
}
 

converted to XML :

SQL> select xutl_json.json_to_xml(doc) as xml_doc
  2  from json_documents
  3  where id = 1;
 
XML_DOC
--------------------------------------------------------------------
<map xmlns="http://www.w3.org/2005/xpath-functions">
  <string key="_id">53e3c6ed-9bfc-2730-e053-0100007f6afb</string>
  <map key="content">
    <string key="name">obj1</string>
    <number key="type">1</number>
    <boolean key="isNew">true</boolean>
    <null key="clientId"/>
    <array key="values">
      <map>
        <string key="name">x</string>
        <number key="v">1</number>
      </map>
      <map>
        <string key="name">y</string>
        <number key="v">2</number>
      </map>
    </array>
  </map>
</map>
 

 
Another example, using hr.json file :

SQL> insert into json_documents
  2  values (2, to_clob(bfilename('DATA_DIR','hr.json'), nls_charset_id('AL32UTF8')));

1 row created.

SQL> commit;

Commit complete.

SQL> select xutl_json.json_to_xml(doc) as xml_doc
  2  from json_documents
  3  where id = 2;
 
XML_DOC
----------------------------------------------------------------------
<array xmlns="http://www.w3.org/2005/xpath-functions">
  <map>
    <number key="id">1000</number>
    <string key="address">1297 Via Cola di Rie</string>
    <string key="city">Roma</string>
    <string key="country">IT</string>
  </map>

...

  <map>
    <number key="id">2700</number>
    <string key="address">Schwanthalerstr. 7031</string>
    <string key="city">Munich</string>
    <string key="country">DE</string>
    <array key="departments">
      <map>
        <number key="id">70</number>
        <string key="name">Public Relations</string>
        <number key="manager">204</number>
        <array key="employees">
          <map>
            <number key="id">204</number>
            <string key="lastname">Baer</string>
            <string key="hiredate">1994-06-07T00:00:00</string>
            <number key="salary">10000</number>
          </map>
        </array>
      </map>
    </array>
  </map>
  <map>
    <number key="id">2800</number>
    <string key="address">Rua Frei Caneca 1360 </string>
    <string key="city">Sao Paulo</string>
    <string key="country">BR</string>
  </map>

...

</array>
 

Full output available here : hr.xml

 

What’s next?

We can apply XQuery or XSLT transformation to JSON documents by using an intermediate XML representation of JSON.
Given the hr.json file, the following will output a JSON array containing, for each city having employees, the number of employees, the minimum and maximum salaries. The array will be sorted by country and city :

select xutl_json.xml_to_json(
         xmlquery(
          'declare default element namespace "http://www.w3.org/2005/xpath-functions"; (::)
           element array {
             for $loc in /array/map
             let $emps := $loc//array[@key="employees"]/map
             where exists($emps)
             order by $loc/string[@key="country"], $loc/string[@key="city"]
             return element map {
               $loc/string[@key="city"]
             , element number { attribute key {"count"}, count($emps) }
             , element number { attribute key {"min-salary"}, min($emps/number[@key="salary"]) }
             , element number { attribute key {"max-salary"}, max($emps/number[@key="salary"]) }
             }
           }'
           passing xutl_json.json_to_xml(t.doc)
           returning content
         )
       )
from json_documents t
where t.id = 2;
 

Output (pretty-printed) :

[
  {"city":"Toronto", "count":2, "min-salary":6000, "max-salary":13000},
  {"city":"Munich", "count":1, "min-salary":10000, "max-salary":10000},
  {"city":"London", "count":1, "min-salary":6500, "max-salary":6500},
  {"city":"Oxford", "count":34, "min-salary":6100, "max-salary":14000},
  {"city":"Seattle", "count":18,"min-salary":2500, "max-salary":24000},
  {"city":"South San Francisco", "count":45, "min-salary":2100, "max-salary":8200},
  {"city":"Southlake", "count":5, "min-salary":4200, "max-salary":9000}
]

A similar transformation may be performed using XSLT.
XSLT 1.0, the only version implemented natively inside the database, does not support fn:min and fn:max functions (part of XPath 2.0). Though they can be implemented using a named template, in the example below, for simplicity, I will just stick to “count” :

select xutl_json.xml_to_json(
         xmltransform(
           xutl_json.json_to_xml(t.doc)
         , q'~<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns="http://www.w3.org/2005/xpath-functions" xmlns:fn="http://www.w3.org/2005/xpath-functions"
                exclude-result-prefixes="fn">
                <xsl:output method="xml"/>
                <xsl:template match="/"> 
                  <array>
                    <xsl:apply-templates select="fn:array/fn:map">
                      <xsl:sort select="fn:string[@key='country']"/>
                      <xsl:sort select="fn:string[@key='city']"/>
                    </xsl:apply-templates>
                  </array>
                </xsl:template>
                <xsl:template match="fn:map">
                  <xsl:variable name="emps" select=".//fn:array[@key='employees']/fn:map"/>
                  <xsl:if test="$emps">
                    <map>
                      <xsl:copy-of select="fn:string[@key='city']"/>
                      <number key="count"><xsl:value-of select="count($emps)"/></number>
                    </map>
                  </xsl:if>
                </xsl:template>
              </xsl:stylesheet>~'
         )
       )
from json_documents t
where t.id = 2;
 

Output (pretty-printed) :

[
  {"city":"Toronto", "count":2},
  {"city":"Munich", "count":1},
  {"city":"London", "count":1},
  {"city":"Oxford", "count":34},
  {"city":"Seattle", "count":18},
  {"city":"South San Francisco", "count":45},
  {"city":"Southlake", "count":5}
]

 

ExcelTable 2.2 : support for OpenDocument (ODF) files

July 7, 2018 Leave a comment

Here’s the latest addition to ExcelTable : support for OpenDocument spreadsheet files (ODF).

The OpenDocument standard has been widely adopted by a great number of organizations.
In France, for instance, ODF is now the recommended standard for office documents in public administrations, and some sectors have already completed their migration from Microsoft Office products (using Open Office XML Standard) to LibreOffice suite.

ODF, like OOXML (.xlsx, .xlsb, .docx etc.), is a container-based format (ZIP archive) composed of different parts, mainly XML documents, describing the document structure and data.
The comparison stops there though as ODF is much more simple to understand and manipulate from a programmatic point of view.
Here’s an overview of the container structure :

Everything we need to read the document resides in the Manifest part (META-INF/manifest.xml), which is used to determine if the document is encrypted (I’ll get to that later), and the content.xml part, which contains all document structure and data.
The mimetype part may also be read to check which type of document we are dealing with (spreadsheet, text, presentation, etc.).
For instance, a spreadsheet file (.ods) has this MIME type :

application/vnd.oasis.opendocument.spreadsheet

The apparent simplicity also comes with some disadvantages :

  • The content part is globally more verbose than OOXML, with longer tag names and namespace prefixes.
  • String values are not shared across the spreadsheet.
  • Rows and cells are not indexed by their respective row numbers and column references, which makes range-based access of the content more difficult.

As an example, here’s the content part for a three-row spreadsheet table :

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:presentation="urn:oasis:names:tc:opendocument:xmlns:presentation:1.0" xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0" xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="urn:oasis:names:tc:opendocument:xmlns:form:1.0" xmlns:script="urn:oasis:names:tc:opendocument:xmlns:script:1.0" xmlns:ooo="http://openoffice.org/2004/office" xmlns:ooow="http://openoffice.org/2004/writer" xmlns:oooc="http://openoffice.org/2004/calc" xmlns:dom="http://www.w3.org/2001/xml-events" xmlns:xforms="http://www.w3.org/2002/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rpt="http://openoffice.org/2005/report" xmlns:of="urn:oasis:names:tc:opendocument:xmlns:of:1.2" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:grddl="http://www.w3.org/2003/g/data-view#" xmlns:tableooo="http://openoffice.org/2009/table" xmlns:drawooo="http://openoffice.org/2010/draw" xmlns:calcext="urn:org:documentfoundation:names:experimental:calc:xmlns:calcext:1.0" xmlns:loext="urn:org:documentfoundation:names:experimental:office:xmlns:loext:1.0" xmlns:field="urn:openoffice:names:experimental:ooo-ms-interop:xmlns:field:1.0" xmlns:formx="urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:form:1.0" xmlns:css3t="http://www.w3.org/TR/css3-text/" office:version="1.2">
  <office:scripts/>
  <office:font-face-decls>
    <style:font-face style:name="Arial1" svg:font-family="Arial"/>
    <style:font-face style:name="Arial" svg:font-family="Arial" style:font-family-generic="swiss"/>
    <style:font-face style:name="Calibri" svg:font-family="Calibri" style:font-family-generic="swiss"/>
    <style:font-face style:name="Cambria" svg:font-family="Cambria" style:font-family-generic="swiss"/>
    <style:font-face style:name="Microsoft Sans Serif" svg:font-family="'Microsoft Sans Serif'" style:font-family-generic="swiss"/>
    <style:font-face style:name="Liberation Sans" svg:font-family="'Liberation Sans'" style:font-family-generic="swiss" style:font-pitch="variable"/>
    <style:font-face style:name="Microsoft YaHei" svg:font-family="'Microsoft YaHei'" style:font-family-generic="system" style:font-pitch="variable"/>
    <style:font-face style:name="Segoe UI" svg:font-family="'Segoe UI'" style:font-family-generic="system" style:font-pitch="variable"/>
    <style:font-face style:name="Tahoma" svg:font-family="Tahoma" style:font-family-generic="system" style:font-pitch="variable"/>
  </office:font-face-decls>
  <office:automatic-styles>
    <style:style style:name="co1" style:family="table-column">
      <style:table-column-properties fo:break-before="auto" style:column-width="16.55mm"/>
    </style:style>
    <style:style style:name="co2" style:family="table-column">
      <style:table-column-properties fo:break-before="auto" style:column-width="56.18mm"/>
    </style:style>
    <style:style style:name="co3" style:family="table-column">
      <style:table-column-properties fo:break-before="auto" style:column-width="21.77mm"/>
    </style:style>
    <style:style style:name="ro1" style:family="table-row">
      <style:table-row-properties style:row-height="4.52mm" fo:break-before="auto" style:use-optimal-row-height="true"/>
    </style:style>
    <style:style style:name="ro2" style:family="table-row">
      <style:table-row-properties style:row-height="8.04mm" fo:break-before="auto" style:use-optimal-row-height="false"/>
    </style:style>
    <style:style style:name="ta1" style:family="table" style:master-page-name="PageStyle_5f_DataSource">
      <style:table-properties table:display="true" style:writing-mode="lr-tb"/>
    </style:style>
    <number:text-style style:name="N100">
      <number:text-content/>
    </number:text-style>
    <style:style style:name="ce1" style:family="table-cell" style:parent-style-name="Default" style:data-style-name="N0">
      <style:table-cell-properties style:rotation-align="none"/>
      <style:text-properties fo:color="#000000" style:text-outline="false" style:text-line-through-style="none" style:text-line-through-type="none" style:font-name="Microsoft Sans Serif" fo:font-size="10pt" fo:font-style="normal" fo:text-shadow="none" style:text-underline-style="none" fo:font-weight="normal" style:font-size-asian="10pt" style:font-style-asian="normal" style:font-weight-asian="normal" style:font-name-complex="Microsoft Sans Serif" style:font-size-complex="10pt" style:font-style-complex="normal" style:font-weight-complex="normal"/>
    </style:style>
    <style:style style:name="ce2" style:family="table-cell" style:parent-style-name="Default" style:data-style-name="N0">
      <style:table-cell-properties style:rotation-align="none"/>
    </style:style>
    <style:style style:name="ce3" style:family="table-cell" style:parent-style-name="Default" style:data-style-name="N100">
      <style:table-cell-properties style:rotation-align="none"/>
      <style:text-properties fo:color="#000000" style:text-outline="false" style:text-line-through-style="none" style:text-line-through-type="none" style:font-name="Microsoft Sans Serif" fo:font-size="10pt" fo:font-style="normal" fo:text-shadow="none" style:text-underline-style="none" fo:font-weight="normal" style:font-size-asian="10pt" style:font-style-asian="normal" style:font-weight-asian="normal" style:font-name-complex="Microsoft Sans Serif" style:font-size-complex="10pt" style:font-style-complex="normal" style:font-weight-complex="normal"/>
    </style:style>
  </office:automatic-styles>
  <office:body>
    <office:spreadsheet>
      <table:calculation-settings table:case-sensitive="false" table:automatic-find-labels="false" table:use-regular-expressions="false" table:use-wildcards="true">
        <table:iteration table:maximum-difference="0.0001"/>
      </table:calculation-settings>
      <table:table table:name="DataSource" table:style-name="ta1">
        <office:forms form:automatic-focus="false" form:apply-design-mode="false"/>
        <table:table-column table:style-name="co1" table:default-cell-style-name="ce2"/>
        <table:table-column table:style-name="co2" table:default-cell-style-name="ce2"/>
        <table:table-row table:style-name="ro1">
          <table:table-cell table:style-name="ce1" office:value-type="float" office:value="1" calcext:value-type="float">
            <text:p>1</text:p>
          </table:table-cell>
          <table:table-cell table:style-name="ce3" office:value-type="string" calcext:value-type="string">
            <text:p>LINE-00001</text:p>
          </table:table-cell>
        </table:table-row>
        <table:table-row table:style-name="ro2">
          <table:table-cell table:style-name="ce1" office:value-type="float" office:value="2" calcext:value-type="float">
            <text:p>2</text:p>
          </table:table-cell>
          <table:table-cell table:style-name="ce3" office:value-type="string" calcext:value-type="string">
            <text:p>LINE-00002</text:p>
          </table:table-cell>
        </table:table-row>
        <table:table-row table:style-name="ro1">
          <table:table-cell table:style-name="ce1" office:value-type="float" office:value="3" calcext:value-type="float">
            <text:p>3</text:p>
          </table:table-cell>
          <table:table-cell table:style-name="ce3" office:value-type="string" calcext:value-type="string">
            <text:p>LINE-00003</text:p>
          </table:table-cell>
        </table:table-row>
      </table:table>
      <table:named-expressions/>
    </office:spreadsheet>
  </office:body>
</office:document-content>

 

For now, the ODF reader in ExcelTable is implemented using DOM interface only. So, the p_method argument of the getRows() function is ignored.
All other existing features are available, for instance :

  • Range-based access of data
  • Cell comments extraction
  • CLOB projection for large cell content
  • Support for password-protected documents

Here’s a sample query based on document test01.ods :

SQL> select t.*
  2  from table(
  3         ExcelTable.getRows(
  4           ExcelTable.getFile('TMP_DIR','test01.ods')
  5         , 'Feuille1'
  6         , q'{
  7             "SRNO"  number         column 'A'
  8           , "VAL"   number         column 'B'
  9           , "DT"    timestamp      column 'C'
 10           , "TXT1"  clob           column 'D'
 11           , "TXT2"  varchar2(1000) column 'E'
 12           , "RN"    for ordinality
 13           }'
 14         , '7:8'
 15         )
 16       ) t
 17  ;
 
      SRNO        VAL DT                             TXT1                            TXT2               RN
---------- ---------- ------------------------------ ------------------------------- ---------- ----------
         4 9999999999 10-JUN-18 03.20.37.000000 PM   €€€€€€€€€€€€€€€€€€€€€€€€€€ABC                       1
         5       -123 11-JUN-18 03.20.37.120000 PM                                   SPARE1              2
 

 

As usual, the source code, documentation and samples are available on GitHub :

/mbleron/ExcelTable
/mbleron/MSUtilities/OfficeCrypto
 

In the remaining of the post, I’ll focus on encryption methods used in password-protected ODF files.

Read more…

How To : Load JSON Data into Multiple Tables

June 20, 2018 Leave a comment

Lately, I’ve been revisiting old posts about miscellaneous XML techniques to see if they could be adapted to similar situations involving JSON data.

One of them is an approach to load a multi-level document into multiple tables :
How To : Load XML data into multiple tables

Here’s the same approach applied to JSON data.
 

Set up

For this test case, I’ll build a JSON document containing the famous HR hierarchy, starting from locations down to employees :

insert into json_documents (id, doc)
select 2
     , json_arrayagg(
         json_object(
           'id' value l.location_id
         , 'address' value l.street_address
         , 'city' value l.city
         , 'country' value l.country_id
         , 'departments' value (
             select json_arrayagg(
                      json_object(
                        'id' value d.department_id
                      , 'name' value d.department_name
                      , 'manager' value d.manager_id
                      , 'employees' value (
                          select json_arrayagg(
                                   json_object(
                                     'id' value e.employee_id
                                   , 'lastname' value e.last_name
                                   , 'hiredate' value e.hire_date
                                   , 'salary' value e.salary
                                   )
                                   order by e.last_name
                                 )
                          from hr.employees e
                          where e.department_id = d.department_id
                        )
                        absent on null
                      )
                      order by d.department_id
                    )
             from hr.departments d
             where d.location_id = l.location_id
           )
           absent on null         
         )
         returning clob
       ) 
       as result
from hr.locations l
;

Below is an excerpt of the generated content (manually formatted for clarity) :

[
  {"id":1000,"address":"1297 Via Cola di Rie","city":"Roma","country":"IT"},
  {"id":1100,"address":"93091 Calle della Testa","city":"Venice","country":"IT"},
  {"id":1200,"address":"2017 Shinjuku-ku","city":"Tokyo","country":"JP"},
  {"id":1300,"address":"9450 Kamiya-cho","city":"Hiroshima","country":"JP"},
  {
    "id":1400,
    "address":"2014 Jabberwocky Rd",
    "city":"Southlake",
    "country":"US",
    "departments":[
      {
        "id":60,
        "name":"IT",
        "manager":103,
        "employees":[
          {"id":105,"lastname":"Austin","hiredate":"1997-06-25T00:00:00","salary":4800},
          {"id":104,"lastname":"Ernst","hiredate":"1991-05-21T00:00:00","salary":6000},
          {"id":103,"lastname":"Hunold","hiredate":"1990-01-03T00:00:00","salary":9000},
          {"id":107,"lastname":"Lorentz","hiredate":"1999-02-07T00:00:00","salary":4200},
          {"id":106,"lastname":"Pataballa","hiredate":"1998-02-05T00:00:00","salary":4800}
        ]
      }
    ]
  },
  {
    "id":1500,
    "address":"2011 Interiors Blvd",
    "city":"South San Francisco",
    "country":"US",
    "departments":[
      ...
    ]
  },
  ...
]

The JSON document possesses three logical levels of entities organized in a master-details structure : locations > departments > employees.

The goal is now to load each entity back into its own table :

create table my_locations as 
select location_id, street_address, city, country_id from hr.locations where 1 = 0;

create table my_departments as
select department_id, department_name, manager_id from hr.departments where 1 = 0;

create table my_employees as 
select employee_id, last_name, hire_date, salary from hr.employees where 1 = 0;

 

Example

Just like the technique referenced in introduction using chained XMLTABLEs, we can follow the same logic here using JSON_TABLE and NESTED PATH clauses to flatten out all levels :

select x.location_id
     , x.street_address
     , x.city
     , x.country_id
     , nvl(x.dept_rnum, 1) as dept_rnum
     , x.department_id
     , x.department_name
     , x.manager_id
     , nvl(x.emp_rnum, 1) as emp_rnum
     , x.employee_id
     , x.last_name
     , x.hire_date
     , x.salary
from json_documents t
   , json_table(t.doc,
       '$[*]'
       columns location_id     number(4)    path '$.id'
             , street_address  varchar2(40) path '$.address'
             , city            varchar2(30) path '$.city'
             , country_id      varchar2(2)  path '$.country'
             , nested path '$.departments[*]'
               columns (
                 dept_rnum        for ordinality
               , department_id    number(4)    path '$.id'
               , department_name  varchar2(30) path '$.name'
               , manager_id       number(6)    path '$.manager'
               , nested path '$.employees[*]'
                 columns (
                   emp_rnum     for ordinality
                 , employee_id  number(6)    path '$.id'
                 , last_name    varchar2(25) path '$.lastname'
                 , hire_date    date         path '$.hiredate'
                 , salary       number(8,2)  path '$.salary'
                 )
               )
     ) x
where t.id = 2
;
 
LOCATION_ID STREET_ADDRESS           CITY                  COUNTRY_ID  DEPT_RNUM DEPARTMENT_ID DEPARTMENT_NAME  MANAGER_ID   EMP_RNUM EMPLOYEE_ID LAST_NAME   HIRE_DATE    SALARY
----------- ------------------------ --------------------- ---------- ---------- ------------- ---------------- ---------- ---------- ----------- ----------- ----------- -------
       1000 1297 Via Cola di Rie     Roma                  IT                  1                                                    1                                     
       1100 93091 Calle della Testa  Venice                IT                  1                                                    1                                     
       1200 2017 Shinjuku-ku         Tokyo                 JP                  1                                                    1                                     
       1300 9450 Kamiya-cho          Hiroshima             JP                  1                                                    1                                     
       1400 2014 Jabberwocky Rd      Southlake             US                  1            60 IT                      103          1         105 Austin      25/06/1997     4800
       1400 2014 Jabberwocky Rd      Southlake             US                  1            60 IT                      103          2         104 Ernst       21/05/1991     6000
       1400 2014 Jabberwocky Rd      Southlake             US                  1            60 IT                      103          3         103 Hunold      03/01/1990     9000
       1400 2014 Jabberwocky Rd      Southlake             US                  1            60 IT                      103          4         107 Lorentz     07/02/1999     4200
       1400 2014 Jabberwocky Rd      Southlake             US                  1            60 IT                      103          5         106 Pataballa   05/02/1998     4800
       1500 2011 Interiors Blvd      South San Francisco   US                  1            50 Shipping                121          1         130 Atkinson    30/10/1997     2800
       1500 2011 Interiors Blvd      South San Francisco   US                  1            50 Shipping                121          2         192 Bell        04/02/1996     4000
       1500 2011 Interiors Blvd      South San Francisco   US                  1            50 Shipping                121          3         129 Bissot      20/08/1997     3300
 ...
 

Now we just have to insert this data set into MY_LOCATIONS, MY_DEPARTMENTS and MY_EMPLOYEES using a multitable INSERT statement.
In order to load each location only once, we’ll only take those rows having DEPT_RNUM = 1 and EMP_RNUM = 1.
Likewise for departments with EMP_RNUM = 1, and excluding rows with actually no departement data (DEPARTMENT_ID IS NOT NULL).
Finally, employees data will be loaded from rows where EMPLOYEE_ID is not NULL :

insert all 
when dept_rnum = 1 and emp_rnum = 1 
     then into my_locations (location_id, street_address, city, country_id)
                     values (location_id, street_address, city, country_id)
when department_id is not null and emp_rnum = 1
     then into my_departments (department_id, department_name, manager_id)
                       values (department_id, department_name, manager_id)
when employee_id is not null
     then into my_employees (employee_id, last_name, hire_date, salary)
                     values (employee_id, last_name, hire_date, salary)
select x.location_id
     , x.street_address
     , x.city
     , x.country_id
     , nvl(x.dept_rnum, 1) as dept_rnum
     , x.department_id
     , x.department_name
     , x.manager_id
     , nvl(x.emp_rnum, 1) as emp_rnum
     , x.employee_id
     , x.last_name
     , x.hire_date
     , x.salary
from json_documents t
   , json_table(t.doc,
       '$[*]'
       columns location_id     number(4)    path '$.id'
             , street_address  varchar2(40) path '$.address'
             , city            varchar2(30) path '$.city'
             , country_id      varchar2(2)  path '$.country'
             , nested path '$.departments[*]'
               columns (
                 dept_rnum        for ordinality
               , department_id    number(4)    path '$.id'
               , department_name  varchar2(30) path '$.name'
               , manager_id       number(6)    path '$.manager'
               , nested path '$.employees[*]'
                 columns (
                   emp_rnum     for ordinality
                 , employee_id  number(6)    path '$.id'
                 , last_name    varchar2(25) path '$.lastname'
                 , hire_date    date         path '$.hiredate'
                 , salary       number(8,2)  path '$.salary'
                 )
               )         
     ) x
where t.id = 2
;

156 rows inserted.
 

SQL> select * from my_locations;
 
LOCATION_ID STREET_ADDRESS                           CITY                           COUNTRY_ID
----------- ---------------------------------------- ------------------------------ ----------
       1000 1297 Via Cola di Rie                     Roma                           IT
       1100 93091 Calle della Testa                  Venice                         IT
       1200 2017 Shinjuku-ku                         Tokyo                          JP
       1300 9450 Kamiya-cho                          Hiroshima                      JP
       1400 2014 Jabberwocky Rd                      Southlake                      US
       1500 2011 Interiors Blvd                      South San Francisco            US
...
       2800 Rua Frei Caneca 1360                     Sao Paulo                      BR
       2900 20 Rue des Corps-Saints                  Geneva                         CH
       3000 Murtenstrasse 921                        Bern                           CH
       3100 Pieter Breughelstraat 837                Utrecht                        NL
       3200 Mariano Escobedo 9991                    Mexico City                    MX
 
23 rows selected.
 
SQL> select * from my_departments;
 
DEPARTMENT_ID DEPARTMENT_NAME                MANAGER_ID
------------- ------------------------------ ----------
           60 IT                                    103
           50 Shipping                              121
           10 Administration                        200
           30 Purchasing                            114
           90 Executive                             100
          100 Finance                               108
          110 Accounting                            205
          120 Treasury                       
...                  
          270 Payroll                        
           20 Marketing                             201
           40 Human Resources                       203
           80 Sales                                 145
           70 Public Relations                      204
 
27 rows selected.
 
SQL> select * from my_employees;
 
EMPLOYEE_ID LAST_NAME                 HIRE_DATE       SALARY
----------- ------------------------- ----------- ----------
        105 Austin                    25/06/1997     4800.00
        104 Ernst                     21/05/1991     6000.00
        103 Hunold                    03/01/1990     9000.00
        107 Lorentz                   07/02/1999     4200.00
        106 Pataballa                 05/02/1998     4800.00
        130 Atkinson                  30/10/1997     2800.00
        192 Bell                      04/02/1996     4000.00
...
        159 Smith                     10/03/1997     8000.00
        171 Smith                     23/02/1999     7400.00
        157 Sully                     04/03/1996     9500.00
        176 Taylor                    24/03/1998     8600.00
        150 Tucker                    30/01/1997    10000.00
        155 Tuvault                   23/11/1999     7000.00
        162 Vishney                   11/11/1997    10500.00
        149 Zlotkey                   29/01/2000    10500.00
        204 Baer                      07/06/1994    10000.00
 
106 rows selected.
 

23 + 27 + 106 = 156
All rows accounted for :)

 

Hierarchical JSON Aggregation : the JSONNest function

June 17, 2018 Leave a comment

This post echoes one of my previous work on the topic, this time applied to JSON data.

The idea is to build a user-defined aggregate function to automatically nest a hierarchically-ordered set of JSON objects, and return a single JSON document containing the whole hierarchy.

For example, from this adjacency list :

        ID  PARENT_ID NAME
---------- ---------- ------
         1            Item_1
         2          1 Item_2
         3          1 Item_3
         4          3 Item_4
         5          4 Item_5

to :

{
  "id": 1,
  "name": "Item_1",
  "children": [
    {
      "id": 2,
      "name": "Item_2"
    },
    {
      "id": 3,
      "name": "Item_3",
      "children": [
        {
          "id": 4,
          "name": "Item_4",
          "children": [
            {
              "id": 5,
              "name": "Item_5"
            }
          ]
        }
      ]
    }
  ]
}

 

The implementation I’ll present here requires the following :

  • An object type to hold the definition of each node (JSONNestItem)
  • A PL/SQL package for the core routines and context maintenance (JSONNestUtil)
  • A object type for ODCI routines (JSONNestImpl)
  • The aggregate function (JSONNest)

Source code available on GitHub :

/mbleron/oracle/JSONNest

Installation :

@JSONNestItem.tps
@JSONNestUtil.pks
@JSONNestUtil.pkb
@JSONNestImpl.tps
@JSONNestImpl.tpb
@JSONNest.fnc

 

Implementation

JSONNest requires at least Oracle 12.2.
It uses PL/SQL JSON object types introduced in that release, allowing fine-grained manipulation of JSON data.
The implementation is a little more complicated than expected because of a bug (I reported) in 12.2, related to accessing JSON objects by reference :
ORA-7445 Or Corrupted Result When Nesting PL/SQL JSON Objects (Doc ID 2361514.1)

The bug is announced to be fixed in 18c (18.1), but I don’t have the opportunity to check yet.
The workaround requires maintaining a stack of JSON objects. When I get to work on a 18c instance, I’ll rewrite the code to use object references instead.

The function :

function JSONNest (input in JSONNestItem) 
return clob
aggregate using JSONNestImpl;

where JSONNestItem is an object type describing a node of the hierarchy :

create or replace type JSONNestItem as object (
  item_level    integer
, json_content  varchar2(32767)
, wrapper       varchar2(4000)
)
  • item_level : node level in the hierarchy, typically derived from LEVEL pseudocolumn in a CONNECT-BY query, or computed in a recursive CTE.
  • json_content : actual JSON content of this node. Must be a JSON object.
  • wrapper : name of the field containing the array of child nodes.

 

Example

I’ll use the usual HR.EMPLOYEES table to generate the hierarchy of employees :

select JSONNest( 
         JSONNestItem(
           level
         , json_object(
             'id' value employee_id
           , 'name' value first_name || ' ' ||last_name
           )
        , 'employees'
        )
      ) as result
from hr.employees
connect by prior employee_id = manager_id
start with manager_id is null
order siblings by last_name
;

Output (formatted) :

{
  "id": 100,
  "name": "Steven King",
  "employees": [
    {
      "id": 101,
      "name": "Neena Kochhar",
      "employees": [
        {
          "id": 108,
          "name": "Nancy Greenberg",
          "employees": [
            {
              "id": 109,
              "name": "Daniel Faviet"
            },
            {
              "id": 110,
              "name": "John Chen"
            },
            {
              "id": 111,
              "name": "Ismael Sciarra"
            },
            {
              "id": 112,
              "name": "Jose Manuel Urman"
            },
            {
              "id": 113,
              "name": "Luis Popp"
            }
          ]
        },
        {
          "id": 200,
          "name": "Jennifer Whalen"
        },
        {
          "id": 203,
          "name": "Susan Mavris"
        },
        {
          "id": 204,
          "name": "Hermann Baer"
        },
        {
          "id": 205,
          "name": "Shelley Higgins",
          "employees": [
            {
              "id": 206,
              "name": "William Gietz"
            }
          ]
        }
      ]
    },
    {
      "id": 102,
      "name": "Lex De Haan",
      "employees": [
        {
          "id": 103,
          "name": "Alexander Hunold",
          "employees": [
            {
              "id": 104,
              "name": "Bruce Ernst"
            },
            {
              "id": 105,
              "name": "David Austin"
            },
            {
              "id": 106,
              "name": "Valli Pataballa"
            },
            {
              "id": 107,
              "name": "Diana Lorentz"
            }
          ]
        }
      ]
    },
    {
      "id": 114,
      "name": "Den Raphaely",
      "employees": [
        {
          "id": 115,
          "name": "Alexander Khoo"
        },
        {
          "id": 116,
          "name": "Shelli Baida"
        },
        {
          "id": 117,
          "name": "Sigal Tobias"
        },
        {
          "id": 118,
          "name": "Guy Himuro"
        },
        {
          "id": 119,
          "name": "Karen Colmenares"
        }
      ]
    },
    {
      "id": 120,
      "name": "Matthew Weiss",
      "employees": [
        {
          "id": 125,
          "name": "Julia Nayer"
        },
        {
          "id": 126,
          "name": "Irene Mikkilineni"
        },
        {
          "id": 127,
          "name": "James Landry"
        },
        {
          "id": 128,
          "name": "Steven Markle"
        },
        {
          "id": 180,
          "name": "Winston Taylor"
        },
        {
          "id": 181,
          "name": "Jean Fleaur"
        },
        {
          "id": 182,
          "name": "Martha Sullivan"
        },
        {
          "id": 183,
          "name": "Girard Geoni"
        }
      ]
    },
    {
      "id": 121,
      "name": "Adam Fripp",
      "employees": [
        {
          "id": 129,
          "name": "Laura Bissot"
        },
        {
          "id": 130,
          "name": "Mozhe Atkinson"
        },
        {
          "id": 131,
          "name": "James Marlow"
        },
        {
          "id": 132,
          "name": "TJ Olson"
        },
        {
          "id": 184,
          "name": "Nandita Sarchand"
        },
        {
          "id": 185,
          "name": "Alexis Bull"
        },
        {
          "id": 186,
          "name": "Julia Dellinger"
        },
        {
          "id": 187,
          "name": "Anthony Cabrio"
        }
      ]
    },
    {
      "id": 122,
      "name": "Payam Kaufling",
      "employees": [
        {
          "id": 133,
          "name": "Jason Mallin"
        },
        {
          "id": 134,
          "name": "Michael Rogers"
        },
        {
          "id": 135,
          "name": "Ki Gee"
        },
        {
          "id": 136,
          "name": "Hazel Philtanker"
        },
        {
          "id": 188,
          "name": "Kelly Chung"
        },
        {
          "id": 189,
          "name": "Jennifer Dilly"
        },
        {
          "id": 190,
          "name": "Timothy Gates"
        },
        {
          "id": 191,
          "name": "Randall Perkins"
        }
      ]
    },
    {
      "id": 123,
      "name": "Shanta Vollman",
      "employees": [
        {
          "id": 137,
          "name": "Renske Ladwig"
        },
        {
          "id": 138,
          "name": "Stephen Stiles"
        },
        {
          "id": 139,
          "name": "John Seo"
        },
        {
          "id": 140,
          "name": "Joshua Patel"
        },
        {
          "id": 192,
          "name": "Sarah Bell"
        },
        {
          "id": 193,
          "name": "Britney Everett"
        },
        {
          "id": 194,
          "name": "Samuel McCain"
        },
        {
          "id": 195,
          "name": "Vance Jones"
        }
      ]
    },
    {
      "id": 124,
      "name": "Kevin Mourgos",
      "employees": [
        {
          "id": 141,
          "name": "Trenna Rajs"
        },
        {
          "id": 142,
          "name": "Curtis Davies"
        },
        {
          "id": 143,
          "name": "Randall Matos"
        },
        {
          "id": 144,
          "name": "Peter Vargas"
        },
        {
          "id": 196,
          "name": "Alana Walsh"
        },
        {
          "id": 197,
          "name": "Kevin Feeney"
        },
        {
          "id": 198,
          "name": "Donald OConnell"
        },
        {
          "id": 199,
          "name": "Douglas Grant"
        }
      ]
    },
    {
      "id": 145,
      "name": "John Russell",
      "employees": [
        {
          "id": 150,
          "name": "Peter Tucker"
        },
        {
          "id": 151,
          "name": "David Bernstein"
        },
        {
          "id": 152,
          "name": "Peter Hall"
        },
        {
          "id": 153,
          "name": "Christopher Olsen"
        },
        {
          "id": 154,
          "name": "Nanette Cambrault"
        },
        {
          "id": 155,
          "name": "Oliver Tuvault"
        }
      ]
    },
    {
      "id": 146,
      "name": "Karen Partners",
      "employees": [
        {
          "id": 156,
          "name": "Janette King"
        },
        {
          "id": 157,
          "name": "Patrick Sully"
        },
        {
          "id": 158,
          "name": "Allan McEwen"
        },
        {
          "id": 159,
          "name": "Lindsey Smith"
        },
        {
          "id": 160,
          "name": "Louise Doran"
        },
        {
          "id": 161,
          "name": "Sarath Sewall"
        }
      ]
    },
    {
      "id": 147,
      "name": "Alberto Errazuriz",
      "employees": [
        {
          "id": 162,
          "name": "Clara Vishney"
        },
        {
          "id": 163,
          "name": "Danielle Greene"
        },
        {
          "id": 164,
          "name": "Mattea Marvins"
        },
        {
          "id": 165,
          "name": "David Lee"
        },
        {
          "id": 166,
          "name": "Sundar Ande"
        },
        {
          "id": 167,
          "name": "Amit Banda"
        }
      ]
    },
    {
      "id": 148,
      "name": "Gerald Cambrault",
      "employees": [
        {
          "id": 168,
          "name": "Lisa Ozer"
        },
        {
          "id": 169,
          "name": "Harrison Bloom"
        },
        {
          "id": 170,
          "name": "Tayler Fox"
        },
        {
          "id": 171,
          "name": "William Smith"
        },
        {
          "id": 172,
          "name": "Elizabeth Bates"
        },
        {
          "id": 173,
          "name": "Sundita Kumar"
        }
      ]
    },
    {
      "id": 149,
      "name": "Eleni Zlotkey",
      "employees": [
        {
          "id": 174,
          "name": "Ellen Abel"
        },
        {
          "id": 175,
          "name": "Alyssa Hutton"
        },
        {
          "id": 176,
          "name": "Jonathon Taylor"
        },
        {
          "id": 177,
          "name": "Jack Livingston"
        },
        {
          "id": 178,
          "name": "Kimberely Grant"
        },
        {
          "id": 179,
          "name": "Charles Johnson"
        }
      ]
    },
    {
      "id": 201,
      "name": "Michael Hartstein",
      "employees": [
        {
          "id": 202,
          "name": "Pat Fay"
        }
      ]
    }
  ]
}

In addition, I’ve also published a function taking a weakly-typed ref cursor and returning the same JSON output.
The cursor must be based on a query returning exactly three columns matching the attributes of JSONNestItem object type :

DECLARE

  rc      sys_refcursor;
  output  clob;
  
BEGIN
 
  open rc for
  select level
       , json_object(
           'id' value employee_id
         , 'name' value first_name || ' ' ||last_name
         )
       , 'employees'
  from hr.employees
  connect by prior employee_id = manager_id
  start with manager_id is null
  order siblings by last_name;
  
  output := JSONNestUtil.getDocument(rc);
   
END;
/

 

Parsing sibling repeating XML groups : Part 2

June 3, 2018 Leave a comment

This post is an addendum to a previously published How-To presenting various techniques to break sibling repeating groups into relational data :

<root>
  <id>1</id>
  <name>SMITH</name>
  <id>2</id>
  <name>WILSON</name>
  <id>3</id>
  <name>GRANT</name>
</root>

to :

ID    NAME
----- ---------
    1 SMITH
    2 WILSON
    3 GRANT

The topic was raised again recently on the Oracle Community forum, but with a particularity I didn’t cover originally : what about optional nodes?
For instance, if id 2 has no name :

<root>
  <id>1</id>
  <name>SMITH</name>
  <id>2</id>
  <id>3</id>
  <name>GRANT</name>
</root>

The “following-sibling” trick will not return the required output :

SQL> select x.*
  2  from xmltable(
  3         'for $i in /root/id
  4          return element r {
  5            $i/following-sibling::name[1]
  6          , $i
  7          }'
  8         passing xmlparse(document
  9         '<root>
 10            <id>1</id><name>SMITH</name>
 11            <id>2</id>
 12            <id>3</id><name>GRANT</name>
 13          </root>')
 14         columns id    number       path 'id'
 15               , name  varchar2(15) path 'name'
 16       ) x
 17  ;
 
        ID NAME
---------- ---------------
         1 SMITH
         2 GRANT          
         3 GRANT

One way to workaround this is to add another predicate to ensure that the node is actually the one coming directly after the current key element, and not one belonging to another logical sibling group further down in the document.

In the following example, I’ll use the node identity comparison operator ‘is’ :

SQL> select x.*
  2  from xmltable(
  3         'for $i in /root/id
  4          return element r {
  5            $i/following-sibling::name[1][preceding-sibling::id[1] is $i]
  6          , $i
  7          }'
  8         passing xmlparse(document
  9         '<root>
 10            <id>1</id><name>SMITH</name>
 11            <id>2</id>
 12            <id>3</id><name>GRANT</name>
 13          </root>')
 14         columns id    number       path 'id'
 15               , name  varchar2(15) path 'name'
 16       ) x
 17  ;
 
        ID NAME
---------- ---------------
         1 SMITH
         2                
         3 GRANT
 

 

Additional Approaches and Performance Considerations

The full XQuery approach (w/ or w/o RETURNING SEQUENCE BY REF clause) is rather compact and maintainable, but it is also very slow when applied on large XML documents.
Mixed approaches using XQuery and SQL post-processing provide far better performance.
I’ll present below three other ways using PIVOT, MODEL and MATCH_RECOGNIZE SQL features.

Read more…

Categories: SQL, XQuery Tags: , ,