Home > HowTo, JSON, PL/SQL, SQL > JSON Flattening Part 1 – The JSONFlatten function

JSON Flattening Part 1 – The JSONFlatten function

This article presents an approach to flatten out a JSON structure into an “Edge” table.
That’s something I’ve been exploring multiple times in the past for XML data, see for example : Yet Another XML Flattening Technique.

I’ll apply the same technique used for XMLFlattenDOM function to JSON data, leveraging the new JSON PL/SQL types introduced in the 12.2 release : JSON_ELEMENT_T, JSON_OBJECT_T, JSON_ARRAY_T.
As those object types implement a DOM-like API (based on the internal JDOM_T object), it is quite easy to adapt my previous approach to JSON.

Given the following (very) simple JSON object :

{
  "key":"ABC",
  "arr":[1,2,3,null]
}

This is the output we’re looking for :

 ID TYPE    PARENT_ID NAME     POS      VALUE
--- ------ ---------- ------- ---- -----------
  1 object          0
  2 string          1 key               ABC
  3 array           1 arr
  4 number          3            1      1
  5 number          3            2      2
  6 number          3            3      3
  7 null            3            4

 

1- Edge object and collection types

The output of the table function will be based on the following object and collection types :

create or replace type JSONEdge as object (
  target_id   integer
, flag        integer
, source_id   integer
, label       varchar2(4000)
, ordinal     integer
, value       varchar2(4000)
, path        varchar2(4000)
);
/

create or replace type JSONEdgeTable as table of JSONEdge;

  • target_id is the unique identifier of the edge record.
  • source_id is the identifier of the parent edge : properties (keys) are considered children of the enclosing JSON object, and likewise for items of a JSON array.
  • flag is the type of element this edge represents. Values are taken from the DBMS_JSON package :
    TYPE_NULL    CONSTANT NUMBER(2)  := 1;
    TYPE_BOOLEAN CONSTANT NUMBER(2)  := 2;
    TYPE_NUMBER  CONSTANT NUMBER(2)  := 3;
    TYPE_STRING  CONSTANT NUMBER(2)  := 4;
    TYPE_OBJECT  CONSTANT NUMBER(2)  := 5;
    TYPE_ARRAY   CONSTANT NUMBER(2)  := 6;
    
  • label is the name of the object property (key), when applicable.
  • ordinal is the position (1-based) of the item in its parent JSON array, when applicable.
  • value is the scalar value (as a string) of the JSON element this edge represents, when its type is one of : STRING, NUMBER or BOOLEAN.
  • path is the SQL/JSON Path expression addressing this edge.

 

2- The JSONFlatten function

JSONFlatten is a pipelined function returning a JSONEdgeTable instance. Therefore it must be used in conjunction with the TABLE operator.

create or replace function JSONFlatten(
  p_json in clob
) 
return JSONEdgeTable pipelined 
is

  type stack_item is record (id integer, elt json_element_t, len integer, idx integer, path varchar2(4000));
  type stack      is table of stack_item;

  st          stack := stack();
  i           pls_integer := 0;
  id          integer := 0;
  idx         integer := 0;
  len         integer := 0;
  elt         json_element_t;
  edge        JSONEdge := JSONEdge(0,null,null,null,null,null,'$');
  key         varchar2(4000);
  
  source_elt  json_element_t;
  source_obj  json_object_t;
  source_arr  json_array_t;

  function escape_json (p_str in varchar2) return varchar2 is
    str  varchar2(4000) := p_str;
  begin
    str := replace(str, '\', '\\');  
    str := replace(str, '"', '\"');  
    str := replace(str, '/', '\/');  
    str := replace(str, chr(8), '\b');
    str := replace(str, chr(9), '\t');
    str := replace(str, chr(10), '\n');  
    str := replace(str, chr(12), '\f');  
    str := replace(str, chr(13), '\r');  
    if instr(str, '\') != 0 then
      str := '"' || str || '"';
    end if;
    return str; 
  end;
  
  procedure push_elt (p_id in integer, p_elt in json_element_t, p_offset in pls_integer, p_path in varchar2) is
  begin
    st.extend;
    i := i + 1;
    st(i).id := p_id;
    st(i).elt := p_elt;
    st(i).idx := 1 - p_offset;
    st(i).len := elt.get_size() - p_offset;
    st(i).path := p_path;
  end; 

begin
  
  elt := json_element_t.parse(p_json);
  
  loop
    
    if i > 0 then
      edge.source_id := st(i).id;
      idx := st(i).idx;
      len := st(i).len;
      edge.path := st(i).path;
    end if;
    
    if idx <= len then
    
      if i > 0 then
        
        source_elt := st(i).elt;
        
        if source_elt.is_object() then
          source_obj := treat(source_elt as json_object_t);
          key := source_obj.get_keys()(idx);
          elt := source_obj.get(key);
          edge.ordinal := null;
          edge.path := edge.path || '.' || escape_json(key);
        else
          source_arr := treat(source_elt as json_array_t);
          elt := source_arr.get(idx);
          key := null;
          edge.ordinal := idx + 1;
          edge.path := edge.path || '[' || to_char(idx) || ']';
        end if;
        
        st(i).idx := idx + 1;
      
      end if;
      
      id := id + 1;
      edge.label := key;
      edge.target_id := id;
      edge.value := null;
          
      case 
      when elt.is_object then       
        edge.flag := dbms_json.TYPE_OBJECT;
        push_elt(id, elt, 0, edge.path);

      when elt.is_array then     
        edge.flag := dbms_json.TYPE_ARRAY;
        push_elt(id, elt, 1, edge.path);
        
      when elt.is_string then 
        edge.flag := dbms_json.TYPE_STRING;
        edge.value := case when source_elt.is_object then source_obj.get_string(key) else source_arr.get_string(idx) end;
      
      when elt.is_number then
        edge.flag := dbms_json.TYPE_NUMBER;
        edge.value := case when source_elt.is_object then source_obj.get_number(key) else source_arr.get_number(idx) end;
      
      when elt.is_boolean then
        edge.flag := dbms_json.TYPE_BOOLEAN;
        edge.value := case when elt.is_true then 'true' else 'false' end;
      
      else
        edge.flag := dbms_json.TYPE_NULL;    
        
      end case;
      
      pipe row (edge);
    
    else
      
      st.trim;
      i := i - 1;
    
    end if;
    
    exit when i = 0;
  
  end loop;

  return;
  
end JSONFlatten;

The function walks down the JSON structure starting from the top-level element (object or array).
A pseudo-recursive approach is implemented by using a local stack to store non-scalar JSON elements being processed, as well as the current position of their respective children : object properties or array items.

 

3- Examples

create table json_documents (
  id  integer
, doc clob check (doc is json)
);

insert into json_documents values (1,
'{
 "str":"ABC",
 "arr":[{"flag":true},{"flag":false}],
 "obj":{
   "item1":{"s1":1,"s2":2},
   "item2":{"s1":1,"s2":2},
   "item3":[0,1,2,3,4,null,true,"false",{"x\ny":"ABC"}]
 }
}');

Query #1 :

SQL> select jf.*
  2  from json_documents t
  3     , table(jsonflatten(t.doc)) jf
  4  where t.id = 1;

 TARGET_ID  FLAG  SOURCE_ID LABEL    ORDINAL VALUE     PATH
---------- ----- ---------- ------- -------- --------- ------------------------
         1     5                                       $
         2     4          1 str              ABC       $.str
         3     6          1 arr                        $.arr
         4     5          3                1           $.arr[0]
         5     2          4 flag             true      $.arr[0].flag
         6     5          3                2           $.arr[1]
         7     2          6 flag             false     $.arr[1].flag
         8     5          1 obj                        $.obj
         9     5          8 item1                      $.obj.item1
        10     3          9 s1               1         $.obj.item1.s1
        11     3          9 s2               2         $.obj.item1.s2
        12     5          8 item2                      $.obj.item2
        13     3         12 s1               1         $.obj.item2.s1
        14     3         12 s2               2         $.obj.item2.s2
        15     6          8 item3                      $.obj.item3
        16     3         15                1 0         $.obj.item3[0]
        17     3         15                2 1         $.obj.item3[1]
        18     3         15                3 2         $.obj.item3[2]
        19     3         15                4 3         $.obj.item3[3]
        20     3         15                5 4         $.obj.item3[4]
        21     1         15                6           $.obj.item3[5]
        22     2         15                7 true      $.obj.item3[6]
        23     4         15                8 false     $.obj.item3[7]
        24     5         15                9           $.obj.item3[8]
        25     4         24 x                ABC       $.obj.item3[8]."x\ny"
                            y

Query #2 :

insert into json_documents values (2,
  httpuritype('https://api.github.com/users/oracle/repos').getclob()
);
SQL> set pages 3000
SQL> select jf.*
  2  from json_documents t
  3     , table(jsonflatten(t.doc)) jf
  4  where t.id = 2;

 TARGET_ID  FLAG  SOURCE_ID LABEL                  ORDINAL VALUE                                                                            PATH
---------- ----- ---------- --------------------- -------- -------------------------------------------------------------------------------- --------------------------------
         1     6                                                                                                                            $
         2     5          1                              1                                                                                  $[0]
         3     3          2 id                             33195015                                                                         $[0].id
         4     4          2 name                           accelerators                                                                     $[0].name
         5     4          2 full_name                      oracle/accelerators                                                              $[0].full_name
         6     5          2 owner                                                                                                           $[0].owner
         7     4          6 login                          oracle                                                                           $[0].owner.login
         8     3          6 id                             4430336                                                                          $[0].owner.id
         9     4          6 avatar_url                     https://avatars2.githubusercontent.com/u/4430336?v=3                             $[0].owner.avatar_url
        10     4          6 gravatar_id                                                                                                     $[0].owner.gravatar_id
        11     4          6 url                            https://api.github.com/users/oracle                                              $[0].owner.url
        12     4          6 html_url                       https://github.com/oracle                                                        $[0].owner.html_url
        13     4          6 followers_url                  https://api.github.com/users/oracle/followers                                    $[0].owner.followers_url
        14     4          6 following_url                  https://api.github.com/users/oracle/following{/other_user}                       $[0].owner.following_url
        15     4          6 gists_url                      https://api.github.com/users/oracle/gists{/gist_id}                              $[0].owner.gists_url
        16     4          6 starred_url                    https://api.github.com/users/oracle/starred{/owner}{/repo}                       $[0].owner.starred_url
        17     4          6 subscriptions_url              https://api.github.com/users/oracle/subscriptions                                $[0].owner.subscriptions_url
        18     4          6 organizations_url              https://api.github.com/users/oracle/orgs                                         $[0].owner.organizations_url
        19     4          6 repos_url                      https://api.github.com/users/oracle/repos                                        $[0].owner.repos_url
        20     4          6 events_url                     https://api.github.com/users/oracle/events{/privacy}                             $[0].owner.events_url
        21     4          6 received_events_url            https://api.github.com/users/oracle/received_events                              $[0].owner.received_events_url
        22     4          6 type                           Organization                                                                     $[0].owner.type
        23     2          6 site_admin                     false                                                                            $[0].owner.site_admin
        24     2          2 private                        false                                                                            $[0].private
        25     4          2 html_url                       https://github.com/oracle/accelerators                                           $[0].html_url
        26     4          2 description                    Open Source Accelerator samples and tools for the Oracle Service Cloud (OSvC) de $[0].description
        27     2          2 fork                           false                                                                            $[0].fork
        28     4          2 url                            https://api.github.com/repos/oracle/accelerators                                 $[0].url

...

      2597     3       2525 stargazers_count               0                                                                                $[29].stargazers_count
      2598     3       2525 watchers_count                 0                                                                                $[29].watchers_count
      2599     4       2525 language                       HTML                                                                             $[29].language
      2600     2       2525 has_issues                     true                                                                             $[29].has_issues
      2601     2       2525 has_projects                   true                                                                             $[29].has_projects
      2602     2       2525 has_downloads                  true                                                                             $[29].has_downloads
      2603     2       2525 has_wiki                       false                                                                            $[29].has_wiki
      2604     2       2525 has_pages                      true                                                                             $[29].has_pages
      2605     3       2525 forks_count                    0                                                                                $[29].forks_count
      2606     1       2525 mirror_url                                                                                                      $[29].mirror_url
      2607     3       2525 open_issues_count              0                                                                                $[29].open_issues_count
      2608     3       2525 forks                          0                                                                                $[29].forks
      2609     3       2525 open_issues                    0                                                                                $[29].open_issues
      2610     3       2525 watchers                       0                                                                                $[29].watchers
      2611     4       2525 default_branch                 master                                                                           $[29].default_branch

2611 rows selected.

 

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: