Home > HowTo, PL/SQL, SQL > How To : Convert Path Enumeration to Adjency List

How To : Convert Path Enumeration to Adjency List

An Adjency List is a way to describe a finite graph structure by associating each vertex of the graph to each of its neighbours.
For a simple oriented tree, a possible implemention of the Adjency List model is to store a node along with its parent node reference.
In a RDBMS, this is indeed a very common way of storing hierarchical data, such as the Employee-Manager relationship :

SQL> select empno, ename, mgr
  2  from scott.emp;
 
EMPNO ENAME        MGR
----- ---------- -----
 7369 SMITH       7902
 7499 ALLEN       7698
 7521 WARD        7698
 7566 JONES       7839
 7654 MARTIN      7698
 7698 BLAKE       7839
 7782 CLARK       7839
 7788 SCOTT       7566
 7839 KING       
 7844 TURNER      7698
 7876 ADAMS       7788
 7900 JAMES       7698
 7902 FORD        7566
 7934 MILLER      7782

Such a structure can be easily traversed using a CONNECT-BY query, or Recursive Subquery Factoring :

SQL> select ename, ltrim(sys_connect_by_path(ename, '/'), '/') as path
  2  from scott.emp
  3  connect by prior empno = mgr
  4  start with mgr is null
  5  order siblings by empno;
 
ENAME      PATH
---------- -----------------------------------------------------------
KING       KING
JONES      KING/JONES
SCOTT      KING/JONES/SCOTT
ADAMS      KING/JONES/SCOTT/ADAMS
FORD       KING/JONES/FORD
SMITH      KING/JONES/FORD/SMITH
BLAKE      KING/BLAKE
ALLEN      KING/BLAKE/ALLEN
WARD       KING/BLAKE/WARD
MARTIN     KING/BLAKE/MARTIN
TURNER     KING/BLAKE/TURNER
JAMES      KING/BLAKE/JAMES
CLARK      KING/CLARK
MILLER     KING/CLARK/MILLER
 
SQL> with tree_walker (empno, ename, path) as (
  2    select empno, ename, ename
  3    from scott.emp
  4    where mgr is null
  5    union all
  6    select e.empno, e.ename, w.path || '/' || e.ename
  7    from tree_walker w
  8         join scott.emp e on e.mgr = w.empno
  9  )
 10  search depth first by empno set emp_order
 11  select ename, path
 12  from tree_walker;
 
ENAME      PATH
---------- ---------------------------------------------
KING       KING
JONES      KING/JONES
SCOTT      KING/JONES/SCOTT
ADAMS      KING/JONES/SCOTT/ADAMS
FORD       KING/JONES/FORD
SMITH      KING/JONES/FORD/SMITH
BLAKE      KING/BLAKE
ALLEN      KING/BLAKE/ALLEN
WARD       KING/BLAKE/WARD
MARTIN     KING/BLAKE/MARTIN
TURNER     KING/BLAKE/TURNER
JAMES      KING/BLAKE/JAMES
CLARK      KING/CLARK
MILLER     KING/CLARK/MILLER
 

Now, what if we want to – given only a list of paths – generate back an adjency data set?

For example, from this list of leaf paths :

KING/BLAKE/ALLEN
KING/BLAKE/JAMES
KING/BLAKE/MARTIN
KING/BLAKE/TURNER
KING/BLAKE/WARD
KING/CLARK/MILLER
KING/JONES/FORD/SMITH
KING/JONES/SCOTT/ADAMS

to :

  ID NAME     PARENT_ID
---- -------- ---------
   1 KING     
   2 JONES            1
   3 SCOTT            2
   4 ADAMS            3
   5 FORD             2
   6 SMITH            5
   7 BLAKE            1
   8 ALLEN            7
   9 WARD             7
  10 MARTIN           7
  11 TURNER           7
  12 JAMES            7
  13 CLARK            1
  14 MILLER          13

This post presents a way to achieve that result, using a pipelined table function.

 

The HModel API

/mbleron/oracle/HModel

create or replace package HModel is

  type Node is record (id integer, pid integer, name varchar2(4000));
  type Tree is table of Node;

  -- Convert Path Enumeration to Adjency List
  function getAdjencyList (
    p_rc  in sys_refcursor
  , p_sep in varchar2
  ) 
  return Tree pipelined;

end HModel;

Example :

SQL> select *
  2  from table(
  3         HModel.getAdjencyList(
  4           cursor(
  5             select 'KING/JONES/SCOTT/ADAMS' from dual union all
  6             select 'KING/JONES/FORD/SMITH'  from dual union all
  7             select 'KING/BLAKE/ALLEN'       from dual union all
  8             select 'KING/BLAKE/WARD'        from dual
  9           )
 10         , '/'
 11         )
 12       )
 13  ;
 
  ID  PID NAME
---- ---- -------
   1      KING
   2    1 JONES
   3    2 SCOTT
   4    3 ADAMS
   5    2 FORD
   6    5 SMITH
   7    1 BLAKE
   8    7 ALLEN
   9    7 WARD

 

Real-life example

Converting all file paths from my C: drive to a hierarchical structure :

1- Target table

create table all_paths ( path varchar2(256) );

2- Data generation and load (using SQL*Loader)

I’m testing this on my Win7 laptop, so using dir command to list the content of my C: drive recursively :

c:\dev\oracle\tmp>dir c:\ /b /s > dir_tree.txt

c:\dev\oracle\tmp>sqlldr userid=dev@pdb1 control=dirload.ctl
Password:

SQL*Loader: Release 12.1.0.2.0 - Production on Mon Oct 3 21:57:16 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Path used:      Direct

Load completed - logical record count 286432.

Table "ALL_PATHS":
  286432 Rows successfully loaded.

Check the log file:
  test.log
for more information about the load.

and this is the control file used (dirload.ctl) :

options (direct=true)
load data 
characterset WE8PC850 
infile 'dir_tree.txt'   
into table "ALL_PATHS"  
truncate  
(  
    path char(256)  
)

Now let’s check the data with a few queries :

SQL> select path
  2  from all_paths
  3  where rownum < 10;
 
PATH
-----------------------------------
c:\.rnd
c:\0edb31f63f0be03c5ebc02a918933d
c:\2aa3728daf04642614b1ca
c:\DAAA
c:\DELL
c:\dev
c:\eula.1028.txt
c:\eula.1031.txt
c:\eula.1033.txt
 
SQL> select *
  2  from table(
  3         HModel.getAdjencyList(
  4           cursor(
  5             select path
  6             from all_paths
  7           )
  8         , '\'
  9         )
 10       )
 11  where rownum < 10;
 
  ID  PID NAME
---- ---- --------------------------------
   1      c:
   2    1 .rnd
   3    1 0edb31f63f0be03c5ebc02a918933d
   4    1 2aa3728daf04642614b1ca
   5    1 DAAA
   6    1 DELL
   7    1 dev
   8    1 eula.1028.txt
   9    1 eula.1031.txt
 

Counting all rows :

SQL> set timing on
SQL>
SQL> select count(*)
  2  from table(
  3         HModel.getAdjencyList(
  4           cursor(
  5             select path
  6             from all_paths
  7           )
  8         , '\'
  9         )
 10       )
 11  ;

  COUNT(*)
----------
    284895

Elapsed: 00:00:05.62

 

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: