PL/SQL RPN calculator

Some non-XML stuff for a change :)
This is something I’ve developed recently to evaluate stored expressions (formulas) using variables.
The “calculator” is written in PL/SQL and implements an RPN evaluation technique, as well as a method to convert infix expressions to RPN using the shunting-yard algorithm.
The evaluation function does not involve any dynamic SQL.

Program and additional objects available here : rpn_util.zip

Update 2014-11-24 – upgraded version available here :
PL/SQL RPN calculator ā€“ version 2

 

1. Package specification

Here’s the package spec exposing both parse and eval methods.

create or replace package rpn_util is

  function parse (p_expr in varchar2)
  return token_list deterministic ;

  function eval (tlist in token_list, vars in kv_table default kv_table()) 
  return binary_double deterministic ;

end rpn_util;

and the supporting object types :

create or replace type token_list is table of varchar2(30) ;

create or replace type kv_obj is object (
  token  varchar2(30)
, val    binary_double
) ;

create or replace type kv_table is table of kv_obj ;

 

2. Examples

  • The parse function
  •  
    The function takes a VARCHAR2 string representing an infix-based expression and returns a collection of RPN “tokens” in a TOKEN_LIST instance :

    SQL> select rpn_util.parse('(1+2)*3-5') as tokens
      2  from dual;
    
    TOKENS
    ----------------------------------------------------------------
    TOKEN_LIST('1', '2', '+', '3', '*', '5', '-')
    
    
    SQL> select rpn_util.parse('(1+2)*3-)5') as tokens
      2  from dual;
    select rpn_util.parse('(1+2)*3-)5') as tokens
           *
    ERROR at line 1:
    ORA-20002: Parentheses mismatched: at position 9
    ORA-06512: at "DEV.RPN_UTIL", line 134
    
    

     

  • The eval function
  •  
    The function takes a TOKEN_LIST collection of RPN tokens and returns the result of the expression :

    SQL> select rpn_util.eval(
      2           rpn_util.parse('(1+2)*3-5')
      3         ) as result
      4  from dual;
    
    RESULT
    ------
         4
    
    

    Besides standard arithmetic operators (+, -, *, /, % and ^), it is also possible to parse and evaluate functions.
    Functions we want to use this way have to be declared in the package : identifier, argument count and actual PL/SQL implementation.
    For example, MIN is internally mapped to SQL function LEAST and ABS (absolute value) to… ABS :

    SQL> select rpn_util.eval(
      2           rpn_util.parse('MIN(3,2) * ABS(4-5)')
      3         ) as result
      4  from dual;
    
    RESULT
    ------
         2
    
    

     
    Comparison (=, <, <=, > and >=) and boolean (AND, OR, NOT) operators are also supported via the IF function :

    SQL> select rpn_util.eval(
      2           rpn_util.parse('(MIN(3,2) * ABS(4-5))^IF(V1>2 OR NOT(V2=1 AND V3>=2),3,2)')
      3         , kv_table(
      4             kv_obj('V1',1)
      5           , kv_obj('V2',1)
      6           , kv_obj('V3',1)
      7           )
      8         ) as result
      9  from dual ;
     
        RESULT
    ----------
             8
     
    

     

  • Using parse and eval with variable values
  •  
    The function takes a second optional parameter of KV_TABLE datatype which is a collection of key-value pairs. Each item of the collection is a KV_OBJ object instance holding the variable name (token) referenced by the formula and its actual value.
    The values are then “bound” to the formula during evaluation.

    SQL> select rpn_util.eval(
      2           rpn_util.parse('(A + B) * C - D')
      3         , kv_table(
      4             kv_obj('A', 1)
      5           , kv_obj('B', 2)
      6           , kv_obj('C', 3)
      7           , kv_obj('D', 5)
      8           )
      9         ) as result
     10  from dual;
    
    RESULT
    ------
         4
    
    

My initial requirement was to evaluate a lot of formulas in one go, with multiple and different input values. Given this specific scenario, I was very reluctent to use dynamic (PL/)SQL – a classic method for the job nonetheless.

A typical (yet simple) use case is the calculation of a thermal efficiency η = output/input, where input and output would come from metering devices and would be stored in a dedicated table along with the timestamp of the measures.
The next section shows an example of this situation, with a more complex formula.

 

3. Benchmarking

To run some comparison tests, here’s a simplified version of my measure table. It stores hourly data of seven variables (A to G) for year 2012 :

SQL> desc measure_fact
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 DT                                                 DATE
 TOKEN                                              VARCHAR2(30)
 VAL                                                NUMBER

SQL> select * from measure_fact order by dt, token ;

DT                  TOKEN             VAL                                       
------------------- ---------- ----------                                       
01/01/2012 00:00:00 A          ,581904481                                       
01/01/2012 00:00:00 B          ,161674702                                       
01/01/2012 00:00:00 C           ,75959276                                       
01/01/2012 00:00:00 D          ,524613253                                       
01/01/2012 00:00:00 E           ,25470387                                       
01/01/2012 00:00:00 F          ,723589015                                       
01/01/2012 00:00:00 G           ,06515539                                       
01/01/2012 01:00:00 A          ,418608775                                       
01/01/2012 01:00:00 B          ,367579034                                       
01/01/2012 01:00:00 C          ,579998936                                       
01/01/2012 01:00:00 D          ,982256601                                       
01/01/2012 01:00:00 E          ,549623049                                       
01/01/2012 01:00:00 F          ,074974691                                       
01/01/2012 01:00:00 G           ,57354791                                       

<snip>

31/12/2012 23:00:00 A          ,874009005                                       
31/12/2012 23:00:00 B          ,825065349                                       
31/12/2012 23:00:00 C          ,493866511                                       
31/12/2012 23:00:00 D           ,36120651                                       
31/12/2012 23:00:00 E          ,308018922                                       
31/12/2012 23:00:00 F          ,672652837                                       
31/12/2012 23:00:00 G          ,241310825                                       

61488 rows selected.

For each hour of the year, I’ll calculate the formula ABS( ( ( A + B ) * C^D ) – E )^( F / G ) with three different methods and compare them : dynamic SQL with literal replacement, dynamic SQL with bind variables and the RPN evaluation technique.

 

a. Using a dynamic PL/SQL block with literal replacement

The “dynamic_eval” function is defined as follows :

create or replace function dynamic_eval (expr in varchar2, vars in kv_table)
return binary_double
is
  tmp varchar2(4000) := expr;
  res binary_double := 0;
begin

  for i in 1..vars.count loop
    tmp := replace(tmp, ':'||vars(i).token, vars(i).val);
  end loop;

  execute immediate 'begin :1 := '||tmp||'; end;' using out res;

  return res;

end;

The function is very simple, it just replaces each occurrence of the token in the input string with its value and use native dynamic sql to evaluate the resulting numeric expression.

Use case :

SQL> set autotrace traceonly statistics
SQL> set timing on
SQL> select dt
  2       , dynamic_eval('power(ABS(((:A+:B)*power(:C,:D))-:E),(:F/:G))'
  3                    , cast(collect(kv_obj(token, val)) as kv_table)) as res
  4  from measure_fact m
  5  group by m.dt;

8784 rows selected.

Elapsed: 00:01:07.18

Statistics
----------------------------------------------------------
      35136  recursive calls
          0  db block gets
        316  consistent gets
          0  physical reads
          0  redo size
     234721  bytes sent via SQL*Net to client
       6854  bytes received via SQL*Net from client
        587  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       8784  rows processed

The query is quite slow, for an obvious reason : it hard-parses 8764 different cursors!

 

b. Using a dynamic PL/SQL block with bind variables

The “dynamic_eval2” function is defined as follows :

create or replace function dynamic_eval2 (expr in varchar2, vars in kv_table)
return binary_double
is
  tmp   varchar2(4000) := expr;
  res   binary_double := 0;
  c     number := dbms_sql.open_cursor;
  dummy number;
begin

  dbms_sql.parse(c, 'begin :0 := '||tmp||'; end;', dbms_sql.native);
  
  for i in 1..vars.count loop
    dbms_sql.bind_variable(c, vars(i).token, vars(i).val);
  end loop;
  dbms_sql.bind_variable(c, '0', res);

  dummy := dbms_sql.execute(c);

  dbms_sql.variable_value(c, '0', res);
  dbms_sql.close_cursor(c);

  return res;

end;

This version is a little bit more complex than the previous one. It uses DBMS_SQL to parse the PL/SQL block and bind the values from the input collection to their respective tokens.

Use case :

SQL> select dt
  2       , dynamic_eval2('power(ABS(((:A+:B)*power(:C,:D))-:E),(:F/:G))'
  3                    , cast(collect(kv_obj(token, val)) as kv_table)) as res
  4  from measure_fact m
  5  group by m.dt;

8784 rows selected.

Elapsed: 00:00:12.79

Statistics
----------------------------------------------------------
      43920  recursive calls
          0  db block gets
        316  consistent gets
          0  physical reads
          0  redo size
     234721  bytes sent via SQL*Net to client
       6854  bytes received via SQL*Net from client
        587  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       8784  rows processed

We can immediately measure the improvement due to the bind variables : the cursor is parsed once and executed multiple times. However, the large number of recursive calls is still problematic.

 

c. Using the RPN technique
SQL> select m.dt
  2       , rpn_util.eval(
  3           rpn_util.parse('ABS(((A+B)*C^D)-E)^(F/G)')
  4         , cast(
  5             collect(kv_obj(m.token, m.val))
  6             as kv_table
  7           )
  8         ) as res
  9  from measure_fact m
 10  group by m.dt ;

8784 rows selected.

Elapsed: 00:00:04.68

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        316  consistent gets
          0  physical reads
          0  redo size
     189000  bytes sent via SQL*Net to client
       6854  bytes received via SQL*Net from client
        587  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       8784  rows processed

No recursive call, and execution time reduced by almost 2/3.
Note : the parse function is called only once because it’s been declared DETERMINISTIC.

 

d. Conclusion
Method Rows fetched Overall time (second)
Dynamic SQL with literal replacement 8784 67
Dynamic SQL with bind variables 8784 12.8
PL/SQL RPN evaluation 8784 4.7

 

4 thoughts on “PL/SQL RPN calculator

  1. Hi,
    I am new XML, and wondering if you could guide me in thr right direction.
    I have XML file, which has mutiple complex types. I have the XSD for the same. I need to read the XML file (Extract set of nodes based on the data from a oracle table / cursor) and create a XML file using the same XSD struture.

    Example i have a bank data in xml format with an XSD

    I have to extract data of certain bank accounts and create an xml file with the same structure as the original source.

    What approach should i take in Oracle 11g database ?

    1. Of course it does contain the body. Check file rpn_util.pck, it has both specs and body.
      BTW, this version has a few bugs left. I’ll post a revamped version in the near future, including a true Recursive Descent Parser that reports errors at compile-time.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.