Archive for November, 2016

ExcelTable 1.2 : introducing streaming support for large files

November 15, 2016 1 comment

ExcelTable is my attempt at building a SQL query interface to read MS Excel files in xlsx (or xlsm) format.
Version 1.2 is now available with the following new features :

  • Streaming read support for large Excel files
  • setFetchSize() routine to limit the number of rows processed per request



Streaming read method

The getRows() function has been extended with an additional optional argument p_method.
Allowed values are as follows :

-- Read methods  
DOM_READ     constant binary_integer := 0;
STREAM_READ  constant binary_integer := 1;

The default is 0 (DOM_READ).

The streaming method requires Java (StAX API) and is much more scalable than the DOM_READ method when accessing large files.
Please see the README for more details about what to install depending on the target database version.

Example on a 500,000-row file (bigfile.xlsx) :

select * 
from table(
       , 'data'      
       , q'{ 
            "ID"           number
          , "FIRST_NAME"   varchar2(15)
          , "LAST_NAME"    varchar2(20)
          , "EMAIL"        varchar2(30)
          , "GENDER"       varchar2(10)
          , "IP_ADDRESS"   varchar2(16)
          , "COMMENT"      varchar2(4000)
          , "IMAGE"        varchar2(4000)
          , "DT"           date  format 'DD/MM/YYYY'

       , 'A2'
       , 1
     ) t ;


Read more…