Home > Miscellaneous, PL/SQL, SQL > ExcelTable 1.3 : support for password-encrypted files

ExcelTable 1.3 : support for password-encrypted files

Here’s the new version of ExcelTable, which can now read password-encrypted files.
It supports Standard and Agile encryption methods, as specified in [MS-OFFCRYPTO].

By default, Office 2007 will encrypt using the Standard method whereas Office 2010 and onwards use Agile encryption.
AES (128 or 256) is usually the default algorithm on standard Office installations.
Because latest Office versions (2013+) make use of SHA-2 hashing algorithms, Oracle 12c is required to read Excel documents encrypted in those versions.

Basically, the only change from ExcelTable 1.2 is the addition of an optional argument p_password in getRows() function :

function getRows (
  p_file     in blob
, p_sheet    in varchar2
, p_cols     in varchar2
, p_range    in varchar2 default null
, p_method   in binary_integer default DOM_READ
, p_password in varchar2 default null
) 
return anydataset pipelined
using ExcelTableImpl;

The following dependencies are also required :
XUTL_CDF
XUTL_OFFCRYPTO

 

Source code available on GitHub :

/mbleron/oracle/ExcelTable

 

A few words about the internals are following…

 

Office Crypto internals

As hinted in my previous post, an encrypted Office document is actually a CFBF (also referred to as “OLE2”, or “CDF”) container, instead of an OPC package for regular Office 2007+ files.

For example :

SQL> select t.path
  2  from table(
  3         xutl_cdf.get_streams(
  4           file2blob('TMP_DIR', 'crypto2016.xlsx')
  5         )
  6       ) t
  7  ;
 
PATH
--------------------------------------------------------------------------------
/DataSpaces/DataSpaceInfo/StrongEncryptionDataSpace
/DataSpaces/DataSpaceMap
/DataSpaces/TransformInfo/StrongEncryptionTransform/Primary
/DataSpaces/Version
/EncryptedPackage
/EncryptionInfo
 

The container holds multiple streams but only the last two, EncryptedPackage and EncryptionInfo are actually needed to decrypt the content :

  • EncryptedPackage is the encrypted OPC package itself (minus the first 8 bytes representing the size of the decrypted package)
  • EncryptionInfo describes how the original package has been encrypted : encryption and hash algorithms, key size, as well as additional data for password verification and content integrity check (Agile method).

The EncryptionInfo stream structure (Standard or Agile method) depends on the Office version that created the document.
The following shows the default methods vs. Office versions :

Office version Method Encryption Hash algorithm Block chaining
2007 Standard AES-128 SHA-1 ECB
2010 Agile AES-128 SHA-1 CBC
2013 Agile AES-256 SHA512 CBC
2016 Agile AES-256 SHA512 CBC

In the Standard method, EncryptionInfo is a binary stream containing various integer and string fields.

In the Agile method, the stream embeds an XML document instead :

select xmltype(
         xutl_cdf.get_stream(
           file2blob('TMP_DIR', 'crypto2016.xlsx')
         , '/EncryptionInfo'
         , p_offset => 8
         )
       , nls_charset_id('AL32UTF8')
       ) as "XmlDescriptor"
from dual;

Output :

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<encryption xmlns="http://schemas.microsoft.com/office/2006/encryption" xmlns:p="http://schemas.microsoft.com/office/2006/keyEncryptor/password" xmlns:c="http://schemas.microsoft.com/office/2006/keyEncryptor/certificate">
  <keyData saltSize="16" blockSize="16" keyBits="256" hashSize="64" cipherAlgorithm="AES" cipherChaining="ChainingModeCBC" hashAlgorithm="SHA512" saltValue="rWF28d3hAudpvSdHinMWzQ=="/>
  <dataIntegrity encryptedHmacKey="bnfxgRjY1NOwc7c297o6MZREFoSC8LaFbsnf98UNm+QB3xDC3xJ6FnvBsnRfFbiumzAAz7NaT8nyPZ5Efio+Sw==" encryptedHmacValue="5qAiVCv2ofGVkyGiDxmn3i3OZO0xC6aPyKYBoqabbVNfQpnCRisvikxGr04HvO2ic0wMXgHmBnzio4VQpZbouQ=="/>
  <keyEncryptors>
    <keyEncryptor uri="http://schemas.microsoft.com/office/2006/keyEncryptor/password">
      <p:encryptedKey spinCount="100000" saltSize="16" blockSize="16" keyBits="256" hashSize="64" cipherAlgorithm="AES" cipherChaining="ChainingModeCBC" hashAlgorithm="SHA512" saltValue="L5ciDWMGt7x9eHtBpRVqZw==" encryptedVerifierHashInput="Hiw93cH8gTZEFaQiTfbtrw==" encryptedVerifierHashValue="ZUS+YW8yD267k0gMPzPbGJa75Zo9l3KYgGYf9BG2CI8JT5mUEyppfdysbdeJUnd1byDnRVHRIz0X2Lj2qCNeBA==" encryptedKeyValue="rQkFTBL7E8geZ2FtUVVGq1Q7kZ/GnzJ2ZKu3DFOwUoQ="/>
    </keyEncryptor>
  </keyEncryptors>
</encryption>

 

Example

Using sample file crypto2016.xlsx :

SQL> select *
  2  from table(
  3         ExcelTable.getRows(
  4           ExcelTable.getFile('TMP_DIR','crypto2016.xlsx')
  5         , 'Feuil1'
  6         , '"COL1" number'
  7         , 'C5'
  8         , 0
  9         , p_password => 'AZE'
 10         )
 11       )
 12  ;
 
      COL1
----------
         1
         2
         3

 

Advertisements
  1. Esteban
    June 13, 2017 at 21:11

    Thanks..

  2. July 1, 2017 at 13:44

    Personally, I would remove the dependency on having to grant select on sys.v_$parameter to determine max_string_size. Just use a simple function.

    function MaxStringSize return integer is
    vResult integer;
    begin
    select lengthb(lpad(‘x’, 32767, ‘x’))
    into vResult
    from dual;
    return vResult;
    end;

    • July 1, 2017 at 19:24

      Thanks Paul.
      I knew that SQL silently truncated strings exceeding the VARCHAR2 limit, but never actually thought about using that “feature” in this case.
      Will certainly follow your advice.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: