重庆思庄Oracle、、PostgreSQL、Redhat认证学习论坛

标题: 11g 新功能:外部表中使用preprocessor不用解压即可直接访问压缩文件 [打印本页]

作者: 郑全    时间: 2017-1-4 15:41
标题: 11g 新功能:外部表中使用preprocessor不用解压即可直接访问压缩文件
preprocessor的意思是,数据是以某种格式存放的,需要你去预先去处理一下,而不要去改变原来的文件,比如:有个gz格式的压缩文件,你处理它的时候不要去试图解压缩他,而是把它的内容显示出来,符合他的预处理程序有:zcat 、gzip -cd ,前面做实验的时候想当然的认为预处理就是把它事先解压缩,写了个shell,用了gzip -d $1 ,做了很长时间都没有成功,每次建完external table的时候都把原始文件给解压缩了。

Oracle? Database Utilities  -> 14 The ORACLE_LOADER Access Driver

If the file you want to load contains data records that are not in a format supported by the ORACLE_LOADER access driver, then use the PREPROCESSOR clause to specify a user-supplied preprocessor program that will execute for every data file. Note that the program specification must be enclosed in a shell script. if it uses arguments (see the description of "file_spec").
The preprocessor program converts the data to a record format supported by the access driver and then writes the converted record data to standard output (stdout), which the access driver reads as input. The syntax of the PREPROCESSOR clause is as follows:


下面做个简单的例子:
1.生成数据文件 d.dat 内容为:

13,987,1998-01-10 00:00:00,3,999,1,1232.16,
13,1660,1998-01-10 00:00:00,3,999,1,1232.16,
13,1762,1998-01-10 00:00:00,3,999,1,1232.16,
13,1843,1998-01-10 00:00:00,3,999,1,1232.16,
13,1948,1998-01-10 00:00:00,3,999,1,1232.16,
13,2273,1998-01-10 00:00:00,3,999,1,1232.16,
13,2380,1998-01-10 00:00:00,3,999,1,1232.16,


2.生成压缩数据文件:
gzip d.dat
3.写个shell  :uncompress  ,内容为,shell的权限对oracle用户有x权限
/bin/gzip -cd $1
4.建一个directory :DUMP
create directory dump as '/home/oracle/dump';
5.把压缩数据文件、shell文件都放dump下
6.

CREATE TABLE ext2
(
  "PROD_ID" NUMBER,
  "CUST_ID" NUMBER,
  "TIME_ID" DATE,
  "CHANNEL_ID" NUMBER,
  "PROMO_ID" NUMBER,
  "QUANTITY_SOLD" NUMBER(10,2),
  "AMOUNT_SOLD" NUMBER(10,2)
)
ORGANIZATION external
(
  TYPE oracle_loader
  DEFAULT DIRECTORY DUMP
  ACCESS PARAMETERS
  (
    RECORDS DELIMITED BY '
' CHARACTERSET US7ASCII
    PREPROCESSOR DUMP:'uncompress'
    BADFILE 'DUMP':'c.bad'
    LOGFILE 'c.log_xt'
    READSIZE 1048576
    FIELDS TERMINATED BY "," LDRTRIM
    MISSING FIELD VALUES ARE NULL
    REJECT ROWS WITH ALL NULL FIELDS
    (
      "PROD_ID" CHAR(255)
        TERMINATED BY ",",
      "CUST_ID" CHAR(255)
        TERMINATED BY ",",
      "TIME_ID" CHAR(255)
        TERMINATED BY ","
        DATE_FORMAT DATE MASK "yyyy-mm-dd hh24:mi:ss",
      "CHANNEL_ID" CHAR(255)
        TERMINATED BY ",",
      "PROMO_ID" CHAR(255)
        TERMINATED BY ",",
      "QUANTITY_SOLD" CHAR(255)
        TERMINATED BY ",",
      "AMOUNT_SOLD" CHAR(255)
        TERMINATED BY ","
    )
  )
  location
  (
    'd.dat.gz'
  )
)REJECT LIMIT UNLIMITED parallel 6
;






欢迎光临 重庆思庄Oracle、、PostgreSQL、Redhat认证学习论坛 (http://bbs.cqsztech.com/) Powered by Discuz! X3.2