Tag Archives: read CSV file with external table

External Tables in Oracle

External Tables in Oracle

External tables are the table that does not reside in the Oracle Database.
They can be in any format and read by access driver. When you create external table its metadata is exists in Oracle database.

You can select, join on external table but no DML operation such as insert, update & delete is possible.No index can be created on External Table.

External tables is used for performing basic extraction, transformation, and loading (ETL) tasks that are common for data warehousing. It is a way to use select statement on outside data without load into the database.

An Access driver is method used to read external data. Default access driver is ORACLE_LOADER,its allows the reading of data from external files(like SQL loader). Another is ORACLE_DATAPUMP, lets you unload data—that is, read data from the database and insert it into an external table.

In Oracle Database 12c Release 2 (12.2), new access drivers ORACLE_HIVE and ORACLE_HDFS are available.
ORACLE_HIVE access driver can extract data stored in Apache Hive.
ORACLE_HDFS access driver can extract data stored in a Hadoop Distributed File System (HDFS).

In Oracle 18c, Inline external tables enable the run-time definition of an external table as part of a SQL statement, without creating the external table as persistent object in the data dictionary. It is like temporary.

Creating an External Table

1. For external table you need comma separated CSV files

emp_detail.dat contains the following data:

1000,Ram,Kapoor,MRG,1000,15-MAY-2008
1001,Mark,Marshal,ITO,1000,12-MAY-2009
1002,Pooja,Semi,ITO,1000,17-MAY-2010
1003,Sham,Marsha,Developer,1000,17-MAY-2011

2. Create directories and assign permission to them in Oracle.
Note: Make sure the following directories exists in Operating system and has permission

CREATE OR REPLACE DIRECTORY ext_dat_dir AS 'D:\employeefiles\data';
CREATE OR REPLACE DIRECTORY ext_log_dir AS 'D:\employeefiles\log';
CREATE OR REPLACE DIRECTORY ext_bad_dir AS 'D:\employeesfiles\bad';
GRANT READ,WRITE ON DIRECTORY ext_dat_dir TO hr;
GRANT READ,WRITE ON DIRECTORY ext_log_dir TO hr;
GRANT READ,WRITE ON DIRECTORY ext_bad_dir TO hr;

3. Create an external table in hr.schema and load data

-- create the external table
CREATE TABLE ext_employees
(employee_id NUMBER(4),
first_name VARCHAR2(20),
last_name VARCHAR2(25),
job_id VARCHAR2(10),
manager_id NUMBER(4),
hire_date DATE,
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY ext_dat_dir
ACCESS PARAMETERS
(
records delimited by newline
badfile ext_bad_dir:'empxt%a_%p.bad'
logfile ext_log_dir:'empxt%a_%p.log'
fields terminated by ','
missing field values are null
( employee_id, first_name, last_name, job_id, manager_id,
hire_date char date_format date mask "dd-mon-yyyy"
)
)
LOCATION ('emp_detail.dat')
)
PARALLEL
REJECT LIMIT UNLIMITED;

4. Check the external table with select commands:

SELECT * FROM ext_employees;

Parameters:
TYPE : ORACLE_LOADER is default or ORACLE_DATAPUMP
REJECT LIMIT clause specifies no limit on the number of errors. if it specified 10 then allow up to 10 rejections.
PARALLEL clause enables parallel query on the data sources.

Modified the parameter used to create external table
Alter Reject limit
ALTER TABLE ext_employees REJECT LIMIT 100;
Alter default directory in external table
ALTER TABLE ext_employees DEFAULT DIRECTORY dat2_dir;
Alter access parameter
ALTER TABLE ext_employees ACCESS PARAMETERS (FIELDS TERMINATED BY ';');
Alter location of files
ALTER TABLE admin_ext_employees LOCATION ('empxt3.txt', 'empxt4.txt');

Override the parameter during Select statement on External Table

SELECT * FROM sales_external EXTERNAL MODIFY (LOCATION ('sales_9.csv')
REJECT LIMIT UNLIMITED);

Inline External Tables
Inline external tables is using the runtime definition of an external table as part of a SQL
statement, without creating the external table as in the data dictionary.
Without creating extrenal table, use select statement with define all other parameters.

SELECT * FROM EXTERNAL (
( id number not null,
time_id DATE NOT NULL,
prod_id INTEGER NOT NULL,
quantity_sold NUMBER(10,2),
amount_sold NUMBER(10,2))
TYPE ORACLE_LOADER
DEFAULT DIRECTORY data_dir1
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY '|')
LOCATION ('sales_2008.csv')
REJECT LIMIT UNLIMITED) sales_data;

Note: need to create data_dir1 directory.

Advertisements