[Non-API] Flat file data source provider - an ODA run-time extension.
Note: The implementation classes are not public APIs.
Backward compatibility support in future releases is not guaranteed.
Package Specification
The ODA flat file driver serves as an exemplary implementation
of the ODA run-time interfaces.
It performs basic data source provider functionalities including:
- Executes a query on a specific flat file(CSV, SSV, TSV, PSV) using SQL-like query syntax
- Provides the query's result set metadata
- Retrieves the query's result set data from the CSV data file
Consuming the ODA flat file driver
- Start by creating an IConnection instance by the FlatFileDriver.getConnection method.
- Open the connection using the IConnection.open( Properties prop ) method.
The driver-specific connection property names are:
- "HOME" : The directory of the flat file (required property)
- "CHARSET" : The character set for decoding the data file; default value= "UTF-8"
- "INCLCOLUMNNAME": Indicates whether the flat file contains column name meta-data; valid values= "YES" (default), "NO"
- "INCLTYPELINE" : Indicates whether the flat file contains data type meta-data; valid values= "YES" (default), "NO"
- "DELIMTYPE": Indicates the delimiter type the flat file uses; valid values="COMMA"(default), "SEMICOLON", "TAB", "PIPE"
- Create an IQuery instance by the IConnection.newQuery( String dataSetType ) method.
- Execute the query by the IQuery.executeQuery method, which returns an IResultSet for data retrieval.
Data store format
The flat file ODA driver expects that both meta-data (including column names
and data types) and data are kept in a single flat file.
The first line of a flat file specifies data column names.
The second line may optionally specify the column data types.
The remaining portion of the file contains data.
Redundant Spaces
Redundant spaces are allowed in a flat file, but will be trimmed once processed by the flat file driver.
Double Quotes
Double Quotes can be used in a flat file for the purpose of clarity.
The quotes, however, will be trimmed once processed by the driver.
That is, a line in flat file like
is processed to be the same as the following line:
A comma within a pair of double quotes would not be treated as separator.
For example,
"I'm, however, a really normal String"
contains a single column value.
It is not considered the same as the following line:
I'm, however, a really normal String
because the second case is processed to contain three columnn values,
i.e. "I'm", "however", and "a really normal String".
Null Values
Null values are allowed. They are presented as blanks, and are comma-separated from other data.
A flat file that contains only one column, and all the data is of null value will be treated as an empty table.
Data Types
Flat file driver currently supports the following data types:
INT, DOUBLE, STRING, DATE, TIME, TIMESTAMP and BIGDECIMAL.
Support for BLOB and CLOB data types will be added in future.
The driver's data type codes are defined as follows:
INT = java.sql.Types.INTEGER;
DOUBLE = java.sql.Types.DOUBLE;
STRING = java.sql.Types.VARCHAR;
DATE = java.sql.Types.DATE;
TIME = java.sql.Types.TIME;
TIMESTAMP = java.sql.Types.TIMESTAMP;
BIGDECIMAL = java.sql.Types.NUMERIC;
BLOB = java.sql.Types.BLOB;
CLOB = java.sql.Types.CLOB;
SQL-like Query Syntax
The flat file driver supports limited SQL-like query syntax.
The supported syntax is:
(those in square brackets are optional):
- SELECT column1 [AS alias1] [,column2 [AS alias2]]... [,columnN [AS aliasN]] FROM tableName
The query text is case in-sensitive and allows redundant spaces.
The flat file driver does not support multiple tables in the FROM clause.
For example, a table is named "employee.csv" with columns "Id", "Name" and "HireDate",
the following queries are valid:
SELECT Id FROM employee.csv
SELECT Id,Name FROM employee.csv
SELECT Id,Name,HireDate FROM employee.csv
select Id AS PersonnelId, Name AS EmployeeName FROM employee.csv
SELECT Name AS EmployeeName , Id , HireDate FROM employee.csv
SELECT name, name from employee.csv
// This command is valid, but would simply return a result set of two columns with same data values.
However, the following queries are invalid:
Id FROM EMPLOYEE.csv
// missing keyword "SELECT"
Select id
// missing keyword "FROM"
SELECT I FROM employee.csv
// invalid column name
Advanced SOL-like query-text is also defined, and it can be used in the dataset design,
for detailed information please refer to "org.eclipse.datatolls.connectivity.oda.flatfile.util.querytextutil.package.html"
@since 3.0