 Tabular data handling toolkit |
Manual page for xlcleaner(1)(TDH)
xlcleaner(1) - clean up Excel/spreadsheet tab-delimited export files
SYNOPSIS
cat tabdata | xlcleaner options
DESCRIPTION
xlcleaner cleans up
the irregularities and annoyances often present in MS Excel tab-delimited
export files, including rounding of numeric values to reasonable precisions.
The tab-delimited export text file is read from standard input and the cleaned-up
result is written to standard output.
Default behavior is to convert tab-delimited files to files that use the
null field symbol (which is = by default or may be changed in your
project config file).
Incoming lines that hold nothing but white space are omitted.
Blanks that are embedded in datafields are converted to underscores,
unless -b is used.
EXAMPLE
cat tabdata | xlcleaner -1 > out
META OPTIONS
-1
-
-
Equivalent to -s -r 0 -t -f . Useful in converting tab delimited data
exported from spreadsheets.
OPTIONS
-r D
-
-
Rounding option. Round numeric values to a reasonable # of decimal places.
D may be used to alter the normal behavior. If D = 0 nothing happens. If D = 1
rounding will go to 1 additional decimal place. D = -1 rounding will go to one
less decimal place than normal.
-t
-
-
Create tab-delimited output, overriding any previously specified field delimitation character.
Also, if -c is being used, comment lines will
be copied to output verbatim. Unless -b is used, embedded blanks
will still be converted to underscores.
-f
-
-
The first non-comment, non-empty line is examined; the number of fields (non-blank or blank)
is counted and this value is used to set the number of fields per record.
All subsequent records will be truncated to that number of fields.
A warning will be issued for any discarded fields having content other than blank.
If your data set has blank fields at the end of the first record it may be best
to remove these manually with a text editor before running xlcleaner -f.
If -s is in effect, blank fields (spacer columns) will be included in the initial count
and in the truncation, but then will be screened out before writing the result.
-n N
-
-
Same as -f but N (the number of fields to process) may be specified directly.
-s
-
-
skip spacer columns. Spacer columns are empty "columns" or fields.
They are often created by spreadsheet users for spacing purposes.
This option works as follows: The first non-empty non-comment line is examined.
Any empty fields in this line are considered "spacers". Those fields will
then be screened out when the result output is written.
A warning will be issued if any content is encountered in a spacer field
that is being omitted. For example, if fields 5 and 9 of the first line are empty,
and -s is in effect, then fields 5 and 9 of all records will be screened out.
-c SYM
-
-
Define SYM as a comment symbol present in the input. Any line whose first
non-whitespace content is this symbol will be copied to output with a DMS comment
symbol (//) added at the beginning.
-p N
-
-
Add padding of N blanks to end of each result record.
-b
-
-
don't convert blanks that are embedded within data fields, to underscores.
-m SYM
-
-
Set null data code to SYM, overriding any previously defined null data code.
|
 Tabular data handling toolkit
Copyright Steve Grubb
|