Skip to contents

A function to create the import specifications for a particular data file. This information can be used on the catalog or fetch functions to correctly assign the data types for columns on imported data. The import specifications are defined as name/value pairs, where the name is the column name and the value is the data type indicator. Available data type indicators are 'guess', 'logical', 'character', 'integer', 'numeric', 'date', 'datetime', and 'time'.

Also note that multiple import specifications can be combined into a collection, and assigned to an entire catalog. See the specs function for an example of using a specs collection.

Usage

import_spec(..., na = NULL, trim_ws = NULL)

Arguments

...

Named pairs of column names and column data types, separated by commas. Available types are: 'guess', 'logical', 'character', 'integer', 'numeric', 'date', 'datetime', and 'time'. The date/time data types accept an optional input format. To supply the input format, append it after the data type following an equals sign, e.g.: 'date=%d%b%Y' or 'datetime=%d-%m-%Y %H:%M:%S'. Default is NULL, meaning no column types are specified, and the function should make its best guess for each column.

na

A vector of values to be treated as NA. For example, the vector c('', ' ') will cause empty strings and single blanks to be converted to NA values. Default is NULL, meaning the value of the na parameter will be taken from the specs function. Any value supplied on the import_spec function will override the value from the specs function.

trim_ws

Whether or not to trim white space from the input data values. The default is NULL, meaning the value of the trim_ws parameter will be taken from the specs function. Any value supplied on the import_spec function will override the value from the specs function.

Value

The import specification object. The class of the object will be "import_spec".

Date/Time Format Codes

Below are some common date formatting codes. For a complete list, see the documentation for the strptime function:

  • %d = day as a number

  • %a = abbreviated weekday

  • %A = unabbreviated weekday

  • %m = month number

  • %b = abbreviated month name

  • %B = unabbreviated month name

  • %y = 2-digit year

  • %Y = 4-digit year

  • %H = hour

  • %M = minute

  • %S = second

  • %p = AM/PM indicator

See also

fetch to retrieve data, and specs for creating a collection of import specs.

Other specs: print.specs(), read.specs(), specs(), write.specs()

Examples

# Get sample data directory
pkg <- system.file("extdata", package = "fetch")

# Create import spec
spc <- import_spec(TRTSDT = "date=%d%b%Y",
                   TRTEDT = "date=%d%b%Y")

# Create catalog without filter
ct <- catalog(pkg, engines$csv, import_specs = spc)

# Get dictionary for ADVS with Import Spec
d <- ct$ADVS

# Observe data types for TRTSDT and TRTEDT are now Dates
d[d$Column %in% c("TRTSDT", "TRTEDT"), ]
# data item 'ADVS': 37 cols 3617 rows
#- Engine: csv
#- Size: 1.1 Mb
#- Last Modified: 2020-09-18 14:30:22
#   Name Column Class Label Format NAs MaxChar
#16 ADVS TRTSDT  Date  <NA>     NA  54      10
#17 ADVS TRTEDT  Date  <NA>     NA 119      10