A function to create a collection of import specifications for a
data source. These specs can be used on the
catalog
function to correctly assign the data types uniquely for
different imported data files. The spec collection is a set of import_spec
objects identified by name/value pairs. The name corresponds to the name of
the input dataset, without file extension. The value is the import_spec
object to use for that dataset. In this way, you may define different
specs for each dataset in your catalog.
The import engines will guess at the data types for any columns that are not explicitly defined in the import specifications. The import spec syntax is the same for all data engines.
Note that the na
and trim_ws
parameters on the specs
function will be applied globally to all files in the library.
These global settings can be overridden on the import_spec
for any particular data file.
Also note that the specs
collection is defined as an object
so it can be stored and reused.
See the write.specs
and read.specs
functions
for additional information on saving and restoring specs.
Usage
specs(..., na = c("", "NA"), trim_ws = TRUE)
Arguments
- ...
Named input specs. The name should correspond to the file name, without the file extension. The spec is defined as an
import_spec
object. See theimport_spec
function for additional information on parameters for that object.- na
A vector of values to be treated as NA. For example, the vector
c('', ' ')
will cause empty strings and single blanks to be converted to NA values. For most file types, empty strings and the string 'NA'('', 'NA')
are considered NA. For SAS® datasets and transport files, a single blank and a single dotc(" ", ".")
are considered NA. The value of thena
parameter on thespecs
function can be overridden by thena
parameter on theimport_spec
function.- trim_ws
Whether or not to trim white space from the input data values. Valid values are TRUE, and FALSE. Default is TRUE. The value of the
trim_ws
parameter on thespecs
function can be overridden by thetrim_ws
parameter on theimport_spec
function.
See also
catalog
to create a data catalog,
fetch
for retrieving data, and
import_spec
for additional information on defining an
import spec.
Other specs:
import_spec()
,
print.specs()
,
read.specs()
,
write.specs()
Examples
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
# Create import spec
spc <- specs(ADAE = import_spec(TRTSDT = "date=%d%b%Y",
TRTEDT = "date=%d%b%Y"),
ADVS = import_spec(TRTSDT = "character",
TRTEDT = "character"))
# Create catalog with specs collection
ct <- catalog(pkg, engines$csv, import_specs = spc)
# Get dictionary for ADAE with Import Spec
d1 <- ct$ADAE
# Observe data types for TRTSDT and TRTEDT are Dates
d1[d1$Column %in% c("TRTSDT", "TRTEDT"), ]
# data item 'ADAE': 56 cols 150 rows
#- Engine: csv
#- Size: 155 Kb
#- Last Modified: 2020-09-18 14:30:22
# Name Column Class Label Format NAs MaxChar
#13 ADAE TRTSDT Date <NA> NA 1 10
#14 ADAE TRTEDT Date <NA> NA 4 10
# Get dictionary for ADVS with Import Spec
d2 <- ct$ADVS
# Observe data types for TRTSDT and TRTEDT are character
d2[d2$Column %in% c("TRTSDT", "TRTEDT"), ]
# data item 'ADVS': 37 cols 3617 rows
#- Engine: csv
#- Size: 1.1 Mb
#- Last Modified: 2020-09-18 14:30:22
# Name Column Class Label Format NAs MaxChar
#16 ADVS TRTSDT character <NA> NA 54 9
#17 ADVS TRTEDT character <NA> NA 119 9