| Title: | 'SAS'-Style 'PROC FORMAT' for R |
|---|---|
| Description: | Provides 'SAS' 'PROC FORMAT'-like functionality for creating and applying value formats in R. Supports discrete and range-based mapping of values to labels, reverse formatting (invalue), date/time/datetime formatting with built-in 'SAS' format names, multi-label formats, expression labels evaluated at apply-time, case-insensitive matching, import/export of format definitions, and proper handling of missing values (NA, NULL, NaN). |
| Authors: | Vladimir Larchenko [aut, cre], Igor Aleschenkov [aut] |
| Maintainer: | Vladimir Larchenko <[email protected]> |
| License: | GPL-3 |
| Version: | 0.8.1 |
| Built: | 2026-06-04 19:25:38 UTC |
| Source: | https://github.com/crow16384/ksformat |
Provides 'SAS' 'PROC FORMAT'-like functionality for creating and applying value formats in R. The package supports mapping values to labels, range-based formatting, reverse formatting (invalue), date/time/datetime formatting, and proper handling of missing values (NA, NULL, NaN).
Format creation:
fnew — create value-to-label mappings (formats)
finput — create reverse mappings (label-to-value invalues)
fnew_bid — create both format and invalue simultaneously
fnew_date — create date/time/datetime formats ('SAS'-style or
custom strftime patterns)
fparse — parse 'SAS'-like format definitions from text or file
fimport — import formats from a 'SAS' CNTLOUT CSV file
e — mark a label for expression evaluation at apply-time
Format application:
fput — apply a format to a vector (value to label)
fputn — apply a numeric format by name (like 'SAS' PUTN)
fputc — apply a character format by name (like 'SAS' PUTC)
fput_all — apply a multilabel format returning all matching labels
fput_df — apply formats to data frame columns
Reverse formatting:
finputn — apply a numeric invalue by name (like 'SAS' INPUTN)
finputc — apply a character invalue by name (like 'SAS' INPUTC)
Format library:
format_get — retrieve a format from the global library
fprint — list or display registered formats
fclear — remove one or all formats from the library
format_library_app — open interactive library browser (Shiny)
fexport — export formats to 'SAS'-like text
Utilities:
is_missing — check for NA, NaN, and empty strings
range_spec — create a range specification object
Key features:
Discrete and range-based numeric formatting with configurable inclusive/exclusive bounds
Multilabel formats — a value can match multiple labels
(multilabel = TRUE in fnew, retrieved with
fput_all)
Case-insensitive matching (ignore_case = TRUE in
fnew)
Expression labels — labels containing .x1, .x2,
etc. are evaluated at apply-time; see also e
Date/time/datetime formatting with built-in 'SAS' format
names (auto-resolved) or custom strftime patterns
Global format library with auto-registration and case-insensitive name lookup
CNTLOUT import — read format catalogues exported from 'SAS'
Cheat sheet: run ksformat_cheatsheet() to open the HTML
version in your browser, or see the files in
system.file("doc", package = "ksformat").
Maintainer: Vladimir Larchenko [email protected]
Authors:
Vladimir Larchenko [email protected]
Igor Aleschenkov [email protected]
Source repository and issue tracker: https://github.com/crow16384/ksformat
# Discrete format fnew("M" = "Male", "F" = "Female", .missing = "Unknown", name = "sex") fput(c("M", "F", NA), "sex") # Numeric range format (parsed from text) fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" ; ') fputn(c(5, 25), "age") # Bidirectional format + invalue fnew_bid("A" = "Active", "I" = "Inactive", name = "status") fputc("A", "status") finputc("Active", "status_inv") # Multilabel format ml <- fnew( "0,17,TRUE,TRUE" = "Pediatric", "18,Inf,TRUE,TRUE" = "Adult", "0,Inf,TRUE,TRUE" = "Any Age", name = "agegrp", type = "numeric", multilabel = TRUE ) fput_all(c(10, 30), ml) # Date format (SAS-style, auto-resolved) fputn(Sys.Date(), "DATE9.") # Export and library management cat(fexport(sex = format_get("sex"))) flist() # character vector of registered names fprint() fclear()# Discrete format fnew("M" = "Male", "F" = "Female", .missing = "Unknown", name = "sex") fput(c("M", "F", NA), "sex") # Numeric range format (parsed from text) fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" ; ') fputn(c(5, 25), "age") # Bidirectional format + invalue fnew_bid("A" = "Active", "I" = "Inactive", name = "status") fputc("A", "status") finputc("Active", "status_inv") # Multilabel format ml <- fnew( "0,17,TRUE,TRUE" = "Pediatric", "18,Inf,TRUE,TRUE" = "Adult", "0,Inf,TRUE,TRUE" = "Any Age", name = "agegrp", type = "numeric", multilabel = TRUE ) fput_all(c(10, 30), ml) # Date format (SAS-style, auto-resolved) fputn(Sys.Date(), "DATE9.") # Export and library management cat(fexport(sex = format_get("sex"))) flist() # character vector of registered names fprint() fclear()
Marks a format label string so it will be evaluated as an R expression
at apply-time (fput), even when it does not contain
.x1, .x2, etc. placeholders.
e(expr)e(expr)
expr |
Character string. The R expression to evaluate. |
This is useful when a label should call a function that does not need
positional .xN arguments.
The expression is evaluated in the caller's environment of
fput, so user-defined functions are accessible.
Labels containing .x1, .x2, etc. are still evaluated
automatically without needing e().
The same character string with an "eval" attribute set to
TRUE.
# Mark an expression for evaluation at apply-time fmt <- fnew( "timestamp" = e("format(Sys.time(), '%Y-%m-%d')"), "static" = "Hello", name = "demo_eval" ) fput(c("timestamp", "static"), fmt) fclear()# Mark an expression for evaluation at apply-time fmt <- fnew( "timestamp" = e("format(Sys.time(), '%Y-%m-%d')"), "static" = "Hello", name = "demo_eval" ) fput(c("timestamp", "static"), fmt) fclear()
Removes one or all formats from the global format library. When called without arguments, clears all formats. When called with a name, removes only that format.
fclear(name = NULL)fclear(name = NULL)
name |
Character. Optional name of a specific format to remove.
If |
Invisible NULL
fnew("M" = "Male", "F" = "Female", name = "sex") fclear("sex") # remove one format fclear() # remove all formatsfnew("M" = "Male", "F" = "Female", name = "sex") fclear("sex") # remove one format fclear() # remove all formats
Converts ks_format and/or ks_invalue objects to
human-readable 'SAS'-like text representation.
fexport(..., formats = NULL, file = NULL)fexport(..., formats = NULL, file = NULL)
... |
Named |
formats |
A named list of format objects. Alternative to |
file |
Optional file path to write the output to. If |
If file is NULL, returns a character string with the
'SAS'-like text. If file is specified, writes to the file and returns
the path invisibly.
# Export a character format sex_fmt <- fnew("M" = "Male", "F" = "Female", .missing = "Unknown", name = "sex") cat(fexport(sex = sex_fmt)) # Export a numeric range format fparse(text = ' VALUE bmi (numeric) [0, 18.5) = "Underweight" [18.5, 25) = "Normal" [25, 30) = "Overweight" [30, HIGH] = "Obese" .missing = "No data" ; ') bmi_fmt <- format_get("bmi") cat(fexport(bmi = bmi_fmt)) # Export a multilabel format risk_fmt <- fnew( "0,3,TRUE,TRUE" = "Low Risk", "0,7,TRUE,TRUE" = "Monitored", "3,7,FALSE,TRUE" = "Medium Risk", "7,10,FALSE,TRUE" = "High Risk", name = "risk", type = "numeric", multilabel = TRUE ) cat(fexport(risk = risk_fmt)) # Export a date format enrl_fmt <- fnew_date("DATE9.", name = "enrldt", .missing = "Not Enrolled") cat(fexport(enrldt = enrl_fmt)) fclear()# Export a character format sex_fmt <- fnew("M" = "Male", "F" = "Female", .missing = "Unknown", name = "sex") cat(fexport(sex = sex_fmt)) # Export a numeric range format fparse(text = ' VALUE bmi (numeric) [0, 18.5) = "Underweight" [18.5, 25) = "Normal" [25, 30) = "Overweight" [30, HIGH] = "Obese" .missing = "No data" ; ') bmi_fmt <- format_get("bmi") cat(fexport(bmi = bmi_fmt)) # Export a multilabel format risk_fmt <- fnew( "0,3,TRUE,TRUE" = "Low Risk", "0,7,TRUE,TRUE" = "Monitored", "3,7,FALSE,TRUE" = "Medium Risk", "7,10,FALSE,TRUE" = "High Risk", name = "risk", type = "numeric", multilabel = TRUE ) cat(fexport(risk = risk_fmt)) # Export a date format enrl_fmt <- fnew_date("DATE9.", name = "enrldt", .missing = "Not Enrolled") cat(fexport(enrldt = enrl_fmt)) fclear()
Reads a CSV file produced by 'SAS' PROC FORMAT with
CNTLOUT= option (typically exported via PROC EXPORT)
and converts compatible format definitions into ks_format and
ks_invalue objects.
fimport(file, register = TRUE, overwrite = TRUE)fimport(file, register = TRUE, overwrite = TRUE)
file |
Path to the CSV file exported from a SAS format catalogue. |
register |
Logical; if |
overwrite |
Logical; if |
The 'SAS' format catalogue CSV is expected to contain the standard CNTLOUT
columns: FMTNAME, START, END, LABEL,
TYPE, HLO, SEXCL, EEXCL.
Supported SAS format types:
NNumeric VALUE format ks_format with
type = "numeric"
CCharacter VALUE format ks_format with
type = "character"
INumeric INVALUE (informat) ks_invalue
with target_type = "numeric"
JCharacter INVALUE (informat) ks_invalue
with target_type = "character"
Incompatible types (logged with a warning):
PPICTURE formats no equivalent in ksformat
Rows with SAS special missing values (.A.Z,
._) in the HLO field are logged as incompatible entries and skipped
because R has no equivalent concept.
A named list of ks_format and ks_invalue objects that
were successfully imported. Returned invisibly.
# In SAS: # proc format library=work cntlout=fmts; run; # proc export data=fmts outfile="formats.csv" dbms=csv replace; run; csv_file <- system.file("extdata", "test_cntlout.csv", package = "ksformat") imported <- fimport(csv_file) flist() fprint() fclear()# In SAS: # proc format library=work cntlout=fmts; run; # proc export data=fmts outfile="formats.csv" dbms=csv replace; run; csv_file <- system.file("extdata", "test_cntlout.csv", package = "ksformat") imported <- fimport(csv_file) flist() fprint() fclear()
Creates an invalue format that converts formatted labels back to values.
This is similar to 'SAS' PROC FORMAT with INVALUE statement.
The invalue is automatically stored in the global format library if name
is provided.
finput( ..., name = NULL, target_type = "numeric", missing_value = NA, ignore_case = FALSE )finput( ..., name = NULL, target_type = "numeric", missing_value = NA, ignore_case = FALSE )
... |
Named arguments defining label-value mappings (reverse of |
name |
Character. Optional name for the invalue format. If provided, the invalue is automatically registered in the global format library. |
target_type |
Character. Type to convert to: |
missing_value |
Value to use for missing inputs (default: |
ignore_case |
Logical. If |
An object of class "ks_invalue" containing the invalue definition.
The object is also stored in the format library if name is given.
# Convert text labels to numeric codes finput( "Male" = 1, "Female" = 2, name = "sex_inv" ) # Apply using finputn (numeric invalue by name) finputn(c("Male", "Female", "Unknown"), "sex_inv") # [1] 1 2 NA fclear() # From a named vector finput(c(Male = 1, Female = 2), name = "sex_inv2") finputn(c("Male", "Female"), "sex_inv2") # [1] 1 2 fclear()# Convert text labels to numeric codes finput( "Male" = 1, "Female" = 2, name = "sex_inv" ) # Apply using finputn (numeric invalue by name) finputn(c("Male", "Female", "Unknown"), "sex_inv") # [1] 1 2 NA fclear() # From a named vector finput(c(Male = 1, Female = 2), name = "sex_inv2") finputn(c("Male", "Female"), "sex_inv2") # [1] 1 2 fclear()
Looks up an INVALUE format by name from the global format library and applies it to convert labels to character values.
finputc(x, invalue_name)finputc(x, invalue_name)
x |
Character vector of labels to convert |
invalue_name |
Character. Name of a registered INVALUE format. |
Character vector
# Bidirectional: use finputc for reverse direction fnew_bid( "A" = "Active", "I" = "Inactive", "P" = "Pending", name = "status" ) # Forward: code -> label fputc(c("A", "I", "P"), "status") # [1] "Active" "Inactive" "Pending" # Reverse: label -> code finputc(c("Active", "Pending", "Inactive"), "status_inv") # [1] "A" "P" "I" fclear()# Bidirectional: use finputc for reverse direction fnew_bid( "A" = "Active", "I" = "Inactive", "P" = "Pending", name = "status" ) # Forward: code -> label fputc(c("A", "I", "P"), "status") # [1] "Active" "Inactive" "Pending" # Reverse: label -> code finputc(c("Active", "Pending", "Inactive"), "status_inv") # [1] "A" "P" "I" fclear()
Convenience wrapper around an INVALUE lookup that pastes multiple vectors together into a composite label before reverse lookup. Mirrors [fputk()] on the invalue side, for INVALUE formats built with composite labels such as 'fmap(paste(col1, col2, sep = "|"), codes)'.
finputk(..., invalue_name, sep = "|", na_as_string = FALSE)finputk(..., invalue_name, sep = "|", na_as_string = FALSE)
... |
Vectors to paste together into a composite label. All vectors are recycled to a common length by [paste()]. |
invalue_name |
Character. Name of a registered INVALUE format. |
sep |
Separator inserted between the pasted components (default '"|"'). |
na_as_string |
If 'FALSE' (default), an 'NA' in any component propagates to the composite label (restored to 'NA_character_' after the [paste()] step) so the invalue's 'missing_value' applies. If 'TRUE', the literal string '"NA"' produced by [paste()] is kept, which is useful when the invalue was built with composite labels via 'fmap(paste(..., sep = "|"), values)'. |
The output type is determined by the stored invalue's 'target_type' (numeric / integer → numeric, character → character, logical → logical).
A vector whose type depends on the invalue's 'target_type'.
[finput()], [finputn()], [finputc()], [fputk()]
# Build an INVALUE keyed on two columns via paste() finput( fmap(paste(c("A", "A", "B"), c(1, 2, 1), sep = "|"), c(10, 20, 30)), name = "ab_inv" ) finputk(c("A", "A", "B"), c(1, 2, 1), invalue_name = "ab_inv") # -> 10 20 30 fclear()# Build an INVALUE keyed on two columns via paste() finput( fmap(paste(c("A", "A", "B"), c(1, 2, 1), sep = "|"), c(10, 20, 30)), name = "ab_inv" ) finputk(c("A", "A", "B"), c(1, 2, 1), invalue_name = "ab_inv") # -> 10 20 30 fclear()
Looks up a numeric INVALUE format by name from the global format library and applies it to convert labels to numeric values.
finputn(x, invalue_name)finputn(x, invalue_name)
x |
Character vector of labels to convert |
invalue_name |
Character. Name of a registered INVALUE format. |
Numeric vector
# Create numeric invalue and apply finput( "Male" = 1, "Female" = 2, name = "sex_inv" ) finputn(c("Male", "Female", "Male", "Unknown", "Female"), "sex_inv") # [1] 1 2 1 NA 2 fclear() # Parse invalue from text and apply fparse(text = ' INVALUE race_inv "White" = 1 "Black" = 2 "Asian" = 3 ; ') finputn(c("White", "Black"), "race_inv") # [1] 1 2 fclear()# Create numeric invalue and apply finput( "Male" = 1, "Female" = 2, name = "sex_inv" ) finputn(c("Male", "Female", "Male", "Unknown", "Female"), "sex_inv") # [1] 1 2 1 NA 2 fclear() # Parse invalue from text and apply fparse(text = ' INVALUE race_inv "White" = 1 "Black" = 2 "Asian" = 3 ; ') finputn(c("White", "Black"), "race_inv") # [1] 1 2 fclear()
Returns a character vector of all format and invalue names currently registered in the global format library.
flist()flist()
A character vector of registered format names, sorted alphabetically.
Returns character(0) if the library is empty.
fnew("M" = "Male", "F" = "Female", name = "sex") flist() fclear()fnew("M" = "Male", "F" = "Female", name = "sex") flist() fclear()
Convenience helper for building data-driven formats with fnew.
Returns a named vector (or list) with class "ks_fmap" that signals
fnew() to use the natural direction: names are input keys,
values are output labels/objects — regardless of the format type.
fmap(keys, values)fmap(keys, values)
keys |
Character vector of input keys (lookup values). |
values |
Vector of output labels or objects (character, numeric, Date, POSIXct, logical, etc.). |
Without fmap(), fnew() reverses named vectors for character
and numeric types (the factor() convention c(Label = "Code")).
Wrapping your data in fmap() suppresses this reversal, so
fmap(keys, values) works identically for character, numeric, Date,
POSIXct, and logical formats.
A named vector (or list, for non-atomic values) with class
c("ks_fmap", <original class>). Names are keys, values are
values.
fnew for format creation.
# Character lookup: keys -> labels fmap(c("M", "F"), c("Male", "Female")) |> fnew(name = "sex") fput(c("M", "F"), "sex") fclear() # Date lookup from a data frame ids <- c("SUBJ-001", "SUBJ-002") dates <- as.Date(c("2023-03-09", "2024-08-13")) fmap(ids, dates) |> fnew(type = "Date", name = "icdtn") fput("SUBJ-001", "icdtn") fclear()# Character lookup: keys -> labels fmap(c("M", "F"), c("Male", "Female")) |> fnew(name = "sex") fput(c("M", "F"), "sex") fclear() # Date lookup from a data frame ids <- c("SUBJ-001", "SUBJ-002") dates <- as.Date(c("2023-03-09", "2024-08-13")) fmap(ids, dates) |> fnew(type = "Date", name = "icdtn") fput("SUBJ-001", "icdtn") fclear()
Construct a ks_fmap-classed named character vector whose names
encode numeric / Date / POSIXct range bounds and whose values are the
corresponding labels. The result is intended to be passed to
fnew as a single positional argument (it suppresses the
default name reversal).
fmap_ranges( low, high, label, inc_low = TRUE, inc_high = FALSE, date_format = NULL )fmap_ranges( low, high, label, inc_low = TRUE, inc_high = FALSE, date_format = NULL )
low, high
|
Numeric, |
label |
Character vector of labels (same length as |
inc_low, inc_high
|
Logical, length 1 or |
date_format |
Optional strptime format string used when formatting
|
Bounds are formatted as ISO 8601: "%Y-%m-%d" for Date,
"%Y-%m-%d %H:%M:%S" (UTC) for POSIXct. Override with
date_format if needed.
A ks_fmap object (named character vector) suitable for
passing to fnew().
rng <- fmap_ranges( low = c(0, 18, 65), high = c(18, 65, Inf), label = c("Child", "Adult", "Senior"), inc_high = c(FALSE, FALSE, TRUE) ) fnew(rng, type = "numeric", name = "age_groups") fput(c(5, 25, 90), "age_groups") fclear()rng <- fmap_ranges( low = c(0, 18, 65), high = c(18, 65, Inf), label = c("Child", "Adult", "Senior"), inc_high = c(FALSE, FALSE, TRUE) ) fnew(rng, type = "numeric", name = "age_groups") fput(c(5, 25, 90), "age_groups") fclear()
Companion to fmap_ranges for the stratified_range
format type. Each row pairs a stratum (e.g. study arm, subject id, or a
composite key produced by fputk()) with a numeric / Date /
POSIXct range and a label. The returned ks_fmap vector carries
the chosen sep as an attribute so that
fnew(type = "stratified_range") picks it up automatically.
fmap_strata( stratum, low, high, label, inc_low = TRUE, inc_high = FALSE, sep = "|", date_format = NULL )fmap_strata( stratum, low, high, label, inc_low = TRUE, inc_high = FALSE, sep = "|", date_format = NULL )
stratum |
Character vector of stratum identifiers. |
low, high
|
Range bounds. See |
label |
Character vector of labels. |
inc_low, inc_high
|
Logical, length 1 or |
sep |
Separator inserted between stratum and range key. Must match
the |
date_format |
Optional strptime format string. |
A ks_fmap object with an attached "strata_sep"
attribute.
visits <- fmap_strata( stratum = c("ARM_A", "ARM_A", "ARM_B"), low = c(0, 7, 0), high = c(7, 14, 10), label = c("Baseline", "Week 1", "Baseline") ) fnew(visits, type = "stratified_range", range_subtype = "numeric", name = "visit_window") fputk(c("ARM_A", "ARM_B"), c(3, 5), format = "visit_window") fclear()visits <- fmap_strata( stratum = c("ARM_A", "ARM_A", "ARM_B"), low = c(0, 7, 0), high = c(7, 14, 10), label = c("Baseline", "Week 1", "Baseline") ) fnew(visits, type = "stratified_range", range_subtype = "numeric", name = "visit_window") fputk(c("ARM_A", "ARM_B"), c(3, 5), format = "visit_window") fclear()
Given a vector of values that match the labels of a range-based
format, returns the corresponding low / high bounds (and
inclusivity flags) for each input. Useful for reconstructing the
underlying range from a coded value.
fmap_to_ranges(x, fmt)fmap_to_ranges(x, fmt)
x |
A vector of values to look up against the format's labels. Coerced to character before matching. |
fmt |
A |
For multilabel formats where the same label maps to several
ranges, only the first matching range is returned. For full
multi-match behaviour, call franges() directly and join
on label.
A data.frame with one row per element of x and
columns low, high, inc_low, inc_high.
Rows where the input does not match any range label contain NA.
fparse(text = ' VALUE visit_ther (numeric) [LOW, 1] = 0 [ 8, 22] = 2 [22, 36] = 4 [37, 50] = 6 ; ') fmap_to_ranges(c(0, 2, 4, 6), "visit_ther") fclear()fparse(text = ' VALUE visit_ther (numeric) [LOW, 1] = 0 [ 8, 22] = 2 [22, 36] = 4 [37, 50] = 6 ; ') fmap_to_ranges(c(0, 2, 4, 6), "visit_ther") fclear()
Creates a format object that maps values to labels, similar to 'SAS' PROC FORMAT.
Supports discrete value mapping, ranges, and special handling of missing values.
The format is automatically stored in the global format library if name
is provided.
fnew( ..., name = NULL, type = "auto", default = NULL, multilabel = FALSE, ignore_case = FALSE, date_format = NULL, range_subtype = c("numeric", "date", "datetime"), strata_sep = "|", verbose = FALSE )fnew( ..., name = NULL, type = "auto", default = NULL, multilabel = FALSE, ignore_case = FALSE, date_format = NULL, range_subtype = c("numeric", "date", "datetime"), strata_sep = "|", verbose = FALSE )
... |
Named arguments defining value-label mappings, or one or more
named vectors/lists using the R convention
Named vectors use the R idiom where names are labels and values are codes,
which is the reverse of the Named-vector reversal: For character and numeric formats, named
vectors are automatically reversed so that Data-driven formats: For formats built programmatically from
data, wrap your data in |
name |
Character. Optional name for the format. If provided, the format is automatically registered in the global format library. |
type |
Character. Type of format: |
default |
Character. Default label for unmatched values (overrides .other) |
multilabel |
Logical. If |
ignore_case |
Logical. If |
date_format |
Character. Optional strptime-style format string used
when parsing date/datetime range keys (e.g. |
range_subtype |
Character. For |
strata_sep |
Character. For |
verbose |
Logical. If |
Special directives:
.missing: Label for NA, NULL, NaN values
.other: Label for values not matching any rule
Named-vector direction (reverse convention):
When a named vector or list is passed as an unnamed argument (e.g.,
fnew(c(Male = "M"))), the direction of the name-to-value mapping
depends on the output type:
For character / numeric types, names are labels and
values are codes. The pairs are reversed internally so that the
format maps code -> label. This follows the standard R idiom used
by factor(), where c(Label = "Code").
For value types (Date, POSIXct,
logical), names are input keys and values are the native R
objects returned by the format. No reversal is applied, because
non-character objects cannot be used as vector names.
This means the same data may need to be arranged differently
depending on the target type. To avoid this inconsistency for data-driven
formats, use fmap(keys, values) which works identically
for all types:
fnew(fmap(ids, dates), type = "Date") fnew(fmap(ids, date_strings), type = "character")
When in doubt, use explicit key = "label" arguments — these are
never reversed regardless of type.
Expression labels: If a label contains .x1, .x2, etc.,
it is treated as an R expression that is evaluated at apply-time. Extra arguments
are passed positionally via ... in fput:
stat_fmt <- fnew("n" = "sprintf('%s', .x1)",
"pct" = "sprintf('%.1f%%', .x1 * 100)")
fput(c("n", "pct"), stat_fmt, c(42, 0.15))
# Returns: "42" "15.0%"
An object of class "ks_format" containing the format definition.
The object is also stored in the format library if name is given.
# Discrete value format (auto-stored as "sex") fnew( "M" = "Male", "F" = "Female", .missing = "Unknown", .other = "Other Gender", name = "sex" ) # Apply immediately fput(c("M", "F", NA, "X"), "sex") # [1] "Male" "Female" "Unknown" "Other Gender" fclear() # Multilabel format: a value can match multiple labels fnew( "0,5,TRUE,TRUE" = "Infant", "6,11,TRUE,TRUE" = "Child", "12,17,TRUE,TRUE" = "Adolescent", "0,17,TRUE,TRUE" = "Pediatric", "18,64,TRUE,TRUE" = "Adult", "65,Inf,TRUE,TRUE" = "Elderly", "18,Inf,TRUE,TRUE" = "Non-Pediatric", name = "age_categories", type = "numeric", multilabel = TRUE ) # fput returns first match; fput_all returns all matches fput(c(3, 14, 25, 70), "age_categories") fput_all(c(3, 14, 25, 70), "age_categories") fclear() # From a named vector (Label = Code convention) sex_vec <- fnew(c(Male = "M", Female = "F"), .missing = "Unknown", name = "sex_vec") fput(c("M", "F", NA), sex_vec) # [1] "Male" "Female" "Unknown" fclear()# Discrete value format (auto-stored as "sex") fnew( "M" = "Male", "F" = "Female", .missing = "Unknown", .other = "Other Gender", name = "sex" ) # Apply immediately fput(c("M", "F", NA, "X"), "sex") # [1] "Male" "Female" "Unknown" "Other Gender" fclear() # Multilabel format: a value can match multiple labels fnew( "0,5,TRUE,TRUE" = "Infant", "6,11,TRUE,TRUE" = "Child", "12,17,TRUE,TRUE" = "Adolescent", "0,17,TRUE,TRUE" = "Pediatric", "18,64,TRUE,TRUE" = "Adult", "65,Inf,TRUE,TRUE" = "Elderly", "18,Inf,TRUE,TRUE" = "Non-Pediatric", name = "age_categories", type = "numeric", multilabel = TRUE ) # fput returns first match; fput_all returns all matches fput(c(3, 14, 25, 70), "age_categories") fput_all(c(3, 14, 25, 70), "age_categories") fclear() # From a named vector (Label = Code convention) sex_vec <- fnew(c(Male = "M", Female = "F"), .missing = "Unknown", name = "sex_vec") fput(c("M", "F", NA), sex_vec) # [1] "Male" "Female" "Unknown" fclear()
Creates both a format and its corresponding invalue for bidirectional conversion.
Both are automatically stored in the global format library if name
is provided.
fnew_bid(..., name = NULL, type = "auto")fnew_bid(..., name = NULL, type = "auto")
... |
Named arguments for format mappings |
name |
Character. Base name for both formats. The invalue will be
named |
type |
Character. Format type |
List with format (ks_format) and invalue (ks_invalue)
components.
# Bidirectional status format status_bi <- fnew_bid( "A" = "Active", "I" = "Inactive", "P" = "Pending", name = "status" ) # Forward: code -> label fputc(c("A", "I", "P", "A"), "status") # [1] "Active" "Inactive" "Pending" "Active" # Reverse: label -> code finputc(c("Active", "Pending", "Inactive"), "status_inv") # [1] "A" "P" "I" fclear() # From a named vector (Label = Code convention, same as fnew) fnew_bid(c(Male = "M", Female = "F"), name = "sex_bid") fputc(c("M", "F"), "sex_bid") finputc(c("Male", "Female"), "sex_bid_inv") fclear()# Bidirectional status format status_bi <- fnew_bid( "A" = "Active", "I" = "Inactive", "P" = "Pending", name = "status" ) # Forward: code -> label fputc(c("A", "I", "P", "A"), "status") # [1] "Active" "Inactive" "Pending" "Active" # Reverse: label -> code finputc(c("Active", "Pending", "Inactive"), "status_inv") # [1] "A" "P" "I" fclear() # From a named vector (Label = Code convention, same as fnew) fnew_bid(c(Male = "M", Female = "F"), name = "sex_bid") fputc(c("M", "F"), "sex_bid") finputc(c("Male", "Female"), "sex_bid_inv") fclear()
Creates a format object for date, time, or datetime values using SAS format
names or custom R strftime patterns. The format is automatically
registered in the global format library.
fnew_date(pattern, name = NULL, type = "auto", .missing = NULL)fnew_date(pattern, name = NULL, type = "auto", .missing = NULL)
pattern |
Character. Either a SAS format name (e.g., |
name |
Character. Name to register the format under. Defaults to the SAS format name (with period) or the pattern itself. |
type |
Character. Type of format: |
.missing |
Character. Label for missing values (NA). Default |
SAS format names are resolved automatically:
Date: DATE9., DDMMYY10., MMDDYY10., YYMMDD10., MONYY7., YEAR4., WEEKDATE., WORDDATE., etc.
Time: TIME8., TIME5., HHMM., HOUR., MMSS.
Datetime: DATETIME20., DATETIME13., etc.
Numeric input is converted using R epoch ("1970-01-01"):
Dates: numeric values are interpreted as days since 1970-01-01
Datetimes: numeric values are interpreted as seconds since 1970-01-01
Times: always treated as seconds since midnight
A ks_format object with date/time type, registered in the library.
# Use a SAS format name fnew_date("DATE9.", name = "mydate") fput(as.Date("2020-01-01"), "mydate") # [1] "01JAN2020" # Use directly without pre-creating fputn(as.Date("2020-06-15"), "MMDDYY10.") # [1] "06/15/2020" # Custom strftime pattern (e.g., Russian style: DD.MM.YYYY) fnew_date("%d.%m.%Y", name = "ru_date", type = "date") fput(as.Date(c("1990-03-25", "1985-11-03", "2000-07-14")), "ru_date") # Custom format with missing value label fnew_date("MMDDYY10.", name = "us_date", .missing = "NO DATE") fput(c(as.Date("2025-01-01"), NA, as.Date("2025-12-31")), "us_date") # [1] "01/01/2025" "NO DATE" "12/31/2025" # Numeric dates (days since 1970-01-01, R epoch) r_days <- as.numeric(as.Date("2025-01-01")) fputn(r_days, "DATE9.") # Multiple SAS date formats applied directly today <- Sys.Date() fputn(today, "DATE9.") fputn(today, "MMDDYY10.") fputn(today, "YYMMDD10.") fputn(today, "MONYY7.") fputn(today, "WORDDATE.") fputn(today, "QTR.") # Time formatting (seconds since midnight) fputn(c(0, 3600, 45000, 86399), "TIME8.") fputn(c(0, 3600, 45000), "HHMM.") # Datetime formatting now <- Sys.time() fputn(now, "DATETIME20.") fputn(now, "DTDATE.") fputn(now, "DTYYMMDD.") fclear()# Use a SAS format name fnew_date("DATE9.", name = "mydate") fput(as.Date("2020-01-01"), "mydate") # [1] "01JAN2020" # Use directly without pre-creating fputn(as.Date("2020-06-15"), "MMDDYY10.") # [1] "06/15/2020" # Custom strftime pattern (e.g., Russian style: DD.MM.YYYY) fnew_date("%d.%m.%Y", name = "ru_date", type = "date") fput(as.Date(c("1990-03-25", "1985-11-03", "2000-07-14")), "ru_date") # Custom format with missing value label fnew_date("MMDDYY10.", name = "us_date", .missing = "NO DATE") fput(c(as.Date("2025-01-01"), NA, as.Date("2025-12-31")), "us_date") # [1] "01/01/2025" "NO DATE" "12/31/2025" # Numeric dates (days since 1970-01-01, R epoch) r_days <- as.numeric(as.Date("2025-01-01")) fputn(r_days, "DATE9.") # Multiple SAS date formats applied directly today <- Sys.Date() fputn(today, "DATE9.") fputn(today, "MMDDYY10.") fputn(today, "YYMMDD10.") fputn(today, "MONYY7.") fputn(today, "WORDDATE.") fputn(today, "QTR.") # Time formatting (seconds since midnight) fputn(c(0, 3600, 45000, 86399), "TIME8.") fputn(c(0, 3600, 45000), "HHMM.") # Datetime formatting now <- Sys.time() fputn(now, "DATETIME20.") fputn(now, "DTDATE.") fputn(now, "DTYYMMDD.") fclear()
Returns a format or invalue object by name. Used when you need the object
(e.g. for fput_df or fexport) rather than
applying by name with fput, fputn, or
fputc.
format_get(name)format_get(name)
name |
Character. Name of a registered format or invalue. |
A ks_format or ks_invalue object.
fnew("M" = "Male", "F" = "Female", name = "sex") sex_fmt <- format_get("sex") fput_df(data.frame(sex = c("M", "F")), sex = sex_fmt) fclear()fnew("M" = "Male", "F" = "Female", name = "sex") sex_fmt <- format_get("sex") fput_df(data.frame(sex = c("M", "F")), sex = sex_fmt) fclear()
Opens an interactive Shiny app for browsing and managing objects currently registered in the global ksformat format library.
format_library_app(port = getOption("shiny.port"), launch.browser = TRUE)format_library_app(port = getOption("shiny.port"), launch.browser = TRUE)
port |
Integer or NULL. Port passed to |
launch.browser |
Logical. Passed to |
The app displays both ks_format (VALUE) and ks_invalue
(INVALUE) objects, supports filtering and name search, shows object details
with a formatted mapping table, and provides management actions to remove one
object, clear the full library, or quit the app.
Invisibly returns NULL.
## Not run: if (interactive() && requireNamespace("shiny", quietly = TRUE)) { fnew("M" = "Male", "F" = "Female", name = "sex") finput("Male" = 1, "Female" = 2, name = "sex_inv") format_library_app() } ## End(Not run)## Not run: if (interactive() && requireNamespace("shiny", quietly = TRUE)) { fnew("M" = "Male", "F" = "Female", name = "sex") finput("Male" = 1, "Female" = 2, name = "sex_inv") format_library_app() } ## End(Not run)
Reads format definitions written in a human-friendly 'SAS'-like syntax
and returns a list of ks_format and/or ks_invalue objects.
All parsed formats are automatically stored in the global format library.
fparse(text = NULL, file = NULL, verbose = FALSE)fparse(text = NULL, file = NULL, verbose = FALSE)
text |
Character string or character vector containing format definitions. If a character vector, lines are concatenated with newlines. |
file |
Path to a text file containing format definitions.
Exactly one of |
verbose |
Logical. If |
The syntax supports two block types:
VALUE blocks define formats (value -> label):
VALUE name (type) "value1" = "Label 1" "value2" = "Label 2" [low, high) = "Range Label (half-open)" (low, high] = "Range Label (open-low, closed-high)" .missing = "Missing Label" .other = "Other Label" ;
INVALUE blocks define reverse formats (label -> numeric value):
INVALUE name "Label 1" = 1 "Label 2" = 2 ;
Syntax rules:
Blocks start with VALUE or INVALUE keyword and end with ;
The type in parentheses is optional; defaults to "auto" for VALUE,
"numeric" for INVALUE
Values can be quoted or unquoted
Ranges use interval notation with explicit bounds
Legacy range syntax low - high is also supported
Special range keywords: LOW (-Inf) and HIGH (Inf)
.missing and .other are special directives
Lines starting with /*, *, //, or # are comments
Block options:
Comma-separated options can be placed inside the parentheses after the type:
nocase — enables case-insensitive key matching (equivalent to
ignore_case = TRUE in fnew).
multilabel — allows overlapping ranges where a single value
matches multiple labels (used with fput_all).
Options can be combined: VALUE name (character, nocase, multilabel).
A named list of ks_format and/or ks_invalue objects.
Names correspond to the format names defined in the text.
All formats are automatically registered in the global format library.
# Parse multiple format definitions from text fparse(text = ' VALUE sex (character) "M" = "Male" "F" = "Female" .missing = "Unknown" ; VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Age Unknown" ; // Invalue block INVALUE race_inv "White" = 1 "Black" = 2 "Asian" = 3 ; ') fput(c("M", "F", NA), "sex") fputn(c(5, 25, 70, NA), "age") finputn(c("White", "Black"), "race_inv") flist() fprint() fclear() # Parse date/time/datetime format definitions fparse(text = ' VALUE enrldt (date) pattern = "DATE9." .missing = "Not Enrolled" ; VALUE visit_time (time) pattern = "TIME8." ; VALUE stamp (datetime) pattern = "DATETIME20." ; ') fput(as.Date("2025-03-01"), "enrldt") fput(36000, "visit_time") fput(as.POSIXct("2025-03-01 10:00:00", tz = "UTC"), "stamp") fclear() # Case-insensitive format (nocase option) fparse(text = ' VALUE yesno (character, nocase) "Y" = "Yes" "N" = "No" .other = "Unknown" ; ') fput(c("y", "N", "YES"), "yesno") # [1] "Yes" "No" "Unknown" fclear() # Parse multilabel format fparse(text = ' VALUE risk (numeric, multilabel) [0, 3] = "Low Risk" [0, 7] = "Monitored" (3, 7] = "Medium Risk" (7, 10] = "High Risk" ; ') fput_all(c(2, 5, 9), "risk") fclear()# Parse multiple format definitions from text fparse(text = ' VALUE sex (character) "M" = "Male" "F" = "Female" .missing = "Unknown" ; VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Age Unknown" ; // Invalue block INVALUE race_inv "White" = 1 "Black" = 2 "Asian" = 3 ; ') fput(c("M", "F", NA), "sex") fputn(c(5, 25, 70, NA), "age") finputn(c("White", "Black"), "race_inv") flist() fprint() fclear() # Parse date/time/datetime format definitions fparse(text = ' VALUE enrldt (date) pattern = "DATE9." .missing = "Not Enrolled" ; VALUE visit_time (time) pattern = "TIME8." ; VALUE stamp (datetime) pattern = "DATETIME20." ; ') fput(as.Date("2025-03-01"), "enrldt") fput(36000, "visit_time") fput(as.POSIXct("2025-03-01 10:00:00", tz = "UTC"), "stamp") fclear() # Case-insensitive format (nocase option) fparse(text = ' VALUE yesno (character, nocase) "Y" = "Yes" "N" = "No" .other = "Unknown" ; ') fput(c("y", "N", "YES"), "yesno") # [1] "Yes" "No" "Unknown" fclear() # Parse multilabel format fparse(text = ' VALUE risk (numeric, multilabel) [0, 3] = "Low Risk" [0, 7] = "Monitored" (3, 7] = "Medium Risk" (7, 10] = "High Risk" ; ') fput_all(c(2, 5, 9), "risk") fclear()
Displays format information from the global format library. When called without arguments, lists all registered format names. When called with a name, displays the full definition of that format.
fprint(name = NULL)fprint(name = NULL)
name |
Character. Optional name of a specific format to display.
If |
Invisible NULL. This function is for display only.
flist for a programmatic alternative that returns
a character vector of registered names.
fnew("M" = "Male", "F" = "Female", name = "sex") flist() # character vector of names fprint() # list all formats fprint("sex") # show specific format fclear()fnew("M" = "Male", "F" = "Female", name = "sex") flist() # character vector of names fprint() # list all formats fprint("sex") # show specific format fclear()
Applies a format definition to a vector of values, returning formatted labels. Properly handles NA, NULL, NaN, and other missing values.
fput(x, format, ..., keep_na = FALSE)fput(x, format, ..., keep_na = FALSE)
x |
Vector of values to format |
format |
A |
... |
Additional arguments for expression labels. Positional arguments
are mapped to |
keep_na |
Logical. If TRUE, preserve NA in output instead of applying missing label. |
The function handles missing values in the following order:
NA, NULL, NaN -> Uses format's missing_label if defined
Exact matches -> Uses defined value-label mapping
Range matches (for numeric) -> Uses range label
No match -> Uses format's other_label or returns original value
Expression labels: If a label string contains .x1, .x2,
etc., it is evaluated as an R expression at apply-time. Extra data is passed
as positional arguments:
stat_fmt <- fnew("n" = "sprintf('%s', .x1)",
"pct" = "sprintf('%.1f%%', .x1 * 100)")
fput(c("n", "pct"), stat_fmt, c(42, 0.15))
# Returns: "42" "15.0%"
Case-insensitive matching: When a format has ignore_case = TRUE,
key matching is case-insensitive for character formats.
Character vector with formatted labels
# Basic discrete formatting fnew("M" = "Male", "F" = "Female", .missing = "Unknown", name = "sex") fput(c("M", "F", NA, "X"), "sex") # [1] "Male" "Female" "Unknown" "X" # Preserve NA instead of applying missing label sex_f <- fnew("M" = "Male", "F" = "Female", .missing = "Unknown") fput(c("M", "F", NA), sex_f, keep_na = TRUE) # [1] "Male" "Female" NA # Numeric range formatting fparse(text = ' VALUE score (numeric) (0, 50] = "Low" (50, 100] = "High" .other = "Out of range" ; ') fput(c(0, 1, 50, 51, 100, 101), "score") # [1] "Out of range" "Low" "Low" "High" "High" "Out of range" fclear()# Basic discrete formatting fnew("M" = "Male", "F" = "Female", .missing = "Unknown", name = "sex") fput(c("M", "F", NA, "X"), "sex") # [1] "Male" "Female" "Unknown" "X" # Preserve NA instead of applying missing label sex_f <- fnew("M" = "Male", "F" = "Female", .missing = "Unknown") fput(c("M", "F", NA), sex_f, keep_na = TRUE) # [1] "Male" "Female" NA # Numeric range formatting fparse(text = ' VALUE score (numeric) (0, 50] = "Low" (50, 100] = "High" .other = "Out of range" ; ') fput(c(0, 1, 50, 51, 100, 101), "score") # [1] "Out of range" "Low" "Low" "High" "High" "Out of range" fclear()
For multilabel formats, returns all matching labels for each input value.
Regular fput returns only the first match; this function
returns all matches as a list of character vectors.
fput_all(x, format, ..., keep_na = FALSE)fput_all(x, format, ..., keep_na = FALSE)
x |
Vector of values to format |
format |
A |
... |
Additional arguments for expression labels (mapped to |
keep_na |
Logical. If TRUE, preserve NA in output. |
A list of character vectors. Each element contains all matching labels for the corresponding input value.
# Basic multilabel: a value can match multiple labels age_ml <- fnew( "0,5,TRUE,TRUE" = "Infant", "6,11,TRUE,TRUE" = "Child", "12,17,TRUE,TRUE" = "Teen", "0,17,TRUE,TRUE" = "Minor", "18,64,TRUE,TRUE" = "Adult", "65,Inf,TRUE,TRUE" = "Senior", name = "age_ml", type = "numeric", multilabel = TRUE ) fput_all(c(3, 15, 25), age_ml) # [[1]] "Infant" "Minor" # [[2]] "Teen" "Minor" # [[3]] "Adult" # Multilabel with .missing and .other fnew( "0,100,TRUE,TRUE" = "Valid Score", "0,49,TRUE,TRUE" = "Below Average", "50,100,TRUE,TRUE" = "Above Average", "90,100,TRUE,TRUE" = "Excellent", .missing = "No Score", .other = "Out of Range", name = "score_ml", type = "numeric", multilabel = TRUE ) fput_all(c(95, 45, NA, 150), "score_ml") # [[1]] "Valid Score" "Above Average" "Excellent" # [[2]] "Valid Score" "Below Average" # [[3]] "No Score" # [[4]] "Out of Range" # Parse multilabel from text fparse(text = ' VALUE risk (numeric, multilabel) [0, 3] = "Low Risk" [0, 7] = "Monitored" (3, 7] = "Medium Risk" (7, 10] = "High Risk" ; ') fput_all(c(2, 5, 9), "risk") # [[1]] "Low Risk" "Monitored" # [[2]] "Monitored" "Medium Risk" # [[3]] "High Risk" fclear()# Basic multilabel: a value can match multiple labels age_ml <- fnew( "0,5,TRUE,TRUE" = "Infant", "6,11,TRUE,TRUE" = "Child", "12,17,TRUE,TRUE" = "Teen", "0,17,TRUE,TRUE" = "Minor", "18,64,TRUE,TRUE" = "Adult", "65,Inf,TRUE,TRUE" = "Senior", name = "age_ml", type = "numeric", multilabel = TRUE ) fput_all(c(3, 15, 25), age_ml) # [[1]] "Infant" "Minor" # [[2]] "Teen" "Minor" # [[3]] "Adult" # Multilabel with .missing and .other fnew( "0,100,TRUE,TRUE" = "Valid Score", "0,49,TRUE,TRUE" = "Below Average", "50,100,TRUE,TRUE" = "Above Average", "90,100,TRUE,TRUE" = "Excellent", .missing = "No Score", .other = "Out of Range", name = "score_ml", type = "numeric", multilabel = TRUE ) fput_all(c(95, 45, NA, 150), "score_ml") # [[1]] "Valid Score" "Above Average" "Excellent" # [[2]] "Valid Score" "Below Average" # [[3]] "No Score" # [[4]] "Out of Range" # Parse multilabel from text fparse(text = ' VALUE risk (numeric, multilabel) [0, 3] = "Low Risk" [0, 7] = "Monitored" (3, 7] = "Medium Risk" (7, 10] = "High Risk" ; ') fput_all(c(2, 5, 9), "risk") # [[1]] "Low Risk" "Monitored" # [[2]] "Monitored" "Medium Risk" # [[3]] "High Risk" fclear()
Applies formats to one or more columns in a data frame.
fput_df(data, ..., suffix = "_fmt", replace = FALSE)fput_df(data, ..., suffix = "_fmt", replace = FALSE)
data |
Data frame |
... |
Named format specifications: |
suffix |
Character. Suffix to add to formatted column names (default: "_fmt") |
replace |
Logical. If TRUE, replace original columns; if FALSE, create new columns |
Data frame with formatted columns
# Apply formats to multiple columns df <- data.frame( id = 1:6, sex = c("M", "F", "M", "F", NA, "X"), age = c(15, 25, 45, 70, 35, NA), stringsAsFactors = FALSE ) sex_f <- fnew("M" = "Male", "F" = "Female", .missing = "Unknown") fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Age Unknown" ; ') age_f <- format_get("age") fput_df(df, sex = sex_f, age = age_f, suffix = "_label") # Date formatting in data frames patients <- data.frame( id = 1:4, visit_date = as.Date(c("2025-01-10", "2025-02-15", "2025-03-20", NA)), stringsAsFactors = FALSE ) visit_fmt <- fnew_date("DATE9.", name = "visit_fmt", .missing = "NOT RECORDED") fput_df(patients, visit_date = visit_fmt) fclear()# Apply formats to multiple columns df <- data.frame( id = 1:6, sex = c("M", "F", "M", "F", NA, "X"), age = c(15, 25, 45, 70, 35, NA), stringsAsFactors = FALSE ) sex_f <- fnew("M" = "Male", "F" = "Female", .missing = "Unknown") fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Age Unknown" ; ') age_f <- format_get("age") fput_df(df, sex = sex_f, age = age_f, suffix = "_label") # Date formatting in data frames patients <- data.frame( id = 1:4, visit_date = as.Date(c("2025-01-10", "2025-02-15", "2025-03-20", NA)), stringsAsFactors = FALSE ) visit_fmt <- fnew_date("DATE9.", name = "visit_fmt", .missing = "NOT RECORDED") fput_df(patients, visit_date = visit_fmt) fclear()
Looks up a character VALUE format by name from the global format library and applies it to the input vector.
fputc(x, format_name, ...)fputc(x, format_name, ...)
x |
Character vector of values to format |
format_name |
Character. Name of a registered character format,
or a character vector of format names (same length as |
... |
Additional arguments passed to |
Character vector with formatted labels
# Apply character format by name fnew("M" = "Male", "F" = "Female", name = "sex") fputc(c("M", "F"), "sex") # [1] "Male" "Female" # Bidirectional: forward direction fnew_bid( "A" = "Active", "I" = "Inactive", "P" = "Pending", name = "status" ) fputc(c("A", "I", "P", "A"), "status") # [1] "Active" "Inactive" "Pending" "Active" fclear()# Apply character format by name fnew("M" = "Male", "F" = "Female", name = "sex") fputc(c("M", "F"), "sex") # [1] "Male" "Female" # Bidirectional: forward direction fnew_bid( "A" = "Active", "I" = "Inactive", "P" = "Pending", name = "status" ) fputc(c("A", "I", "P", "A"), "status") # [1] "Active" "Inactive" "Pending" "Active" fclear()
Convenience wrapper around [fput()] that pastes multiple vectors together into a composite key before lookup. Useful when a format is keyed on the combination of several columns (e.g., 'USUBJID|VISITNUM').
fputk(..., format, sep = "|", keep_na = FALSE, na_as_string = FALSE)fputk(..., format, sep = "|", keep_na = FALSE, na_as_string = FALSE)
... |
Vectors to paste together into a composite key. All vectors are recycled to a common length by [paste()]. |
format |
A [ks_format] object or a registered format name (character string). |
sep |
Separator inserted between the pasted components (default '"|"'). |
keep_na |
If 'TRUE', 'NA' inputs remain 'NA' in the output instead of being mapped via '.missing'. Passed through to [fput()]. |
na_as_string |
If 'FALSE' (default), an 'NA' in any component propagates to the composite key (restored to 'NA_character_' after the [paste()] step) so that [fput()] can apply '.missing' handling. If 'TRUE', the literal string '"NA"' produced by [paste()] is kept, which is useful when the format was built with composite keys via 'fmap(paste(..., sep = "|"), values)' — because [paste()] converts 'NA' to '"NA"' on both sides, the round-trip lookup then matches. |
A character vector of formatted labels, the same length as the (recycled) input vectors.
[fput()], [fputn()], [fputc()], [finputk()]
# Build a lookup keyed on two columns fnew( "A|1" = "2025-01-15", "A|2" = "2025-02-20", "B|1" = "2025-03-10", .other = "NOT FOUND", name = "visit_date", type = "character" ) subj <- c("A", "A", "B", "B") visit <- c(1, 2, 1, 3) fputk(subj, visit, format = "visit_date") # -> "2025-01-15" "2025-02-20" "2025-03-10" "NOT FOUND" fclear() # Composite key with NA components matching a paste()-built format fnew( fmap( paste(c("CHEM", "COAG"), c("ALB", "INR"), c("g/L", NA), sep = "|"), c("ALB", "INR") ), name = "lb_param", type = "character" ) fputk(c("CHEM", "COAG"), c("ALB", "INR"), c("g/L", NA), format = "lb_param", na_as_string = TRUE) # -> "ALB" "INR" fclear()# Build a lookup keyed on two columns fnew( "A|1" = "2025-01-15", "A|2" = "2025-02-20", "B|1" = "2025-03-10", .other = "NOT FOUND", name = "visit_date", type = "character" ) subj <- c("A", "A", "B", "B") visit <- c(1, 2, 1, 3) fputk(subj, visit, format = "visit_date") # -> "2025-01-15" "2025-02-20" "2025-03-10" "NOT FOUND" fclear() # Composite key with NA components matching a paste()-built format fnew( fmap( paste(c("CHEM", "COAG"), c("ALB", "INR"), c("g/L", NA), sep = "|"), c("ALB", "INR") ), name = "lb_param", type = "character" ) fputk(c("CHEM", "COAG"), c("ALB", "INR"), c("g/L", NA), format = "lb_param", na_as_string = TRUE) # -> "ALB" "INR" fclear()
Looks up a numeric VALUE format by name from the global format library and applies it to the input vector.
fputn(x, format_name, ...)fputn(x, format_name, ...)
x |
Numeric vector of values to format |
format_name |
Character. Name of a registered numeric format,
or a character vector of format names (same length as |
... |
Additional arguments passed to |
Character vector with formatted labels
# Numeric range formatting fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Age Unknown" ; ') fputn(c(5, 25, 70, NA), "age") # [1] "Child" "Adult" "Senior" "Age Unknown" # SAS date format (auto-resolved, no pre-creation needed) fputn(as.Date("2025-01-15"), "DATE9.") # [1] "15JAN2025" # Time format (seconds since midnight) fputn(c(0, 3600, 45000), "TIME8.") # [1] "00:00:00" "01:00:00" "12:30:00" fclear()# Numeric range formatting fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Age Unknown" ; ') fputn(c(5, 25, 70, NA), "age") # [1] "Child" "Adult" "Senior" "Age Unknown" # SAS date format (auto-resolved, no pre-creation needed) fputn(as.Date("2025-01-15"), "DATE9.") # [1] "15JAN2025" # Time format (seconds since midnight) fputn(c(0, 3600, 45000), "TIME8.") # [1] "00:00:00" "01:00:00" "12:30:00" fclear()
Returns the range-based mappings of a ks_format object as a tidy
data frame. Discrete entries (plain values, .missing, .other)
are excluded.
franges(fmt)franges(fmt)
fmt |
A |
Range keys are stored internally as strings such as
"0,18,TRUE,FALSE". franges() parses these keys back into
their numeric bounds and inclusivity flags, making it easy to inspect,
filter, or programmatically reuse range definitions.
Bounds parsed as HIGH or LOW appear as Inf and
-Inf respectively.
A data.frame with columns low, high,
inc_low, inc_high, and label. Rows are returned in
the order ranges appear in the format. If the format has no range
entries, an empty data frame with the same columns is returned.
fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Unknown" ; ') franges("age") fclear()fparse(text = ' VALUE age (numeric) [0, 18) = "Child" [18, 65) = "Adult" [65, HIGH] = "Senior" .missing = "Unknown" ; ') franges("age") fclear()
Element-wise check for missing values including NA and NaN. Optionally treats empty strings as missing.
is_missing(x)is_missing(x)
x |
Value to check |
Logical vector. NULL input returns logical(0).
is_missing(NA) # TRUE is_missing(NaN) # TRUE is_missing("") # TRUE is_missing("text") # FALSE is_missing(c(1, NA, NaN)) # FALSE TRUE TRUEis_missing(NA) # TRUE is_missing(NaN) # TRUE is_missing("") # TRUE is_missing("text") # FALSE is_missing(c(1, NA, NaN)) # FALSE TRUE TRUE
Opens the package cheat sheet in the default browser (HTML) or viewer (PDF).
The files are installed under system.file("doc", ..., package = "ksformat").
ksformat_cheatsheet(format = c("html", "pdf"))ksformat_cheatsheet(format = c("html", "pdf"))
format |
Character: |
Invisibly, the path to the opened file. If the file is not found, an error is thrown.
## Not run: ksformat_cheatsheet() # open HTML in browser ksformat_cheatsheet("pdf") # open PDF ## End(Not run)## Not run: ksformat_cheatsheet() # open HTML in browser ksformat_cheatsheet("pdf") # open PDF ## End(Not run)
Print Format Object
## S3 method for class 'ks_format' print(x, ...)## S3 method for class 'ks_format' print(x, ...)
x |
A ks_format object |
... |
Additional arguments (unused) |
The input x, returned invisibly.
Print Invalue Object
## S3 method for class 'ks_invalue' print(x, ...)## S3 method for class 'ks_invalue' print(x, ...)
x |
A ks_invalue object |
... |
Additional arguments (unused) |
The input x, returned invisibly.
Helper function to create range specifications for numeric formats.
range_spec(low, high, label, inc_low = TRUE, inc_high = FALSE)range_spec(low, high, label, inc_low = TRUE, inc_high = FALSE)
low |
Numeric. Lower bound of the range. |
high |
Numeric. Upper bound of the range. |
label |
Character. Label for values in this range. |
inc_low |
Logical. If |
inc_high |
Logical. If |
By default, ranges are half-open: [low, high) — the lower bound is
included and the upper bound is excluded. This matches 'SAS' PROC FORMAT
range semantics and prevents overlap between adjacent ranges.
A range_spec object (list with low, high, label, inc_low, inc_high).
range_spec(0, 18, "Child") # [0, 18) range_spec(18, 65, "Adult") # [18, 65) range_spec(65, Inf, "Senior", inc_high = TRUE) # [65, Inf]range_spec(0, 18, "Child") # [0, 18) range_spec(18, 65, "Adult") # [18, 65) range_spec(65, Inf, "Senior", inc_high = TRUE) # [65, Inf]