---
title: "FAQ and Practical Gotchas in ksTFL"
author: "ksTFL Team"
output:
rmarkdown::html_vignette:
dev: pdf
css: ksTFL-vignette.css
vignette: >
%\VignetteIndexEntry{FAQ and Practical Gotchas in ksTFL}
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
editor_options:
markdown:
wrap: 80
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
comment = "", collapse = TRUE, eval = FALSE,
dev = "png",
dev.args = if (capabilities("cairo")) list(type = "cairo") else NULL
)
options(pkgdown.internet = FALSE)
```
## Overview
This vignette collects short answers to the ksTFL questions that usually appear
after the first successful output: hidden helper columns, width recalculation,
span header levels, replay metadata, template precedence, Table of
Contents behavior, and practical `define_cols()` / `compute_cols()` recipes.
It is intentionally practical: the audience is readers who already built at
least one spec and now need to debug or sharpen a workflow.
Related reading:
- [Getting Started](Getting_Started_with_ksTFL.html) for the pipeline and
object model.
- [Advanced StyleRows](Advanced_StyleRows.html),
[Column Width Management](Column_Width_Management.html), and
[Real Examples](Real_Examples_with_ksTFL.html) when a question turns into a
deeper workflow problem.
- [Font Management](Font_Management.html) for font discovery and fallback
- [Rendering Pipeline](Rendering_Pipeline.html) for renderer internals
------------------------------------------------------------------------
## Data and spec behavior
### 1. Why does `cols` not drop the other columns from my data?
Because `cols` is a presentation lens, not a data-mutation step. ksTFL keeps
the full input data inside the spec's shadow data so later `compute_cols()`
calls can still reference helper fields that never appear in the document.
### 2. Can I hide a column and still use it in `compute_cols()`?
Yes. This is the standard helper-column pattern: set `isVisible = FALSE` and
keep using that column in conditions or `value_from` arguments. The package
examples do this with fields such as `SECTION`, `SECTION_ID`, `MODELVAL`, and
`SOC_GROUP`.
```r
spec <- create_table(df) |>
define_cols(FLAG, isVisible = FALSE) |>
compute_cols(FLAG == "Y", c_style(c(PARAM, VALUE), styleRef = "flagged"))
```
### 3. Why can I not set `colWidth` on an invisible column?
Invisible columns are forced to width `"0.0cm"` and removed from width
recalculation entirely. If a column must reserve visual space, it is not truly
invisible and should stay visible.
### 4. Why did the other column widths change after I locked one column?
Setting `colWidth` locks those columns. With `autoColWidth = TRUE` (the
default), ksTFL re-normalizes the remaining visible unlocked columns so they
fill the leftover width.
```r
spec <- create_table(df) |>
define_cols(ID, colWidth = "20%")
```
After that call, `ID` stays fixed at `20%` and the remaining visible unlocked
columns are recalculated to fill the rest.
### 5. Why did `c_glue()` not modify a repeated value?
If a cell was already suppressed by `dedupe = TRUE`, `c_glue()` skips it on
purpose. The same skip happens for non-leader cells inside a merge, so glue the
leader column or turn deduplication off for that field.
------------------------------------------------------------------------
## Row actions and layout rules
### 6. Why does `compute_cols()` not like aggregate logic such as `mean(x)`?
`compute_cols()` conditions are captured lazily and evaluated row-wise. If you
need section-level or whole-table aggregates, calculate them upstream or write
them into a helper column before creating the spec.
```r
df$group_mean <- ave(df$AVAL, df$GROUP, FUN = mean)
spec <- create_table(df) |>
compute_cols(group_mean > 10, c_style(AVAL, styleRef = "flagged"))
```
### 7. Can I nest `c_*()` actions inside each other?
No. Row actions are siblings, not nested verbs. Either pass multiple actions to
one `compute_cols()` call or use several `compute_cols()` calls with the same
condition.
### 8. Why does every `add_span_header()` call create a new row of headers?
Because `stubOrder` auto-increments when you omit it. Reuse the same
`stubOrder` for sibling span headers that belong on one header row, and only
increase it when you really want a new level.
```r
spec <- create_table(df) |>
add_span_header(c(TRT_A_N, TRT_A_PCT), label = "Treatment A", stubOrder = 1) |>
add_span_header(c(TRT_B_N, TRT_B_PCT), label = "Treatment B", stubOrder = 1)
```
### 9. Can span headers overlap?
Yes across different levels, no within the same level. Headers at the same
`stubOrder` must not share columns, but parent and child levels can overlap
freely.
### 10. How do I keep a small table under a figure on the same page?
Set `continuousSection = TRUE` on the following spec, not the first one. Keep
page size and margins compatible across both sections, and use this pattern for
short follow-on content because Word still handles overflow naturally.
```r
report <- create_report(
create_figure(plot_obj) |>
set_document(continuousSection = TRUE),
create_table(summary_tbl) |>
set_document(continuousSection = TRUE)
)
```
### 11. When should I use `isGrouping`, `isPaging`, and `isColBreak`?
Use `isGrouping` when a value change defines a logical section, `isPaging` when
that value change should start a new vertical page group, and `isColBreak` when
a wide listing should split horizontally into segments while repeating ID
columns.
### 12. Why do my footnotes repeat on every page?
That is the default: `footnotePlace = "repeated"`. Switch to `"last_page"`
when you want a final note block only, or `"doc_footer"` when the note belongs
in the Word footer area.
```r
tfl_set_options(footnotePlace = "last_page")
```
------------------------------------------------------------------------
## Rendering, replay, and reproducibility
### 13. What is the practical difference between `write_doc()`,
`save_report()`, and `replay_report()`?
`write_doc()` is the one-step path for everyday use. `save_report()` writes the
spec JSON plus table/figure payloads without rendering, while `replay_report()`
renders later from those saved artifacts and can also combine previously saved
outputs into one document.
```r
write_doc(report, name = "tables")
saved <- save_report(report, docFileName = "tables.docx", metaPath = "meta")
replay_report(saved$spec_file, meta_dir = "meta")
```
### 14. When do I need a persistent `metaPath` instead of `tempdir()`?
Use `tempdir()` when you only need the final DOCX right now. Use a persistent
`metaPath` when you want exact replays, QC comparison, report inventories, or a
later combined replay workflow.
If you replay by DOCX name, ksTFL resolves the latest saved spec in that meta
folder; if you need an exact historical version, replay by the saved JSON file
name instead.
```r
replay_report("tables.docx", meta_dir = "meta")
replay_report("abc123def456.json", meta_dir = "meta")
```
### 15. Can I delete the original figure file after saving a report?
For replay-based workflows, yes after a successful save, because ksTFL copies
the figure into `metaPath` under its `dataRef`. The saved meta folder becomes
the durable rendering input.
### 16. Why did different sections of one report use different templates?
That is the default behavior for multi-spec reports. Each spec resolves its own
`docTemplate`, so a table can use one bundled template while a text or figure
section uses another.
```r
report <- create_report(
create_table(adsl) |> set_page_style(docTemplate = "Navy_Pro"),
create_text() |> set_page_style(docTemplate = "Carbon_Dark")
)
```
### 17. How do I force one template across every section?
Use `overrideTemplate` in `write_doc()` or `replay_report()`. That global
override wins over per-spec `docTemplate` values and is the cleanest way to
re-skin a finished bundle.
```r
write_doc(report, name = "tables", overrideTemplate = "Navy_Pro")
replay_report("tables.docx", meta_dir = "meta", overrideTemplate = "Navy_Pro")
```
------------------------------------------------------------------------
## TOC and report assembly
### 18. Why does a Table of Contents not appear even though I asked for one?
You need both parts of the contract: request a TOC (`toc = TRUE`,
`insertTOC = TRUE`, or the package option) and mark at least one title or
subtitle with `toclevel`. A TOC request with no `toclevel` entries has nothing
to index.
```r
spec <- create_table(df) |>
add_title("Table 1", toclevel = 1)
write_doc(create_report(spec), name = "tables", toc = TRUE)
```
### 19. Why is the TOC still just a placeholder when I open the DOCX?
ksTFL writes a Word TOC field, not a pre-expanded static table. Open the file
in Word, click inside the TOC, and update fields with `F9` to populate it.
### 20. Can `create_report()` accept a named list of specs built in a loop?
Yes. `create_report()` accepts named lists of `TFL_spec` objects, which is
useful when specs are created dynamically or in separate program files. The
list names become the key prefixes inside the final `TFL_report`.
```r
specs <- list(
demog = create_table(adsl),
labs = create_table(adlb)
)
report <- create_report(specs)
```
------------------------------------------------------------------------
## Practical column and action recipes
These are short copy-paste patterns for the `define_cols()` and
`compute_cols()` cases that usually come up after the first working table.
### 21. How do I define several display columns in one place?
Use one `define_cols()` call when the columns share the same labels, widths,
or base value styles.
```r
spec <- create_table(adsl) |>
define_cols(
c(AGE, WEIGHT, HEIGHT),
label = c("Age", "Weight
(kg)", "Height
(cm)"),
colWidth = c("12%", "14%", "14%"),
valueStyleRef = c("ar", "ar", "ar")
)
```
This keeps aligned numeric columns easy to maintain.
### 22. How do I use `NA` to skip one column inside a batch `define_cols()` call?
Use `NA` at the position you want to leave unchanged. This is handy when most
columns share one update but one column should keep its existing definition.
```r
spec <- create_table(adsl) |>
define_cols(
c(USUBJID, AGE, TRT01A),
label = c("Subject ID", NA, "Treatment"),
colWidth = c("18%", NA, "20%"),
valueStyleRef = c("mono", "ar", NA)
)
```
Here `AGE` keeps its current label and width, and `TRT01A` keeps its current
value style. This also works well with hidden helper columns when you want to
skip `colWidth` because invisible columns are forced to `"0.0cm"`.
### 23. How do I hide a helper column but still use it to drive formatting?
Hide the helper with `isVisible = FALSE`, then refer to it in
`compute_cols()` as usual.
```r
spec <- create_table(df) |>
define_cols(FLAG, isVisible = FALSE) |>
define_cols(c(PARAM, VALUE), label = c("Parameter", "Value")) |>
add_style("flagged", s_font(color = "#8B0000", bold = TRUE)) |>
compute_cols(
FLAG == "Y",
c_style(c(PARAM, VALUE), styleRef = "flagged")
)
```
This is the standard pattern for QC flags, section ids, and hidden totals.
### 24. How do I turn a hidden grouping column into a stub header?
Use `c_addrow()` on the first row of each group and pull the display text from
the hidden column.
```r
spec <- create_table(df) |>
add_style(
"section_header",
s_font(bold = TRUE, color = "#FFFFFF"),
s_table_style(background_color = "#4682B4")
) |>
define_cols(REGION, isVisible = FALSE) |>
define_cols(c(PRODUCT, REVENUE), label = c("Product", "Revenue")) |>
compute_cols(
firstOf(REGION),
c_addrow(
pos = "above",
value_from = REGION,
styleRef = "section_header"
)
)
```
This is usually cleaner than repeating the region on every detail row.
### 25. How do I insert subtotals from a hidden total column?
Precompute the subtotal upstream, hide that helper column, and insert it on
the last row of each group.
```r
spec <- create_table(df) |>
add_style(
"subtotal_row",
s_font(bold = TRUE),
s_table_style(background_color = "#D9D9D9")
) |>
define_cols(TOTAL, isVisible = FALSE) |>
compute_cols(
lastOf(REGION),
c_addrow(
pos = "below",
value_from = TOTAL,
styleRef = f_combine("subtotal_row", "ar")
)
)
```
This works well when the display row is just a formatted version of stored
summary text.
### 26. How do I apply one condition to several visible columns at once?
Pass a column vector to `c_style()` instead of repeating the same condition in
separate calls.
```r
spec <- create_table(labs) |>
add_style("out_of_range", s_font(color = "#FF4500", bold = TRUE)) |>
compute_cols(
VISIT == "Week 8" & AVAL > AVAL_ULN,
c_style(c(PARAM, AVAL, UNIT), styleRef = "out_of_range")
)
```
Use this when the flag belongs to the row but only a few columns should show
it.
### 27. How do I combine font and background styles for one rule?
Compose styles with `f_combine()` instead of defining a new style for every
font-plus-fill pairing.
```r
spec <- create_table(df) |>
add_style(
"warn_bg",
s_table_style(background_color = "#FFF4E5")
) |>
compute_cols(
CRITFL == "Y",
c_style(c(PARAM, VALUE), styleRef = f_combine("b", "warn_bg"))
)
```
This is a good fit for one-off emphasis rules.
### 28. How do I give columns a base style and still add row-level
highlighting later?
Put default alignment or indentation in `define_cols()`, then add the
conditional layer in `compute_cols()`.
```r
spec <- create_table(df) |>
add_style(
"warn_row",
s_table_style(background_color = "#FFF4E5")
) |>
define_cols(PARAM, valueStyleRef = "indent_1") |>
define_cols(VALUE, valueStyleRef = "ar") |>
compute_cols(
FLAG == "Y",
c_style(everything(), styleRef = "warn_row")
)
```
The base column styles stay in place; the row style adds on top.
### 29. How do I build a total line by combining `c_merge()`, `c_clear()`,
and `c_glue()`?
Use one `compute_cols()` call when the same rows need several sibling actions.
```r
spec <- create_table(df) |>
compute_cols(
PRODUCT == "TOTAL",
c_merge(c(PRODUCT, REVENUE), styleRef = f_combine("b", "ar")),
c_clear(PRODUCT),
c_glue(PRODUCT, "after", REGION),
c_glue(PRODUCT, "after", text = " total: "),
c_glue(PRODUCT, "after", REVENUE)
)
```
This is useful when the display string does not exist as one input column.
### 30. How do I apply more than one action to the same condition without
nested `c_*()` calls?
Keep the actions as separate arguments inside one `compute_cols()` call.
```r
spec <- create_table(df) |>
add_style("boundary", s_font(bold = TRUE)) |>
compute_cols(
firstOf(GROUP),
c_addrow(pos = "above", value_from = GROUP, styleRef = "boundary"),
c_style(c(PARAM, VALUE), styleRef = "boundary")
)
```
Row actions are siblings, not nested verbs.
### 31. How do I build a two-level stub with one hidden column and two style rules?
Insert the group header from the hidden column, then use separate style rules
for summary rows and detail rows.
```r
spec <- create_table(df) |>
define_cols(REGION, isVisible = FALSE) |>
define_cols(c(PRODUCT, REVENUE), label = c("Product", "Revenue")) |>
compute_cols(
firstOf(REGION),
c_addrow(pos = "above", value_from = REGION, styleRef = "b")
) |>
compute_cols(
PRODUCT == "TOTAL",
c_style(PRODUCT, styleRef = f_combine("i", "indent_1")),
c_style(REVENUE, styleRef = "i")
) |>
compute_cols(
PRODUCT != "TOTAL",
c_style(PRODUCT, styleRef = "indent_2")
)
```
That pattern is handy when the output stub needs visible hierarchy even though
the source data is still flat.
### 32. Can I combine multiple actions of the same or different types, and how do they work together?
Yes, but as sibling actions, not nested calls. You can pass any mix of
`c_style()`, `c_addrow()`, `c_merge()`, `c_clear()`, `c_glue()`, and
`c_pageBreak()` in one `compute_cols()` call.
```r
spec <- create_table(df) |>
compute_cols(
firstOf(GROUP),
c_addrow("above", value_from = GROUP, styleRef = "b"),
c_style(c(PARAM, VALUE), styleRef = f_combine("b", "fc_navy"))
) |>
compute_cols(
PARAM == "TOTAL",
c_merge(c(PARAM, VALUE), styleRef = "ar"),
c_clear(PARAM),
c_glue(PARAM, "after", text = "Total: "),
c_glue(PARAM, "after", VALUE)
)
```
Practical rule: when one action depends on the visual result of another,
prefer separate `compute_cols()` calls (as above) to keep intent explicit.
### 33. How can I create three- or four-level nested text in one column (for example Parameter/Visit/Statistic indentation)?
It depends on the input data shape. Two common patterns are shown below.
#### Pattern A: detail rows only, hierarchy injected with `c_addrow()`
```r
dt <- tibble::tribble(
~PARAM, ~VISIT, ~STATISTICS, ~VALUE,
"ALT", "Visit 1", "Mean", 1L,
"ALT", "Visit 1", "Median", 2L,
"ALT", "Visit 2", "Mean", 1L,
"ALT", "Visit 2", "Median", 2L,
"AST", "Visit 1", "Mean", 1L,
"AST", "Visit 1", "Median", 2L,
"AST", "Visit 2", "Mean", 1L,
"AST", "Visit 2", "Median", 2L
)
spec <- create_table(dt) |>
define_cols(c(PARAM, VISIT), isVisible = FALSE) |>
define_cols(
c(STATISTICS, VALUE),
label = c("Parameter
Visit
Statistics", "Value"),
valueStyleRef = c("indent_2", NA),
labelStyleRef = c("al", NA)
) |>
compute_cols(
firstOf(PARAM),
c_addrow("above", value_from = PARAM)
) |>
compute_cols(
firstOf(VISIT),
c_addrow("above", value_from = VISIT, styleRef = "indent_1")
)
```
What this does and why:
- Input contains only detail rows (`STATISTICS` + `VALUE`).
- `PARAM` and `VISIT` are hidden helper columns that drive layout.
- `c_addrow()` inserts visible hierarchy rows above first parameter/visit
boundaries.
- This keeps source data tidy while producing a nested visual stub.
#### Pattern B: placeholder hierarchy rows in data, collapsed with `c_merge()`
```r
dt <- tibble::tribble(
~PARAM, ~VISIT, ~STATISTICS, ~VALUE,
"ALT", "Visit 1", NA, NA,
"ALT", "Visit 1", NA, NA,
"ALT", "Visit 1", "Mean", 1L,
"ALT", "Visit 1", "Median", 2L,
"ALT", "Visit 2", NA, NA,
"ALT", "Visit 2", NA, NA,
"ALT", "Visit 2", "Mean", 1L,
"ALT", "Visit 2", "Median", 2L,
"AST", "Visit 1", NA, NA,
"AST", "Visit 1", NA, NA,
"AST", "Visit 1", "Mean", 1L,
"AST", "Visit 1", "Median", 2L,
"AST", "Visit 2", NA, NA,
"AST", "Visit 2", NA, NA,
"AST", "Visit 2", "Mean", 1L,
"AST", "Visit 2", "Median", 2L
)
spec <- create_table(dt) |>
define_cols(c(PARAM, VISIT), isVisible = FALSE) |>
define_cols(
c(STATISTICS, VALUE),
label = c("Parameter
Visit
Statistics", "Value"),
valueStyleRef = c("indent_2", NA),
labelStyleRef = c("al", NA)
) |>
compute_cols(
firstOf(PARAM, VISIT),
c_merge(c(PARAM, VISIT, STATISTICS), styleRef = "indent_0")
) |>
compute_cols(
!firstOf(PARAM, VISIT) & is.na(STATISTICS),
c_merge(c(VISIT, STATISTICS), styleRef = "indent_1")
)
```
What this does and why:
- Input already contains placeholder hierarchy rows (`STATISTICS = NA`).
- `c_merge()` turns those rows into spanning hierarchy lines.
- First merge call builds the top level (`PARAM` + `VISIT` context).
- Second merge call handles lower placeholder rows (visit-level line).
- This pattern is useful when source extracts already contain structural rows
and you want to preserve that model.
Both patterns are valid. Choose by source shape:
- Use Pattern A when hierarchy should be derived from boundaries.
- Use Pattern B when hierarchy rows already exist in incoming data.
### 34. How do I switch between continuous sections, repeating/not repeating headers, and row-break behavior across pages?
These controls come from different layers:
- Continuous sections between specs: use `set_document(continuousSection = TRUE)`
on the following spec.
- Repeating title/subtitle groups across pages: controlled by `isContinues`
(`FALSE` repeats, `TRUE` suppresses repeated title/subtitle output).
- Table header repetition and row splitting across pages are template layout
settings (`repeat_header_on_each_page`, `allow_row_break_across_pages`).
```r
report <- create_report(
create_table(tbl_a) |>
set_document(isContinues = FALSE),
create_table(tbl_b) |>
set_document(continuousSection = TRUE, isContinues = TRUE)
)
write_doc(report, name = "layout_switch")
```
Important caveat: when `isColBreak` is active, ksTFL enforces
`repeat_header_on_each_page = TRUE` and
`allow_row_break_across_pages = FALSE` for correct horizontal pagination.
------------------------------------------------------------------------
## Metadata workflows: replay, combine, and validation
### 35. How do I replay a document from stored metadata without re-running R code?
Use `replay_report()` with either the DOCX filename (uses the latest saved
spec) or the exact spec JSON hash for a specific historical version. This
replays from the saved JSON and data files, not from R objects, so the original
data frames or ggplot objects are not needed.
```r
# Replay the latest version by DOCX name
replay_report("tables_01.docx", meta_dir = "meta")
# Replay an exact historical version by spec hash
replay_report("abc123def456.json", meta_dir = "meta")
# Override output location
replay_report(
"tables_01.docx",
meta_dir = "meta",
output_path = "qc/tables_01_replay.docx"
)
```
Practical workflow: run production specs with `save_report()` instead of
`write_doc()` to preserve the metadata, then use `replay_report()` for QC
re-runs, template switches, or regulatory re-submissions without touching the
original R scripts.
### 36. How do I combine multiple documents into a single DOCX with a Table of Contents?
Pass a vector of spec references (DOCX names or JSON hashes) to
`replay_report()` along with a combined `output_path`. The function merges all
specs into one document and optionally inserts a TOC page at the front.
```r
# Combine two documents from the same meta folder
replay_report(
spec_json = c("tables_01.docx", "listings_01.docx"),
meta_dir = "meta",
output_path = "output/combined_tables_listings.docx",
insertTOC = TRUE,
tocTitle = "Table of Contents"
)
# Combine documents from different meta folders
replay_report(
spec_json = c(
"meta_tables/abc123.json",
"meta_figures/def456.json",
"meta_listings/ghi789.json"
),
output_path = "output/full_clinical_report.docx",
insertTOC = TRUE,
tocTitle = "Clinical Study Report - Contents"
)
```
This is the standard pattern for assembling final submission packages from
individually validated outputs.
### 37. How do I filter and combine only the latest versions of documents?
Use `list_reports()` to scan the meta folder, filter for `is_latest == TRUE`,
then pass the matched `spec_file` entries to `replay_report()`. This is useful
when you have many historical versions but only want to combine the current set.
```r
library(dplyr)
meta_index <- list_reports("meta", sort_by = "doc_file")
# Keep only latest entries
latest <- meta_index %>% filter(is_latest)
# Optional: filter by document name patterns
tables_and_figures <- latest %>%
filter(grepl("table|figure", doc_file, ignore.case = TRUE))
# Combine into one document
replay_report(
spec_json = tables_and_figures$spec_file,
meta_dir = "meta",
output_path = "output/final_report.docx",
insertTOC = TRUE
)
```
This pattern is particularly useful for batch production workflows where
hundreds of outputs are generated separately and then assembled into themed
bundles (tables-only, figures-only, or full report).
### 38. How do I match saved metadata with actual DOCX files for QC validation?
Use `list_reports()` to get the metadata index, then cross-check with the actual
files on disk using an inner join. This ensures both the metadata and the
rendered output exist before attempting validation or replay.
```r
library(dplyr)
library(tibble)
# Read metadata index
meta_index <- list_reports("meta", sort_by = "doc_file")
latest <- meta_index %>% filter(is_latest)
# Scan output folder for actual DOCX files
docx_on_disk <- list.files(
"output",
pattern = "\\.docx$",
full.names = FALSE
)
docx_on_disk <- docx_on_disk[!startsWith(docx_on_disk, "~$")] # Skip temp files
# Inner join - keep only entries with both metadata and file
matched <- latest %>%
inner_join(
tibble(doc_file = docx_on_disk),
by = "doc_file"
) %>%
arrange(doc_file, datetime)
cat(sprintf(
"Matched: %d of %d latest entries have corresponding DOCX files\n",
nrow(matched),
nrow(latest)
))
# Use matched entries for validation workflow
for (i in seq_len(nrow(matched))) {
cat(sprintf(
"%2d. %s [%s] -> %s\n",
i,
matched$doc_file[i],
matched$datetime[i],
matched$spec_file[i]
))
}
```
This cross-reference pattern is the foundation of validation workflows:
programmers save metadata during production runs, QC reviewers scan the output
folder and metadata folder, then match and replay only the entries that exist in
both places.
### 39. How do I store metadata persistently for regulatory validation?
Use `save_report()` with a persistent `metaPath` (not `tempdir()`) to create a
durable metadata archive. This archive contains:
- Spec JSON files (hash-named, one per save)
- Data JSON files (referenced by `dataRef` in specs)
- Figure image files (copied with original extensions preserved)
- `_index.json` (automatically maintained index of all specs)
```r
# Set persistent directories in options
tfl_set_options(
output_directory = "output",
meta_directory = "meta"
)
# Save report with metadata
spec1 <- create_table(adsl) %>%
add_title("Table 1: Demographics", toclevel = 1) %>%
set_document(hasData = TRUE)
spec2 <- create_table(advs) %>%
add_title("Table 2: Vital Signs", toclevel = 1) %>%
set_document(hasData = TRUE)
report <- create_report(spec1, spec2)
result <- save_report(
report,
docFileName = "tables_demographics_vitals.docx",
outDir = "output",
metaPath = "meta",
insertTOC = TRUE
)
# Metadata now available for:
# - QC replay: replay_report(result$spec_file, meta_dir = "meta")
# - Template switch: replay_report(..., overrideTemplate = "Navy_Pro")
# - Historical audit: list_reports("meta") shows all versions with timestamps
```
Validation workflow benefits:
- **Reproducibility**: Exact replay without re-running upstream data processing
- **Auditability**: Every save creates a timestamped entry in `_index.json`
- **Template flexibility**: Re-render with different templates without changing
specs
- **QC independence**: Reviewers replay from metadata, not from live R sessions
### 40. How do I clean up obsolete metadata files while keeping the latest versions?
Use `clean_reports()` to remove old spec JSONs and orphaned data files while
preserving the most recent N versions per document. This keeps the metadata
folder manageable in long-running projects.
```r
# Keep only the 2 most recent versions of each document
clean_reports(meta_dir = "meta", keep_versions = 2)
# Keep only the latest version (most aggressive cleanup)
clean_reports(meta_dir = "meta", keep_versions = 1)
```
The function:
- Identifies obsolete spec JSONs (older than `keep_versions`)
- Deletes obsolete specs
- Scans surviving specs for referenced data/figure files
- Deletes orphaned data JSONs and images not referenced by any surviving spec
- Updates `_index.json` to reflect the cleaned state
Run this periodically in development to avoid accumulating hundreds of obsolete
metadata files, or use it before archiving a project to keep only the final
validated versions.