Deborah.DeborahCore.TOMLConfigDeborah

Deborah.DeborahCore.TOMLConfigDeborah.BootstrapConfigType
struct BootstrapConfig

Defines bootstrap resampling parameters (block bootstrap).

Fields

  • ranseed::Int : Random seed for reproducibility.
  • N_bs::Int : Number of bootstrap replicates to generate (N_bs $> 0$).
  • blk_size::Int : Block length (blk_size ≥ 1). Use 1 for i.i.d. bootstrap.
  • method::String : Block-bootstrap scheme to use. Accepted values:
    • "nonoverlapping" : Nonoverlapping Block Bootstrap (NBB). Partition the series into disjoint blocks of length blk_size, then resample those blocks with replacement to reconstruct a series of approximately the original length (last block may be truncated).
    • "moving" : Moving Block Bootstrap (MBB). Candidate blocks are all contiguous length-blk_size windows; resample these with replacement.
    • "circular" : Circular Block Bootstrap (CBB). Like MBB, but windows wrap around the end of the series (circular indexing).

Notes

  • Only the three literal strings above are recognized; other values should raise an error.
  • Resampled series length should match the original; if it overshoots, truncate the final block.
  • Choose blk_size based on dependence strength (larger for stronger autocorrelation).

Example (TOML)

[bootstrap]
ranseed  = 850528
N_bs     = 1000
blk_size = 500
method   = "nonoverlapping"  # one of: "nonoverlapping", "moving", "circular"
source
Deborah.DeborahCore.TOMLConfigDeborah.TraceDataConfigType
struct TraceDataConfig

Holds metadata and parameters for trace data input/output configuration used in LightGBM machine-learning workflows.

Fields

  • location::String Base directory containing raw trace data files.

  • ensemble::String Ensemble identifier (e.g., "L8T4k13580").

  • analysis_header::String Prefix for analysis directories (e.g., "analysis_...").

  • X::Vector{String} List of input file names (e.g., ["plaq.dat", "rect.dat"]).

  • Y::String Output file name (e.g., "pbp.dat").

  • model::String Machine learning model identifier (e.g., "Ridge", "LightGBM").

  • read_column_X::Vector{Int} List of $1$-based column indices to read from each file in X.

  • read_column_Y::Int $1$-based column index to read from output file Y.

  • index_column::Int $1$-based column index for reading configuration indices (usually 1).

  • LBP::Int Percentage ($0 < x < 100$) of the total configurations to assign as the labeled set.

  • TRP::Int Percentage ($0 \le x \le 100$) of the labeled set that is used as the training set. (i.e., $\texttt{(training set)} = \texttt{(total set)} \times \dfrac{\texttt{LBP}}{100} \times \dfrac{\texttt{TRP}}{100}$)

  • IDX_shift::Int Offset applied to align the index of input X and output Y (e.g., 0 or 1).

  • dump_X::Bool Whether to dump preprocessed input matrix X to disk.

  • use_abbreviation::Bool Whether to abbreviate variable names for output directory or file naming.

source