Deborah.DeborahCore.TOMLConfigDeborah

struct BootstrapConfig

Defines bootstrap resampling parameters (block bootstrap).

Fields

ranseed::Int : Random seed for reproducibility.
N_bs::Int : Number of bootstrap replicates to generate (N_bs $> 0$).
blk_size::Int : Block length (blk_size ≥ 1). Use 1 for i.i.d. bootstrap.
method::String : Block-bootstrap scheme to use. Accepted values:
- "nonoverlapping" : Nonoverlapping Block Bootstrap (NBB). Partition the series into disjoint blocks of length blk_size, then resample those blocks with replacement to reconstruct a series of approximately the original length (last block may be truncated).
- "moving" : Moving Block Bootstrap (MBB). Candidate blocks are all contiguous length-blk_size windows; resample these with replacement.
- "circular" : Circular Block Bootstrap (CBB). Like MBB, but windows wrap around the end of the series (circular indexing).

Notes

Only the three literal strings above are recognized; other values should raise an error.
Resampled series length should match the original; if it overshoots, truncate the final block.
Choose blk_size based on dependence strength (larger for stronger autocorrelation).

Example (TOML)

[bootstrap]
ranseed  = 850528
N_bs     = 1000
blk_size = 500
method   = "nonoverlapping"  # one of: "nonoverlapping", "moving", "circular"

struct FullConfigDeborah

Aggregate configuration struct used in the Deborah.DeborahCore pipeline.

Fields

data::TraceDataConfig : Data input and model setup config.
bs::BootstrapConfig : Bootstrap-specific parameters.
jk::JackknifeConfig : Jackknife-specific parameters.
abbrev::StringTranscoder.AbbreviationConfig : Abbreviation dictionary or struct.

struct JackknifeConfig

Defines jackknife resampling parameters.

Fields

struct TraceDataConfig

Holds metadata and parameters for trace data input/output configuration used in LightGBM machine-learning workflows.

Fields

location::String Base directory containing raw trace data files.
ensemble::String Ensemble identifier (e.g., "L8T4k13580").
analysis_header::String Prefix for analysis directories (e.g., "analysis_...").
X::Vector{String} List of input file names (e.g., ["plaq.dat", "rect.dat"]).
Y::String Output file name (e.g., "pbp.dat").
model::String Machine learning model identifier (e.g., "Ridge", "LightGBM").
read_column_X::Vector{Int} List of $1$-based column indices to read from each file in X.
read_column_Y::Int $1$-based column index to read from output file Y.
index_column::Int $1$-based column index for reading configuration indices (usually 1).
LBP::Int Percentage ($0 < x < 100$) of the total configurations to assign as the labeled set.
TRP::Int Percentage ($0 \le x \le 100$) of the labeled set that is used as the training set. (i.e., $\texttt{(training set)} = \texttt{(total set)} \times \dfrac{\texttt{LBP}}{100} \times \dfrac{\texttt{TRP}}{100}$)
IDX_shift::Int Offset applied to align the index of input X and output Y (e.g., 0 or 1).
dump_X::Bool Whether to dump preprocessed input matrix X to disk.
use_abbreviation::Bool Whether to abbreviate variable names for output directory or file naming.

parse_full_config_Deborah(
    toml_path::String, 
    jobid::Union{Nothing, String}=nothing
) -> FullConfigDeborah

Arguments

Returns

TOMLConfigDeborah.FullConfigDeborah : Struct containing all parsed configuration sections.