Deborah.DeborahEsther.EstherDependencyManager

Deborah.DeborahEsther.EstherDependencyManager.ensure_TrM_exists — Function

ensure_TrM_exists(
    toml_path::String, 
    jobid::Union{Nothing, String}=nothing
) -> Nothing

Ensures that all required $\text{Tr} \, M^{-n} \; (n=1,2,3,4)$ data files exist for each input group. If any expected file is missing, the function automatically invokes EstherDependencyManager.run_Deborah_from_Esther to regenerate the missing outputs.

Arguments

toml_path::String : Path to the TOML configuration file specifying input features, models, and output options.
jobid::Union{Nothing, String} : Optional job ID string used for logging.

Behavior

Parses the configuration and checks for the presence of output files associated with each TrMi group.
Each TrMi group ($\text{Tr} \, M^{-n} \; (n=1,2,3,4)$) consists of (X, Y, model) triplets.
For each group, it verifies the existence of files such as Y_info, Y_bc, YP_bc, etc.
If any required file is missing, EstherDependencyManager.run_Deborah_from_Esther is called to regenerate the outputs.

Returns

Nothing : This is a side-effect function that ensures required files exist or are created.

source

Deborah.DeborahEsther.EstherDependencyManager.generate_toml_dict — Method

generate_toml_dict(
    location::String,
    ensemble::String,
    analysis_header::String,
    X::Vector{String},
    Y::String,
    model::String,
    read_column_X::Vector{Int},
    read_column_Y::Int,
    index_column::Int,
    LBP::Int,
    TRP::Int,
    IDX_shift::Int,
    dump_X::Bool,
    ranseed::Int,
    N_bs::Int,
    blk_size::Int,
    method::String,
    bin_size::Int,
    abbreviation::Dict{String,String},
    use_abbreviation::Bool
) -> Dict

Construct a TOML-compatible configuration dictionary for the Deborah.DeborahCore workflow.

Arguments

location::String : Base directory for outputs.
ensemble::String : Ensemble identifier (e.g., "L8T4b1.60k13570").
analysis_header::String : Analysis folder prefix (e.g., "analysis").
X::Vector{String} : Input feature keys.
Y::String : Target observable key.
model::String : Model type (e.g., "LightGBM", "Lasso").
read_column_X::Vector{Int} : $1$-based value-column indices for each X file.
read_column_Y::Int : $1$-based value-column index for the Y file.
index_column::Int : $1$-based column index of configuration IDs in files.
LBP::Int : Label group ID (label partition parameter).
TRP::Int : Training group ID (training partition parameter).
IDX_shift::Int : Index offset/shift used by Deborah.DeborahCore.
dump_X::Bool : Whether to dump input matrices.
ranseed::Int : Random seed for bootstrap.
N_bs::Int : Number of bootstrap replicates.
blk_size::Int : Block length for block bootstrap ($\ge 1$).
method::String : Block-bootstrap scheme to encode in TOML:
- "nonoverlapping" — Nonoverlapping Block Bootstrap (NBB)
- "moving" — Moving Block Bootstrap (MBB)
- "circular" — Circular Block Bootstrap (CBB)
bin_size::Int : Jackknife bin size.
abbreviation::Dict{String,String} : Abbreviation map for input encoding.
use_abbreviation::Bool : Use abbreviations in paths/filenames if true.

Returns

Dict : A nested, TOML-ready dictionary including (at minimum) sections/keys for:
- data/paths (location, ensemble, analysis_header, abbreviations),
- IO columns (read_column_X, read_column_Y, index_column, IDX_shift, dump_X),
- bootstrap (ranseed, N_bs, blk_size, method),
- jackknife (bin_size),
- partitions (LBP, TRP),
- model (model, X, Y).

Notes

All column indices are $1$-based.
method must be one of the three literals above; invalid values should be rejected by the caller.

source

Deborah.DeborahEsther.EstherDependencyManager.run_Deborah_from_Esther — Function

run_Deborah_from_Esther(
    location::String,
    ensemble::String,
    analysis_header::String,
    X::Vector{String},
    Y::String,
    model::String,
    read_column_X::Vector{Int},
    read_column_Y::Int,
    index_column::Int,
    LBP::Int,
    TRP::Int,
    IDX_shift::Int,
    dump_X::Bool,
    ranseed::Int,
    N_bs::Int,
    blk_size::Int,
    method::String,
    bin_size::Int,
    overall_name::String,
    abbreviation::Dict{String,String},
    use_abbreviation::Bool,
    jobid::Union{Nothing, String}=nothing
) -> Nothing

Invoke Deborah.DeborahCore from within Deborah.Esther if required $\text{Tr} \, M^{-n} \; (n=1,2,3,4)$ files are missing. This function generates a temporary TOML configuration from the provided arguments, writes it to disk, and launches the Deborah.DeborahCore workflow.

Arguments

location::String : Base path for output directory.
ensemble::String : Ensemble identifier (e.g., "L8T4b1.60k13570").
analysis_header::String : Analysis name prefix (e.g., "analysis").
X::Vector{String} : Input feature list.
Y::String : Target observable key.
model::String : Model name (e.g., "LightGBM").
read_column_X::Vector{Int} : $1$-based column indices for values in each X file.
read_column_Y::Int : $1$-based column index for values in the Y file.
index_column::Int : $1$-based column index for configuration index.
LBP::Int : Label group ID.
TRP::Int : Training group ID.
IDX_shift::Int : Index offset shift used by Deborah.DeborahCore.
dump_X::Bool : Whether to dump input matrices.
ranseed::Int : Random seed for bootstrap.
N_bs::Int : Number of bootstrap replicates.
blk_size::Int : Block length used for block bootstrap ($\ge 1$).
method::String : Block-bootstrap scheme to use:
- "nonoverlapping" — Nonoverlapping Block Bootstrap (NBB).
- "moving" — Moving Block Bootstrap (MBB).
- "circular" — Circular Block Bootstrap (CBB; wrap-around windows).
bin_size::Int : Jackknife bin size.
overall_name::String : Unified name tag for output files.
abbreviation::Dict{String,String} : Abbreviation map for input encoding.
use_abbreviation::Bool : Whether to use abbreviations in paths/filenames.
jobid::Union{Nothing,String} : Optional job ID for logging.

Behavior

Resolves output directory according to use_abbreviation.
Saves the TOML as config_Deborah_*.toml under the output_dir.
Launches Deborah.DeborahCore with the generated configuration.

Returns

Nothing — side-effecting helper.

Notes

method must be one of "nonoverlapping", "moving", "circular"; invalid values should raise an error before launching.
If existing Deborah outputs are present and valid, this function should be a no-op (caller-dependent).

source