Deborah.DeborahEsther.EstherDependencyManager

Deborah.DeborahEsther.EstherDependencyManager.ensure_TrM_existsFunction
ensure_TrM_exists(
    toml_path::String, 
    jobid::Union{Nothing, String}=nothing
) -> Nothing

Ensures that all required $\text{Tr} \, M^{-n} \; (n=1,2,3,4)$ data files exist for each input group. If any expected file is missing, the function automatically invokes EstherDependencyManager.run_Deborah_from_Esther to regenerate the missing outputs.

Arguments

  • toml_path::String : Path to the TOML configuration file specifying input features, models, and output options.
  • jobid::Union{Nothing, String} : Optional job ID string used for logging.

Behavior

  • Parses the configuration and checks for the presence of output files associated with each TrMi group.
  • Each TrMi group ($\text{Tr} \, M^{-n} \; (n=1,2,3,4)$) consists of (X, Y, model) triplets.
  • For each group, it verifies the existence of files such as Y_info, Y_bc, YP_bc, etc.
  • If any required file is missing, EstherDependencyManager.run_Deborah_from_Esther is called to regenerate the outputs.

Returns

  • Nothing : This is a side-effect function that ensures required files exist or are created.
source
Deborah.DeborahEsther.EstherDependencyManager.generate_toml_dictMethod
generate_toml_dict(
    location::String,
    ensemble::String,
    analysis_header::String,
    X::Vector{String},
    Y::String,
    model::String,
    read_column_X::Vector{Int},
    read_column_Y::Int,
    index_column::Int,
    LBP::Int,
    TRP::Int,
    IDX_shift::Int,
    dump_X::Bool,
    ranseed::Int,
    N_bs::Int,
    blk_size::Int,
    method::String,
    bin_size::Int,
    abbreviation::Dict{String,String},
    use_abbreviation::Bool
) -> Dict

Construct a TOML-compatible configuration dictionary for the Deborah.DeborahCore workflow.

Arguments

  • location::String : Base directory for outputs.
  • ensemble::String : Ensemble identifier (e.g., "L8T4b1.60k13570").
  • analysis_header::String : Analysis folder prefix (e.g., "analysis").
  • X::Vector{String} : Input feature keys.
  • Y::String : Target observable key.
  • model::String : Model type (e.g., "LightGBM", "Lasso").
  • read_column_X::Vector{Int} : $1$-based value-column indices for each X file.
  • read_column_Y::Int : $1$-based value-column index for the Y file.
  • index_column::Int : $1$-based column index of configuration IDs in files.
  • LBP::Int : Label group ID (label partition parameter).
  • TRP::Int : Training group ID (training partition parameter).
  • IDX_shift::Int : Index offset/shift used by Deborah.DeborahCore.
  • dump_X::Bool : Whether to dump input matrices.
  • ranseed::Int : Random seed for bootstrap.
  • N_bs::Int : Number of bootstrap replicates.
  • blk_size::Int : Block length for block bootstrap ($\ge 1$).
  • method::String : Block-bootstrap scheme to encode in TOML:
    • "nonoverlapping" — Nonoverlapping Block Bootstrap (NBB)
    • "moving" — Moving Block Bootstrap (MBB)
    • "circular" — Circular Block Bootstrap (CBB)
  • bin_size::Int : Jackknife bin size.
  • abbreviation::Dict{String,String} : Abbreviation map for input encoding.
  • use_abbreviation::Bool : Use abbreviations in paths/filenames if true.

Returns

  • Dict : A nested, TOML-ready dictionary including (at minimum) sections/keys for:
    • data/paths (location, ensemble, analysis_header, abbreviations),
    • IO columns (read_column_X, read_column_Y, index_column, IDX_shift, dump_X),
    • bootstrap (ranseed, N_bs, blk_size, method),
    • jackknife (bin_size),
    • partitions (LBP, TRP),
    • model (model, X, Y).

Notes

  • All column indices are $1$-based.
  • method must be one of the three literals above; invalid values should be rejected by the caller.
source
Deborah.DeborahEsther.EstherDependencyManager.run_Deborah_from_EstherFunction
run_Deborah_from_Esther(
    location::String,
    ensemble::String,
    analysis_header::String,
    X::Vector{String},
    Y::String,
    model::String,
    read_column_X::Vector{Int},
    read_column_Y::Int,
    index_column::Int,
    LBP::Int,
    TRP::Int,
    IDX_shift::Int,
    dump_X::Bool,
    ranseed::Int,
    N_bs::Int,
    blk_size::Int,
    method::String,
    bin_size::Int,
    overall_name::String,
    abbreviation::Dict{String,String},
    use_abbreviation::Bool,
    jobid::Union{Nothing, String}=nothing
) -> Nothing

Invoke Deborah.DeborahCore from within Deborah.Esther if required $\text{Tr} \, M^{-n} \; (n=1,2,3,4)$ files are missing. This function generates a temporary TOML configuration from the provided arguments, writes it to disk, and launches the Deborah.DeborahCore workflow.

Arguments

  • location::String : Base path for output directory.
  • ensemble::String : Ensemble identifier (e.g., "L8T4b1.60k13570").
  • analysis_header::String : Analysis name prefix (e.g., "analysis").
  • X::Vector{String} : Input feature list.
  • Y::String : Target observable key.
  • model::String : Model name (e.g., "LightGBM").
  • read_column_X::Vector{Int} : $1$-based column indices for values in each X file.
  • read_column_Y::Int : $1$-based column index for values in the Y file.
  • index_column::Int : $1$-based column index for configuration index.
  • LBP::Int : Label group ID.
  • TRP::Int : Training group ID.
  • IDX_shift::Int : Index offset shift used by Deborah.DeborahCore.
  • dump_X::Bool : Whether to dump input matrices.
  • ranseed::Int : Random seed for bootstrap.
  • N_bs::Int : Number of bootstrap replicates.
  • blk_size::Int : Block length used for block bootstrap ($\ge 1$).
  • method::String : Block-bootstrap scheme to use:
    • "nonoverlapping" — Nonoverlapping Block Bootstrap (NBB).
    • "moving" — Moving Block Bootstrap (MBB).
    • "circular" — Circular Block Bootstrap (CBB; wrap-around windows).
  • bin_size::Int : Jackknife bin size.
  • overall_name::String : Unified name tag for output files.
  • abbreviation::Dict{String,String} : Abbreviation map for input encoding.
  • use_abbreviation::Bool : Whether to use abbreviations in paths/filenames.
  • jobid::Union{Nothing,String} : Optional job ID for logging.

Behavior

  • Resolves output directory according to use_abbreviation.
  • Saves the TOML as config_Deborah_*.toml under the output_dir.
  • Launches Deborah.DeborahCore with the generated configuration.

Returns

  • Nothing — side-effecting helper.

Notes

  • method must be one of "nonoverlapping", "moving", "circular"; invalid values should raise an error before launching.
  • If existing Deborah outputs are present and valid, this function should be a no-op (caller-dependent).
source