Deborah.DeborahCore
Deborah.DeborahCore — Modulemodule DeborahCoreDeborah.DeborahCore — Bias-corrected ML pipeline.
Deborah.DeborahCore provides the end-to-end workflow to learn regression models for trace-like observables, apply bias correction, and materialize study-ready artifacts (vectorized X/Y bundles, and summary tables). It orchestrates configuration parsing, path/name construction, dataset partitioning, feature/target preparation, multiple ML backends (Ridge/Lasso/LightGBM variants), and result writing/printing.
Scope & Responsibilities
- Configuration & paths: parse a single
TOMLinto strongly-typed structs and construct reproducible analysis paths and filenames. - Data preparation: split labeled data into training/bias-correction sets; vectorize
X/Ybundles for ML; emitLB/TR/BC/ULartifacts for inspection and reuse. - Model training: Run baseline and machine-learning training sequences.
RidgeandLassoare provided viaJuliaAI/MLJ.jl.LightGBMis available either throughJuliaAI/MLJ.jlor viaPyCall.jl. TheJuliaAI/MLJ.jl-basedLightGBMbranch (internally referred to asMiddleGBM) additionally supports optional hyperparameter scanning/tuning and per-split evaluation (TR/BC/UL). - Outputs & reporting: write machine predictions (flattened/matrix forms), residual plots (
MiddleGBMoption), and jackknife/bootstrap summaries for downstream tools.
Key Components
Deborah.DeborahCore.TOMLConfigDeborah- parse/run config structs:TraceDataConfig,BootstrapConfig,JackknifeConfig,FullConfigDeborah;parse_full_config_Deborah(...).Deborah.DeborahCore.PathConfigBuilderDeborah- build stable output layout:DeborahPathConfig;build_path_config_Deborah(...).Deborah.DeborahCore.DatasetPartitionerDeborah— computeLB/TR/BC/ULindices and counts.Deborah.DeborahCore.XYMLInfoGenerator/Deborah.DeborahCore.XYMLVectorizer— split & dumpLB/TR/BC/ULblocks; flatten and reshapeX/Yfor ML I/O (vector $\Leftrightarrow$ $N_\text{cnf} \times N_\text{src}$ matrix).Deborah.DeborahCore.FeaturePipeline/Deborah.DeborahCore.MLInputPreparer— assemble feature tables (NamedTupleform) and target vectors per split.Deborah.DeborahCore.BaselineSequence— non-ML baselines and scaffolding.Deborah.DeborahCore.MLSequence— model runners:- `
Deborah.DeborahCore.MLSequenceRidge(JuliaAI/MLJ.jlRidge), - `
Deborah.DeborahCore.MLSequenceLasso(JuliaAI/MLJ.jlLasso), - `
Deborah.DeborahCore.MLSequenceLightGBM(JuliaAI/MLJ.jlLightGBM), - `
Deborah.DeborahCore.MLSequenceMiddleGBM(JuliaAI/MLJ.jlLightGBM$+$ learning curves/tuning; residual plots), - `
Deborah.DeborahCore.MLSequencePyCallLightGBM(PythonLightGBMviaPyCall.jl),
- `
Deborah.DeborahCore.SummaryWriterDeborah/Deborah.DeborahCore.ResultPrinterDeborah— persist and printY/YP, bias, andP1/P2summaries (jackknife & bootstrap).Deborah.DeborahCore.DeborahRunner— glue code to execute the full pipeline end-to-end.
Public API (typical entry points)
Minimal Usage
julia> using Deborah
julia> run_DeborahWizard()
julia> run_Deborah("config_Deborah.toml")Notes
- Splits: labeled set (
LB) is partitioned intoTR(training) andBC(bias correction) according toLBPandTRP;ULis the remaining unlabeled set. - Shapes: ML expects flattened vectors; helper utils convert between flattened and $N_\text{cnf} \times N_\text{src}$ matrices for analysis and file dumps.
MiddleGBM(LightGBM): can auto-produce learning curves and residual plots; ensure plotting dependencies are available whenjobid === nothing.PyCall.jlbackend: requiresPythonLightGBMavailable to thePyCall.jlenvironment.
See Also
Deborah.Sarah: shared logging, naming, formattingDeborah.Esther: cumulant estimation at the single ensembleDeborah.Miriam: cumulant estimation with multi-ensemble reweighting
- Deborah.DeborahCore.BaselineSequence
- Deborah.DeborahCore.DatasetPartitionerDeborah
- Deborah.DeborahCore.DeborahRunner
- Deborah.DeborahCore.FeaturePipeline
- Deborah.DeborahCore.MLInputPreparer
- Deborah.DeborahCore.MLSequence
- Deborah.DeborahCore.MLSequenceLasso
- Deborah.DeborahCore.MLSequenceLightGBM
- Deborah.DeborahCore.MLSequenceMiddleGBM
- Deborah.DeborahCore.MLSequencePyCallLightGBM
- Deborah.DeborahCore.MLSequenceRidge
- Deborah.DeborahCore.PathConfigBuilderDeborah
- Deborah.DeborahCore.ResultPrinterDeborah
- Deborah.DeborahCore.SummaryWriterDeborah
- Deborah.DeborahCore.TOMLConfigDeborah
- Deborah.DeborahCore.XYMLInfoGenerator
- Deborah.DeborahCore.XYMLVectorizer