Deborah.RebekahMiriam.ComparisonRebekahMiriam

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_bhattacharyya_dicts — Method

build_bhattacharyya_dicts(
    ext_dict::Dict{Tuple{Symbol,Symbol,Symbol,String}, Array{Float64,2}},
    keys::Vector{Symbol},
    keywords::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector;
    σ_floor::Float64 = 1e-12,
    also_hellinger::Bool = false
) -> Tuple{
    Dict{Tuple{Symbol,Symbol,String}, Array{Float64,2}},
    Union{Nothing, Dict{Tuple{Symbol,Symbol,String}, Array{Float64,2}}}
}

Construct Bhattacharyya-coefficient ($\mathrm{BC}$) matrices — and optionally Hellinger-distance matrices — for all (key, keyword, pred_tag) triples against a fixed orig_tag. Keying and grid traversal mirror build_overlap_and_error_dicts.

What it builds

For each (key, keyword, pred_tag):

bc_dict[(key, pred_tag, keyword)] :: Array{Float64,2} — Bhattacharyya coefficient matrix over the (labels $\times$ trains_ext) grid.
H_dict[(key, pred_tag, keyword)] :: Array{Float64,2} (optional) — Hellinger distance matrix over the same grid.

Inputs

ext_dict::Dict{(Symbol,Symbol,Symbol,String) => Array{Float64,2}} holding 2D arrays for keys of the form (key, kind, tag, keyword), where kind $\in$ {:avg, :err}, tag $\in$ pred_tags $\cup$ {orig_tag}.
- (key, :avg, pred_tag, keyword) → $\mu_{\text{pred}}$
- (key, :err, pred_tag, keyword) → $\sigma_{\text{pred}}$
- (key, :avg, orig_tag, keyword) → $\mu_{\text{orig}}$
- (key, :err, orig_tag, keyword) → $\sigma_{\text{orig}}$
keys, keywords, pred_tags, orig_tag, labels, trains_ext
σ_floor::Float64=1e-12 — floor for standard deviations to ensure numerical stability.
also_hellinger::Bool=false — if true, also compute Hellinger matrices.

Output structure

bc_dict :: Dict{(key::Symbol, pred_tag::Symbol, keyword::String) => Array{Float64,2}}
H_dict :: Union{Nothing, Dict{(key,pred_tag,keyword)=>Array{Float64,2}}}

Rows correspond to labels indices; columns correspond to trains_ext indices.

See also

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_bhattacharyya_dicts_for_measurements — Method

build_bhattacharyya_dicts_for_measurements(
    ext_dict::Dict{Tuple{Symbol,Symbol,Symbol,String}, Array{Float64,2}},
    keys::Vector{Symbol},
    kappa_list::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector;
    σ_floor::Float64 = 1e-12,
    also_hellinger::Bool = false
) -> Tuple{
    Dict{Tuple{Symbol,Symbol,String}, Array{Float64,2}},
    Union{Nothing, Dict{Tuple{Symbol,Symbol,String}, Array{Float64,2}}}
}

Build $\mathrm{BC}$ (and optionally Hellinger) matrices for measurement summaries for single ensemble keyed by (key, kind, tag, kappa_str). Keying/loop structure is analogous to build_bhattacharyya_dicts, but uses kappa_str in place of keyword.

What it builds

bc_dict[(key, pred_tag, kappa_str)] :: Array{Float64,2}
H_dict[(key, pred_tag, kappa_str)] :: Array{Float64,2} (optional)

Inputs / Outputs / Formulas

Same as build_bhattacharyya_dicts, with internal keys:

(key, :avg, pred_tag, kappa_str), (key, :err, pred_tag, kappa_str),
(key, :avg, orig_tag, kappa_str), (key, :err, orig_tag, kappa_str).

σ_floor clipping and the BC/H formulas are identical.

See also

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_jsd_dicts — Method

build_jsd_dicts(
    ext_dict::Dict{Tuple{Symbol,Symbol,Symbol,String}, Array{Float64,2}},
    keys::Vector{Symbol},
    keywords::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector;
    σ_floor::Float64 = 1e-12,
    k::Float64 = 8.0,
    n::Int = 1201
) -> Dict{Tuple{Symbol,Symbol,String}, Array{Float64,2}}

Construct Jensen-Shannon divergence ($\mathrm{JSD}$, base-2; range $[0,1]$) matrices for all (key, keyword, pred_tag) triples against a fixed orig_tag. Keying/layout mirror build_bhattacharyya_dicts.

What it builds

jsd_dict[(key, pred_tag, keyword)] :: Array{Float64,2} — $\mathrm{JSD}$ (base-2) over the labels $\times$ trains_ext grid.

Inputs

ext_dict with the same 4-tuple key scheme (key, :avg/err, tag, keyword) for both pred_tag and orig_tag.
σ_floor::Float64=1e-12 — floor for standard deviations.
k::Float64=8.0, n::Int=1201 — numerical integration window/resolution parameters.

See also

Deborah.Rebekah.Comparison.jsd_normals

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_jsd_dicts_for_measurements — Method

build_jsd_dicts_for_measurements(
    ext_dict::Dict{Tuple{Symbol,Symbol,Symbol,String}, Array{Float64,2}},
    keys::Vector{Symbol},
    kappa_list::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector;
    σ_floor::Float64 = 1e-12,
    k::Float64 = 8.0,
    n::Int = 1201
) -> Dict{Tuple{Symbol,Symbol,String}, Array{Float64,2}}

Build base-2 $\mathrm{JSD}$ matrices for measurement summaries for single ensemble keyed by (key, kind, tag, kappa_str). Keying and traversal mirror build_bhattacharyya_dicts_for_measurements.

What it builds

jsd_dict[(key, pred_tag, kappa_str)] :: Array{Float64,2}

Inputs / Definition / Grid traversal

Same as build_jsd_dicts, replacing keyword with kappa_str and using internal keys:

(key, :avg/err, pred_tag, kappa_str), (key, :avg/err, orig_tag, kappa_str).

The $\mathrm{JSD}$ definition, σ_floor handling, and numerical parameters k, n are identical.

See also

Deborah.Rebekah.Comparison.jsd_normals

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_overlap_and_error_dicts — Method

build_overlap_and_error_dicts(
    ext_dict::Dict{Tuple{Symbol, Symbol, Symbol, String}, Array{Float64,2}},
    keys::Vector{Symbol},
    keywords::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector
) -> Tuple{
    Dict{Tuple{Symbol, Symbol, String}, Array{Int,2}},
    Dict{Tuple{Symbol, Symbol, String}, Array{Float64,2}}
}

Construct overlap and error-ratio dictionaries for all observables and keywords in a multi-criterion setup.

This is a backward-compatible wrapper that returns only the original two outputs (overlap codes and error ratios). Internally it delegates to build_overlap_error_and_ovl_dicts and discards the additional ovl_dict. Use the 3-return variant if you also need the $\sigma$-normalized type-B distances.

Arguments

(Identical to [build_overlap_error_and_ovl_dicts] except no σ_floor keyword.)

Returns

chk_dict[(key, pred_tag, keyword)] :: Array{Int,2} → overlap quality codes for each (label, train).
err_dict[(key, pred_tag, keyword)] :: Array{Float64,2} → error ratios for each (label, train).

Notes

Existing call sites like chk_dict, err_dict = build_overlap_and_error_dicts(...) remain valid and unchanged.
Prefer build_overlap_error_and_ovl_dicts for new code when you want the third output (ovl_dict) without altering older call sites.

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_overlap_and_error_dicts_for_measurements — Method

build_overlap_and_error_dicts_for_measurements(
    ext_dict::Dict{Tuple{Symbol, Symbol, Symbol, String}, Array{Float64,2}},
    keys::Vector{Symbol},
    kappa_list::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector
) -> Tuple{
    Dict{Tuple{Symbol, Symbol, String}, Array{Int,2}},
    Dict{Tuple{Symbol, Symbol, String}, Array{Float64,2}}
}

Build overlap-check and error-ratio dictionaries from measurement summaries for a single ensemble.

This is a backward-compatible wrapper that returns only the original two outputs. Internally it calls build_overlap_error_and_ovl_dicts_for_measurements and discards the additional ovl_dict. Use the 3-return variant if you also need the $\sigma$-normalized type-B distances.

Arguments

(Identical to build_overlap_error_and_ovl_dicts_for_measurements except no σ_floor keyword.)

Returns

chk_dict[(key, pred_tag, kappa_str)] :: Array{Int,2} → overlap quality codes for each (label, train).
err_dict[(key, pred_tag, kappa_str)] :: Array{Float64,2} → error ratios for each (label, train).

Notes

Existing call sites like chk_dict, err_dict = build_overlap_and_error_dicts_for_measurements(...) continue to work unchanged.
Prefer the 3-return variant build_overlap_error_and_ovl_dicts_for_measurements for new code that consumes ovl_dict.

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_overlap_error_and_ovl_dicts — Method

build_overlap_error_and_ovl_dicts(
    ext_dict::Dict{Tuple{Symbol, Symbol, Symbol, String}, Array{Float64,2}},
    keys::Vector{Symbol},
    keywords::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector;
    σ_floor::Float64=1e-12
) -> Tuple{
    Dict{Tuple{Symbol, Symbol, String}, Array{Int,2}},
    Dict{Tuple{Symbol, Symbol, String}, Array{Float64,2}},
    Dict{Tuple{Symbol, Symbol, String}, Array{Float64,2}}
}

Construct overlap, error-ratio, and type-B distance dictionaries for all observables and keywords in a multi-criterion setup.

This function is specific to the Deborah.Miriam analysis framework. It traverses combinations of observable keys (keys), keyword criteria (keywords), and prediction methods (pred_tags) to compare against a reference (orig_tag). Each data array is extracted from ext_dict, which stores 2D matrices indexed by 4-tuples of the form

(observable_key, kind, tag, keyword)

where kind is either :avg or :err. Only entries that exist in ext_dict are processed.

Arguments

ext_dict: Dictionary of 2D matrices for all observable combinations, keyed by 4-tuples.
keys: Observable types (e.g., :TrM1, :TrM2, ...).
keywords: Interpolation/selection criteria (e.g., "kurt", "skew", ...).
pred_tags: Prediction method tags to evaluate (e.g., :RWP1, :RWP2).
orig_tag: Tag used for reference data (typically :RWBS).
labels: Vector indexing the LBP axis.
trains_ext: Vector indexing the TRP axis.
σ_floor: Small positive floor for uncertainty to avoid divide-by-zero in type-B distance.

Returns

chk_dict[(key, pred_tag, keyword)] :: Array{Int,2} → overlap codes (0/1/2) via Deborah.Rebekah.Comparison.check_overlap.
err_dict[(key, pred_tag, keyword)] :: Array{Float64,2} → error ratios pred_err / orig_err via Deborah.Rebekah.Comparison.err_ratio.
ovl_dict[(key, pred_tag, keyword)] :: Array{Float64,2} → type-B distances $d \equiv \dfrac{|\mu_{\text{orig}} - \mu_{\text{pred}}|}{\max(\sigma_{\text{orig}}, \sigma_{\text{floor}})}$ computed by Deborah.Rebekah.Comparison.check_overlap_type_b.

Notes

ovl_dict is intentionally asymmetric (measured in units of the reference/original $\sigma$).
The overlap code (chk_dict) is a coarse classifier; ovl_dict provides a graded “how far in σ_orig” measure that complements it.

See also

Deborah.Rebekah.Comparison.check_overlap — interval overlap classifier (0/1/2).
Deborah.Rebekah.Comparison.check_overlap_type_b — asymmetric σ-normalized separation.
Deborah.Rebekah.Comparison.bhattacharyya_coeff_normals — symmetric overlap proxy using both variances.

source

Deborah.RebekahMiriam.ComparisonRebekahMiriam.build_overlap_error_and_ovl_dicts_for_measurements — Method

build_overlap_error_and_ovl_dicts_for_measurements(
    ext_dict::Dict{Tuple{Symbol, Symbol, Symbol, String}, Array{Float64,2}},
    keys::Vector{Symbol},
    kappa_list::Vector{String},
    pred_tags::Vector{Symbol},
    orig_tag::Symbol,
    labels::Vector,
    trains_ext::Vector;
    σ_floor::Float64=1e-12
) -> Tuple{
    Dict{Tuple{Symbol, Symbol, String}, Array{Int,2}},
    Dict{Tuple{Symbol, Symbol, String}, Array{Float64,2}},
    Dict{Tuple{Symbol, Symbol, String}, Array{Float64,2}}
}

Build overlap, error-ratio, and type-B distance dictionaries from measurement summaries for a single ensemble.

This variant uses kappa_list as the 4th-key dimension of ext_dict, i.e., data are indexed by 4-tuples

(observable_key, kind, tag, kappa_str)

with kind $\in$ {:avg, :err}.

Arguments

ext_dict: Summary dictionary returned by Deborah.RebekahMiriam.SummaryLoaderRebekahMiriam.load_miriam_summary_for_measurement.
keys: List of observable keys (e.g., [:trM1, :trM2, :trM3, :trM4] or capitalized variants).
kappa_list: $\kappa$ values as strings (dictionary dimension).
pred_tags: Prediction tags (e.g., [:T_P1, :T_P2]).
orig_tag: Original/reference tag (e.g., :T_BS).
labels: LBP ratios index axis.
trains_ext: TRP ratios index axis.
σ_floor: Small positive floor for uncertainty to avoid divide-by-zero in type-B distance.

Returns

chk_dict[(key, pred_tag, kappa_str)] :: Array{Int,2} → overlap codes (0/1/2) per (label, train).
err_dict[(key, pred_tag, kappa_str)] :: Array{Float64,2} → error ratios pred_err / orig_err per (label, train).
ovl_dict[(key, pred_tag, kappa_str)] :: Array{Float64,2} → type-B distances $d \equiv \dfrac{|\mu_{\text{orig}} - \mu_{\text{pred}}|}{\max(\sigma_{\text{orig}}, \sigma_{\text{floor}})}$ per (label, train).

Notes

Asymmetric $\sigma$ scaling (units of the reference/original) by design.
Complements the coarse overlap code with a continuous $\sigma$-distance measure.

See also

build_overlap_error_and_ovl_dicts — keyword-based 4th dimension.
build_overlap_and_error_dicts_for_measurements — 2-return wrapper for backward compatibility.

source