Deborah.Sarah.Bootstrap
Deborah.Sarah.Bootstrap.Plan4 — Typeconst Plan4{T}Type alias for a 4-tuple of vectors holding per-sample quantities (e.g., cumulants or their components):
Plan4{T} == NTuple{4, AbstractVector{T}}
Typical usage:
Q_tuple::Plan4{Float64}: four length-Nvectors containing raw values (e.g., $Q_n \; (n=1,2,3,4)$) indexed from1toN.ps_tuple::Plan4{Float64}: four length-N+1prefix-sum vectors corresponding to the above (e.g.,ps[k] = sum(Q[1:k-1])withps[1] = 0.0).
Notes
- This alias is used to express the interface of batching functions that consume four related series of equal length (for
Q_tuple) or their prefix sums (forps_tuple).
Deborah.Sarah.Bootstrap.Plan5 — TypePlan5{T} = NTuple{5, AbstractVector{T}}Type alias for a 5-tuple of vectors holding per-sample series, typically $(Q_1, Q_2, Q_3, Q_4, w)$ where $w$ is a reweighting factor.
Q_tuple::Plan5{Float64}: five length-Nvectors of raw values.ps_tuple::Plan5{Float64}: five length-N+1prefix-sum vectors whereps[k+1] - ps[k] == Q[k]andps[1] == 0.0.
This alias makes signatures concise for functions that operate on four related series plus a weight track.
Deborah.Sarah.Bootstrap.block_sum — Methodblock_sum(
ps::AbstractVector{<:Real},
i::Int,
len::Int,
N::Int
) -> Float64Return the sum of a block of length len starting at index i, using the prefix-sum array ps. Supports circular wrap-around when the block exceeds N.
Arguments
ps::AbstractVector{<:Real}: Prefix-sum array of lengthN+1.i::Int: $1$-based start index of the block.len::Int: Block length ($\ge 0$).N::Int: Population size (length of the original array).
Behavior
- If
len == 0, returns0.0. - If
i + len - 1 ≤ N, returnsps[i+len] - ps[i]. - Otherwise, wraps around and returns
(ps[N+1] - ps[i]) + ps[(i+len-1) - N].
Returns
Float64: Sum over the requested block.
Deborah.Sarah.Bootstrap.bootstrap_average_error — Methodbootstrap_average_error(
arr::AbstractArray
) -> (Real, Real)Compute the mean and standard deviation (as bootstrap error estimate) from an array of bootstrap samples.
Arguments
arr::AbstractArray: Vector of bootstrap samples
Returns
A tuple:
m: Mean of sampless: Standard deviation with Bessel's correction
Deborah.Sarah.Bootstrap.bootstrap_average_error_from_raw — Methodbootstrap_average_error_from_raw(
arr::AbstractVector,
N_bs::Int,
block::Int,
rng::Random.AbstractRNG;
method::String = "nonoverlapping"
) -> (Real, Real)Estimate the mean and its bootstrap error from a single raw series arr using block bootstrap with a given block size and method.
This routine builds (or reuses) a block-start plan, computes the resampled mean for each bootstrap replicate via mean_from_plan, and finally aggregates the bootstrap estimate and its standard error.
Arguments
arr::AbstractVector: Raw data of lengthN_all. Must be non-empty.N_bs::Int: Number of bootstrap replicates.block::Int: Block size (must satisfy1 ≤ block ≤ length(arr)).rng::Random.AbstractRNG: RNG used to generate the block-start plan.method::String(keyword; default"nonoverlapping"): Blocking scheme. Canonical names:"moving": moving blocks (uses prefix sums)"nonoverlapping": non-overlapping blocks (uses prefix sums)"circular": circular blocks with wrap-around (no prefix sums)
normalize_method:"nbb" → "nonoverlapping","cbb" → "circular".
Behavior
- Validates inputs: non-empty
arr;blockwithin[1, length(arr)]. - Normalizes
methodvianormalize_method. - Builds
ps = prefix_sums(arr)for prefix-sum methods; usesnothingfor"circular". - Obtains
(starts, nblk, lastlen)fromensure_plan(...). - For each
ibs$\in$1:N_bs, computes the replicate mean bymean_from_plan(ps, arr, starts, N_all, block, nblk, lastlen, ibs; method). - Aggregates
(m, s) = bootstrap_average_error(mean_arr)and returns them.
Returns
(m, s) :: (Real, Real):m: Bootstrap estimate of the mean ofarr.s: Bootstrap standard error of the mean.
Notes
- Contracts:
length(arr) == N_all.ensure_planmust producestarts::Matrix{Int}of sizeN_bs$\times$nblkwith valid $1$-based start indices for the selected method.- For prefix-sum methods,
psmust satisfylength(ps) == N_all + 1.
- Complexity:
- Plan generation: depends on
ensure_plan. - Per replicate:
O(nblk)for prefix-sum methods;O(nblk * block)for circular.
- Plan generation: depends on
- Performance:
- Uses
Base.@inboundsin the replicate loop. - The heavy lifting is delegated to
mean_from_plan, keeping logic de-duplicated.
- Uses
Examples
m, s = bootstrap_average_error_from_raw(
randn(10_000),
1000,
50,
MersenneTwister();
method="nbb" # alias normalized to "nonoverlapping"
)Deborah.Sarah.Bootstrap.ensure_plan — Methodensure_plan(
provided::Union{Nothing, AbstractMatrix{<:Integer}},
rng::Random.AbstractRNG,
N::Int,
block::Int,
nbs::Int;
method::String
) -> (Union{Nothing, Matrix{Int}}, Int, Int)Ensure a valid block-start plan matrix for the given configuration. If provided is missing or has incompatible shape, allocate and (re)generate a plan.
Arguments
provided::Union{Nothing, AbstractMatrix{<:Integer}}: Optional existing plan of shapenbs$\times$nblk.rng::Random.AbstractRNG: RNG instance.N::Int: Population size.block::Int: Block size.nbs::Int: Number of bootstrap replicates (rows).method::String: Plan generation method (as ingen_block_plan!).
Behavior
- Computes
(nblk, lastlen) = nblk_lastlen(N, block). - If
N == 0, returns(nothing, nblk, lastlen). - If
providedisnothingor not of sizenbs$\times$nblk, a newstartsis allocated and filled viagen_block_plan!; otherwise returnsprovided.
Returns
(starts, nblk, lastlen)where:starts::Union{Nothing, Matrix{Int}}: Valid plan (ornothingifN==0).nblk::Int: Number of blocks per replicate.lastlen::Int: Length of the final (possibly short) block.
Deborah.Sarah.Bootstrap.gen_block_plan! — Methodgen_block_plan!(
starts::Matrix{Int},
rng::Random.AbstractRNG,
N::Int,
blk::Int,
nblk::Int;
method::String="moving"
) -> Matrix{Int}Generate a plan of block start indices for block bootstrap. Each row of starts holds nblk start positions for one bootstrap replicate.
Arguments
starts::Matrix{Int}: PreallocatedN_bs$\times$nblkmatrix to fill.rng::Random.AbstractRNG: RNG instance.N::Int: Population size ($\ge 0$).blk::Int: Block size ($\ge 1$).nblk::Int: Number of blocks per replicate.method::String="moving": One of"moving","nonoverlapping"(or"nbb"),"circular"(or"cbb").blk==1implies i.i.d.
Behavior
blk == 1(i.i.d.): each start is sampled i.i.d. from1:N(with replacement)."moving": starts sampled i.i.d. from1:(N-blk+1)(no wrap)."nonoverlapping": use onlyN_eff = div(N, blk)*blk; starts are{1, 1+blk, …, 1+(div(N,blk)-1)*blk}sampled with replacement (no wrap)."circular": starts sampled i.i.d. from1:N(wrap applied when summing).- If
N == 0, returnsstartsunchanged.
Returns
Matrix{Int}: The filledstartsmatrix (same object).
Deborah.Sarah.Bootstrap.mean_from_plan — Methodmean_from_plan(
ps::Union{Nothing,AbstractVector{<:Real}},
arr::AbstractVector{<:Real},
starts::Union{Nothing,AbstractMatrix{<:Integer}},
N::Int,
blk::Int,
nblk::Int,
lastlen::Int,
ibs::Int;
method::String
) -> RealCompute the block-resampled mean for a single bootstrap replicate ibs using a pre-generated start-index plan.
Arguments
ps::Union{Nothing,AbstractVector{<:Real}}: Optional prefix-sum array ofarrwith lengthN+1, whereps[k] = sum(arr[1:k-1]). Required for"moving","nonoverlapping"methods. Ignored for"circular"and for the trivial caseblk == 1.arr::AbstractVector{<:Real}: Source data of lengthNto be block-resampled.starts::Union{Nothing,AbstractMatrix{<:Integer}}: Block start indices of shapenbs$\times$nblk($1$-based). Rowibsencodes the sequence of block starts for the replicate. May benothingwhenN == 0.N::Int: Population size, i.e.length(arr).blk::Int: Nominal block size.nblk::Int: Number of blocks per replicate (including the final possibly short block).lastlen::Int: Length of the final block (may be0if the last block is absent; otherwise1$\le$lastlen$\le$blk).ibs::Int: $1$-based index of the bootstrap replicate row to use fromstarts.method::String: Block scheme. One of"moving","nonoverlapping"(usesps), or"circular"(wrap-around modulo indexing).
Behavior
- Early exit: if
N == 0orstarts === nothingornblk == 0, returns0.0. - If
blk == 1, gathersnblkelements directly atstarts[ibs, k]and returns their average divided byN. - For
"moving" | "nonoverlapping": uses prefix sums to accumulate(nblk-1)full blocks of lengthblkand, iflastlen > 0, one final partial block of lengthlastlen. Returns the total divided byN. Requiresps !== nothing. - For
"circular": sums each block by explicit modulo indexingi = (s + j - 1) % N + 1, first for the(nblk-1)full blocks of lengthblk, then the optional final block of lengthlastlen. Returns the total divided byN. - Throws
JobLoggerTools.error_benji("Unknown method = $method")for unrecognized methods.
Returns
mean::Real: The resampled mean for replicateibs, normalized byN.
Notes
- Contracts/assumptions:
length(arr) == N.startshas shapenbs$\times$nblk, andibs$\in$1:nbs.- Start indices are $1$-based and valid for the chosen method.
- For prefix-sum methods,
psmust satisfylength(ps) == N+1.
- Complexity:
O(nblk)for prefix-sum methods (moving/nonoverlapping);O(nblk * blk)for circular methods.
- Performance: Uses
Base.@viewforsrow = starts[ibs, :]andBase.@inboundsin inner loops.
Deborah.Sarah.Bootstrap.nblk_lastlen — Methodnblk_lastlen(
N::Int,
block::Int
) -> (Int, Int)Compute the number of blocks nblk and the trailing block length lastlen needed to cover N items with blocks of size block.
Arguments
N::Int: Population size ($\ge 0$).block::Int: Block size ($\ge 1$).
Behavior
- Uses
nblk = cld(N, block). - The final block may be shorter:
lastlen = N - block*(nblk - 1). - If
N == 0, returns(0, 0).
Returns
(Int, Int):(nblk, lastlen).
Deborah.Sarah.Bootstrap.normalize_method — Methodnormalize_method(
m::AbstractString
) -> StringNormalize bootstrap/blocking method aliases to canonical names.
Arguments
m::AbstractString: Input method name or alias. Recognized aliases:"nbb"→"nonoverlapping""cbb"→"circular"
Behavior
- Returns the canonical method string if a known alias is provided.
- Otherwise returns
String(m)as-is.
Returns
method::String: Canonical method name.
Examples
normalize_method("nbb") == "nonoverlapping" # true
normalize_method("moving") == "moving" # trueDeborah.Sarah.Bootstrap.prefix_sums — Methodprefix_sums(
arr::AbstractVector{<:Real}
) -> Vector{Float64}Compute a one-based prefix-sum array for arr to enable $O(1)$ block-sum queries.
Arguments
arr::AbstractVector{<:Real}: Input data.
Behavior
- Returns a vector
psof lengthlength(arr)+1withps[1] = 0.0andps[i+1] = ps[i] + arr[i]fori = 1:length(arr).
Returns
Vector{Float64}: Prefix-sum arraypssuch that the sum ofarr[i:j]equalsps[j+1] - ps[i].
Deborah.Sarah.Bootstrap.update_mean_from_plan! — Methodupdate_mean_from_plan!(
Q_tuple::Plan5{Float64},
ps_tuple::Plan5{Float64},
starts::Matrix{Int},
N::Int,
blk::Int,
nblk::Int,
lastlen::Int,
outmat::Matrix{Float64},
ibs::Int;
method::String = "nonoverlapping",
) -> nothingAccumulate block-averaged means for five related series $(Q_1, Q_2, Q_3, Q_4, w)$ into outmat[1:5, ibs], using either direct indexing (blk == 1 or "circular") or prefix sums ("nonoverlapping" / "moving").
Arguments
Q_tuple:Plan5{Float64}of raw series; each vector must have lengthN.ps_tuple:Plan5{Float64}of prefix sums corresponding toQ_tuple; each vector must have lengthN+1withps[k+1] - ps[k] == Q[k].starts: Matrix of $1$-based start indices for each block; rowibsis used.N: Total number of samples per series.blk: Nominal block length (except possibly the last).nblk: Number of blocks (including the possibly shorter last block).lastlen: Length of the last block when shorter thanblk.outmat: Output matrix with at least 5 rows; columnibsis overwritten.ibs: Column index intooutmatto write results.method:"nonoverlapping"/"moving"(prefix-sum path) or"circular"(modular direct indexing). Any other string throws an error.
Behavior
- For
blk == 1: sums the point values directly atstarts[ibs, k]. - For
"nonoverlapping"/"moving": uses prefix sums for $O(1)$ block sums, i.e.,sum(Q[s:e]) == ps[e+1] - ps[s]. - For
"circular": uses modular indexingi = (s + j - 1) % N + 1for block spans. - Final accumulators are divided by
N(assumed total length) and stored inoutmat[1:5, ibs].
Assumptions
length.(Q_tuple) == (N, N, N, N, N)length.(ps_tuple) == (N+1, N+1, N+1, N+1, N+1)starts[ibs, 1:nblk]are valid start indices under the chosen method.- The denominator is
N(total series length), not the total block length.
Returns
Nothing(mutatesoutmatin place).
Throws
JobLoggerTools.error_benji("Unknown method = $method")for unsupportedmethodvalues.
Deborah.Sarah.Bootstrap.update_mean_from_plan4! — Methodupdate_mean_from_plan4!(
Q_tuple::Plan4{Float64},
ps_tuple::Plan4{Float64},
starts::Union{Nothing, AbstractMatrix{<:Integer}},
N::Int, blk::Int, nblk::Int, lastlen::Int,
outmat::Matrix{Float64}, ibs::Int;
method::String = "nonoverlapping"
) -> nothingAccumulate block-averaged means for four related series $(Q_1, Q_2, Q_3, Q_4)$ into outmat[:, ibs], using either direct indexing or prefix sums depending on the block-sampling method.
Arguments
Q_tuple:Plan4{Float64}of raw series. Each vector must have lengthN.ps_tuple:Plan4{Float64}of prefix sums for the corresponding raw series. Each vector must have lengthN + 1and satisfyps[k+1] - ps[k] == Q[k].starts: Eithernothing(degenerate case) or an integer matrix where rowibscontains $1$-based start indices of each block (size ≥nblk).N: Total number of samples per series.blk: Nominal block length (for all but possibly the last block).nblk: Number of blocks (including the possibly shorter last block).lastlen: Length of the last block; used when the last block is shorter thanblk.outmat: Output matrix of size at least4 × ?; columnibsis overwritten with the four accumulated means (one per row).ibs: Column index inoutmatto write results into.method: One of"nonoverlapping","moving","nbb"(prefix-sum paths), or"circular","cbb"(circular modular indexing). Any other string throws.
Behavior
- If
N == 0,starts === nothing, ornblk == 0, the function writes zeros tooutmat[1:4, ibs]and returns. - For
"nonoverlapping","moving","nbb":- Accumulation uses prefix sums for $O(1)$ block-sum queries:
sum(Q[s:e]) == ps[e+1] - ps[s].
- Accumulation uses prefix sums for $O(1)$ block-sum queries:
- For
"circular","cbb":- Accumulation uses modular indexing over
Q_tuple:i = (s + j - 1) % N + 1forj = 0:(len-1).
- Accumulation uses modular indexing over
- The final four accumulators are divided by
Nand stored inoutmat[1:4, ibs].
Requirements / Assumptions
length.(Q_tuple) == (N, N, N, N)length.(ps_tuple) == (N+1, N+1, N+1, N+1)starts[ibs, k]is valid fork = 1:nblk, and each block range is within1:Nafter applying the chosen method’s indexing rule.outmathas at least 4 rows and a valid columnibs.
Returns
Nothing(mutatesoutmatin place).
Complexity
"nonoverlapping","moving","nbb":O(nblk)due to prefix-sum use."circular","cbb":O(nblk * blk)(or withlastlenfor the final block).
Throws
JobLoggerTools.error_benji("Unknown method = $method")ifmethodis not one of the recognized options.
Example
Q1, Q2, Q3, Q4 = CumulantsBundleUtils.flatten_Q4_columns(Q_Y_ORG) # each length N
ps = (Bootstrap.prefix_sums(Q1), Bootstrap.prefix_sums(Q2),
Bootstrap.prefix_sums(Q3), Bootstrap.prefix_sums(Q4)) # each length N+1
out = zeros(4, nboots)
update_mean_from_plan4!((Q1, Q2, Q3, Q4), ps, starts_all, N, blk, nblk, lastlen,
out, ibs; method="nonoverlapping")Notes
- For robustness, you may add:
JobLoggerTools.assert_benji(
all(length.(Q_tuple) .== N),
"length.(Q_tuple) must all equal N"
)
JobLoggerTools.assert_benji(all(length.(ps_tuple) .== N .+ 1),
"length.(ps_tuple) must all equal N+1"
)near the top (disabled in production if needed).