Deborah.Sarah.DataLoader

Deborah.Sarah.DataLoader.load_data_fileFunction
load_data_file(
    path::String, 
    key::String, 
    jobid::Union{Nothing, String}=nothing
) -> Matrix{Float64}

Load a tab-delimited data file and return its contents as a matrix of Float64.

Arguments

  • path::String: Directory path where the data file is located.
  • key::String: Filename of the data file to be loaded.
  • jobid::Union{Nothing, String} : Optional job identifier for structured logging.

Returns

  • Matrix{Float64}: A 2D array containing the parsed data from the file.

Notes

source
Deborah.Sarah.DataLoader.try_multi_readdlmFunction
try_multi_readdlm(
    path::String, 
    jobid::Union{Nothing, String}=nothing
) -> Matrix{Float64}

Efficiently attempt to read a delimited numeric data file using a fast primary strategy with a lightweight fallback parser.

This function first tries to read the file as a tab-delimited matrix using DelimitedFiles.readdlm. If that fails, it falls back to a manual parser that splits lines using common delimiters and extracts only numeric values, skipping any non-numeric tokens (e.g., labels or tags).

Compared to the original robust version, this version prioritizes speed, assuming the file is mostly well-formed with numeric values and consistent rows. Still supports fallback parsing for mixed-format dumps (like Y_info) but uses map and filter for better performance.

Parsing Strategy

  1. Attempt DelimitedFiles.readdlm(path, '\t', Float64) (tab-delimited).
  2. If it fails:
    • Read lines and split by regex: [, ; ]+
    • Skip non-numeric tokens using tryparse(Float64, token)
    • Parse numeric tokens and collect into rows
    • Assert all rows have the same number of numeric columns

Arguments

  • path::String : Path to the target .dat or text file to read.
  • jobid::Union{Nothing, String} : Optional job identifier for structured logging.

Returns

  • Matrix{Float64} : A matrix of parsed numeric values with shape (N_rows, N_columns).

Errors

  • Throws an error if:
    • Tab-delimited read fails and fallback also fails
    • The number of numeric values per row is inconsistent during fallback.
source