localuf.data_processors

Module for functions to process numerical data, mainly from sim.

Functions

add_ignored_timesteps(data[, ...])

Add ignored timesteps to data.

get_failure_data(data[, p_slice, alpha, method])

Get failure stats from output of sim.accuracy.monte_carlo.

get_failure_data_from_subset_sample(data, ...)

Get failure stats from output of sim.accuracy.subset_sample.

get_log_runtime_data(data)

Get log runtime data from runtime data.

get_stats(log_data[, missing])

Get WLS (weighted least squares) stats from output of either get_log_runtime_data OR get_failure_data.

localuf.data_processors.get_failure_data(data, p_slice=slice(None, None, None), alpha=np.float64(0.31731050786291415), method='wilson')[source]

Get failure stats from output of sim.accuracy.monte_carlo.

Parameters:
Return dT:

a DataFrame where each row a (distance, probability); columns are:

  • f logical failure rate

  • lo lower confidence bound of f

  • hi upper confidence bound of f

  • x log10(p)

  • y log10(f)

  • yerr half the confidence interval of y.

Return type:

DataFrame

localuf.data_processors.get_failure_data_from_subset_sample(data, code_class, noise, noise_levels, alpha=np.float64(0.31731050786291415), method='normal')[source]

Get failure stats from output of sim.accuracy.subset_sample.

Returns:

A DataFrame indexed by (distance, noise level), with columns:

  • f logical error probability.

  • lo lower bound of f.

  • hi upper bound of f.

Parameters:
  • data (DataFrame)

  • code_class (Type[Code])

  • noise (Literal['code capacity', 'phenomenological', 'circuit-level'])

  • noise_levels (ndarray[tuple[Any, ...], dtype[float64]])

  • alpha (float)

Side effects: adds columns f, SE_lo, SE_hi to data.

localuf.data_processors.get_log_runtime_data(data)[source]

Get log runtime data from runtime data.

data output from sim.runtime.batch.

Returns log_data:

A DataFrame where each column a stat: x, y, or yerr; row, a (probability, distance).

Parameters:

data (DataFrame)

localuf.data_processors.get_stats(log_data, missing='drop', **kwargs_for_WLS)[source]

Get WLS (weighted least squares) stats from output of either get_log_runtime_data OR get_failure_data.

Parameters:
  • log_data (DataFrame) – output from get_log_runtime_data or get_failure_data.

  • missing – passed to statsmodels.regression.linear_model.WLS.

  • kwargs_for_WLS – passed to statsmodels.regression.linear_model.WLS.

Return stats:

A DataFrame where each row an x-value (probability OR distance); columns are:

  • intercept

  • se_intercept

  • gradient

  • se_gradient

  • r_squared

localuf.data_processors.add_ignored_timesteps(data, extra_steps_per_layer=2, layers_per_sample=<function <lambda>>)[source]

Add ignored timesteps to data.

Parameters:
  • data (DataFrame) – a DataFrame where each column a (distance, noise level); row, a runtime sample.

  • extra_steps_per_layer – number of ignored timesteps per layer in the decoding graph.

  • layers_per_sample (Callable[[int], int]) – a function with input d that outputs the measurement round count per sample.

E.g. if data is output of sim.runtime.frugal with time_only='merging' using Snowflake with the 1:1 schedule, the default kwargs of this function will convert it to the analogous output of sim.runtime.frugal with time_only='all'. This is because the extra timesteps per layer are a drop and a grow. If using Snowflake with the 2:1 schedule, set extra_steps_per_layer=3 as there is one additional grow timestep.