Trial Design#

class sctrial.design.TrialDesign(participant_col: str = 'participant_id', visit_col: str = 'visit', arm_col: str | None = 'arm', arm_treated: str = 'Treated', arm_control: str = 'Control', celltype_col: str | None = 'celltype', crossover_col: str | None = None, baseline_visit: str | None = None, followup_visit: str | None = None)[source]#

Bases: object

Describe the trial-design columns and metadata labels in adata.obs.

The TrialDesign object centralizes the mapping of your study design to the AnnData object. It is used by almost all statistical and plotting functions in sctrial.

arm_bin(obs: DataFrame) Series[source]#

Return 0/1 treated indicator aligned to obs.index.

Parameters:

obs – DataFrame containing the participant-visit data.

Returns:

A Series with 0/1 indicator of treated status.

Return type:

pd.Series

Raises:
  • ValueError – If arm_treated and arm_control are the same.

  • KeyError – If arm_col is not in obs.columns.

arm_col: str | None = 'arm'#

Name of the column containing treatment arm assignments.

Set to None for single-arm studies that lack an arm column.

arm_control: str = 'Control'#

The label in arm_col representing the control/placebo group.

arm_treated: str = 'Treated'#

The label in arm_col representing the treatment/experimental group.

baseline_visit: str | None = None#

Optional default baseline visit label (e.g., ‘Baseline’, ‘V1’).

celltype_col: str | None = 'celltype'#

Optional name of the column containing cell-type annotations.

crossover_col: str | None = None#

Optional name of the column containing boolean-like indicators for crossover cells.

followup_visit: str | None = None#

Optional default follow-up visit label (e.g., ‘Follow-up’, ‘V2’).

participant_col: str = 'participant_id'#

Name of the column containing unique participant identifiers.

primary_visits(baseline: str | None = None, followup: str | None = None) tuple[str, str][source]#

Return (baseline, followup) visit labels.

Parameters:
  • baseline – Optional explicit baseline visit label. If None, uses self.baseline_visit.

  • followup – Optional explicit follow-up visit label. If None, uses self.followup_visit.

Returns:

Tuple of (baseline, followup) visit labels.

Return type:

tuple[str, str]

required_cols(*, include_celltype: bool = False, include_crossover: bool = False) Sequence[str][source]#

Return required obs columns for this design.

Parameters:
  • include_celltype – If True, include celltype_col when it is defined.

  • include_crossover – If True, include crossover_col when it is defined.

Returns:

List of required columns.

Return type:

list[str]

validate(adata: AnnData, *, include_celltype: bool = False, include_crossover: bool = False, check_arm_labels: bool = True) None[source]#

Validate that adata.obs contains required columns and labels.

Parameters:
  • adata – AnnData object to validate.

  • include_celltype – If True, require celltype_col in adata.obs.

  • include_crossover – If True, require crossover_col in adata.obs.

  • check_arm_labels – If True, verify that treated/control labels are present in arm_col.

Return type:

None

Raises:
  • KeyError – If required columns are missing.

  • ValueError – If arm labels are not found in adata.obs[self.arm_col].

visit_col: str = 'visit'#

Name of the column containing visit or timepoint labels.

Trial design specification: TrialDesign dataclass and design detection.

Design Detection#

sctrial.convenience.auto_detect_design(adata: AnnData, arm_treated: str | None = None, arm_control: str | None = None) TrialDesign[source]

Auto-detect trial design from common column naming patterns.

Looks for common patterns in column names:

  • participant: participant_id, patient_id, donor_id, subject_id, sample_id

  • visit: visit, timepoint, time, day, week, time_point

  • arm: arm, treatment_arm, arm_id, treatment, group, condition

  • celltype: celltype, cell_type, cluster, annotation, cell_annotation

Column matching uses exact (case-insensitive) match first, then word-boundary partial match (pattern must appear at a word boundary in the column name, so "arm" matches "arm_id" but not "farm_id").

Parameters:
  • adata – AnnData object to analyze.

  • arm_treated – Optional: specify the label for treated arm. If not provided, the function tries keyword-based detection (e.g. “Treated”, “Drug”, “Active”). Raises if detection fails.

  • arm_control – Optional: specify the label for control arm. If not provided, the function tries keyword-based detection (e.g. “Control”, “Placebo”). Raises if detection fails.

Returns:

Detected design (may need manual adjustment).

Return type:

TrialDesign

Examples

>>> design = auto_detect_design(adata)
>>> print(f"Detected design: {design}")
>>> # Verify the detected design. If arm labels need adjustment,
>>> # create a new TrialDesign with corrected values:
>>> from dataclasses import replace
>>> design = replace(design, arm_treated="Drug_A", arm_control="Placebo")
Raises:

ValueError – If required columns cannot be detected, or if arm labels cannot be determined (more than 2 arms, only 1 arm, or no keyword match for ambiguous labels).