---
title: "Reference: Analytical Method Validation"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Reference: Analytical Method Validation}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)
```

```{r setup, message = FALSE}
library(measure)
library(dplyr)
library(ggplot2)
```

## Overview

The `measure` package provides a comprehensive suite of functions for analytical method validation. These functions are designed to be compatible with regulatory frameworks including:
- **ICH Q2(R2)**: Validation of Analytical Procedures
- **ISO/IEC 17025**: General requirements for testing and calibration laboratories
- **USP <1225>**: Validation of Compendial Procedures
- **ICH M10**: Bioanalytical Method Validation (for applicable workflows)

This vignette demonstrates key validation workflows including calibration, precision, accuracy, uncertainty, and quality control.

## Calibration Curves

### Fitting Calibration Curves

The `measure_calibration_fit()` function fits weighted or unweighted calibration curves with comprehensive diagnostics.

```{r calibration-fit}
# Create calibration data
set.seed(42)
cal_data <- data.frame(
  nominal_conc = c(1, 5, 10, 25, 50, 100, 250, 500),
  response = c(1, 5, 10, 25, 50, 100, 250, 500) * 1.05 +
             rnorm(8, sd = c(0.1, 0.3, 0.5, 1, 2, 4, 10, 20))
)

# Fit with 1/x^2 weighting (common for bioanalytical methods)
cal <- measure_calibration_fit(
  cal_data,
  response ~ nominal_conc,
  weights = "1/x2"
)

print(cal)
```

### Visualizing the Calibration

```{r calibration-plot}
autoplot(cal, type = "curve")
```

### Checking Residuals

```{r calibration-residuals}
autoplot(cal, type = "residuals")
```

### Predicting Unknown Concentrations

```{r calibration-predict}
unknowns <- data.frame(
  sample_id = c("Sample_1", "Sample_2", "Sample_3"),
  response = c(52.3, 125.8, 280.5)
)

predictions <- measure_calibration_predict(
  cal,
  newdata = unknowns,
  interval = "confidence"
)

cbind(unknowns, predictions)
```

### Calibration Verification

Verify that the calibration remains valid using QC samples:

```{r calibration-verify}
qc_data <- data.frame(
  sample_id = c("QC_Low", "QC_Mid", "QC_High"),
  nominal_conc = c(3, 75, 400),
  response = c(3.1, 77.5, 395.2)
)

verification <- measure_calibration_verify(cal, qc_data)
print(verification)
```

## Limits of Detection and Quantitation

### Multiple Methods

`measure` supports multiple approaches for calculating LOD/LOQ:

```{r lod-loq}
# Blank-based approach (3σ/10σ)
blank_data <- data.frame(
  sample_type = rep("blank", 10),
  response = rnorm(10, mean = 0.5, sd = 0.08)
)

lod_result <- measure_lod(
  blank_data,
  "response",
  method = "blank_sd",
  calibration = cal
)
print(lod_result)

# Or calculate both together
lod_loq <- measure_lod_loq(
  blank_data,
  "response",
  method = "blank_sd",
  calibration = cal
)
tidy(lod_loq)
```

## Precision Studies

### Repeatability (Within-Run Precision)

```{r repeatability}
# Data from replicate measurements
repeat_data <- data.frame(
  sample_id = rep(c("Low", "Mid", "High"), each = 6),
  concentration = c(
    rnorm(6, 10, 0.5),
    rnorm(6, 100, 4),
    rnorm(6, 500, 18)
  )
)

repeatability <- measure_repeatability(
  repeat_data,
  "concentration",
  group_col = "sample_id"
)
print(repeatability)
```

### Intermediate Precision

```{r intermediate-precision}
# Data from multiple days
ip_data <- data.frame(
  day = rep(1:3, each = 6),
  analyst = rep(c("A", "A", "A", "B", "B", "B"), 3),
  concentration = 100 +
    rep(c(-2, 0, 2), each = 6) +  # Day effect
    rep(c(-1, 1), 9) +            # Analyst effect
    rnorm(18, sd = 3)             # Residual
)

ip_result <- measure_intermediate_precision(
  ip_data,
  "concentration",
  factors = c("day", "analyst")
)
print(ip_result)
```

### Gage R&R Analysis

For measurement system analysis:

```{r gage-rr}
# Gage R&R data
grr_data <- data.frame(
  part = rep(1:5, each = 6),
  operator = rep(rep(c("Op1", "Op2"), each = 3), 5),
  measurement = c(
    # Part 1
    10.1, 10.2, 10.0, 10.3, 10.1, 10.2,
    # Part 2
    20.2, 20.1, 20.3, 20.0, 20.2, 20.1,
    # Part 3
    15.1, 15.0, 15.2, 15.3, 15.1, 15.0,
    # Part 4
    25.0, 25.1, 24.9, 25.2, 25.0, 25.1,
    # Part 5
    30.1, 30.2, 30.0, 30.1, 30.0, 30.2
  )
)

grr_result <- measure_gage_rr(
  grr_data,
  "measurement",
  part_col = "part",
  operator_col = "operator"
)
print(grr_result)
```

## Accuracy Assessment

### Bias and Recovery

```{r accuracy}
accuracy_data <- data.frame(
  level = rep(c("Low", "Mid", "High"), each = 5),
  measured = c(
    rnorm(5, 10.2, 0.3),   # Low level, slight positive bias
    rnorm(5, 100, 2.5),    # Mid level, no bias
    rnorm(5, 498, 8)       # High level, slight negative bias
  ),
  reference = rep(c(10, 100, 500), each = 5)
)

accuracy <- measure_accuracy(
  accuracy_data,
  "measured",
  "reference",
  group_col = "level"
)
print(accuracy)
```

### Linearity Assessment

```{r linearity}
linearity_data <- data.frame(
  concentration = rep(c(10, 25, 50, 75, 100), each = 3),
  response = rep(c(10, 25, 50, 75, 100), each = 3) * 1.02 +
             rnorm(15, sd = 1.5)
)

linearity <- measure_linearity(
  linearity_data,
  "concentration",
  "response"
)
print(linearity)

# Plot with fit line
autoplot(linearity, type = "fit")
```

## Uncertainty Budgets

### ISO GUM Uncertainty

Create uncertainty budgets following the GUM (Guide to the Expression of Uncertainty in Measurement):

```{r uncertainty}
# Define uncertainty components
components <- list(
  uncertainty_component(
    name = "Repeatability",
    type = "A",
    value = 0.5,
    df = 9
  ),
  uncertainty_component(
    name = "Calibration",
    type = "B",
    value = 0.3,
    distribution = "normal"
  ),
  uncertainty_component(
    name = "Reference Standard",
    type = "B",
    value = 0.1,
    distribution = "rectangular"
  ),
  uncertainty_component(
    name = "Temperature",
    type = "B",
    value = 0.2,
    sensitivity = 0.5  # Sensitivity coefficient
  )
)

budget <- measure_uncertainty_budget(.list = components)
print(budget)
```

### Visualizing Uncertainty Contributions

```{r uncertainty-plot}
autoplot(budget)
```

## Control Charts

### Setting Up Control Limits

```{r control-limits}
# Historical QC data
qc_history <- data.frame(
  run_order = 1:30,
  qc_value = rnorm(30, mean = 100, sd = 2)
)

limits <- measure_control_limits(qc_history, "qc_value")
print(limits)
```
### Monitoring with Westgard Rules

```{r control-chart}
# New run data including potential out-of-control point
new_run <- data.frame(
  run_order = 1:20,
  qc_value = c(rnorm(19, 100, 2), 108)  # Last point is high
)

chart <- measure_control_chart(
  new_run,
  "qc_value",
  "run_order",
  limits = limits,
  rules = c("1_3s", "2_2s", "R_4s", "10x")
)
print(chart)
```

```{r control-chart-plot}
autoplot(chart)
```

## Acceptance Criteria

### Defining Criteria

```{r criteria}
# Create custom criteria
my_criteria <- measure_criteria(
  criterion("cv", "<=", 15, description = "Precision CV"),
  criterion("bias_pct", "between", c(-10, 10), description = "Bias"),
  criterion("recovery", "between", c(85, 115), description = "Recovery %")
)
print(my_criteria)
```

### Using Preset Criteria

```{r preset-criteria}
# ICH Q2 presets
ich_criteria <- criteria_ich_q2()
print(ich_criteria)

# Bioanalytical presets
bio_criteria <- criteria_bioanalytical()
print(bio_criteria)
```

### Assessing Results

```{r assess}
# Sample results to assess (single summary values per criterion)
# For example, from a method validation summary
results <- list(
  cv = 5.2,          # Overall precision CV
  bias_pct = 1.3,    # Overall bias
  recovery = 101.3   # Mean recovery
)

assessment <- measure_assess(results, my_criteria)
print(assessment)

# Check if all criteria passed
all_pass(assessment)
```

## Method Comparison

When validating a new method, you often need to compare it against a reference or existing method. The `measure` package provides several approaches for method comparison studies.

### Bland-Altman Analysis

Bland-Altman plots show the agreement between two methods by plotting differences against means:

```{r bland-altman}
# Paired measurements from two methods
comparison_data <- data.frame(
  sample_id = 1:30,
  method_A = rnorm(30, mean = 100, sd = 15),
  method_B = rnorm(30, mean = 102, sd = 16)
)

ba <- measure_bland_altman(
  comparison_data,
  method1_col = "method_A",
  method2_col = "method_B",
  regression = "linear"  # Test for proportional bias
)
print(ba)
```

```{r bland-altman-plot}
autoplot(ba)
```

### Regression Methods

For method comparison regression, use Deming or Passing-Bablok regression which account for error in both methods:

```{r deming-regression}
# Method comparison with known measurement error
deming_data <- data.frame(
  reference = c(5, 10, 25, 50, 100, 200, 400),
  test_method = c(5.2, 10.3, 25.8, 51.2, 101.5, 203.1, 408.2)
)

deming <- measure_deming_regression(
  deming_data,
  method1_col = "reference",
  method2_col = "test_method",
  bootstrap = TRUE,
  bootstrap_n = 500
)
print(deming)

# Check if methods are equivalent
glance(deming)
```

For Passing-Bablok regression (non-parametric), install the `mcr` package:

```{r passing-bablok, eval = FALSE}
# Requires: install.packages("mcr")
pb <- measure_passing_bablok(
  deming_data,
  method1_col = "reference",
  method2_col = "test_method"
)
print(pb)
```

### Proficiency Testing

Evaluate laboratory performance in proficiency testing programs:

```{r proficiency-score}
# PT results from multiple labs
pt_data <- data.frame(
  lab_id = paste0("Lab_", 1:10),
  measured = c(99.2, 100.5, 98.8, 101.2, 97.5, 100.1, 99.8, 102.3, 100.6, 94.0),
  assigned = rep(100, 10),
  uncertainty = c(1.5, 2.0, 1.8, 1.6, 2.2, 1.9, 1.7, 2.1, 1.5, 2.0)
)

# z-scores with known sigma
z_scores <- measure_proficiency_score(
  pt_data,
  measured_col = "measured",
  reference_col = "assigned",
  score_type = "z_score",
  sigma = 2.5
)
print(z_scores)
```

```{r proficiency-plot}
autoplot(z_scores)
```

## Matrix Effects

Matrix effects (ion suppression/enhancement) must be evaluated in LC-MS/MS and similar methods.

### Evaluating Matrix Effects

```{r matrix-effect}
# Post-extraction spike experiment
me_data <- data.frame(
  sample_type = rep(c("matrix", "neat"), each = 6),
  matrix_lot = rep(c("Lot1", "Lot2", "Lot3"), 4),
  concentration = rep(c("low", "high"), each = 3, times = 2),
  response = c(
    # Matrix samples (some suppression)
    9200, 9500, 8900, 47500, 48200, 46800,
    # Neat samples
    10000, 10000, 10000, 50000, 50000, 50000
  )
)

me <- measure_matrix_effect(
  me_data,
  response_col = "response",
  sample_type_col = "sample_type",
  matrix_level = "matrix",
  neat_level = "neat",
  concentration_col = "concentration"
)
print(me)
```

```{r matrix-effect-plot}
autoplot(me, type = "bar")
```

### Standard Addition Correction

When matrix effects vary between samples, standard addition provides sample-specific correction:

```{r standard-addition, eval = FALSE}
library(recipes)

# Standard addition data
sa_data <- data.frame(
  sample_id = rep(c("Sample1", "Sample2"), each = 4),
  addition = rep(c(0, 10, 20, 30), 2),
  response = c(
    150, 250, 350, 450,  # Sample 1
    250, 350, 450, 550   # Sample 2
  )
)

rec <- recipe(~ ., data = sa_data) |>
  step_measure_standard_addition(
    response,
    addition_col = "addition",
    sample_id_col = "sample_id"
  ) |>
  prep()

# Original concentrations calculated via extrapolation
bake(rec, new_data = NULL)
```

## Sample Preparation QC

Recipe steps for quality control during sample preparation.

### Dilution Factor Correction

Back-calculate concentrations for diluted samples:

```{r dilution-correction}
library(recipes)

dilution_data <- data.frame(
  sample_id = paste0("S", 1:5),
  dilution_factor = c(1, 2, 5, 10, 1),
  analyte = c(50, 45, 42, 48, 51)  # Measured after dilution
)

rec <- recipe(~ ., data = dilution_data) |>
  update_role(sample_id, new_role = "id") |>
  step_measure_dilution_correct(
    analyte,
    dilution_col = "dilution_factor",
    operation = "multiply"
  ) |>
  prep()

# Back-calculated original concentrations
bake(rec, new_data = NULL)
```

### Surrogate Recovery

Monitor extraction efficiency with surrogate standards:

```{r surrogate-recovery}
qc_data <- data.frame(
  sample_id = paste0("QC", 1:6),
  surrogate = c(95, 105, 88, 112, 75, 132)  # Expected = 100
)

rec <- recipe(~ ., data = qc_data) |>
  update_role(sample_id, new_role = "id") |>
  step_measure_surrogate_recovery(
    surrogate,
    expected_value = 100,
    action = "flag",
    min_recovery = 80,
    max_recovery = 120
  ) |>
  prep()

# Flag samples outside recovery limits
bake(rec, new_data = NULL)
```

## Drift Correction

### Detecting Drift

```{r detect-drift}
# Data with drift
drift_data <- data.frame(
  sample_type = rep("qc", 20),
  run_order = 1:20,
  feature1 = 100 + (1:20) * 0.8 + rnorm(20, sd = 2),  # Has drift

  feature2 = 100 + rnorm(20, sd = 2)                   # No drift
)

drift_result <- measure_detect_drift(
  drift_data,
  features = c("feature1", "feature2"),
  qc_type = "qc"
)
print(drift_result)
```

### Correcting Drift

```{r drift-correction, eval = FALSE}
library(recipes)

# Using QC-LOESS correction in a recipe
rec <- recipe(~ ., data = drift_data) |>
  step_measure_drift_qc_loess(
    feature1, feature2,
    qc_type = "qc"
  ) |>
  prep()

corrected <- bake(rec, new_data = NULL)
```

## Validation Reports

Once you've completed your validation studies, you can compile all results into a reproducible validation report using `measure_validation_report()`. The package provides templates following regulatory frameworks like ICH Q2(R2) and USP <1225>.
### Creating a Validation Report

```{r validation-report}
# Gather validation results (using objects from above)
report <- measure_validation_report(
  # Metadata
  title = "HPLC-UV Method Validation Report",
  method_name = "Compound X Assay",
  method_description = "Reversed-phase HPLC with UV detection at 254 nm",
  analyst = "J. Smith",
  reviewer = "A. Jones",
  lab = "Analytical Development",
  instrument = "Agilent 1260 Infinity II",


  # Validation sections (results from earlier in this vignette)
  calibration = cal,
  lod_loq = lod_loq,
  accuracy = accuracy,
  precision = list(repeatability = repeatability, intermediate = ip_result),
  linearity = linearity,
  range = list(lower = 1, upper = 500, units = "ng/mL"),
  uncertainty = budget,

  # Text sections
  specificity = "No interfering peaks observed at the analyte retention time when analyzing blank matrix samples.",
  robustness = list(
    factors = c("Flow rate (±0.1 mL/min)", "Column temperature (±5°C)", "Mobile phase pH (±0.2)"),
    conclusion = "Method showed acceptable robustness within tested parameter ranges."
  ),

  # Conclusions
  conclusions = list(
    summary = "The analytical method meets all acceptance criteria for precision, accuracy, and linearity.",
    recommendations = c(
      "Method is suitable for intended use",
      "Revalidate if significant changes are made to instrumentation or reagents"
    )
  ),

  # References
  references = c(
    "ICH Q2(R2) Validation of Analytical Procedures (2023)",
    "USP <1225> Validation of Compendial Procedures"
  )
)

print(report)
```

### Inspecting Report Sections

```{r report-sections}
# Check which sections are included
summary(report)

# Access specific sections
has_validation_section(report, "calibration")
has_validation_section(report, "stability")  # Not included

# Get section data
get_validation_section(report, "range")
```

### Adding Custom Sections

You can add additional sections after report creation:

```{r add-section}
# Add a stability section later
report <- add_validation_section(
  report,
  "stability",
  list(
    description = "Short-term stability at room temperature",
    results = data.frame(
      timepoint = c("0h", "4h", "8h", "24h"),
      recovery_pct = c(100, 99.5, 98.8, 97.2)
    ),
    conclusion = "Sample is stable for 24 hours at room temperature."
  )
)

has_validation_section(report, "stability")
```

### Tidy Output

Extract all results as a tidy tibble for further analysis:

```{r tidy-report}
tidy(report)
```

### Rendering Reports

To render a validation report to HTML or PDF, use `render_validation_report()`. Two templates are provided:

- **ICH Q2(R2)**: Organized by validation characteristics (specificity, linearity, range, accuracy, precision, etc.)
- **USP <1225>**: Organized by procedure category (I, II, III, IV)

```{r render-report, eval = FALSE}
# Render to HTML using ICH Q2 template (default)
render_validation_report(
  report,
  output_file = "validation_report.html",
  template = "ich_q2"
)

# Render to PDF using USP <1225> template
render_validation_report(
  report,
  output_file = "validation_report.pdf",
  output_format = "pdf",
  template = "usp_1225"
)

# Use a custom Quarto template
render_validation_report(
  report,
  output_file = "custom_report.html",
  template_path = "path/to/custom_template.qmd"
)
```

The rendered report includes:

- **Header**: Method name, date, analyst, reviewer, lab, instrument
- **Table of Contents**: Auto-generated from sections
- **Validation Sections**: Each with formatted tables and plots
- **Provenance**: R version, package versions, timestamp for reproducibility

## Summary

The `measure` package provides a complete toolkit for analytical method validation:

| Category | Key Functions |
|----------|---------------|
| **Calibration** | `measure_calibration_fit()`, `measure_calibration_predict()`, `measure_calibration_verify()` |
| **LOD/LOQ** | `measure_lod()`, `measure_loq()`, `measure_lod_loq()` |
| **Precision** | `measure_repeatability()`, `measure_intermediate_precision()`, `measure_gage_rr()` |
| **Accuracy** | `measure_accuracy()`, `measure_linearity()`, `measure_carryover()` |
| **Method Comparison** | `measure_bland_altman()`, `measure_deming_regression()`, `measure_passing_bablok()`, `measure_proficiency_score()` |
| **Matrix Effects** | `measure_matrix_effect()`, `step_measure_standard_addition()` |
| **Sample Prep QC** | `step_measure_dilution_correct()`, `step_measure_surrogate_recovery()` |
| **Uncertainty** | `measure_uncertainty_budget()`, `measure_uncertainty()` |
| **Control Charts** | `measure_control_limits()`, `measure_control_chart()` |
| **Criteria** | `measure_criteria()`, `measure_assess()`, `criteria_ich_q2()`, `criteria_bland_altman()`, `criteria_matrix_effects()` |
| **Drift** | `measure_detect_drift()`, `step_measure_drift_qc_loess()` |
| **Validation Reports** | `measure_validation_report()`, `render_validation_report()` |

All functions follow a consistent design philosophy:
- **Tidy outputs**: Results are tibbles with `tidy()`, `glance()`, and `autoplot()` methods
- **Transparent diagnostics**: No hidden decisions; all parameters and flags are visible
- **Regulatory compatibility**: Designed with ICH, ISO, and FDA guidelines in mind
- **Provenance tracking**: Audit trails for outlier handling and data modifications

For more details on any function, see the package documentation with `?function_name`.

## See Also

- [Getting Started](../articles/tutorial-getting-started.html) - Introduction to measure workflows
- [Preprocessing Reference](../articles/reference-preprocessing.html) - Guide to preprocessing techniques