Monash University
hiersim.ado (17.44 kB)

hiersim - Stata program to simulate hierarchical clinical registry data and apply benchmarking and outlier classification methods

Download (17.44 kB)
posted on 2023-11-03, 04:31 authored by Jessy HansenJessy Hansen

A user-written Stata program to simulate hierarchical (3 level) clinical registry data with known outlying sites using specified parameters, run regression models, apply outlier classification methods, flag outliers and compare to true, storing and saving performance measures and dataset features.

The program allows for the specification of ten data parameters: outcome prevalence (prev), outlier definition (using a risk quotient, rq), proportion of outliers (op), site dispersion (using site SD, ssdint), risk-adjustment fit (using risk factor SD, rfsd), clinician variance (using clinician:site SD ratio, csdr), number of sites (siten), clinicians (clinn) and patients (totn), and case volume minimum (cvmin). Parameters can be varied iteratively one at a time or in a factorial framework.

Estimates for benchmarking can be obtained from four different methods: unadjusted rates (raw), and rates obtained from ordinary (ord), conditional fixed effects (fe) or random-intercept effects (re) logistic regression.

Outlier classification using confidence interval or control limits techniques can be applied. A selection of four confidence interval and three control limit methods are available: Byar approximation (cib), exact Poisson (cid), Rothman & Greenland (cir), Vandenbrouke (civ), exact binomial (cle), Wald (clw), false discovery rate (fdr), with the confidence/control level specified as desired.

The number of simulations (nsim), risk factor odds ratio (rfor) and starting seed (seed) can be optionally specified.

Stata syntax:

prev(numlist) rq(numlist) op(numlist) ssdint(numlist) rfsd(numlist) csdr(numlist) siten(numlist) clinn(numlist) totn(numlist) cvmin(numlist) - list of parameter values, at least one value required for all parameters, first value taken as the default for that parameter
saving(string) - specify location to save results from the simulation

ff(string) - specify parameter to be varied in a fully factorial design (default base)
model(string) - specify list of estimate methods, options: raw, ord, fe, re (default raw)
limit(string) - specify list of outlier classification methods (+level), method options: civ, cib, cid, cir, cle, clw, fdr (default cle95)
nsim(integer) - specify number of simulations to be run for each unique combination of parameters (default 100)
seed(integer) - specify starting seed for random number generation (default 202302)
rfor(real) - specify the risk factor odds ratio for the logit model (default 3)
REPLACE - specify that the simulation dataset can replace another with the same name


Usage metrics




    Ref. manager