| Title: | Rank-Based Test to Evaluate a Surrogate Marker |
|---|---|
| Description: | Uses a novel rank-based nonparametric approach to evaluate a surrogate marker in a small sample size setting. Details are described in Parast et al (2024) <doi:10.1093/biomtc/ujad035>, in Hughes A et al (2025) <doi:10.1002/sim.70241>, and in Hughes A et al (2026) <doi:10.48550/arXiv.2605.03819>. A tutorial for this package can be found at <https://www.laylaparast.com/surrogaterank> and a Shiny App implementing the package can be found at <https://parastlab.shinyapps.io/SurrogateRankApp/>. |
| Authors: | Layla Parast [aut, cre], Arthur Hughes [aut] |
| Maintainer: | Layla Parast <[email protected]> |
| License: | GPL |
| Version: | 3.0 |
| Built: | 2026-05-20 19:36:53 UTC |
| Source: | https://github.com/laylaparast/surrogaterank |
Calculates the rank-based test statistic for Y and the rank-based test statistic for S and the difference, delta, along with corresponding standard error estimates
delta.calculate(full.data = NULL, yone = NULL, yzero = NULL, sone = NULL, szero = NULL)delta.calculate(full.data = NULL, yone = NULL, yzero = NULL, sone = NULL, szero = NULL)
full.data |
either full.data or yone, yzero, sone, szero must be supplied; if full data is supplied it must be in the following format: one observation per row, Y is in the first column, S is in the second column, treatment group (0 or 1) is in the third column. |
yone |
primary outcome, Y, in group 1 |
yzero |
primary outcome, Y, in group 0 |
sone |
surrogate marker, S, in group 1 |
szero |
surrogate marker, S, in group 0 |
u.y |
rank-based test statistic for Y |
u.s |
rank-based test statistic for S |
delta |
difference, u.y-u.s |
sd.u.y |
standard error estimate of u.y |
sd.u.s |
standard error estimate of u.s |
sd.delta |
standard error estimate of delta |
Layla Parast
data(example.data) delta.calculate(yone = example.data$y1, yzero = example.data$y0, sone = example.data$s1, szero = example.data$s0)data(example.data) delta.calculate(yone = example.data$y1, yzero = example.data$y0, sone = example.data$s1, szero = example.data$s0)
This function calculates the difference in treatment effects on a univariate marker
and on a continuous primary response. This extends the delta.calculate() function
from the SurrogateRank package to the case where samples may be paired instead of
independent, and where a two sided test is desired.
delta.calculate.extension( yone, yzero, sone, szero, alpha = 0.05, paired = FALSE )delta.calculate.extension( yone, yzero, sone, szero, alpha = 0.05, paired = FALSE )
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group
with dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group
with dimension |
alpha |
significance level of test, default is |
paired |
logical flag giving if the data is independent or paired. If
|
This function estimates the difference (delta) between two rank-based statistics
(e.g., Wilcoxon statistics or paired ranks) for a primary outcome and a surrogate,
under either an independent or paired design.
A list with the following elements:
u.y: Rank-based test statistic for the primary outcome
u.s: Rank-based test statistic for the surrogate
delta.estimate: Estimated difference between outcome and surrogate statistics
sd.u.y: Standard deviation of the outcome statistic
sd.u.s: Standard deviation of the surrogate statistic
sd.delta: Standard error of the delta estimate
Arthur Hughes, Layla Parast
# Load data data("example.data") yone <- example.data$y1 yzero <- example.data$y0 sone <- example.data$s1 szero <- example.data$s0 delta.calculate.extension.result <- delta.calculate.extension( yone, yzero, sone, szero, paired = TRUE )# Load data data("example.data") yone <- example.data$y1 yzero <- example.data$y0 sone <- example.data$s1 szero <- example.data$s0 delta.calculate.extension.result <- delta.calculate.extension( yone, yzero, sone, szero, paired = TRUE )
Function to perform meta-analysis of summary statistics and hypothesis testing for a single marker
delta.reml.meta( delta = NULL, sd.delta = NULL, epsilon = NULL, alpha = 0.05, alternative = "two.sided", tol = 1e-10, verbose = FALSE, test = "knha", meta.analysis.method = "RE" )delta.reml.meta( delta = NULL, sd.delta = NULL, epsilon = NULL, alpha = 0.05, alternative = "two.sided", tol = 1e-10, verbose = FALSE, test = "knha", meta.analysis.method = "RE" )
delta |
numeric vector of delta values per study |
sd.delta |
numeric vector of standard error of delta values per study |
epsilon |
numeric non-inferiority margin for testing cross-study validity |
alpha |
numeric significance level of test. Note : using the two-one-sided test ( |
alternative |
character giving the alternative hypothesis type for testing the summary effect.
One of |
tol |
numeric convergence tolerance for finding a root of the score equation |
verbose |
logical flag indicating whether messages should be printed, defaults to |
test |
character giving the type of test to be performed. The default is |
meta.analysis.method |
character giving the meta-analysis method to be used. The default is |
a list with elements
n.studies : numeric, number of studies considered
tau2 : numeric, estimated tau-squared (between-study heterogeneity)
mu.delta : numeric, estimated mean of distribution of delta
se.delta : numeric, standard error of delta summary estimate
ci.delta.upper : numeric, upper confidence interval for mean of delta.
Note : if using the non-inferiority test (i.e. alternative = "less"),
these bounds correspond to a (1-alpha)*100% confidence interval,
whereas the two-one-sided test (i.e. alternative = "two.sided")
corresponds to a (1-2alpha)*100% interval.
ci.delta.lower : numeric, lower confidence interval for mean of delta
p.lower : numeric, if alternative is "two.sided", gives the p-value corresponding to
testing the null hypothesis that delta is less than -epsilon.
Value is NULL if alternative is "less".
p.upper : numeric, if alternative is "two.sided", gives the p-value corresponding to
testing the null hypothesis that delta is less than epsilon.
Value is NULL if alternative is "less".
p : numeric, consensus p-value for hypothesis test for either the two-one-sided test or
the non-inferiorty test.
Q : numeric, Cochran's Q-statistic for heterogeneity between studies
I2 : numeric, Higgins-Thompson I-squared statistic representing the total percentage of variation
attributable to between-study heterogeneity
weights.tau : numeric vector of raw study weights for the summary measure
weights.tau.relative : numeric vector of relative study weights for the summary measure,
such that each weight is a percentage adding to 100%
weights.tau.sum : numeric, sum of weights.tau
Arthur Hughes
Calculates the estimated power to detect a valid surrogate given a total sample size and specified alternative
est.power(n.total, rho = 0.8, u.y.alt, delta.alt, power.want.s = 0.7, alpha = 0.05)est.power(n.total, rho = 0.8, u.y.alt, delta.alt, power.want.s = 0.7, alpha = 0.05)
n.total |
total sample size in study |
rho |
rank correlation between Y and S in group 0, default is 0.8 |
u.y.alt |
specified alternative for u.y |
delta.alt |
specified alternative for u.s |
power.want.s |
desired power for u.s, default is 0.7 |
alpha |
significance level, default is 0.05 |
estimated power
Layla Parast
est.power(n.total = 50, rho = 0.8, u.y.alt=0.9, delta.alt = 0.1)est.power(n.total = 50, rho = 0.8, u.y.alt=0.9, delta.alt = 0.1)
Example data use to illustrate the functions
data("example.data")data("example.data")
A list with 4 elements representing 25 observations from a treatment group (group 1) and 25 observations from a control group (group 0):
y1the primary outcome,Y, in group 1
y0the primary outcome, Y, in group 0
s1the surrogate marker, S, in group 1
s0the surrogate marker, S, in group 0
data(example.data)data(example.data)
A simulated high‑dimensional dataset for demonstrating the RISE methodology implemented in SurrogateRank. The data contains primary response and 1000 surrogate candidates from 25 treated individuals and 25 untreated individuals, where 10% of the surrogate candidates are "valid".
data("example.data.highdim", package = "SurrogateRank")data("example.data.highdim", package = "SurrogateRank")
A list containing :
primary response in treated
primary response in untreated
1000 surrogate candidates in treated
1000 surrogate candidates in untreated
for each surrogate, null false if the surrogate is valid
Simulated for package examples.
data("example.data.highdim", package = "SurrogateRank") head(example.data.highdim)data("example.data.highdim", package = "SurrogateRank") head(example.data.highdim)
A simulated high-dimensional, multi-study dataset for demonstrating the RISE-meta methodology implemented in SurrogateRank, generated with the generate.example.data.highdim.multistudy() function. The data contains treatment effect measures on the primary endpoint and on 500 surrogate candidates, where the first 50 of these candidates are "valid" surrogates.
data("example.data.highdim.multistudy", package = "SurrogateRank")data("example.data.highdim.multistudy", package = "SurrogateRank")
A list with the following components:
Numeric vector of length M containing treatment effects on the primary endpoint across trials.
Numeric matrix of dimension M times J containing treatment effects on each of the J candidate markers.
Vector of length J containing the truth of surrogate validity. null false corresponds to valid surrogates, whereas null true corresponds to invalid surrogates.
Value of epsilon used to define surrogate validity.
Simulated for package examples.
data("example.data.highdim.multistudy", package = "SurrogateRank") head(example.data.highdim.multistudy)data("example.data.highdim.multistudy", package = "SurrogateRank") head(example.data.highdim.multistudy)
A simulated high‑dimensional dataset for demonstrating the RISE-Meta methodology implemented in SurrogateRank. The data contains primary response and 100 surrogate candidates from 25 treated individuals and 25 untreated individuals across 5 different studies, where 10% of the surrogate candidates are "valid".
data("example.data.highdim.multistudy.ipd", package = "SurrogateRank")data("example.data.highdim.multistudy.ipd", package = "SurrogateRank")
A list containing :
primary response in treated
primary response in untreated
1000 surrogate candidates in treated
1000 surrogate candidates in untreated
study names for treated
study names for untreated
for each surrogate, null false if the surrogate is valid
Simulated for package examples.
data("example.data.highdim.multistudy.ipd", package = "SurrogateRank") head(example.data.highdim.multistudy.ipd)data("example.data.highdim.multistudy.ipd", package = "SurrogateRank") head(example.data.highdim.multistudy.ipd)
Generates individual participant data for high-dimensional surrogate candidates using one of two data generating processes, as described in Hughes A et al (2025) https://doi.org/10.1002/sim.70241.
generate.example.data.highdim( n1, n0, p, prop_valid, valid_sigma = 1, corr = 0, mode = "simple", y0_mean = 0, y0_sd = 1, y1_mean = 3, y1_sd = 1, s0_mean = 0, s0_sd = 1, s1_mean = 0, s1_sd = 1, seed = 12345 )generate.example.data.highdim( n1, n0, p, prop_valid, valid_sigma = 1, corr = 0, mode = "simple", y0_mean = 0, y0_sd = 1, y1_mean = 3, y1_sd = 1, s0_mean = 0, s0_sd = 1, s1_mean = 0, s1_sd = 1, seed = 12345 )
n1 |
positive numeric giving the sample size in the treated group |
n0 |
positive numeric giving the sample size in the untreated group |
p |
positive numeric giving the number of markers to generate |
prop_valid |
numeric between 0 and 1 (inclusive) giving the proportion of surrogate candidates to generate as valid. |
valid_sigma |
non-negative numeric giving the standard deviation for valid candidates |
corr |
non-negative numeric giving the correlation between the surrogate candidates |
mode |
character taking values in c("simple", "complex"). If "simple", generates all variables with (multivariate) normal distributions. Else, uses a more complex exponential distribution. |
y0_mean |
numeric giving the mean of the primary endpoint in the untreated group |
y0_sd |
non-negative numeric giving the standard deviation of the primary endpoint in the untreated group |
y1_mean |
numeric giving the mean of the primary endpoint in the treated group |
y1_sd |
non-negative numeric giving the standard deviation of the primary endpoint in the treated group |
s0_mean |
numeric giving the mean of the surrogate candidates in the untreated group |
s0_sd |
non-negative numeric giving the standard deviation of the surrogate candidates in the untreated group |
s1_mean |
numeric giving the mean of the surrogate candidates in the treated group |
s1_sd |
non-negative numeric giving the standard deviation of the surrogate candidates in the treated group |
seed |
numeric giving a seed for reproducibility |
A list with the following components:
vector containing primary endpoint values in treated group
vector containing primary endpoint values in untreated group
n1 times p matrix containing surrogate candidate values in treated group
n0 times p matrix containing surrogate candidate values in untreated group
character vector giving the truth behind the null hypothesis for each surrogate candidate
res <- generate.example.data.highdim(n1 = 25, n0 = 25, p = 500, prop_valid = 1) dim(res$s1) # 25 x 500res <- generate.example.data.highdim(n1 = 25, n0 = 25, p = 500, prop_valid = 1) dim(res$s1) # 25 x 500
Generates simulated trial-level treatment effects for multiple surrogate markers across multiple studies, including both valid and invalid surrogates. This function implements a hierarchical random-effects model: true trial-level effects are drawn from marker-specific means with between-trial heterogeneity, and observed trial-level effects include additional within-study sampling error.
generate.example.data.highdim.multistudy( epsilon = 0.2, M = 5, sample_sizes = c(25, 50, 100, 150, 250), J = 500, prop_valid = 0.1, u_tau_min = 0.01, u_tau_max = 0.1, u_nu_min = 0.01, u_nu_max = 0.1, prop_invalid_under = 0.5, invalid_at_boundary = FALSE, invalid_mean_discrete = NULL, valid_mean_discrete = NULL, seed = 12345 )generate.example.data.highdim.multistudy( epsilon = 0.2, M = 5, sample_sizes = c(25, 50, 100, 150, 250), J = 500, prop_valid = 0.1, u_tau_min = 0.01, u_tau_max = 0.1, u_nu_min = 0.01, u_nu_max = 0.1, prop_invalid_under = 0.5, invalid_at_boundary = FALSE, invalid_mean_discrete = NULL, valid_mean_discrete = NULL, seed = 12345 )
epsilon |
Numeric in (0,1). Defines the region of validity for the
surrogate marker means. Markers with mean discrepancy within
|
M |
Integer. Number of trials (studies) to simulate. Must be > 1. |
sample_sizes |
Numeric vector of length |
J |
Integer. Total number of markers to simulate (valid + invalid). |
prop_valid |
Numeric, between 0 and 1. Proportion of markers that are valid. |
u_tau_min |
Numeric >= 0. Lower bound of marker-specific between-trial
heterogeneity variance ( |
u_tau_max |
Numeric >= u_tau_min. Upper bound of marker-specific
between-trial heterogeneity variance ( |
u_nu_min |
Numeric > 0. Lower bound of marker-specific variance
component ( |
u_nu_max |
Numeric >= u_nu_min. Upper bound of marker-specific variance
component ( |
prop_invalid_under |
Numeric, between 0 and 1. Probability that an invalid marker underestimates the treatment effect on Y. |
invalid_at_boundary |
default |
invalid_mean_discrete |
vector of discrete numeric values to sample true means of valid surrogates at. These values must be greater or equal in absolute value than epsilon. |
valid_mean_discrete |
vector of discrete numeric values to sample true means of valid surrogates at. These values must be smaller in absolute value than epsilon. |
seed |
numeric giving a seed for reproducibility |
The function first draws marker-level parameters:
from the validity or invalidity region,
from a uniform distribution, and from a uniform distribution.
Then, for each trial, true trial-level effects are drawn as
, and
observed effects include independent within-study sampling error
.
A list with the following components:
M x J matrix of observed trial-level discrepancies
() including sampling error.
M x J matrix of within-study standard deviations
().
Numeric vector of sample sizes for each trial.
Character vector of length J, "null true" for valid markers and "null false" for invalid markers.
Numeric vector of true marker-level mean discrepancies
().
Numeric vector of marker-specific between-trial
heterogeneity variances ().
res <- generate.example.data.highdim.multistudy( epsilon = 0.2, M = 5, sample_sizes = c(25, 50, 100, 150, 250), J = 500, prop_valid = 0.1 ) dim(res$delta) # 5 x 500 head(res$mu.true)res <- generate.example.data.highdim.multistudy( epsilon = 0.2, M = 5, sample_sizes = c(25, 50, 100, 150, 250), J = 500, prop_valid = 0.1 ) dim(res$delta) # 5 x 500 head(res$mu.true)
Generates individual participant data for high-dimensional surrogate candidates using one of two data generating processes, as described in Hughes A et al (2025) https://doi.org/10.1002/sim.70241.
generate.example.data.highdim.multistudy.ipd( M, n1, n0, p, prop_valid, valid_sigma = 1, corr = 0, mode = "simple", y0_mean = 0, y0_sd = 1, y1_mean = 3, y1_sd = 1, s0_mean = 0, s0_sd = 1, s1_mean = 0, s1_sd = 1, seed = 12345 )generate.example.data.highdim.multistudy.ipd( M, n1, n0, p, prop_valid, valid_sigma = 1, corr = 0, mode = "simple", y0_mean = 0, y0_sd = 1, y1_mean = 3, y1_sd = 1, s0_mean = 0, s0_sd = 1, s1_mean = 0, s1_sd = 1, seed = 12345 )
M |
number of studies |
n1 |
positive numeric giving the sample size in the treated groups |
n0 |
positive numeric giving the sample size in the untreated groups |
p |
positive numeric giving the number of markers to generate |
prop_valid |
numeric between 0 and 1 (inclusive) giving the proportion of surrogate candidates to generate as valid. |
valid_sigma |
non-negative numeric giving the standard deviation for valid candidates |
corr |
non-negative numeric giving the correlation between the surrogate candidates |
mode |
character taking values in c("simple", "complex"). If "simple", generates all variables with (multivariate) normal distributions. Else, uses a more complex exponential distribution. |
y0_mean |
numeric giving the mean of the primary endpoint in the untreated group |
y0_sd |
non-negative numeric giving the standard deviation of the primary endpoint in the untreated group |
y1_mean |
numeric giving the mean of the primary endpoint in the treated group |
y1_sd |
non-negative numeric giving the standard deviation of the primary endpoint in the treated group |
s0_mean |
numeric giving the mean of the surrogate candidates in the untreated group |
s0_sd |
non-negative numeric giving the standard deviation of the surrogate candidates in the untreated group |
s1_mean |
numeric giving the mean of the surrogate candidates in the treated group |
s1_sd |
non-negative numeric giving the standard deviation of the surrogate candidates in the treated group |
seed |
numeric giving a seed for reproducibility |
A list with the following components:
vector containing primary endpoint values in treated group
vector containing primary endpoint values in untreated group
n1 times p matrix containing surrogate candidate values in treated group
n0 times p matrix containing surrogate candidate values in untreated group
study names for treated samples
study names for untreated samples
character vector giving the truth behind the null hypothesis for each surrogate candidate
res <- generate.example.data.highdim.multistudy.ipd( M = 5, n1 = 25, n0 = 25, p = 500, prop_valid = 1 ) dim(res$s1) # (5 studies x 25 individuals = 125) x 500res <- generate.example.data.highdim.multistudy.ipd( M = 5, n1 = 25, n0 = 25, p = 500, prop_valid = 1 ) dim(res$s1) # (5 studies x 25 individuals = 125) x 500
A set of high-dimensional surrogate candidates are evaluated jointly. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response.
rise.evaluate( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", paired = FALSE, return.all.evaluate = TRUE, return.plot.evaluate = TRUE, evaluate.weights = TRUE, screening.weights = NULL, markers = NULL )rise.evaluate( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", paired = FALSE, return.all.evaluate = TRUE, return.plot.evaluate = TRUE, evaluate.weights = TRUE, screening.weights = NULL, markers = NULL )
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group
with dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group
with dimension |
alpha |
significance level for determining surrogate candidates. Default is
|
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based
on the surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate
validity. Either this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation
in order to improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If
|
return.all.evaluate |
logical flag. If |
return.plot.evaluate |
logical flag. If |
evaluate.weights |
logical flag. If |
screening.weights |
dataframe with columns |
markers |
a vector of marker names (column names of szero and sone) to evaluate. If not given, will default to evaluating all markers in the dataframes. |
a list with
individual.metrics if return.all.evaluate=TRUE, a dataframe of
evaluation results for each significant marker.
gamma.s a list with elements gamma.s.one and gamma.s.zero, giving
the combined surrogate marker in the treated and untreated groups, respectively.
gamma.s.evaluate : a dataframe giving the evaluation of gamma.s
gamma.s.plot : a ggplot2 plot showing gamma.s against the primary response
on the rank-scale.
Arthur Hughes
# Load high-dimensional example data# Load high-dimensional example data
Function to perform the evaluation stage of RISE-meta : Meta-Analysis of High-Dimensional Surrogate Markers
rise.evaluate.meta( yone, yzero, sone, szero, studyone, studyzero, alpha = 0.05, power.want.s.study = NULL, epsilon.study = NULL, epsilon.meta.mode = "user", epsilon.meta = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", test = "knha", paired.all = FALSE, paired.studies = NULL, evaluate.weights = TRUE, screening.weights = NULL, weight.mode = "diff.epsilon", markers = NULL, return.all.evaluate = FALSE, return.forest.plot = TRUE, return.fit.plot = TRUE, show.pooled.effect = TRUE, meta.analysis.method = "RE" )rise.evaluate.meta( yone, yzero, sone, szero, studyone, studyzero, alpha = 0.05, power.want.s.study = NULL, epsilon.study = NULL, epsilon.meta.mode = "user", epsilon.meta = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", test = "knha", paired.all = FALSE, paired.studies = NULL, evaluate.weights = TRUE, screening.weights = NULL, weight.mode = "diff.epsilon", markers = NULL, return.all.evaluate = FALSE, return.forest.plot = TRUE, return.fit.plot = TRUE, show.pooled.effect = TRUE, meta.analysis.method = "RE" )
yone |
numeric vector of primary response values in the treated participants |
yzero |
numeric vector of primary response values in the untreated participants |
sone |
matrix or dataframe of surrogate candidates in the treated group with dimension
|
szero |
matrix or dataframe of surrogate candidates in the untreated group with dimension
|
studyone |
character vector of length |
studyzero |
character vector of length |
alpha |
significance level for determining valid surrogates. Default is |
power.want.s.study |
numeric in (0,1) - power desired for a test of treatment effect based on the
surrogate candidate. If |
epsilon.study |
numeric in (0,1) - non-inferiority margin for determining surrogate validity in the
within-study screening phase. If |
epsilon.meta.mode |
character string specifying the mode to choose the value of the acceptable margin defined
by epsilon. By default, this is set to "user", where the value of epsilon is fixed by the user, defined by the
value of the argument |
epsilon.meta |
numeric in (0,1) - non-inferiority margin for determining surrogate validity in the meta-analysis stage. Must be specified. |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation in order to
improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
test |
character giving the type of test to be performed. The default is |
paired.all |
logical flag giving if the data is independent or paired. If |
paired.studies |
character vector specifying the names of the studies in |
evaluate.weights |
logical flag. If |
screening.weights |
dataframe with columns |
weight.mode |
character giving the type of weighting to return to be used in case |
markers |
a vector of marker names (column names of szero and sone) to evaluate. If not given, will default to evaluating all markers in the dataframes. |
return.all.evaluate |
logical flag. If |
return.forest.plot |
logical flag. If |
return.fit.plot |
logical flag. If |
show.pooled.effect |
logical flag. If |
meta.analysis.method |
character giving the meta-analysis method to be used. The default is |
a list with elements
individual.metrics : if return.all.evaluate=TRUE, a list containing
dataframes individual.metrics.study (per-study results for individual markers) and
individual.metrics.meta (meta-analysis results for individual markers).
evaluation.metrics.study : study-level results for the combined marker, gamma.
evaluation.metrics.meta : meta-analysis results for the combined marker, gamma.
gamma.s : a list with elements gamma.s.one and gamma.s.zero, giving
the values of the combined surrogate marker gamma in the treated and untreated groups, respectively.
gamma.s.plot : if return.forest.plot and/or return.fit.plot are TRUE,
returns evaluation plots as a list
Arthur Hughes
data("example.data.highdim.multistudy.ipd") yone <- example.data.highdim.multistudy.ipd$y1 yzero <- example.data.highdim.multistudy.ipd$y0 sone <- example.data.highdim.multistudy.ipd$s1 szero <- example.data.highdim.multistudy.ipd$s0 studyone <- example.data.highdim.multistudy.ipd$study1 studyzero <- example.data.highdim.multistudy.ipd$study0 rise.meta.screen.result <- rise.screen.meta( yone, yzero, sone, szero, studyone, studyzero, epsilon.study = 0.2, epsilon.meta = 0.2 ) markers = rise.meta.screen.result[["significant.markers"]] screening.weights = rise.meta.screen.result[["screening.weights"]] rise.meta.evaluate.result <- rise.evaluate.meta( yone, yzero, sone, szero, studyone, studyzero, epsilon.meta = 0.2, markers = markers, screening.weights = screening.weights, epsilon.study = 0.2 )data("example.data.highdim.multistudy.ipd") yone <- example.data.highdim.multistudy.ipd$y1 yzero <- example.data.highdim.multistudy.ipd$y0 sone <- example.data.highdim.multistudy.ipd$s1 szero <- example.data.highdim.multistudy.ipd$s0 studyone <- example.data.highdim.multistudy.ipd$study1 studyzero <- example.data.highdim.multistudy.ipd$study0 rise.meta.screen.result <- rise.screen.meta( yone, yzero, sone, szero, studyone, studyzero, epsilon.study = 0.2, epsilon.meta = 0.2 ) markers = rise.meta.screen.result[["significant.markers"]] screening.weights = rise.meta.screen.result[["screening.weights"]] rise.meta.evaluate.result <- rise.evaluate.meta( yone, yzero, sone, szero, studyone, studyzero, epsilon.meta = 0.2, markers = markers, screening.weights = screening.weights, epsilon.study = 0.2 )
A set of high-dimensional surrogate candidates are screened one-by-one to identify strong candidates. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response. P-values corresponding to hypothesis testing on this measure are corrected for the high number of statistical tests performed.
rise.screen( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", paired = FALSE, return.all.screen = TRUE, return.all.weights = FALSE, weight.mode = "inverse.delta", normalise.weights = TRUE, verbose = T )rise.screen( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", paired = FALSE, return.all.screen = TRUE, return.all.weights = FALSE, weight.mode = "inverse.delta", normalise.weights = TRUE, verbose = T )
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group with dimension
|
szero |
matrix or dataframe of surrogate candidates in the untreated group with dimension
|
alpha |
significance level for determining surrogate candidates. Default is |
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based on the
surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate validity. Either
this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation in order to
improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If |
return.all.screen |
logical flag. If |
return.all.weights |
logical flag. If |
weight.mode |
character giving the type of weighting to return. One of
|
normalise.weights |
logical flag. If |
verbose |
logical flag. If |
a list with elements
screening.metrics : dataframe of screening results (for each candidate marker - number of observations n,
u.y, u.s, delta, CI, sd, epsilon, p-values).
significant.markers: character vector of markers with p_adjusted < alpha
screening.weights: dataframe giving marker names and the inverse absolute value of the
associated deltas.
Arthur Hughes
# Load high-dimensional example data# Load high-dimensional example data
The RISE screening algorithm is applied to each study using a rank-based measure of treatment effect similarity. In the second stage, these effect estimates are combined using a random-effects meta-analysis and the retained markers are those for which there is strong evidence of surrogacy across many studies.
rise.screen.meta( yone, yzero, sone, szero, studyone, studyzero, alpha = 0.05, power.want.s.study = NULL, epsilon.study = NULL, epsilon.meta.mode = "user", epsilon.meta = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", test = "knha", paired.all = FALSE, paired.studies = NULL, return.all.screen = TRUE, return.all.weights = FALSE, weight.mode = "diff.epsilon", return.screen.plot = TRUE, screen.plot.topN = 15, screen.plot.point.estimate = FALSE, normalise.weights = TRUE, return.forest.plot = TRUE, return.fit.plot = TRUE, show.pooled.effect = TRUE, return.study.similarity.plot = TRUE, return.evaluate.results = TRUE, meta.analysis.method = "RE" )rise.screen.meta( yone, yzero, sone, szero, studyone, studyzero, alpha = 0.05, power.want.s.study = NULL, epsilon.study = NULL, epsilon.meta.mode = "user", epsilon.meta = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", test = "knha", paired.all = FALSE, paired.studies = NULL, return.all.screen = TRUE, return.all.weights = FALSE, weight.mode = "diff.epsilon", return.screen.plot = TRUE, screen.plot.topN = 15, screen.plot.point.estimate = FALSE, normalise.weights = TRUE, return.forest.plot = TRUE, return.fit.plot = TRUE, show.pooled.effect = TRUE, return.study.similarity.plot = TRUE, return.evaluate.results = TRUE, meta.analysis.method = "RE" )
yone |
numeric vector of primary response values in the treated participants |
yzero |
numeric vector of primary response values in the untreated participants |
sone |
matrix or dataframe of surrogate candidates in the treated group with dimension
|
szero |
matrix or dataframe of surrogate candidates in the untreated group with dimension
|
studyone |
character vector of length |
studyzero |
character vector of length |
alpha |
significance level for determining surrogate candidates in both stages. Default is |
power.want.s.study |
numeric in (0,1) - power desired for a test of treatment effect based on the
surrogate candidate. Either this or |
epsilon.study |
numeric in (0,1) - non-inferiority margin for determining surrogate validity in the
within-study screening phase. Either this or |
epsilon.meta.mode |
character string specifying the mode to choose the value of the acceptable margin defined
by epsilon. By default, this is set to "user", where the value of epsilon is fixed by the user, defined by the
value of the argument |
epsilon.meta |
numeric in (0,1) - fixed non-inferiority margin for determining surrogate validity in the meta-analysis stage. |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation in order to
improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
test |
character giving the type of test to be performed. The default is |
paired.all |
logical flag giving if the data is independent or paired. If |
paired.studies |
character vector specifying the names of the studies in |
return.all.screen |
logical flag. If |
return.all.weights |
logical flag. If |
weight.mode |
character giving the type of weighting to return. One of
|
return.screen.plot |
logical flag. If |
screen.plot.topN |
number of predictors to display in the screening results figure, default value is 15. |
screen.plot.point.estimate |
logical flag. If |
normalise.weights |
logical flag. If |
return.forest.plot |
logical flag. If |
return.fit.plot |
logical flag. If |
show.pooled.effect |
logical flag. If |
return.study.similarity.plot |
logical flag. If |
return.evaluate.results |
logical flag. If |
meta.analysis.method |
character giving the meta-analysis method to be used. The default is |
a list with elements
screening.metrics.study : dataframe of per-study results from RISE screening.
For each candidate marker - study name, study sample size, estimate of delta, standard error of delta.
screening.metrics.meta : dataframe of meta-analysis screening results.
For each candidate marker - number of studies n.studies,
estimate of mean delta value mu.delta,
its standard error se.delta, confidence interval and prediction interval,
estimate of tau-squared tau2, Cochran's Q-statistic and Higgins-Thompson I-Squared,
unadjusted and adjusted meta-analysis p-values, and standardised weights.
Note : if using the non-inferiority test (i.e. alternative = "less"),
the intervals have width (1-alpha)*100%,
whereas the two-one-sided test (i.e. alternative = "two.sided")
corresponds to a (1-2alpha)*100% width.
significant.markers: character vector of markers with meta-analysis p-values < alpha
screening.weights: dataframe giving marker names and the standardised meta-analysis weights
evaluation.metrics.study : dataframe of per-study results for the combined marker gamma, evaluated on the same data
evaluation.metrics.meta : dataframe of meta-analysis results for the combined marker gamma, evaluated on the same data
gamma.s.plot: if return.forest.plot, return.fit.plot, and/or return.study.similarity.plot
are TRUE, returns fitted evaluation plots on training data as a list.
Arthur Hughes
data("example.data.highdim.multistudy.ipd") yone <- example.data.highdim.multistudy.ipd$y1 yzero <- example.data.highdim.multistudy.ipd$y0 sone <- example.data.highdim.multistudy.ipd$s1 szero <- example.data.highdim.multistudy.ipd$s0 studyone <- example.data.highdim.multistudy.ipd$study1 studyzero <- example.data.highdim.multistudy.ipd$study0 rise.meta.screen.result <- rise.screen.meta( yone, yzero, sone, szero, studyone, studyzero, epsilon.study = 0.2, epsilon.meta = 0.2 )data("example.data.highdim.multistudy.ipd") yone <- example.data.highdim.multistudy.ipd$y1 yzero <- example.data.highdim.multistudy.ipd$y0 sone <- example.data.highdim.multistudy.ipd$s1 szero <- example.data.highdim.multistudy.ipd$s0 studyone <- example.data.highdim.multistudy.ipd$study1 studyzero <- example.data.highdim.multistudy.ipd$study0 rise.meta.screen.result <- rise.screen.meta( yone, yzero, sone, szero, studyone, studyzero, epsilon.study = 0.2, epsilon.meta = 0.2 )
Calculates the rank-based test statistic for Y and the rank-based test statistic for S and the difference, delta, along with corresponding standard error estimates, then tests whether the surrogate is valid
test.surrogate(full.data = NULL, yone = NULL, yzero = NULL, sone = NULL, szero = NULL, epsilon = NULL, power.want.s = 0.7, u.y.hyp = NULL, alpha = 0.05)test.surrogate(full.data = NULL, yone = NULL, yzero = NULL, sone = NULL, szero = NULL, epsilon = NULL, power.want.s = 0.7, u.y.hyp = NULL, alpha = 0.05)
full.data |
either full.data or yone, yzero, sone, szero must be supplied; if full data is supplied it must be in the following format: one observation per row, Y is in the first column, S is in the second column, treatment group (0 or 1) is in the third column. |
yone |
primary outcome, Y, in group 1 |
yzero |
primary outcome, Y, in group 0 |
sone |
surrogate marker, S, in group 1 |
szero |
surrogate marker, S, in group 0 |
epsilon |
threshold to use for delta, default calculates epsilon as a function of desired power for S |
power.want.s |
desired power for S, default is 0.7 |
u.y.hyp |
hypothesized value of u.y used in the calculation of epsilon, default uses estimated valued of u.y |
alpha |
significance level, default is 0.05 |
u.y |
rank-based test statistic for Y |
u.s |
rank-based test statistic for S |
delta |
difference, u.y-u.s |
sd.u.y |
standard error estimate of u.y |
sd.u.s |
standard error estimate of u.s |
sd.delta |
standard error estimate of delta |
ci.delta |
1-sided confidence interval for delta |
epsilon.used |
the epsilon value used for the test |
is.surrogate |
logical, TRUE if test indicates S is a good surrogate, FALSE otherwise |
Layla Parast
data(example.data) test.surrogate(yone = example.data$y1, yzero = example.data$y0, sone = example.data$s1, szero = example.data$s0)data(example.data) test.surrogate(yone = example.data$y1, yzero = example.data$y0, sone = example.data$s1, szero = example.data$s0)
This function tests for surrogacy of a univariate marker with respect to a continuous primary
response. This extends the test.surrogate() function from the SurrogateRank
package to the case where samples may be paired instead of independent, and where a two sided
test is desired.
test.surrogate.extension( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, alternative = "two.sided", paired = FALSE )test.surrogate.extension( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, alternative = "two.sided", paired = FALSE )
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group
with dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group
with dimension |
alpha |
significance level for determining surrogate candidates. Default is
|
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based
on the surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate
validity. Either this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If
|
A list containing:
u.y: Estimated rank-based treatment effect on the outcome.
u.s: Estimated rank-based treatment effect on the surrogate.
delta.estimate: Estimated difference in treatment effects: u.y - u.s.
sd.u.y: Standard deviation of u.y.
sd.u.s: Standard deviation of u.s.
sd.delta: Standard deviation of delta.estimate.
ci.delta: One-sided confidence interval upper bound for delta.estimate.
p.delta: p-value for validity of trial-level surrogacy.
epsilon.used: Non-inferiority threshold used in the test.
is.surrogate: TRUE if the surrogate passes the test, else FALSE.
Arthur Hughes, Layla Parast
# Load data data("example.data") yone <- example.data$y1 yzero <- example.data$y0 sone <- example.data$s1 szero <- example.data$s0 test.surrogate.extension.result <- test.surrogate.extension( yone, yzero, sone, szero, power.want.s = 0.8, paired = TRUE, alternative = "two.sided" )# Load data data("example.data") yone <- example.data$y1 yzero <- example.data$y0 sone <- example.data$s1 szero <- example.data$s0 test.surrogate.extension.result <- test.surrogate.extension( yone, yzero, sone, szero, power.want.s = 0.8, paired = TRUE, alternative = "two.sided" )
RISE (Rank-Based Identification of High-Dimensional Surrogate Markers) is a two-stage method to identify and evaluate high-dimensional surrogate candidates of a continuous response.
In the first stage (called screening), the high-dimensional candidates are screened one-by-one to identify strong candidates. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response. P-values corresponding to hypothesis testing on this measure are corrected for the high number of statistical tests performed.
In the second stage (called evaluation), candidates with an adjusted p-value below a given significance level are evaluated by combining them into a single synthetic marker. The surrogacy of this marker is then assessed with the univariate test as described before.
To avoid overfitting, the two stages are performed on separate data.
test.surrogate.rise( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", paired = FALSE, screen.proportion = 0.66, return.all.screen = TRUE, return.all.evaluate = TRUE, return.plot.evaluate = TRUE, evaluate.weights = TRUE, return.all.weights = FALSE, weight.mode = "inverse.delta", normalise.weights = TRUE )test.surrogate.rise( yone, yzero, sone, szero, alpha = 0.05, power.want.s = NULL, epsilon = NULL, u.y.hyp = NULL, p.correction = "BH", n.cores = 1, alternative = "two.sided", paired = FALSE, screen.proportion = 0.66, return.all.screen = TRUE, return.all.evaluate = TRUE, return.plot.evaluate = TRUE, evaluate.weights = TRUE, return.all.weights = FALSE, weight.mode = "inverse.delta", normalise.weights = TRUE )
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group with
dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group with
dimension |
alpha |
significance level for determining surrogate candidates. Default is
|
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based on
the surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate
validity. Either this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation in
order to improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If
|
screen.proportion |
numeric in (0,1) - proportion of data to be used for the screening stage.
The default is |
return.all.screen |
logical flag. If |
return.all.evaluate |
logical flag. If |
return.plot.evaluate |
logical flag. If |
evaluate.weights |
logical flag. If |
return.all.weights |
logical flag. If |
weight.mode |
character giving the type of weighting to return. One of
|
normalise.weights |
logical flag. If |
a list with
screening.results: a list with
screening.metrics : dataframe of screening results (for each candidate marker - number of observations n,
u.y, u.s, delta, CI, sd, epsilon, p-values)
significant_markers: character vector of markers with p_adjusted < alpha.
evaluate.results: a list with
individual.metrics if return.all.evaluate=TRUE, a dataframe of
evaluation results for each significant marker.
gamma.s a list with elements gamma.s.one and gamma.s.zero, giving the
combined surrogate marker in the treated and untreated groups, respectively.
gamma.s.evaluate : a dataframe giving the evaluation of gamma.s
gamma.s.plot : a ggplot2 plot showing gamma.s against the primary response
on the rank-scale.
Arthur Hughes
# Load high-dimensional example data# Load high-dimensional example data