meta-analysis Archives

Statistics: Meta-analysis. 1) Install dmetar package

A. R code

install.packages("tidyverse") 
install.packages("meta") 
install.packages("metafor")
devtools::install_github("MathiasHarrer/dmetar") 
# if not working, clone the package from github and unzip and install from local
devtools::install("C:/dmetar-master/dmetar-master")

B. Error

Error: (converted from warning) cannot remove prior installation of package ‘digest’

C. Workaround

get library location: Sys.getenv(“R_LIBS_USER”)
close R program completely
Go to R library: C:\Program Files\R\R-3.6.1\library
Delete “digest” folder manually
Rerun R with above code and the error message will not appear again and the dmetar package will be successfully intstalled.

Statistics: Calculate Effect Size for Meta-Analysis

A. Web-based Effect Size Calculator

B. Examples

T: Treatment Group
C: Control Group
T(n): n for Treatment Group
C(n): n for Control Group
p: p-value
std: standard deviation

Data	ES
1. Mean SES / Group/ n / std 127.8 / T / 25 / 10.4 132.3 / C / 30 / 132.3	Standardized Mean Difference (d) Means and Standard Deviation d= -0.4466
2. t/ T(n) / C(n) 1.68 / 10 / 12	Standardized Mean Difference (d) T-Test, Unequal Sample Size d= 0.7193
3. T(n) / C(n) / p 10 / 12 / .037	Standardized Mean Difference (d) T-Test P-Value, Unequal Sample Size d= 0.9569
4. r = .27 for binary variable and continuous variable	2*0.27/ sqrt(1-0.27^2) ES=0.560829 formula from D.W. Wilson’s slides
5. Group / Mean test scroe / n 1 / 55.38 / 13 2 / 59.40 / 18 3 / 75.14 / 37 4 / 88 / 22 F(3,86) = 7.05, for meta-analysis only interested group 1 and 2, std not reported.	Standardized Mean Difference (d) F-Test, 3 or more Groups d= -0.1658
6. 2 x 2 table n / group / % not improved / % improved 42 / T / 32% / 68% 29 / C / 37% / 63%	Standardized Mean Difference (d) Frequency Distribution (Proportions) d= 0.1057
7. frequency table for T and C group, 60 cases for each group (no report of the means and stds) Degrees of Condition / T(n) / C(n) 0 / 15 / 20 1 / 15 / 20 2 / 15 / 10 3 / 15 / 10	Standardized Mean Difference (d) Frequency Distribution d = 0.305
8. regression analysis with nonequivalent comparison group design covariates: employment status, marital status, age etc. treatment: intervention / probation only unstandardized regression coefficient: -.523 std for DV, severity of physical abuse: s=9.23 sample size (intervention / probation only) : n1= 125 / n2=254	Standardized Mean Difference (d) Unstandardized Regression coefficient Covariates adjusted ES(d)= -0.0568 =(125-1)9.23^2+ (254-1)9.23^2 =32117.72 = 124+254-2 = 377 =32117.72 / 377 = 85.1929 =sqrt(85.1929)= 9.23 = S_pooled ES=-.523 /9.23 = -.056

C. Web-based Calculator Output

Example 3: T-Test P-Value, Unequal Sample Size

Example 6, Frequency distribution (Proportions)

Example 8, Unstandardized Regression Coefficient

D. Citation

Wilson, D. B. (date of version). Meta-analysis macros for SAS, SPSS, and Stata. Retrieved November 3, 2019, from http://mason.gmu.edu/~dwilsonb/ma.html

Statistics: Tools for Systematic Review and Meta-Analysis

A. Resource

Cochrane
Covidence
PRISMA: Transparent reporting of systematic reviews and meta-analysis
- PRISMA-P: Guidelines for developing review protocols
- PRISMA-IPD: Guidelines for individual patient data
- PRISMA-NMA: Guidelines for Network Meta-Analysis

B. Guidelines

Cooper & Hedges, 1994
Hedges & Olkin, 1985
Lipsey & Wilson, 2001
Borenstein, Hedges, Higgins, & Rothstein, 2008: Comprehensive Meta-Analysis Version 2.2.048

C. Review Process

Identification of studies
- Name of the reviewer
- Date of the review
- Article: Author, date of publication, title, journal, issue number, pages, and credentials
General Information
- Focus of study
- Country of study
- Variables being measured
- Age range of participants
- Location of the study
Study Research Questions
- hypothesis
- theoretical/empirical basis

Methods designs
- Independent variables
- Outcome variables
- Measurement tools
Methods groups
- Nonrandomized with treatment and control groups/repeated measures design
- Number of groups
Methods sampling strategy
- Explicitly stated/Implicit/not stated/unclear
- sampling frame (telephone directory, electoral register, postcode, school listing)random selection/systematically/convenience
Sample information
- number of participants in the study
- if more than one group, the number of participants in each group
- sex
- socioeconomic status ethnicity
- special educational need
- region
- control for bias from confounding variables and groups
- baseline value for longitudinal study

Recruitment and consent
- Method: letters of invitation, telephone, face-to-face
- incentives
- consent sought

Data collection
- Methods: experimental, curriculum-based assessment, focus group, group interview, one-to-one interview, observation, self-completion questionnaire, self-completion report or diary, exams, clinical test, practical test, psychological test, school records, secondary data etc.
- who collected the data
- reliability
- validity

Data analysis
- statistical methods: descriptive, correlation, group differences (t test, ANOVA), growth curve analysis/multilevel modeling(HLM), structural equation modeling(SEM), path analysis, regression

Results and conclusion
- Group means, SD, N, estimated effect size, appropriate SD, F, t test, significance, inverse variance weight

D. Statistics

Cohen’s kappa
Cohen’s d
effect size
aggregate/weighted mean effect size
95% confidence interval: upper and lower
homogeneity of variance (Q statistic): Test if the mean effect size of the studies are significantly heterogeneous (p<.05), which means that there is more variability in the effect sizes than would be expected from sampling error and that the effect sized did not estimate common population mean (Lipsey & Wilson, 2001)
df: degrees of freedom
I square (%): the percentage of variability of the effect size that is attributable to true heterogeneity, that is, over and above the sampling error.
Outlier detection
mixed-effects model (consider studies as random effects): moderator analysis for heterogeneity (allow for population parameters to vary across studies, reducing the probability of committing a Type I error)
Proc GLM/ANOVA (consider studies as fixed effects): moderator analysis for heterogeneity
- Region
- Socioeconomic status
- Geographical location
- Education level
- Setting
- Language
- sampling method
Statistical difference in the mean effect size of methodological feature of the study
- confidence in effect size derivation (medium, high)
- reliability (not reported, reported)
- validity (not reported vs. reported
classic fail-safe N/Orwin’s fail-safe N: The number of missing null studies needed to bring the current mean effect size of the meta-analysis to .04. Threshhold is 5k+10, k is number of studies for the meta-analysis. If the N is greater than the 5k+10 limit then it is unlikely that publication bias poses a significant threat to the validity of findings of the meta-analysis.
- Used to assess publication bias. eg. control for bias in studies (tightly controlled, loosely controlled, not controlled)

E. Purpose/Research Questions

Whether the treatment is associated with single effect or multiple effects?
Understand the variability of studies on the association of treatment with single or multiple effects, and explain the variable effects potentially through the study features (moderators). How do the effects of the treatment vary different study features?

F. Reference

Odesope et al, 2010: A Systematic Review and Meta-Analysis of the Cognitive Correlates of Bilingualism
PRISMA Checklist
PRISMA Flow Diagram

SAS: Meta-Analysis CMH Example for Categorical Variable

A. Reference

SUGI27 paper: (Hamer and Simpson, 2002) SAS Tools for Meta-Analysis* *The methodology in this paper is ok, but the example was not interpreted correctly. I have corrected the example in this post.
SAS 9.2 User Guide: Example 35.7 Cochran-Mentel-Haenszel Statistics
SAS 9.3 User Guide: The FREQ Procedure (Odds Ratio and Relative Risks for 2×2 Tables)

B. Meta-Analysis

A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analysis can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting measurements that are expected to have some degree of error. The aim then is to use approaches from statistics to derive a pooled estimate closest to the unknown common truth based on how this error is perceived.
Wikipedia

In meta-analysis, studies become observations.
Research collect data for meta-analysis by systematic review of the literature in the field, and compile data directly from the summary statistics in the publication.

C. Problem with simply lumping the data from different studies together

Not consider treatment-by-study interaction
Assume response rates are the same in all studies.

D. SAS Solution (follow Hamer and Simpson’s paper, but corrected the output from the paper)

Create data set with the results of 2 studies. B: Remitted; N:Not remitted; P: Placebo; D: Drug.
I have used B (Better) to indicate Remitted cases because Proc Freq test is based on column 1 and row 1 of the 2 by 2 table, so if we code R for Remitted cases then the remitted case will be in column 2 because the table is by alphabetical order and R is after N.
The Hamer and Simpson paper actually tested the null hypothesis for the non-effective cases rather than the effective cases.

data chm;
input study $ response $ trt $ cellfreq @@;
datalines;
study1	B	P	24	study1	N	P	3
study1	B	D	58	study1	N	D	30
study2	B	P	16	study2	N	P	57
study2	B	D	2	study2	N	D	10
;
run;

Run Cochran-Mantel-Haenszel Statistics using Proc Freq procedure with cmh option.

proc freq data=chm;
tables study*trt*response /cmh;
weight cellfreq;
run;

E. SAS Output

SAS chm table

Frequency table

Cochrane-Mantel-Haenszel test

F. Notes

The Mantel-Haenszel estimator of the common odds ratio assumed the estimation to be homogeneous among both studies.
The Mentel-Haenszel statistics tests the null hypothesis that the response rate is the same for the two treatments, after adjusting for possible differences in study response rates.
For Proc Freq testing options, make sure the group that you want to tested are in row 1 and column 1. It is also important to crosstab treatment as row and response as column, so the interpretation of the relative risk for the risk of improvement make sense. In Hamper and Simpon’s paper the crosstab has been transposed, therefore the relative risk output doesn’t make sense.

G. Interpretation

The CMH test statistics is 4.65 with a p-value of 0.03, therefore, we can reject the null hypothesis that there is no association between treatment and response. P-value lower than 0.05 indicates that the association between treatment and response remains strong after adjusting for study.
Relative Risk (Column 1) equals to 0.74 which means the probability of the improvement with the drug is 0.74 time the probability of the improvement with the placebo.
Relative Risk (Column 2) equals to 1.51 which means the probability of no improvement in the symptoms with the drug is 1.51 times the probability of no improvement with the placebo.
The Breslow-Day test has a large p-value of 0.295 which indicates there is no significant difference in the odds ratios among the studies.

* I will show the odds ratio and relative risk calculation in Excel in another post.