Skip to contents

tallieR provides two measures of internal consistency: Cronbach’s alpha and McDonald’s omega. This vignette explains what each measures, when to prefer one over the other, and how to use both functions.

Why internal consistency matters

When questionnaire items are summed or averaged into a scale score, internal consistency tells you how well those items measure the same underlying construct. Low consistency suggests the items may not belong together; high consistency is a prerequisite for treating the total score as meaningful.

Cronbach’s alpha

Cronbach’s alpha assumes that all items have equal factor loadings (tau-equivalence). Under that assumption, alpha is the expected correlation between the current scale and any other scale of the same length drawn from the same item pool.

library(tallieR)

study <- read_scoreme_dir("exports/")

# All questionnaires in the study
cronbach_alpha(study)

# Specific subset
cronbach_alpha(study, questionnaires = c("ess", "isi", "phq9"))

The output is a data frame with one row per questionnaire:

Column Description
questionnaire_id Questionnaire identifier
alpha Cronbach’s alpha
ci_lower / ci_upper Exact 95% CI (Feldt et al., 1987)
n_items Number of numeric items used
n_obs Number of complete observations
note NA on success, or reason for failure

The confidence interval uses the exact F-distribution method of Feldt et al. (1987) rather than a bootstrap approximation. A wider interval reflects fewer participants, not a worse instrument.

Interpreting alpha

Conventional thresholds (Nunnally, 1978):

Alpha Interpretation
< 0.60 Poor
0.60 – 0.70 Questionable
0.70 – 0.80 Acceptable
0.80 – 0.90 Good
>= 0.90 Excellent (may indicate item redundancy)

These are rules of thumb, not hard cutoffs. Context matters: a screener with 3 items and alpha = 0.72 may be perfectly adequate for its purpose.

McDonald’s omega

Omega relaxes the tau-equivalence assumption. It uses the factor loadings from a single-factor EFA to estimate the proportion of scale variance attributable to the common factor:

ωt=(λi)2(λi)2+(1λi2)\omega_t = \frac{(\sum \lambda_i)^2}{(\sum \lambda_i)^2 + \sum(1 - \lambda_i^2)}

When items have unequal loadings (which is the norm in psychological questionnaires), omega is a less biased estimate of reliability than alpha. Alpha systematically underestimates reliability for congeneric scales, and can overestimate it when items are highly correlated for reasons unrelated to the construct.

omega_reliability(study)
omega_reliability(study, questionnaires = c("ess", "isi"))

Output columns: questionnaire_id, omega, n_items, n_obs, note.

When to use which

Situation Recommendation
Tau-equivalent items (equal loadings assumed) Either; alpha is conventional
Congeneric items (unequal loadings, typical) Prefer omega
Comparing against published norms that report alpha Report both; flag the difference
Small sample (< 30) Alpha with exact CI; omega may not converge
Reporting for publication Report both with sample size and n items

Comparing alpha and omega side by side

alpha_res <- cronbach_alpha(study, questionnaires = c("ess", "isi", "phq9"))
omega_res  <- omega_reliability(study, questionnaires = c("ess", "isi", "phq9"))

merge(
  alpha_res[, c("questionnaire_id", "alpha", "ci_lower", "ci_upper", "n_obs")],
  omega_res[, c("questionnaire_id", "omega")],
  by = "questionnaire_id"
)

Using an items_long() data frame directly

Both functions accept either a study object or a data frame produced by items_long(). This is useful when you want to filter to a specific group or time point before computing reliability:

items <- items_long(study)

# Only control group
control_items <- items[items$group == "control", ]
cronbach_alpha(control_items)

# Only baseline session
baseline_items <- items[items$session == "baseline", ]
omega_reliability(baseline_items)

Handling non-numeric items

Some instruments include items that cannot be coerced to numeric — MCTQ clock times, STOP-BANG yes/no responses. These are silently dropped before estimation. The n_items column in the output tells you how many numeric items were actually used, so you can detect if unexpected items were dropped.

Failure modes

Questionnaires that cannot be estimated return NA with an explanatory note:

Situation Note
Fewer than 2 numeric items “Need at least 2 numeric items.”
Fewer than 2 complete observations “Need at least 2 complete observations.”
Zero variance in row totals (alpha) “Zero variance in row totals.”
More items than observations (omega) “More items than observations; covariance matrix is singular.”
EFA non-convergence (omega) “Factor analysis did not converge.”

References

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.

Feldt, L. S., Woodruff, D. J., & Salih, F. A. (1987). Statistical inference for coefficient alpha. Applied Psychological Measurement, 11(1), 93–103.

McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum Associates.

Nunnally, J. C. (1978). Psychometric theory (2nd ed.). McGraw-Hill.

Revelle, W., & Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74(1), 145–154.