tallieR provides two measures of internal consistency: Cronbach’s alpha and McDonald’s omega. This vignette explains what each measures, when to prefer one over the other, and how to use both functions.
Why internal consistency matters
When questionnaire items are summed or averaged into a scale score, internal consistency tells you how well those items measure the same underlying construct. Low consistency suggests the items may not belong together; high consistency is a prerequisite for treating the total score as meaningful.
Cronbach’s alpha
Cronbach’s alpha assumes that all items have equal factor loadings (tau-equivalence). Under that assumption, alpha is the expected correlation between the current scale and any other scale of the same length drawn from the same item pool.
library(tallieR)
study <- read_scoreme_dir("exports/")
# All questionnaires in the study
cronbach_alpha(study)
# Specific subset
cronbach_alpha(study, questionnaires = c("ess", "isi", "phq9"))The output is a data frame with one row per questionnaire:
| Column | Description |
|---|---|
questionnaire_id |
Questionnaire identifier |
alpha |
Cronbach’s alpha |
ci_lower / ci_upper
|
Exact 95% CI (Feldt et al., 1987) |
n_items |
Number of numeric items used |
n_obs |
Number of complete observations |
note |
NA on success, or reason for failure |
The confidence interval uses the exact F-distribution method of Feldt et al. (1987) rather than a bootstrap approximation. A wider interval reflects fewer participants, not a worse instrument.
Interpreting alpha
Conventional thresholds (Nunnally, 1978):
| Alpha | Interpretation |
|---|---|
| < 0.60 | Poor |
| 0.60 – 0.70 | Questionable |
| 0.70 – 0.80 | Acceptable |
| 0.80 – 0.90 | Good |
| >= 0.90 | Excellent (may indicate item redundancy) |
These are rules of thumb, not hard cutoffs. Context matters: a screener with 3 items and alpha = 0.72 may be perfectly adequate for its purpose.
McDonald’s omega
Omega relaxes the tau-equivalence assumption. It uses the factor loadings from a single-factor EFA to estimate the proportion of scale variance attributable to the common factor:
When items have unequal loadings (which is the norm in psychological questionnaires), omega is a less biased estimate of reliability than alpha. Alpha systematically underestimates reliability for congeneric scales, and can overestimate it when items are highly correlated for reasons unrelated to the construct.
omega_reliability(study)
omega_reliability(study, questionnaires = c("ess", "isi"))Output columns: questionnaire_id, omega,
n_items, n_obs, note.
When to use which
| Situation | Recommendation |
|---|---|
| Tau-equivalent items (equal loadings assumed) | Either; alpha is conventional |
| Congeneric items (unequal loadings, typical) | Prefer omega |
| Comparing against published norms that report alpha | Report both; flag the difference |
| Small sample (< 30) | Alpha with exact CI; omega may not converge |
| Reporting for publication | Report both with sample size and n items |
Comparing alpha and omega side by side
alpha_res <- cronbach_alpha(study, questionnaires = c("ess", "isi", "phq9"))
omega_res <- omega_reliability(study, questionnaires = c("ess", "isi", "phq9"))
merge(
alpha_res[, c("questionnaire_id", "alpha", "ci_lower", "ci_upper", "n_obs")],
omega_res[, c("questionnaire_id", "omega")],
by = "questionnaire_id"
)Using an items_long() data frame directly
Both functions accept either a study object or a data frame produced
by items_long(). This is useful when you want to filter to
a specific group or time point before computing reliability:
items <- items_long(study)
# Only control group
control_items <- items[items$group == "control", ]
cronbach_alpha(control_items)
# Only baseline session
baseline_items <- items[items$session == "baseline", ]
omega_reliability(baseline_items)Handling non-numeric items
Some instruments include items that cannot be coerced to numeric —
MCTQ clock times, STOP-BANG yes/no responses. These are silently dropped
before estimation. The n_items column in the output tells
you how many numeric items were actually used, so you can detect if
unexpected items were dropped.
Failure modes
Questionnaires that cannot be estimated return NA with
an explanatory note:
| Situation | Note |
|---|---|
| Fewer than 2 numeric items | “Need at least 2 numeric items.” |
| Fewer than 2 complete observations | “Need at least 2 complete observations.” |
| Zero variance in row totals (alpha) | “Zero variance in row totals.” |
| More items than observations (omega) | “More items than observations; covariance matrix is singular.” |
| EFA non-convergence (omega) | “Factor analysis did not converge.” |
References
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.
Feldt, L. S., Woodruff, D. J., & Salih, F. A. (1987). Statistical inference for coefficient alpha. Applied Psychological Measurement, 11(1), 93–103.
McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum Associates.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). McGraw-Hill.
Revelle, W., & Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74(1), 145–154.