Classification of psychopathology
Traditional taxonomies: categorical diagnostic systems
(e.g., DSM, ICD)
Recent models: dimensional and integrative approaches
(e.g. Kotov et al., 2017; Borsboom & Cramer, 2013)
How can we identify latent factors?
\(M_{kg} \in \mathbb{R}_{\ge 0}^{K \times G}\) is the full observed data matrix (test item x subject)
NMF decomposes \(M\) into two lower-rank nonnegative matrices:
\[ M_{kg} = \sum_{n=1}^{N} P_{kn} E_{ng} \]
Dataset composed of ordinal variables (e.g., items measured on a Likert scale)
Estimate latent dimensions using both Factor Analysis and Non-negative Matrix Factorization (NMF)
Compare the two methods by examining the correlation matrix of factor loadings
How can NMF capture the underlying structure of a questionnaire (items → factor)?
confirmatory NMF?
If the results are encouraging …
Third step: use NMF to identify latent factors shared within the same spectrum of symptoms
Fourth step: use causal NMF (Landy et al., 2025b) to study the effect of different treatments on the same spectrum of symptoms
All materials are available on GitHub at laurasitaunipd/nmf
Borsboom, D., & Cramer, A. O. (2013). Network analysis: an integrative approach to the structure of psychopathology. Annual review of clinical psychology, 9(1), 91-121.
Kotov, R., Krueger, R. F., Watson, D., Achenbach, T. M., Althoff, R. R., Bagby, R. M., … & Zimmerman, M. (2017). The Hierarchical Taxonomy of Psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of abnormal psychology, 126(4), 454.
Landy, J. M., Basava, N., & Parmigiani, G. (2025a). bayesNMF: Fast Bayesian Poisson NMF with Automatically Learned Rank Applied to Mutational Signatures. arXiv preprint arXiv:2502.18674.
Landy, J. M., Zorzetto, D., De Vito, R., & Parmigiani, G. (2025b). Causal Inference for Latent Outcomes Learned with Factor Models. arXiv preprint arXiv:2506.20549.
EFA
R package bayesNMF
Comparison of Item-by-Factor matrices
obtained through EFA and NMF
CFA
model = "
SMD =~ bessi_1 + bessi_6 + bessi_11 + bessi_16 + bessi_21 + bessi_26 + bessi_31 + bessi_36 + bessi_41
IND =~ bessi_5 + bessi_10 + bessi_15 + bessi_20 + bessi_25 + bessi_30 + bessi_35 + bessi_40 + bessi_45
COD =~ bessi_3 + bessi_8 + bessi_13 + bessi_18 + bessi_23 + bessi_28 + bessi_33 + bessi_38 + bessi_43
SED =~ bessi_2 + bessi_7 + bessi_12 + bessi_17 + bessi_22 + bessi_27 + bessi_32 + bessi_37 + bessi_42
ESD =~ bessi_4 + bessi_9 + bessi_14 + bessi_19 + bessi_24 + bessi_29 + bessi_34 + bessi_39 + bessi_44
"
fit = cfa(model=model, data=dati, ordered=T)
summary(fit, standardized=T)
fitMeasures(fit, fit.measures=c("rmsea","srmr","cfi","nnfi"))
modificationIndices(fit, sort.=T)[1:10,]Comparison of Item-by-Factor matrices
obtained through CFA and NMF
lambda = inspect(fit, what = "std")$lambda
lambda <- lambda[order(as.numeric(gsub("bessi_", "", rownames(lambda)))), ]
allLoadings = cbind(P,lambda)
corrplot(
cor(allLoadings),
method = "color", # fill cells with colors
type = "full", # show full matrix
addCoef.col = "black", # show correlation coefficients
number.cex = 0.7, # text size for numbers
tl.cex = 0.8 # text size for labels
)HandZone October 30th, 2025