Quantifying task-relevant representational similarity using decision variable correlation
Previous studies have compared neural activities in the visual cortex to representations in deep neural networks trained on image classification. Interestingly, while some suggest that their representations are highly similar, others argued the opposite. Here, we propose a new approach to characterize the similarity of the decision strategies of two observers (models or brains) using decision variable correlation (DVC). DVC quantifies the image-by-image correlation between the decoded decisions based on the internal neural representations in a classification task. Thus, it can capture task-relevant information rather than general representational alignment. We evaluate DVC using monkey V4/IT recordings and network models trained on image classification tasks. We find that model-model similarity is comparable to monkey-monkey similarity, whereas model-monkey similarity is consistently lower. Strikingly, DVC decreases with increasing network performance on ImageNet-1k. Adversarial training does not improve model-monkey similarity in task-relevant dimensions assessed using DVC, although it markedly increases the model-model similarity. Similarly, pre-training on larger datasets does not improve model-monkey similarity. These results suggest a divergence between the task-relevant representations in monkey V4/IT and those learned by models trained on image classification tasks.
💡 Research Summary
This paper introduces Decision Variable Correlation (DVC), a novel metric designed to quantify the similarity of task‑relevant representations between biological observers (e.g., primate visual cortex) and artificial neural networks. Traditional approaches such as Representational Similarity Analysis (RSA), Centered Kernel Alignment (CKA), and Canonical Correlation Analysis (CCA) compare the overall geometry of high‑dimensional representations but ignore which dimensions actually drive behavior. Conversely, behavioral similarity measures (e.g., Cohen’s κ, I2n) are confounded by overall accuracy and decision bias. DVC bridges this gap by projecting each high‑dimensional representation onto an optimal linear decision axis (found via Linear Discriminant Analysis) for a given binary (or pairwise) classification task, thereby extracting a continuous decision variable (DV) for every stimulus. The Pearson correlation between the DVs of two observers, after correcting for measurement noise via a split‑half reliability procedure, yields a noise‑adjusted estimate of how consistently the two systems use the same task‑relevant information.
The authors first validate DVC on neural recordings from two macaque monkeys (V4 + IT combined). Using a dataset of 3,200 naturalistic images spanning eight object categories, they find high within‑brain classification accuracy (≈0.94 for V4, 0.92 for IT) and a between‑monkey DVC of ~0.57, indicating that the two animals employ remarkably similar image‑by‑image decision strategies.
Next, they apply DVC to 43 deep convolutional networks pretrained on ImageNet‑1k. Despite the models achieving a wide range of ImageNet top‑1 accuracies (≈70‑85 %), model‑brain DVC values are consistently lower than brain‑brain DVC, averaging below 0.30. Strikingly, DVC declines as ImageNet performance increases, revealing an inverse relationship between generic classification prowess and alignment with primate visual processing in the task‑relevant subspace. Model‑model DVC, by contrast, is comparable to brain‑brain DVC, suggesting that networks are internally consistent with each other even though they diverge from the brain.
The authors further examine two interventions that have been proposed to improve brain‑model alignment: (1) adversarial training to increase robustness, and (2) pretraining on larger, more diverse datasets (ImageNet‑21k, JFT‑300M). Robust models exhibit higher model‑model DVC but do not show any meaningful gain in model‑brain DVC. Similarly, data‑rich models do not close the gap with neural data. These findings imply that neither robustness nor sheer data volume automatically yields representations that match the decision variables used by primate V4/IT during object recognition.
Methodologically, the paper contributes a practical pipeline: (i) dimensionality reduction (e.g., PCA) to equalize feature counts across models, (ii) LDA to derive optimal decision axes for each class pair, (iii) split‑half noise correction to obtain unbiased correlation estimates, and (iv) averaging across all class‑pair DVCs to produce a single similarity score per model‑brain pair. The authors provide open‑source code for reproducibility.
In summary, the study demonstrates that high‑performing image‑classification networks, even when made more robust or trained on massive datasets, remain fundamentally misaligned with the task‑relevant representations of primate visual cortex. DVC offers a principled, accuracy‑agnostic, behavior‑focused metric that can reveal such divergences more sensitively than traditional representational analyses. The work suggests that future model development should incorporate explicit constraints derived from neural decision variables or joint optimization with behavioral data to achieve genuinely brain‑like visual processing.
Comments & Academic Discussion
Loading comments...
Leave a Comment