This is the official implementation of Neural Collapse in Test Time Adaptation on CVPR 2026. We propose NCTTA as a novel algrithm for test-time adaption.
Abstract
Test-Time Adaptation (TTA) enhances model robustness to out-of-distribution (OOD) data by updating the model online during inference, yet existing methods lack theoretical insights into the fundamental causes of performance degradation under domain shifts. Recently, Neural Collapse (NC) has been proposed as an emergent geometric property of deep neural networks (DNNs), providing valuable insights for TTA. In this work, we extend NC to the sample-wise level and discover a novel phenomenon termed Sample-wise Alignment Collapse (NC3+), demonstrating that a sample's feature embedding, obtained by a trained model, aligns closely with the corresponding classifier weight. Building on NC3+, we identify that the performance degradation stems from sample-wise misalignment in adaptation which exacerbates under larger distribution shifts. This indicates the necessity of realigning the feature embeddings with their corresponding classifier weights. However, the misalignment makes pseudo-labels unreliable under domain shifts. To address this challenge, we propose NCTTA, a novel feature-classifier alignment method with hybrid targets to mitigate the impact of unreliable pseudo-labels, which blends geometric proximity with predictive confidence. Extensive experiments demonstrate the effectiveness of NCTTA in enhancing robustness to domain shifts. For example, NCTTA outperforms Tent by 14.52% on ImageNet-C.
Our main contributions are as follows:
- NC theory is, for the first time, extended to the sample-wise level through Sample-wise Alignment Collapse (NC3+), validated theoretically and empirically.
- Comprehensive empirical evidence is provided to show that the performance degradation of pre-trained models on OOD data originates from sample-wise misalignment.
- A novel NC-guided TTA (NCTTA) method is proposed to effectively promote alignment between sample feature embeddings and their corresponding classifier weights.
- Extensive experiments demonstrate the effectiveness of NCTTA. For instance, it achieves 78.30% average accuracy on CIFAR-10-C under severe corruption and 66.61% on ImageNet-C, outperforming prior methods.
In the mild shift scenario, where test samples are corrupted but drawn from a single domain, NCTTA consistently outperforms prior methods across all benchmarks. On CIFAR-10-C and ImageNet-C, NCTTA achieves the highest average accuracy, ranking first on nearly all corruption types. Similar gains are observed on PACS and Waterbirds, where NCTTA handles spurious correlations and domain shifts more effectively than existing methods. These results demonstrate the benefit of our feature-classifier alignment under domain shifts. Moreover, under certain corruptions, such as Snow and Frost on ImageNet-C, Tent performs even worse than no_adapt. In contrast, NCTTA can mitigate the shortcomings of conventional entropy loss simply by incorporating NC loss, without the need for additional augmentations or carefully designs. Below are the experimental results on ImageNet-C.
Our codebase is built upon TTAB.
Before running the code, please install the required environment:
# Clone this repo
git clone https://github.com/Cevaaa/NCTTA.git
cd NCTTA
# Create a conda environment
conda create -n nctta python=3.10 -y
conda activate nctta
pip install -r requirements.txtYou need to prepare the required datasets in advance. Store the datasets under data_path with the following directory structure:
data_path/
├── cifar10_c/
├── cifar100_c/
├── ILSVRC/
│ ├── imagenet-c/
│ └── val/
├── pacs/
│ ├── art/
│ ├── cartoon/
│ ├── photo/
│ └── sketch/
└── waterbirds/
To run baseline tests, please prepare the corresponding pre-trained checkpoints for the base model. TTAB provides an improved pretraining script. Meanwhile, we release a set of checkpoints at this link, which were used to benchmark baselines and NCTTA in our paper.
For runnable examples, please refer to TTAB.
In addition to the algorithms already implemented in TTAB, we further implement the following methods:
- ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation (ViDA, Liu et al., 2023)
- Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors (DeYO, Lee et al., 2024)
- COME: Test-time Adaptation by Conservatively Minimizing Entropy (COME, Zhan et al., 2025)
- Adapt in the Wild: Test-Time Entropy Minimization with Sharpness and Feature Regularization (SAR2, Niu et al., 2025)
- Decoupled Entropy Minimization (AdaDEM, Ma et al., 2025)
- Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting (EATA-C, Tan et al., 2025)
If you find this repository helpful for your project, please consider citing:
@article{chen2025neural,
title={Neural Collapse in Test-Time Adaptation},
author={Chen, Xiao and Du, Zhongjing and Huang, Jiazhen and Jiang, Xu and Lu, Li and Jiang, Jingyan and Wang, Zhi},
journal={arXiv preprint arXiv:2512.10421},
year={2025}
}
This repo is developed upon TTAB.
For any additional questions, feel free to email chen-x25@mails.tsinghua.edu.cn.


