Skip to content

Official implementation of the paper 'DHR-CLIP: Dynamic High-resolution prompt learning for Zero-shot Anomaly Segmentation'

Notifications You must be signed in to change notification settings

YUL-git/DHR-CLIP

Repository files navigation

DHR-CLIP

Official implementation of "DHR-CLIP: Dynamic High-Resolution Object-agnostic Prompt Learning for Zero-shot Anomaly Segmentation"
by Jiyul Ham, Jun-Geal Baek.
Accepted to ICAIIC 2025 in Fukuoka, Japan

Introduction

Zero-shot anomaly segmentation (ZSAS) is crucial for detecting and localizing defects in target datasets without need for training samples. This approach is particularly valuable in industrial quality control, where there are distributional shifts between training and operational environments or when data access is restricted. Recent vision-language models have demonstrated strong zero-shot performance across various visual tasks. However, the variations in the granularity of local anomaly regions due to resolution changes and their focus on class semantics make it challenging to directly apply them to ZSAS. To address these issues, we propose DHR-CLIP, a novel approach that incorporates dynamic high-resolution processing to enhance ZSAS in industrial inspection tasks. Additionally, we adapt object-agnostic prompt design to detect normal and anomalous patterns without relying on specific object semantics. Finally, we implement deep-text prompt tuning in the text encoder for refined textual representations and employ V-V attention layers in the vision encoder to capture detailed local features. Our integrated framework enables effective identification of fine-grained anomalies through refinement of image and text prompt design, providing precise localization of defects. The effectiveness of DHR-CLIP has been demonstrated through comprehensive experiments on real-world industrial datasets, MVTecAD and VisA, achieving strong performance and generalization capabilities across diverse industrial scenarios.
overview

Overview of DHR-CLIP

overview

Motivation of DHR-CLIP

overview

Quantitative results

Table 1 Table 2

Reproducibility

Implementation environment

  • Ubuntu==22.04.1 LTS
  • cuda==12.1.0
  • cudnn==8
  • python==3.10
  • pytorch==2.0.0

First, download MVTecAD and VisA datasets. and then generate json files.

cd generate_dataset_json
python mvtec.py
python visa.py

Second, run DHRCLIP python file.

bash run_DHRCLIP.sh

About

Official implementation of the paper 'DHR-CLIP: Dynamic High-resolution prompt learning for Zero-shot Anomaly Segmentation'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published