SAM2 + Spatial Re-Identification for Identity-Preserving Video Segmentation

This repository extends Meta AI’s SAM2 foundation model to address identity fragmentation in video segmentation under:

Occlusion
Similar-looking object interaction
Object splitting and merging
Cluttered dynamic scenes

Collaborators

This project was completed for CIS 6800 – Advanced Machine Perception (University of Pennsylvania).

Team Members:

Maanasa Rajeshwer
Marika Nishi
Prakriti Prasad

Spatial re-identification and tracklet extensions were developed collaboratively as part of the course final project.

Original SAM2 repository:
https://github.com/facebookresearch/sam2

Motivation

Foundation video segmentation models excel at mask propagation but often fail to preserve object identity over time when:

Objects become partially or fully occluded
Multiple identical objects interact
Viewpoints change significantly
Objects split or merge

These failures are critical in embodied AI systems where identity consistency is required for:

Tracking
Manipulation
Multi-object reasoning
Perception-to-control pipelines

This project introduces a training-free spatial re-identification pipeline that augments SAM2 with temporal reasoning and proximity-aware matching.

Key Contributions in This Fork

This repository adds the following extensions to the original SAM2 framework:

1. Tracklet-Based Temporal Memory

Maintains short-term identity memory
Aggregates mask predictions across frames
Enables temporal smoothing

2. Spatial Proximity-Aware Re-Identification

Uses geometric proximity constraints
Reduces ID swaps between similar objects
Improves identity stability in cluttered scenes

3. Optical Flow Integration

Integrates RAFT optical flow for motion consistency
Aligns masks temporally
Improves re-identification under fast motion

4. Evaluation on Challenging Scenarios

Tested on:

Cup shuffling sequences
Similar-looking object tracking
Sticky note tracking
Cluttered paper splitting scenarios

These experiments focus on identity fragmentation failure modes.

Architecture Overview

Multi-Frame Video
↓
SAM2 Masklet Prediction
↓
Optical Flow Motion Alignment (RAFT)
↓
Tracklet Formation
↓
Spatial Re-Identification
↓
Identity-Consistent Mask Propagation

This design preserves SAM2’s foundation capabilities while introducing temporal reasoning without retraining the model.

Installation

Follow the original SAM2 installation instructions:

git clone https://github.com/facebookresearch/sam2.git
cd sam2
pip install -e .

## Installation Assumptions

This fork assumes:

- Python >= 3.10  
- PyTorch >= 2.5  
- CUDA-enabled GPU  

Optional (for notebooks):

```bash
pip install -e ".[notebooks]"

Running the Spatial Re-ID Extensions

Example:

python sam2/tracklets/tracklets_demo3.py

Or explore the provided notebooks:

notebooks/tracklets_demo.ipynb
notebooks/ap2.ipynb (cup shuffling)
notebooks/ap3.ipynb (similar-looking object tracking)

Results

We evaluate identity preservation under challenging conditions:

Scenario	Baseline SAM2	SAM2 + Spatial Re-ID
Similar object crossing	ID swaps	Reduced swaps
Occlusion	Identity loss	Preserved
Object splitting	Fragmentation	Improved consistency

Qualitative improvements are observed in clutter-heavy and ambiguous scenes.

Limitations

Still dependent on mask quality from SAM2
Proximity-based heuristics may fail in dense scenes
No retraining performed — purely inference-time augmentation

Future Directions

Learned re-identification embeddings
Multi-camera identity consistency
Integration with embodied control policies

Attribution

This repository is based on:

Ravi et al., “SAM 2: Segment Anything in Images and Videos,” 2024
https://github.com/facebookresearch/sam2

All original SAM2 code remains under the Apache 2.0 License.

All spatial re-identification and tracklet extensions were developed for academic research.

Citation

If referencing this extension, please cite both:

SAM2 (Meta AI)
This spatial re-identification extension (CIS 6800 project)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
assets		assets
checkpoints		checkpoints
demo		demo
notebooks		notebooks
report		report
sam2		sam2
sav_dataset		sav_dataset
tools		tools
tracklets		tracklets
training		training
.clang-format		.clang-format
.gitignore		.gitignore
.watchmanconfig		.watchmanconfig
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
LICENSE_cctorch		LICENSE_cctorch
MANIFEST.in		MANIFEST.in
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
backend.Dockerfile		backend.Dockerfile
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAM2 + Spatial Re-Identification for Identity-Preserving Video Segmentation

Collaborators

Motivation

Key Contributions in This Fork

1. Tracklet-Based Temporal Memory

2. Spatial Proximity-Aware Re-Identification

3. Optical Flow Integration

4. Evaluation on Challenging Scenarios

Architecture Overview

Installation

Running the Spatial Re-ID Extensions

Results

Limitations

Future Directions

Attribution

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SAM2 + Spatial Re-Identification for Identity-Preserving Video Segmentation

Collaborators

Motivation

Key Contributions in This Fork

1. Tracklet-Based Temporal Memory

2. Spatial Proximity-Aware Re-Identification

3. Optical Flow Integration

4. Evaluation on Challenging Scenarios

Architecture Overview

Installation

Running the Spatial Re-ID Extensions

Results

Limitations

Future Directions

Attribution

Citation

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages