SNIFF - Social-Robot Navigator for Identifying and Finding Forgotten Items

A multimodal robotic platform integrating Vision-Language Models (VLMs), Large Language Models (LLMs), and autonomous navigation for assistive human-robot interaction.

Key Features

Natural Conversation Interface
- Real-time speech-to-text with OpenAI Whisper
- Contextual dialogue handling using Llama-3.2 LLM
- Text-to-speech conversion via PyTTSX3
Advanced Perception System
- Open-set object detection with YOLO-World v8
- 1080p stereo vision with depth perception
- 1.3m elevated camera perspective
Autonomous Navigation
- ROS2 NavStack with SLAM-based mapping
- Waypoint-based search pattern
- Custom safety protocols for human proximity

System Architecture

Component	Technology Stack	Key Specs
Hardware	Modified TurtleBot3 + Orange Pi 5 Max	1.5GHz NPU, 32GB RAM, Offboard CUDA Compute for VLM
Vision	YOLO-World v8 + OpenCV	30FPS @ 640×480
Navigation	ROS2 NavStack	0.2m/s avg speed
Conversation	Llama-3.2 + Whisper	<2s response latency

Performance Highlights

State Success Rates (20 trials):

- Idle Detection: 65% [95% CI: 43.6-86.4%]
- User Approach: 100% 
- Conversation Handling: 100%
- Object Detection: 80% [62.0-98.0%] 
- Final Navigation: 55% [32.6-77.4%]

Object-Specific Accuracy:

Object	Success Rate	Relative Size
Backpack	96%	Large
Phone	90%	Small
Bowl	80%	Medium
Water Bottle	56.7%	Medium

Key Findings

Robust Spatial Generalization
- No significant performance difference across locations (χ²=0.104, p=0.991)
- Table 1-4 success rates: 86.7%, 85%, 68%, 80%
Error Propagation Analysis
- Strong correlation between early-stage success and final outcome (r=0.81)
- Navigation errors accounted for 45% of total failures
Model Limitations
- Water bottle detection challenges due to reflective surfaces
- 35% idle state failures from environmental distractions

Future Directions

Implement multi-user detection algorithms
Develop dynamic search patterns beyond fixed waypoints
Integrate tactile feedback for object verification
Optimize model quantization for edge deployment

Technologies Used: ROS2, PyTorch, YOLO-World, Llama-3.2, OpenCV
Hardware: TurtleBot3, Orange Pi 5 Max, ELP Stereo Cam, Anker PowerConf S330

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
bringup_files		bringup_files
data_analysis		data_analysis
human-pose-estimation-opencv @ 5f86f77		human-pose-estimation-opencv @ 5f86f77
offboard_compute		offboard_compute
onboard_vision		onboard_vision
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
handwave.py		handwave.py
handwave_ros.py		handwave_ros.py
owl_vit_node.py		owl_vit_node.py
sniff_robot_node.py		sniff_robot_node.py
speech2item.py		speech2item.py
speech2item_test.py		speech2item_test.py
wave_detect.py		wave_detect.py
yolo_node.py		yolo_node.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SNIFF - Social-Robot Navigator for Identifying and Finding Forgotten Items

Key Features

System Architecture

Performance Highlights

Key Findings

Future Directions

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SNIFF - Social-Robot Navigator for Identifying and Finding Forgotten Items

Key Features

System Architecture

Performance Highlights

Key Findings

Future Directions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages