Skip to content

EQA merge #254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 92 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
63041b0
Add semnav-specific / hydra-related changes to stretch_ai
blakerbuchanan Nov 5, 2024
3706b02
commit changes used for in-home hardware experiments
SaumyaSaxena Jan 10, 2025
a105448
installation update
hello-peiqi Jan 16, 2025
daa6880
grapheqa
hello-peiqi Jan 16, 2025
0e2e7ae
grapheqa fix
hello-peiqi Jan 17, 2025
665ac75
update
hello-peiqi Jan 27, 2025
6d12fbf
restore rerun
hello-peiqi Jan 27, 2025
c9d93f6
restore branch
hello-peiqi Jan 27, 2025
4e31af9
dynamem fix
hello-peiqi Jan 31, 2025
f30ddc9
revert unexpected bugs
hello-peiqi Jan 31, 2025
c9eda02
revert unexpected bugs
hello-peiqi Jan 31, 2025
b7218f0
Merge branch 'main' into hello-peiqi/grapheqa
hello-peiqi Jan 31, 2025
216ddbb
grapheqa push
hello-peiqi Feb 4, 2025
7b37896
Merge branch 'main' into hello-peiqi/grapheqa
hello-peiqi Feb 4, 2025
c9feeba
Merge branch 'hello-peiqi/grapheqa' of https://github.com/hello-robot…
hello-peiqi Feb 4, 2025
b58cd69
debug to make scene grpah building work
hello-peiqi Feb 4, 2025
4a7ea35
update
hello-peiqi Feb 6, 2025
12dbed8
update
hello-peiqi Feb 7, 2025
15c8cef
update
hello-peiqi Feb 7, 2025
8032c56
checkpoint
hello-peiqi Feb 9, 2025
a384867
update
hello-peiqi Feb 11, 2025
a17cc36
update save image
hello-peiqi Feb 12, 2025
232da6e
update save best image
hello-peiqi Feb 15, 2025
b55a649
add planner
hello-peiqi Feb 17, 2025
a18fe80
debug
hello-peiqi Feb 18, 2025
051c4b8
debug planner
hello-peiqi Feb 18, 2025
7511362
write custom question
hello-peiqi Feb 19, 2025
1850efe
add discord support
hello-peiqi Feb 20, 2025
f8676a6
add captioner
hello-peiqi Feb 21, 2025
6d51d32
add deepseek
cpaxton Feb 22, 2025
8888ed8
add qwen deepseek and make support a little cleaner and more
cpaxton Feb 22, 2025
99855dc
updates
cpaxton Feb 22, 2025
1bede03
added qwen quantization and some cleanup
cpaxton Feb 22, 2025
c788f2e
update and add bitsandbytes
cpaxton Feb 22, 2025
8dd3991
update
cpaxton Feb 22, 2025
435b479
update
hello-peiqi Feb 23, 2025
bc4fcbc
suggestions on deepseek update
hello-peiqi Feb 27, 2025
91a1ffa
Merge branch 'cpaxton/deepseek' of https://github.com/hello-robot/str…
hello-peiqi Feb 27, 2025
285b5e9
find better captioner
hello-peiqi Mar 2, 2025
db15156
update installation
hello-peiqi Mar 2, 2025
c6ba6c8
update installation
hello-peiqi Mar 2, 2025
95abd77
fix setup
hello-peiqi Mar 2, 2025
a4d54d3
fix setup
hello-peiqi Mar 3, 2025
db09627
improve running speed
hello-peiqi Mar 3, 2025
5bb2f4a
exploration update
hello-peiqi Mar 7, 2025
5fce257
fix installation
hello-peiqi Mar 10, 2025
2ee6a53
fix installation
hello-peiqi Mar 10, 2025
09f6d84
update scene graph merging
hello-peiqi Mar 12, 2025
45b4a1f
improve performance
hello-peiqi Mar 14, 2025
9f6bac8
minor update
hello-peiqi Mar 14, 2025
1f572ca
pass unit test
hello-peiqi Mar 14, 2025
812d4b8
udpate owlsam
hello-peiqi Mar 14, 2025
b5b19b5
Merge branch 'main' into hello-peiqi/grapheqa_merge
hello-peiqi Mar 14, 2025
14bab14
fix bug
hello-peiqi Mar 14, 2025
7fcd248
Merge branch 'hello-peiqi/grapheqa_merge' of https://github.com/hello…
hello-peiqi Mar 14, 2025
70167f8
update docs
hello-peiqi Mar 16, 2025
9c7f227
add captioners
hello-peiqi Mar 17, 2025
7c16cfb
update
hello-peiqi Mar 19, 2025
718df94
update
hello-peiqi Mar 19, 2025
23a74c5
switch to qwen
hello-peiqi Mar 19, 2025
80d542b
upgrade
hello-peiqi Mar 20, 2025
b165b0b
update
hello-peiqi Mar 25, 2025
3293404
update
hello-peiqi Mar 25, 2025
d8e67bc
update
hello-peiqi Mar 25, 2025
c001ef1
update discord
hello-peiqi Mar 26, 2025
2b1bfa0
fix bug
hello-peiqi Mar 26, 2025
c18a121
minor update
hello-peiqi Mar 28, 2025
0a683c7
minor update in prompt
hello-peiqi Mar 28, 2025
33b229b
Stretch EQA (#263)
hello-peiqi May 2, 2025
c20b982
Merge branch 'main' into hello-peiqi/grapheqa_merge
hello-peiqi May 2, 2025
b1550bb
remove projects folder
hello-peiqi May 2, 2025
7d55bf6
add docs
hello-peiqi May 7, 2025
b27ec84
add video
hello-peiqi May 7, 2025
c6801ff
downsample image
hello-peiqi May 7, 2025
845f15f
small update
hello-peiqi May 7, 2025
d88bad3
switch to large image
hello-peiqi May 7, 2025
62f2272
minor update to documentation
hello-ck May 7, 2025
cae3e48
documentation update
hello-ck May 7, 2025
d342d98
more documentation edits
hello-ck May 7, 2025
a369db7
some docs edits
hello-peiqi May 7, 2025
96c8e4b
documentation edits
hello-ck May 7, 2025
c0d0118
resolve conflict
hello-peiqi May 7, 2025
d7d4e0c
Merge branch 'hello-peiqi/grapheqa_merge' of https://github.com/hello…
hello-peiqi May 7, 2025
4d10072
doc update
hello-peiqi May 7, 2025
0a63963
update docs
hello-peiqi May 9, 2025
0fff6db
update docs
hello-peiqi May 9, 2025
5e6b255
save docstring
hello-peiqi May 9, 2025
a318134
add docsstring
hello-peiqi May 9, 2025
ec7cd06
add docsstring
hello-peiqi May 9, 2025
e995d6a
add docsstring
hello-peiqi May 9, 2025
7b231f0
Hello peiqi/cleanup (#267)
hello-peiqi May 12, 2025
209d04a
update? (#268)
hello-peiqi May 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
- LLM agents
- text to speech and speech to text
- visualization and debugging
- embodied question answering

Much of the code is licensed under the Apache 2.0 license. See the [LICENSE](LICENSE) file for more information. Parts of it are derived from the Meta [HomeRobot](https://github.com/facebookresearch/home-robot) project and are licensed under the [MIT license](META_LICENSE).

Expand Down Expand Up @@ -121,6 +122,7 @@ Check out additional documentation for ways to use Stretch AI:
- [Add a New LLM Task](docs/adding_a_new_task.md) -- How to add a new task that can be called by an LLM
- [DynaMem](docs/dynamem.md) -- Run the LLM agent in dynamic scenes, meaning you can walk around and place objects as the robot explores
- [Data Collection for Learning from Demonstration](docs/data_collection.md) -- How to collect data for learning from demonstration
- [Embodied Question Answering](docs/eqa.md) -- Allow the robot to explore the environment and answer questions from the users about the environment.
- [Learning from Demonstration](docs/learning_from_demonstration.md) -- How to train and evaluate policies with LfD
- [Open-Vocabulary Mobile Manipulation](docs/ovmm.md) -- Experimental code which can handle more complex language commands
- [Apps](docs/apps.md) -- List of many different apps that you can run
Expand Down
1 change: 1 addition & 0 deletions docs/apps.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Finally:
- [Dex Teleop data collection](#dex-teleop-for-data-collection) - Dexterously teleoperate the robot to collect demonstration data.
- [Learning from Demonstration (LfD)](learning_from_demonstration.md) - Train SOTA policies using [HuggingFace LeRobot](https://github.com/huggingface/lerobot)
- [Dynamem OVMM system](dynamem.md) - Deploy open vocabulary mobile manipulation system [Dynamem](https://dynamem.github.io)
- [Embodied question answering (EQA) system](eqa.md) - Deploy embodied question answering system borrowing the idea of [GraphEQA](https://grapheqa.github.io)

There are also some apps for [debugging](debug.md).

Expand Down
113 changes: 113 additions & 0 deletions docs/eqa.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# The Stretch AI EQA Module

The **Embodied Question Answering (EQA) Module** enables a robot to actively explore its environment, gather visual and spatial data, and answer user queries about what it sees. To answer queries, the EQA module has the robot explore the environment to acquire useful information to answer the question, produces a semantic representation of the environment, and processes questions from the user. Systems like the EQA module have the potential to be used in a variety of applications. For example, it might help people find objects in their home, which could be useful to many people, including people with visual and cognitive impairments. It might also help people monitor their home when they're away by enabling them to ask the robot to check on things.

## Demo Video

[The following](https://youtu.be/6tHGBYFkyMU) shows Stretch AI EQA running in one of our developers' homes.

_Click this large image to follow the link to YouTube:_

[![A demonstration of the EQA module in action](images/eqa.png)](https://youtu.be/6tHGBYFkyMU)

# Motivation and Methodology

In previous EQA work [GraphEQA](https://arxiv.org/abs/2412.14480), researchers provided a multimodal large language models (mLLMs), such as Google's Gemini and OpenAI's GPT, with a prompt that includes a object-centric semantic scene graph and task-relevant robot image observations. GraphEQA utilizes third party scene graph modules [Hydra](https://arxiv.org/abs/2201.13360) based on ROS Noetic. Installing this module can be difficult due to OS and software version compatibility. To provide a more user friendly alternative, we adapted the methods of [GraphEQA](https://arxiv.org/abs/2412.14480) for use with existing code in the Stretch AI repo.

In GraphEQA, mLLMs are expected to answer the question based on task-relevant image observations and plan exploration based on a scene graph string. Stretch AI has useful capabilities that can serve similar roles. For example, [DynaMem system](dynamem.md) finds task-relevant images and VLM models, such as [Qwen](../src/stretch/llms/qwen_client.py) and [OpenAI GPT](../src/stretch/llms/openai_client.py), extract visual clues from image observations by listing featured objects in the images such as beds, tables, etc.

The Stretch AI EQA module builds on these existing capabilities resulting in a pipeline that only requires a Stretch robot, a GPU machine with 12GB VRAM, and Internet connection to the following cloud-based AI models:
- A lightweight VLM running on the local worstation. We use `Qwen-VL-2.5-3B` here.
- A vision language encoder trained in contrastive manner. We use `SigLip-v1-so400m` here
- A powerful mLLM. We use `gemini-2.5-pro-preview-03-25` here.

When receiving a new question, the robot follows a recipe to find the answer:
- Extract keywords from the question using the light weighted VLM. For example, `is there a hand sanitizer near sink?` will result in keywords `hand sanitizer` and `sink`.
- Rotate the head pan to look around and use DynaMem to add images into a voxel-based semantic memory, which extracts pixel level vision language features with a vision language encoder and then projects 2D pixels into the 3D points to add to the voxel map.
- Use the lightweight VLM to identify featured object names from image observations and add these visual clues to a list.
- Query DynaMem to identify few task relevant images.
- Identify image observations selected as task-relevant images and image observations corresponding to unexplored frontiers. Add this information to augment visual clues.
- Prompt mLLM with relevant images along with augmented visual clues to answer questions. Following GraphEQA, we also ask an mLLM to provide confidence with the answers. If the mLLM is not confident with the answers, it should also output an image ID indicating areas that should be explored.
- If no certain answer can be provided, the robot should navigate to the selected image ID.
- Iterate the above process until an answer can be provided.


## Understanding EQA's code structure

This module shares or extends core dependencies (mapping, perception, llms) with other Stretch AI modules like AI Pickup and DynaMem. The following code is relevant to this module:

| File locations | Purpose |
| ----------------------- | ---------------------------------------------------------------- |
| [`src/stretch/app/run_eqa.py`](../src/stretch/app/run_eqa.py) | Entry point for EQA module |
| [`src/stretch/agent/task/dynamem/dynamem_task.py`](../src/stretch/agent/task/dynamem/dynamem_task.py?plain=1#L409) | An executor wrapper for EQA module |
| [`src/stretch/agent/robot_agent_eqa.py`](../src/stretch/agent/robot_agent_eqa.py) | Robot agent class containing all useful APIs for question answering |
| [`src/stretch/mapping/voxel/voxel_dynamem.py`](../src/stretch/mapping/voxel/voxel_dynamem.py#928) | We added EQA utilities to [DynaMem voxel.py](../src/stretch/mapping/voxel/voxel_dynamem.py) |

## Instructions

### Installation and preparation

The very first step is to install all necessary packages on both your Stretch robot and your workstation following [this instruction](./install_details.md).

Next you should install Gemini following [Google's docs](https://ai.google.dev/gemini-api/docs/quickstart?lang=python) and obtain a Google API key with Tier 1. Tier 1 belong to Pay-as-you-go catogories.

**So BE VERY CAUTIOUS! That means you will be charged as you use Gemini models as you attempt EQA module!**.

But Gemini model usage in this module is fairly cheap. You can check [pricing](https://ai.google.dev/gemini-api/docs/pricing) and [rate limit](https://ai.google.dev/gemini-api/docs/rate-limits) for `gemini-2.5-pro-preview-03-25`.

If you also want to try Discord bot, which is a more beautiful, user friendly communication interface compared with the naive terminal and command line, you should also need to install dependencies and obtain your discord tokens following [discord_bot.md](./discord_bot.md)

### Run EQA module

Launch the EQA agent via the `run_eqa` entry-point. By default, the robot will first rotate in place to scan its surroundings, pop out a rerun window (but by default the rerun contents will not be automatically saved, once you close the rerun window, you lose all visualization data), and you will be asked to enter your questions in the terminal.

You need to know the IP address of your robot to send commands to your robot. Once you know your `ROBOT_IP`, you can start running the following commands to try this EQA module.

You also need to set up your Gemini key before running EQA scripts by

```bash
export GOOGLE_API_KEY=$YOUR_GEMINI_TOKEN
```

If you also want to try discord bot, you need to set up discord token as well

```bash
export DISCORD_TOKEN=$YOUR_DICORD_TOKEN
```

```bash
python -m stretch.app.run_eqa --robot_ip $ROBOT_IP
```

Other options

- `--not_rotate_in_place`, `-N` : skip initial rotation-in-place scan
- `--discord`, `-D`: launch Discord bot for a better interface than the terminal and command line
- `--save_rerun`, `--SR`: save Rerun log files to `dynamem_log/debug_*` as rrd file for offline replay (but rerun window online streaming would be disabled)

**Example runs**:
Assume your robot ip is `192.168.1.42`.
* Skip initial rotation-in-place scan:

```bash
python -m stretch.app.run_eqa --robot_ip 192.168.1.42 -N
```
* Enable Discord for remote users:

```bash
python -m stretch.app.run_eqa --robot_ip 192.168.1.42 -D
```
* No initial rotation-in-place scan, save rerun visualization, enable discord:

```bash
python -m stretch.app.run_eqa --robot_ip 192.168.1.42 -N -D --SR
```


## Contributing

This is an active component within the Stretch repository. Please follow the main [CONTRIBUTING.md](./CONTRIBUTING.md) guidelines for branching, testing, and pull requests.

---

*Last updated: May 2025*
3 changes: 3 additions & 0 deletions docs/images/eqa.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -204,13 +204,13 @@ else
echo "Install detectron2 for perception (required by Detic)"
git submodule update --init --recursive
cd third_party/detectron2
pip install -e .
python -m pip install -e .

echo "Install Detic for perception"
cd ../../src/stretch/perception/detection/detic/Detic
# Make sure it's up to date
git submodule update --init --recursive
pip install -r requirements.txt
python -m pip install -r requirements.txt

# cd ../../src/stretch/perception/detection/detic/Detic
# Create folder for checkpoints and download
Expand Down
16 changes: 12 additions & 4 deletions src/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,24 +53,32 @@
# From openai
"openai",
"openai-clip",
# For gemini
"google-genai",
# For Yolo
# "ultralytics",
# Hardware dependencies
"hello-robot-stretch-urdf",
"pyrealsense2",
"urchin",
# Visualization
"rerun-sdk>=0.18.0",
"rerun-sdk==0.18.0",
# For siglip encoder
"sentencepiece",
# For git tools
"gitpython",
# Configuration tools and neural networks
"hydra-core",
"timm>1.0.0",
"huggingface_hub[cli]",
"transformers>=4.39.2",
"accelerate",
"huggingface_hub[cli]>=0.24.7",
# "flash-attn",
"transformers>=4.50.0",
"retry",
"qwen_vl_utils",
"bitsandbytes",
"autoawq>=0.1.5",
"triton >= 3.0.0",
"accelerate >= 1.5.0",
"einops",
# Meta neural nets
"segment-anything",
Expand Down
82 changes: 32 additions & 50 deletions src/stretch/agent/robot_agent_dynamem.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
from typing import Any, Dict, List, Optional, Union
from uuid import uuid4

import cv2
import numpy as np
import rerun as rr
import rerun.blueprint as rrb
Expand Down Expand Up @@ -63,7 +62,10 @@


class RobotAgent(RobotAgentBase):
"""Basic demo code. Collects everything that we need to make this work."""
"""
Extending from Basic demo code robot_agent.py. Adds new functionality that implements DynaMem.
https://dynamem.github.io
"""

def __init__(
self,
Expand All @@ -76,7 +78,6 @@ def __init__(
show_instances_detected: bool = False,
use_instance_memory: bool = False,
realtime_updates: bool = False,
obs_sub_port: int = 4450,
re: int = 3,
manip_port: int = 5557,
log: Optional[str] = None,
Expand Down Expand Up @@ -108,12 +109,6 @@ def __init__(
# For placing
self.owl_sam_detector = None

# if self.parameters.get("encoder", None) is not None:
# self.encoder: BaseImageTextEncoder = get_encoder(
# self.parameters["encoder"], self.parameters.get("encoder_args", {})
# )
# else:
# self.encoder: BaseImageTextEncoder = None
self.device = "cuda" if torch.cuda.is_available() else "cpu"

if not os.path.exists("dynamem_log"):
Expand Down Expand Up @@ -170,13 +165,6 @@ def __init__(
self._manipulation_radius = parameters["motion_planner"]["goals"]["manipulation_radius"]
self._voxel_size = parameters["voxel_size"]

# self.image_processor = VoxelMapImageProcessor(
# rerun=True,
# rerun_visualizer=self.robot._rerun,
# log="dynamem_log/" + datetime.now().strftime("%Y%m%d_%H%M%S"),
# robot=self.robot,
# ) # type: ignore
# self.encoder = self.image_processor.get_encoder()
context = zmq.Context()
self.manip_socket = context.socket(zmq.REQ)
self.manip_socket.connect("tcp://" + server_ip + ":" + str(manip_port))
Expand Down Expand Up @@ -208,6 +196,9 @@ def __init__(
self._start_threads()

def create_obstacle_map(self, parameters):
"""
This function creates the MaskSiglipEncoder, Owlv2 detector, voxel map util class and voxel map navigation space util class
"""
if self.manipulation_only:
self.encoder = None
else:
Expand Down Expand Up @@ -249,7 +240,6 @@ def create_obstacle_map(self, parameters):
smooth_kernel_size=parameters.get("filters/smooth_kernel_size", -1),
use_median_filter=parameters.get("filters/use_median_filter", False),
median_filter_size=parameters.get("filters/median_filter_size", 5),
median_filter_max_error=parameters.get("filters/median_filter_max_error", 0.01),
use_derivative_filter=parameters.get("filters/use_derivative_filter", False),
derivative_filter_threshold=parameters.get("filters/derivative_filter_threshold", 0.5),
detection=self.detection_model,
Expand All @@ -267,6 +257,9 @@ def create_obstacle_map(self, parameters):
self.planner = AStar(self.space)

def setup_custom_blueprint(self):
"""
This function define rerun blueprint of DynaMem module.
"""
main = rrb.Horizontal(
rrb.Spatial3DView(name="3D View", origin="world"),
rrb.Vertical(
Expand All @@ -285,35 +278,14 @@ def setup_custom_blueprint(self):
)
rr.send_blueprint(my_blueprint)

def compute_blur_metric(self, image):
def update_map_with_pose_graph(self):
"""
Computes a blurriness metric for an image tensor using gradient magnitudes.

Parameters:
- image (torch.Tensor): The input image tensor. Shape is [H, W, C].
Update our voxel map using a pose graph. Used for realtime update.
By default DynaMem will ask the robot stop to take new observations so this function will not be called.

Returns:
- blur_metric (float): The computed blurriness metric.
"""

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Compute gradients using the Sobel operator
Gx = cv2.Sobel(gray_image, cv2.CV_64F, 1, 0, ksize=3)
Gy = cv2.Sobel(gray_image, cv2.CV_64F, 0, 1, ksize=3)

# Compute gradient magnitude
G = cv2.magnitude(Gx, Gy)

# Compute the mean of gradient magnitudes
blur_metric = G.mean()

return blur_metric

def update_map_with_pose_graph(self):
"""Update our voxel map using a pose graph"""

t0 = timeit.default_timer()
self.pose_graph = self.robot.get_pose_graph()

Expand Down Expand Up @@ -392,10 +364,7 @@ def update_map_with_pose_graph(self):
# if obs.is_pose_graph_node:
# self.voxel_map.add_obs(obs)
if len(self.obs_history) > 0:
obs_history = self.obs_history[-5:]
blurness = [self.compute_blur_metric(obs.rgb) for obs in obs_history]
obs = obs_history[blurness.index(max(blurness))]
# obs = self.obs_history[-1]
obs = self.obs_history[-1]
else:
obs = None

Expand Down Expand Up @@ -462,6 +431,10 @@ def update(self):
)

def look_around(self):
"""
Let the robot look around to check its surroudings.
Rotating the robot head to compensate for the narrow field of view of realsense head camera
"""
print("*" * 10, "Look around to check", "*" * 10)
for pan in [0.6, -0.2, -1.0, -1.8]:
tilt = -0.6
Expand All @@ -482,6 +455,7 @@ def execute_action(
self,
text: str,
):
""" """
if not self._realtime_updates:
self.robot.look_front()
self.look_around()
Expand All @@ -497,6 +471,8 @@ def execute_action(

if len(res) > 0:
print("Plan successful!")
# This means that the robot has already finished all of its trajectories and should stop to manipulate the object.
# We will append two nan on the trajectory to denote that the robot is reaching the target point
if len(res) >= 2 and np.isnan(res[-2]).all():
if len(res) > 2:
self.robot.execute_trajectory(
Expand All @@ -507,9 +483,8 @@ def execute_action(
)

return True, res[-1]
# The robot has not reached the object. Next it should look around and continue navigation
else:
# print(res)
# res[-1][2] += np.pi / 2
self.robot.execute_trajectory(
res,
pos_err_threshold=self.pos_err_threshold,
Expand All @@ -522,10 +497,12 @@ def execute_action(
return None, None

def run_exploration(self):
"""Go through exploration. We use the voxel_grid map created by our collector to sample free space, and then use our motion planner (RRT for now) to get there. At the end, we plan back to (0,0,0).
"""
Go through exploration when the robot has not received any text query from the user.
We use the voxel_grid map created by our collector to sample free space, and then use A* planner to get there.
"""

Args:
visualize(bool): true if we should do intermediate debug visualizations"""
# "" means the robot has not received any text query from the user and should conduct exploration just to better know the environment
status, _ = self.execute_action("")
if status is None:
print("Exploration failed! Perhaps nowhere to explore!")
Expand Down Expand Up @@ -668,6 +645,11 @@ def process_text(self, text, start_pose):
return traj

def navigate(self, text, max_step=10):
"""
The robot calls this function to navigate to the object.
It will call execute_action function until it is ready for manipulation
"""
# Start a new rerun recording to avoid an overly large rerun video.
rr.init("Stretch_robot", recording_id=uuid4(), spawn=True)
finished = False
step = 0
Expand Down
Loading
Loading