SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks. Compared to its predecessor SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short text phrase or exemplars.
Please refer to X-AnyLabeling-Server for download, installation, and server setup instructions.
Launch the X-AnyLabeling client, press Ctrl+A or click the AI button in the left menu bar to open the auto-labeling panel. In the model dropdown list, select Remote-Server, then choose Segment Anything 3.
sam3-text-prompt.mp4
- Enter object names in the text field (e.g.,
person,car,bicycle) - Separate multiple classes with periods or commas:
person.car.bicycleordog,cat,tree - Click Send to initiate detection
sam3-visual-prompt.mp4
- Click +Rect or -Rect to activate drawing mode
- Draw bounding boxes around target objects or regions of interest (use +Rect for positive prompts, -Rect for negative prompts)
- Add multiple prompts for different object instances
- Click Run Rect to process visual cues
- Click Finish (or press
f) to complete the object, enter the label category and confirm, or use Clear to remove all visual prompts
