Skip to content

Commit 8055df7

Browse files
authored
DOCS-3586: Update capture data section (#4797)
1 parent ca59448 commit 8055df7

File tree

6 files changed

+100
-74
lines changed

6 files changed

+100
-74
lines changed

docs/data-ai/capture-data/advanced/advanced-data-capture-sync.md

Lines changed: 23 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ type: "docs"
88
platformarea: ["data"]
99
description: "Advanced data capture and data sync configurations."
1010
date: "2025-02-10"
11+
updated: "2025-12-04"
1112
---
1213

1314
Some data use cases require advanced configuration beyond the attributes accessible in the UI.
@@ -18,7 +19,7 @@ You can also configure data capture for remote parts.
1819

1920
Configure how long your synced data remains stored in the cloud:
2021

21-
- **Retain data up to a certain size (for example, 100GB) or for a specific length of time (for example, 14 days):** Set `retention_policies` at the resource level.
22+
- **Retain data up to a certain size (for example, 100GB) or for a specific length of time (for example, 14 days):** Set `retention_policy` at the resource level.
2223
See the `retention_policy` field in [data capture configuration attributes](/data-ai/capture-data/advanced/advanced-data-capture-sync/#click-to-view-data-capture-attributes).
2324
- **Delete data captured by a machine when you delete the machine:** Control whether your cloud data is deleted when a machine or machine part is removed.
2425
See the `delete_data_on_part_deletion` field in the [data management service configuration attributes](/data-ai/capture-data/advanced/advanced-data-capture-sync/#click-to-view-data-management-attributes).
@@ -93,7 +94,7 @@ The following attributes are available for the data management service:
9394
<!-- prettier-ignore -->
9495
| Name | Type | Required? | Description | `viam-micro-server` Support |
9596
| ------------------ | ------ | --------- | ----------- | ------------------- |
96-
| `capture_disabled` | bool | Optional | Toggle data capture on or off for the entire machine {{< glossary_tooltip term_id="part" text="part" >}}. Note that even if capture is on for the whole part, but is not on for any individual {{< glossary_tooltip term_id="component" text="components" >}} (see Step 2), data is not being captured. <br> Default: `false` | <p class="center-text"><i class="fas fa-check" title="yes"></i></p> |
97+
| `capture_disabled` | bool | Optional | Toggle data capture on or off for the entire machine {{< glossary_tooltip term_id="part" text="part" >}}. Note that even if capture is on for the whole part, if it is not on for any individual {{< glossary_tooltip term_id="component" text="components" >}}, data is not being captured. <br> Default: `false` | <p class="center-text"><i class="fas fa-check" title="yes"></i></p> |
9798
| `capture_dir` | string | Optional | Path to the directory on your machine where you want to store captured data. If you change the directory for data capture, only new data is stored in the new directory. Existing data remains in the directory where it was stored. <br> Default: `~/.viam/capture` | <p class="center-text"><i class="fas fa-check" title="yes"></i></p> |
9899
| `tags` | array of strings | Optional | Tags to apply to all images or tabular data captured by this machine part. May include alphanumeric characters, underscores, and dashes. | |
99100
| `sync_disabled` | bool | Optional | Toggle cloud sync on or off for the entire machine {{< glossary_tooltip term_id="part" text="part" >}}. <br> Default: `false` | |
@@ -103,12 +104,12 @@ The following attributes are available for the data management service:
103104
| `delete_data_on_part_deletion` | bool | Optional | Whether deleting this {{< glossary_tooltip term_id="machine" text="machine" >}} or {{< glossary_tooltip term_id="part" text="machine part" >}} should result in deleting all the data captured by that machine part. <br> Default: `false` | <p class="center-text"><i class="fas fa-check" title="yes"></i></p> |
104105
| `delete_every_nth_when_disk_full` | int | Optional | How many files to delete when local storage meets the [fullness criteria](/data-ai/capture-data/advanced/how-sync-works/#storage). The data management service will delete every Nth file that has been captured upon reaching this threshold. Use JSON mode to configure this attribute. <br> Default: `5`, meaning that every fifth captured file will be deleted. | |
105106
| `maximum_num_sync_threads` | int | Optional | Max number of CPU threads to use for syncing data to the Viam Cloud. <br> Default: [runtime.NumCPU](https://pkg.go.dev/runtime#NumCPU)/2 so half the number of logical CPUs available to viam-server | |
106-
| `mongo_capture_config.uri` | string | Optional | The [MongoDB URI](https://www.mongodb.com/docs/v6.2/reference/connection-string/) data capture will attempt to write tabular data to after it is enqueued to be written to disk. When non-empty, data capture will capture tabular data to the configured MongoDB database and collection at that URI.<br>See `mongo_capture_config.database` and `mongo_capture_config.collection` below for database and collection defaults.<br>See [Data capture directly to MongoDB](/data-ai/capture-data/advanced/how-sync-works/#storage) for an example config.| |
107-
| `mongo_capture_config.database` | string | Optional | When `mongo_capture_config.uri` is non empty, changes the database data capture will write tabular data to. <br> Default: `"sensorData"` | |
108-
| `mongo_capture_config.collection` | string | Optional | When `mongo_capture_config.uri` is non empty, changes the collection data capture will write tabular data to.<br> Default: `"readings"` | |
109-
| `cache_size_kb` | float | Optional | `viam-micro-server` only. The maximum amount of storage bytes (in kilobytes) allocated to a data collector. <br> Default: `1` KB. | <p class="center-text"><i class="fas fa-check" title="yes"></i></p> |
107+
| `mongo_capture_config.uri` | string | Optional | The [MongoDB URI](https://www.mongodb.com/docs/v6.2/reference/connection-string/) to which data capture will attempt to write tabular data after it is enqueued to be written to disk. When non-empty, data capture will write tabular data to the configured MongoDB database and collection at that URI.<br>See `mongo_capture_config.database` and `mongo_capture_config.collection` below for database and collection defaults.<br>See [Capture directly to your own MongoDB cluster](/data-ai/capture-data/advanced/advanced-data-capture-sync/#capture-directly-to-your-own-mongodb-cluster) for example configurations.| |
108+
| `mongo_capture_config.database` | string | Optional | When `mongo_capture_config.uri` is non-empty, changes the database data capture will write tabular data to. <br> Default: `"sensorData"` | |
109+
| `mongo_capture_config.collection` | string | Optional | When `mongo_capture_config.uri` is non-empty, changes the collection data capture will write tabular data to.<br> Default: `"readings"` | |
110+
| `cache_size_kb` | float | Optional | `viam-micro-server` only. The maximum amount of storage (in kilobytes) allocated to a data collector. <br> Default: `1` KB. | <p class="center-text"><i class="fas fa-check" title="yes"></i></p> |
110111
| `file_last_modified_millis` | float | Optional | The amount of time to pass since arbitrary files were last modified until they are synced. Normal <file>.capture</file> files are synced as soon as they are able to be synced. <br> Default: `10000` milliseconds. | |
111-
| `disk_usage_deletion_threshold` | float | Optional | The disk usage ratio at or above which, files will be deleted if the capture directory makes up at least the specified `capture_dir_deletion_threshold` of the disk usage. If disk usage is at or above the disk usage threshold, but the capture directory is below the capture directory threshold, then file deletion will not occur but a warning will be logged periodically. Default: `0.9`. | |
112+
| `disk_usage_deletion_threshold` | float | Optional | The disk usage ratio at or above which files will be deleted if the capture directory makes up at least the specified `capture_dir_deletion_threshold` of the disk usage. If disk usage is at or above the disk usage threshold, but the capture directory is below the capture directory threshold, then file deletion will not occur but a warning will be logged periodically. Default: `0.9`. | |
112113
| `capture_dir_deletion_threshold` | float | Optional | The ratio of disk usage made up by the capture directory at or above which files will be deleted if the disk usage ratio is also above the `disk_usage_deletion_threshold`. If the ratio of disk usage of the capture directory is at or above the threshold but the disk usage is below the disk usage threshold, then file deletion will not occur but a warning will be logged periodically. Default: `0.5`. | |
113114

114115
{{< /expand >}}
@@ -196,7 +197,7 @@ This example configuration captures data from the `ReadImage` method of a camera
196197
{{% /tab %}}
197198
{{% tab name="viam-micro-server" %}}
198199

199-
This example configuration captures data from the `GetReadings` method of a temperature sensor and wifi signal sensor:
200+
This example configuration captures data from the `Readings` method of a temperature sensor and wifi signal sensor:
200201

201202
```json {class="line-numbers linkable-line-numbers"}
202203
{
@@ -267,7 +268,7 @@ This example configuration captures data from the `GetReadings` method of a temp
267268
{{% /tab %}}
268269
{{< /tabs >}}
269270

270-
Example for a vision service:
271+
Example configuration for a vision service:
271272

272273
This example configuration captures data from the `CaptureAllFromCamera` method of the vision service:
273274

@@ -345,7 +346,9 @@ Viam supports data capture from {{< glossary_tooltip term_id="resource" text="re
345346
For example, if you use a {{< glossary_tooltip term_id="part" text="part" >}} that does not have a Linux operating system or does not have enough storage or processing power to run `viam-server`, you can still process and capture the data from that part's resources by adding it as a remote part.
346347

347348
Currently, you can only configure data capture from remote resources in your JSON configuration.
348-
To add them to your JSON configuration you must explicitly add the remote resource's `type`, `model`, `name`, and `additional_params` to the `data_manager` service configuration in the `remotes` configuration:
349+
To add them to your JSON configuration, you must explicitly add the remote resource's `type`, `model`, `name`, and `additional_params` to the data_manager service configuration in the remotes configuration:
350+
351+
`name` and `additional_params` to the `data_manager` service configuration in the `remotes` configuration:
349352

350353
<!-- prettier-ignore -->
351354
| Key | Description |
@@ -428,9 +431,7 @@ The following example of a configuration with a remote part captures data from t
428431
"sync_disabled": true,
429432
"sync_interval_mins": 5,
430433
"tags": ["tag1", "tag2"]
431-
},
432-
"name": "data_manager",
433-
"type": "data_manager"
434+
}
434435
}
435436
],
436437
"components": [],
@@ -443,16 +444,16 @@ The following example of a configuration with a remote part captures data from t
443444
"type": "data_manager",
444445
"attributes": {
445446
"capture_methods": [
446-
// Captures data from two analog readers (A1 and A2)
447-
{
447+
// Captures data from two analog readers (A1 and A2)
448+
{
448449
"method": "Analogs",
449450
"capture_frequency_hz": 1,
450451
"cache_size_kb": 10,
451452
"name": "rdk:component:board/my-esp32",
452453
"additional_params": { "reader_name": "A1" },
453454
"disabled": false
454-
},
455-
{
455+
},
456+
{
456457
"method": "Analogs",
457458
"capture_frequency_hz": 1,
458459
"cache_size_kb": 10,
@@ -467,7 +468,7 @@ The following example of a configuration with a remote part captures data from t
467468
"cache_size_kb": 10,
468469
"name": "rdk:component:board/my-esp32",
469470
"additional_params": {
470-
"pin_name": “27”
471+
"pin_name": "27"
471472
},
472473
"disabled": false
473474
}
@@ -491,14 +492,15 @@ The following example of a configuration with a remote part captures data from t
491492
{
492493
"services": [
493494
{
495+
"name": "data_manager",
496+
"api": "rdk:service:data_manager",
497+
"model": "rdk:builtin:builtin",
494498
"attributes": {
495499
"capture_dir": "",
496500
"sync_disabled": true,
497501
"sync_interval_mins": 5,
498502
"tags": []
499-
},
500-
"name": "data_manager",
501-
"type": "data_manager"
503+
}
502504
}
503505
],
504506
"components": [],

docs/data-ai/capture-data/advanced/how-sync-works.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@ weight: 12
66
layout: "docs"
77
type: "docs"
88
platformarea: ["data"]
9-
description: "Data capture and sync works differently for viam-server and viam-micro-server."
9+
description: "Data capture and sync work differently for viam-server and viam-micro-server."
1010
date: "2024-12-18"
11+
updated: "2025-12-04"
1112
---
1213

13-
Data capture and cloud sync works differently for `viam-server` and `viam-micro-server`.
14+
Data capture and cloud sync work differently for `viam-server` and `viam-micro-server`.
1415

1516
{{< tabs >}}
1617
{{% tab name="viam-server" %}}
@@ -28,7 +29,7 @@ The data is captured locally on the machine's storage and, by default, stored in
2829

2930
The relative path for the data capture directory depends on where `viam-server` is run from, as well as the operating system of the machine.
3031

31-
To find the `$HOME` value, check your machine's logs on startup which will log it in the environment variables:
32+
To find the `$HOME` value, check your machine's logs on startup, which will log it in the environment variables:
3233

3334
```sh
3435
2025-01-15T14:27:26.073Z INFO rdk server/entrypoint.go:77 Starting viam-server with following environment variables {"HOME":"/home/johnsmith"}
@@ -69,15 +70,15 @@ When data is stored in the cloud, it is encrypted at rest by the cloud storage p
6970

7071
## Data integrity
7172

72-
Viam's data management service is designed to safeguard against data loss, data duplication and otherwise compromised data.
73+
Viam's data management service is designed to safeguard against data loss, data duplication, and otherwise compromised data.
7374

7475
If the internet becomes unavailable or the machine needs to restart during the sync process, the sync is interrupted.
75-
If the sync process is interrupted, the service will retry uploading the data at exponentially increasing intervals until the interval in between tries is at one hour, at which point the service retries the sync every hour.
76+
If the sync process is interrupted, the service will retry uploading the data at exponentially increasing intervals until the interval between retries reaches one hour, at which point the service retries the sync every hour.
7677
When the connection is restored and sync resumes, the service continues sync where it left off without duplicating data.
7778
If the interruption happens mid-file, sync resumes from the beginning of that file.
7879

7980
To avoid syncing files that are still being written to, the data management service only syncs arbitrary files that haven't been modified in the previous 10 seconds.
80-
This default can be changed with the [`file_last_modified_millis` config attribute](/data-ai/capture-data/capture-sync/).
81+
This default can be changed with the [`file_last_modified_millis` config attribute](/data-ai/capture-data/advanced/advanced-data-capture-sync/#click-to-view-data-management-attributes).
8182

8283
## Automatic data deletion
8384

@@ -116,8 +117,7 @@ When a machine loses its internet connection, it cannot resume cloud sync until
116117

117118
To ensure that the machine can store all data captured while it has no connection, you need to provide enough local data storage.
118119

119-
If your robot is offline and can't sync and your machine's disk fills up beyond a certain threshold, the data management service will delete captured data to free up additional space and maintain a working machine.
120-
For more information, see [Automatic data deletion details](/data-ai/capture-data/advanced/how-sync-works/)
120+
For information about automatic data deletion when storage fills up, see [Automatic data deletion](#automatic-data-deletion) above.
121121

122122
Data capture supports capturing tabular data directly to MongoDB in addition to capturing to disk.
123123
For more information, see [Capture directly to MongoDB](/data-ai/capture-data/advanced/advanced-data-capture-sync/#capture-directly-to-your-own-mongodb-cluster).

docs/data-ai/capture-data/capture-sync.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ type: "docs"
88
platformarea: ["data"]
99
description: "Capture data from a resource on your machine and sync the data to the cloud."
1010
date: "2024-12-03"
11+
updated: "2025-12-04"
1112
aliases:
1213
- /services/data/capture/
1314
- /data/capture/
@@ -41,9 +42,9 @@ aliases:
4142
---
4243

4344
You can use the data management service to capture data from [supported components and services](/data-ai/capture-data/capture-sync/#click-to-see-resources-that-support-data-capture-and-cloud-sync), then sync it to the cloud.
44-
You can also sync data from arbitrary folders on your machine.
45+
You can also [sync data from arbitrary folders on your machine](/data-ai/capture-data/upload-other-data/#sync-data-from-another-directory).
4546

46-
## How data capture and data sync works
47+
## How data capture and data sync work
4748

4849
The data management service writes data from your configured Viam resources to local storage on your edge device and syncs data from the edge device to the cloud:
4950

@@ -92,13 +93,13 @@ Some models do not support all options, for example webcams do not capture point
9293

9394
{{< /expand >}}
9495

95-
For instructions on configuring data capture and sync with JSON, go to [Advanced data capture and sync configurations](/data-ai/capture-data/advanced/advanced-data-capture-sync/) and follow the instructions for JSON examples.
96+
For instructions on configuring data capture and sync with JSON, see [Advanced data capture and sync configurations](/data-ai/capture-data/advanced/advanced-data-capture-sync/).
9697

9798
## View captured data
9899

99100
1. Navigate to the [**DATA** tab](https://app.viam.com/data/view).
100101
1. Select the [**Images**](https://app.viam.com/data/view?view=images), [**Files**](https://app.viam.com/data/view?view=files), [**Point clouds**](https://app.viam.com/data/view?view=point+clouds), or [**Sensors**](https://app.viam.com/data/view?view=sensors) subtab.
101-
1. Filter data by location, type of data, and more.
102+
1. Filter data by location, type, and more.
102103

103104
## Stop data capture or data sync
104105

@@ -142,4 +143,4 @@ For other ways to control data synchronization, see:
142143
## Next steps
143144

144145
For more information on available configuration attributes and options like capturing directly to MongoDB or conditional sync, see [Advanced data capture and sync configurations](/data-ai/capture-data/advanced/advanced-data-capture-sync/).
145-
To leverage AI, you can now [create a dataset](/data-ai/train/create-dataset/) with the data you've captured.
146+
To leverage AI, you can [create a dataset](/data-ai/train/create-dataset/) with the data you've captured.

0 commit comments

Comments
 (0)