Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch #560

Winter-Soren · 2025-03-27T17:42:12Z

This PR implements collection and visualization of Web3.Storage measurement batch metrics.

Closes #511

Changes

Created new InfluxDB bucket 'spark-batch-metrics' for dedicated batch metrics storage
Enhanced telemetry recording with separate batch metrics measurement
Added visualization dashboards for batch metrics

Code Changes

Modified common/telemetry.js to add dedicated batch metrics write client
Enhanced publish/index.js to separate batch metrics recording
Added proper tags for improved metric querying

Metrics Collected

batch_size_bytes: Total size of the measurement batch
avg_measurement_size_bytes: Average size per measurement
measurement_count: Number of measurements in batch
Tags:
- cid: Content ID of the batch
- round_index: Associated round number

Testing

Verified metrics collection with test data
Confirmed proper data storage in InfluxDB
Validated dashboard visualizations

Winter-Soren · 2025-03-27T17:43:27Z

Hi @pyropy can you review this draft and let me know if I’m heading in the right direction?

Winter-Soren · 2025-03-27T17:48:16Z

@pyropy , I am also facing this issue in parallel, where the server runs fine but suddenly outputs an error stating that InfluxDB write failed. I have attached a screenshot below.

I created InfluxDB keys from this platform: https://eu-central-1-1.aws.cloud2.influxdata.com/orgs/e0c31560a2dfff87/load-data/tokens.

I tried deleting keys and checking RPC nodes but issue persists.

juliangruber · 2025-03-28T09:00:14Z

@Winter-Soren could you please link to the issue this addresses both in the PR title and description?

bajtos

I am confused.

What problems is this pull request trying to solve?

If the goal is to start collecting telemetry about batch sizes, I would expect the pull request to simply add more fields to the existing publish point.

bajtos · 2025-03-28T09:12:35Z

publish/index.js

+    point.tag('cid', cid.toString())
+    point.tag('round_index', roundIndex.toString())


Storing CIDs and round indexes in tags will cause high cardinality that will degrade performance and/or increase our bill.

Quoting from https://docs.influxdata.com/influxdb/v2/write-data/best-practices/resolve-high-cardinality/

Tags containing highly variable information like unique IDs, hashes, and random strings lead to a large number of series, also known as high series cardinality. High series cardinality is a primary driver of high memory usage for many database workloads.
(...)
Review your tags to ensure each tag does not contain unique values for most entries.

thanks for the suggestion, I'll rectify the commit!

+1 to what @bajtos said.

Winter-Soren · 2025-03-29T18:55:10Z

@Winter-Soren could you please link to the issue this addresses both in the PR title and description?

@juliangruber, I have edited the name of the PR and added a link to the issue it resolves.

pyropy

Good job and thank you for your contribution @Winter-Soren! 🎉

I think the PR is going in the right direction but would need to fix few things before continuing. Namely lets stick to established pattern in the common/telemetry.js file (which I've noted in the comments). We would also need to improve naming a bit to make it more obvious of what metrics we're storing.

Keep up the good work! 🚀

common/telemetry.js

pyropy · 2025-03-31T09:56:48Z

publish/index.js

@@ -53,6 +53,11 @@ export const publish = async ({

  logger.log(`Publishing ${measurements.length} measurements. Total unpublished: ${totalCount}. Batch size: ${maxMeasurements}.`)

+  // Calculate batch size in bytes
+  const batchSizeBytes = Buffer.byteLength(


Have you found some other ways to calculate batch size without serialising objects to JSON? Depending on the batch size that might create consume a lot of memory.

+1

We have the following code few lines below:

const file = new File( [measurements.map(m => JSON.stringify(m)).join('\n')], 'measurements.ndjson', { type: 'application/json' } )

Please refactor it so that we create only one copy of measurements.map(m => JSON.stringify(m)).join('\n').

pyropy · 2025-03-31T09:57:35Z

publish/index.js

+    point.tag('cid', cid.toString())
+    point.tag('round_index', roundIndex.toString())


+1 to what @bajtos said.

common/telemetry.js

bajtos · 2025-03-31T11:29:52Z

Thank you for adding a link to #511, @Winter-Soren. This pull request makes sense to me now!

Thank you for the contribution ❤️

Before we move forward, I think it's best for me and @pyropy to agree on the higher-level design.

@pyropy Why is it necessary to create a new bucket and a new telemetry writer?

As I wrote before:

If the goal is to start collecting telemetry about batch sizes, I would expect the pull request to simply add more fields to the existing publish point.

If we add more fields to the existing publish point, what problems would that cause?

pyropy · 2025-03-31T12:30:08Z

Thank you for adding a link to #511, @Winter-Soren. This pull request makes sense to me now!

Thank you for the contribution ❤️

Before we move forward, I think it's best for me and @pyropy to agree on the higher-level design.

@pyropy Why is it necessary to create a new bucket and a new telemetry writer?

As I wrote before:

If the goal is to start collecting telemetry about batch sizes, I would expect the pull request to simply add more fields to the existing publish point.

If we add more fields to the existing publish point, what problems would that cause?

Sorry, I have just quickly glanced over your comment. Looking at the code I have to say you're right, we could just simply extend publish point. I'm going also update the #511 description accordingly.

pyropy · 2025-03-31T12:35:04Z

@Winter-Soren I have published a bad task description without looking deeper into the existing codebase. I have updated the task description according to original comment published by @bajtos.

pyropy

I've added new comments according to updated description of #511. Please let me know if you need further explanation. 🙏🏻

pyropy · 2025-03-31T12:37:00Z

common/telemetry.js

@@ -25,9 +25,17 @@ const networkInfoWriteClient = influx.getWriteApi(
  's' // precision
 )

+// Add new write client for batch metrics


Let's not add new bucket and write client, rather let's just extend existing publish metric.

pyropy · 2025-03-31T12:37:38Z

common/telemetry.js

 setInterval(() => {
  publishWriteClient.flush().catch(console.error)
  networkInfoWriteClient.flush().catch(console.error)
+  batchMetricsWriteClient.flush().catch(console.error)


Suggested change

batchMetricsWriteClient.flush().catch(console.error)

We won't need this one if we extend existing publish metric.

pyropy · 2025-03-31T12:38:04Z

common/telemetry.js

+  recordNetworkInfoTelemetry,
+  batchMetricsWriteClient


Suggested change

recordNetworkInfoTelemetry,

batchMetricsWriteClient

recordNetworkInfoTelemetry

We won't need this one if we extend existing publish metric.

pyropy · 2025-03-31T12:38:30Z

publish/index.js

  recordTelemetry('publish', point => {
+    // Existing metrics


Let's extend this metric with new point that collects batch size.

Let me correct that - we should add new fields to the existing point.

pyropy · 2025-03-31T12:39:20Z

publish/index.js

@@ -136,6 +143,16 @@ export const publish = async ({
    )
    point.intField('add_measurements_duration_ms', ieAddMeasurementsDuration)
  })
+
+  // Separate batch metrics recording for better organization
+  recordTelemetry('batch_metrics', point => {


Let's delete this new metric.

feat: web3 storage batch metrics

f22813c

Winter-Soren requested review from bajtos, juliangruber, pyropy and NikolasHaimerl as code owners March 27, 2025 17:42

github-project-automation bot added this to Space Meridian Mar 27, 2025

bajtos requested changes Mar 28, 2025

View reviewed changes

bajtos reviewed Mar 28, 2025

View reviewed changes

Winter-Soren changed the title ~~Web3.Storage Measurement Batch Metrics Visualization~~ Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch Mar 29, 2025

Winter-Soren requested a review from bajtos March 29, 2025 18:57

pyropy requested changes Mar 31, 2025

View reviewed changes

This comment was marked as off-topic.

Sign in to view

feat: add batch size metrics to publish telemetry

3a83c09

Winter-Soren force-pushed the feat/511-web3-storage-batch-metrics branch from 9bc327d to 3a83c09 Compare April 5, 2025 09:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch #560

Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch #560

Winter-Soren commented Mar 27, 2025 •

edited

Loading

Winter-Soren commented Mar 27, 2025 •

edited

Loading

Winter-Soren commented Mar 27, 2025 •

edited

Loading

juliangruber commented Mar 28, 2025

bajtos left a comment

bajtos Mar 28, 2025 •

edited

Loading

Winter-Soren Mar 29, 2025

pyropy Mar 31, 2025

Winter-Soren commented Mar 29, 2025

pyropy left a comment

pyropy Mar 31, 2025

bajtos Mar 31, 2025

pyropy Mar 31, 2025

bajtos commented Mar 31, 2025

pyropy commented Mar 31, 2025

pyropy commented Mar 31, 2025

pyropy left a comment

pyropy Mar 31, 2025

pyropy Mar 31, 2025

pyropy Mar 31, 2025

pyropy Mar 31, 2025

bajtos Mar 31, 2025

pyropy Mar 31, 2025

This comment was marked as off-topic.

		point.tag('cid', cid.toString())
		point.tag('round_index', roundIndex.toString())

	recordNetworkInfoTelemetry,
	batchMetricsWriteClient
	recordNetworkInfoTelemetry

Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch #560

Are you sure you want to change the base?

Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch #560

Conversation

Winter-Soren commented Mar 27, 2025 • edited Loading

Changes

Code Changes

Metrics Collected

Testing

Winter-Soren commented Mar 27, 2025 • edited Loading

Winter-Soren commented Mar 27, 2025 • edited Loading

juliangruber commented Mar 28, 2025

bajtos left a comment

Choose a reason for hiding this comment

bajtos Mar 28, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Winter-Soren commented Mar 29, 2025

pyropy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bajtos commented Mar 31, 2025

pyropy commented Mar 31, 2025

pyropy commented Mar 31, 2025

pyropy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as off-topic.

Winter-Soren commented Mar 27, 2025 •

edited

Loading

Winter-Soren commented Mar 27, 2025 •

edited

Loading

Winter-Soren commented Mar 27, 2025 •

edited

Loading

bajtos Mar 28, 2025 •

edited

Loading