Skip to content

Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch #560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions common/telemetry.js
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,13 @@ const networkInfoWriteClient = influx.getWriteApi(
's' // precision
)

// Add new write client for batch metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add new bucket and write client, rather let's just extend existing publish metric.

const batchMetricsWriteClient = influx.getWriteApi(
'Filecoin Station', // org
'spark-batch-metrics', // bucket
'ns' // precision
)

setInterval(() => {
publishWriteClient.flush().catch(console.error)
networkInfoWriteClient.flush().catch(console.error)
Expand Down
11 changes: 11 additions & 0 deletions publish/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,11 @@ export const publish = async ({

logger.log(`Publishing ${measurements.length} measurements. Total unpublished: ${totalCount}. Batch size: ${maxMeasurements}.`)

// Calculate batch size in bytes
const batchSizeBytes = Buffer.byteLength(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you found some other ways to calculate batch size without serialising objects to JSON? Depending on the batch size that might create consume a lot of memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

We have the following code few lines below:

  const file = new File(
    [measurements.map(m => JSON.stringify(m)).join('\n')],
    'measurements.ndjson',
    { type: 'application/json' }
  )

Please refactor it so that we create only one copy of measurements.map(m => JSON.stringify(m)).join('\n').

measurements.map(m => JSON.stringify(m)).join('\n')
)

// Share measurements
const start = new Date()
const file = new File(
Expand Down Expand Up @@ -126,7 +131,9 @@ export const publish = async ({

logger.log('Done!')

// Telemetry recording with batch size metrics
recordTelemetry('publish', point => {
// Existing metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's extend this metric with new point that collects batch size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me correct that - we should add new fields to the existing point.

point.intField('round_index', roundIndex)
point.intField('measurements', measurements.length)
point.floatField('load', totalCount / maxMeasurements)
Expand All @@ -135,6 +142,10 @@ export const publish = async ({
uploadMeasurementsDuration
)
point.intField('add_measurements_duration_ms', ieAddMeasurementsDuration)

// Add batch size metrics to existing publish point
point.intField('batch_size_bytes', batchSizeBytes)
point.floatField('avg_measurement_size_bytes', batchSizeBytes / measurements.length)
})
}

Expand Down
Loading