Skip to content

Use multi-get; native bucket scanner in segment stats computation #8469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 67 commits into from
Apr 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
4a8f067
WIP: Improve segment index performance
fm3 Mar 20, 2025
45adb6a
implementation with triple cast on every element, not faster than scala
fm3 Mar 20, 2025
398e757
measure what else is slow
fm3 Mar 20, 2025
3b8fd9f
wip cleanup, measure other stuff
fm3 Mar 24, 2025
ecb2c9a
Merge branch 'master' into segment-index-perf
fm3 Mar 24, 2025
be8ba6f
wip: use volume bucket buffer, fossil multi-put
fm3 Mar 24, 2025
0f29acb
logging
fm3 Mar 24, 2025
79312c8
WIP rewrite segmentIndexBuffer to include caching, use Set
fm3 Mar 25, 2025
f515bfc
use new segment index buffer functionality
fm3 Mar 25, 2025
ad7eeff
WIP fossil multi-get for buckets
fm3 Mar 25, 2025
7693b78
fix setting volumeBucketDataHasChanged
fm3 Mar 25, 2025
a3a36ce
do not parallelize prefill
fm3 Mar 25, 2025
0d5c9df
Merge branch 'master' into segment-index-perf
fm3 Mar 26, 2025
8264691
use batched multi-get and multi-put
fm3 Mar 26, 2025
9353c6b
connect to temporaryStore, compress if needed
fm3 Mar 26, 2025
9dc7d31
load buckets from temporarystore, remote fallback layer
fm3 Mar 26, 2025
d84ad19
use multi-get; native bucket scanner in segment stats computation
fm3 Mar 26, 2025
e848ded
fix bucketPosition conversion
fm3 Mar 26, 2025
fabfdd8
include fallback data
fm3 Mar 26, 2025
2cea8b5
load buckets from fallback layer in one request
fm3 Mar 27, 2025
f6290c1
Merge branch 'master' into segment-index-perf
fm3 Mar 27, 2025
b9c75ce
Merge branch 'segment-index-perf' into segment-index-perf-fallback-ba…
fm3 Mar 27, 2025
ab0f2d7
some cleanup
fm3 Mar 27, 2025
9f1872f
readerOnly segmentIndexBuffer
fm3 Mar 27, 2025
10c2c92
cleanup
fm3 Mar 27, 2025
2dd76bc
cleanup
fm3 Mar 27, 2025
9bdb487
format, remove logging
fm3 Mar 27, 2025
d400771
debug wrong volume values for editable mapping (id seems to be mapped…
fm3 Mar 27, 2025
6247a9a
Merge branch 'master' into segment-index-perf
fm3 Mar 31, 2025
27129b9
request correct version of editable mapping data
fm3 Mar 31, 2025
f0ff544
changelog, migration
fm3 Mar 31, 2025
eee5bbc
cleanup
fm3 Mar 31, 2025
bf99be3
Merge branch 'master' into segment-index-perf
fm3 Mar 31, 2025
acc1c0e
Merge branch 'segment-index-perf' into volume-stats-perf
fm3 Mar 31, 2025
b6719c2
implement also for bounding box
fm3 Mar 31, 2025
4cf95d7
make cpp types more explicit
fm3 Mar 31, 2025
7142762
expose loadMultiple via bucketProvider and binaryDataService
fm3 Mar 31, 2025
4a37090
cleanup
fm3 Mar 31, 2025
e7e5356
unused import
fm3 Mar 31, 2025
28b994d
add error logging (also in tracingstore), skip requesting buckets tha…
fm3 Apr 1, 2025
5663bc6
apply mappings if needed, cleanup
fm3 Apr 1, 2025
f6ef112
lint
fm3 Apr 1, 2025
5275c20
Merge branch 'master' into segment-index-perf
fm3 Apr 1, 2025
5352edf
clean up cpp code
fm3 Apr 1, 2025
1312de4
Merge branch 'segment-index-perf' into volume-stats-perf
fm3 Apr 1, 2025
993b9cc
const, try/catch
fm3 Apr 1, 2025
d80da1b
clang-format
fm3 Apr 1, 2025
b6ac259
size_t for index
fm3 Apr 1, 2025
b55f893
applicationHealthService only exists on datastore side, it can stay t…
fm3 Apr 1, 2025
6583bb9
unused import
fm3 Apr 1, 2025
3b7b4af
Do not cache EditableMappingBucketProvider across versions
fm3 Apr 2, 2025
21bfbec
pr feedback part 1; consistent volumeLayer naming
fm3 Apr 2, 2025
5292a33
use fossil multi-get when requesting multiple segmentIds from segment…
fm3 Apr 2, 2025
2b71386
use map for more efficient lookups during gathering segmentIndex values
fm3 Apr 2, 2025
8bc345e
add one more conversion from set to seq for less map overhead
fm3 Apr 2, 2025
ab98fc4
Update webknossos-tracingstore/app/com/scalableminds/webknossos/traci…
fm3 Apr 2, 2025
a2bd486
Update webknossos-tracingstore/app/com/scalableminds/webknossos/traci…
fm3 Apr 2, 2025
a1ceb36
Merge branch 'master' into segment-index-perf
fm3 Apr 3, 2025
99bcdbe
Merge branch 'segment-index-perf' into volume-stats-perf
fm3 Apr 3, 2025
7aef564
Merge branch 'master' into volume-stats-perf
fm3 Apr 3, 2025
694a766
fixes after merge
fm3 Apr 3, 2025
f3ff1c9
changelog
fm3 Apr 3, 2025
18eb48b
Merge branch 'master' into volume-stats-perf
fm3 Apr 3, 2025
5fcafce
Merge branch 'master' into volume-stats-perf
fm3 Apr 7, 2025
b08a51f
implement pr feedback
fm3 Apr 7, 2025
410914d
Merge branch 'master' into volume-stats-perf
fm3 Apr 8, 2025
ad36432
implement cpp feedback (cache locality)
fm3 Apr 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ For upgrade instructions, please check the [migration guide](MIGRATIONS.released
- The opacity of meshes can be adjusted using the 'Change Segment Color' context menu entry in the segments tab. [#8443](https://github.com/scalableminds/webknossos/pull/8443)
- The maximum available storage of an organization is now enforced during upload. [#8385](https://github.com/scalableminds/webknossos/pull/8385)
- Performance improvements for volume annotation save requests. [#8460](https://github.com/scalableminds/webknossos/pull/8460)
- Performance improvements for segment statistics (volume + bounding box in context menu). [#8469](https://github.com/scalableminds/webknossos/pull/8469)

### Changed

Expand Down
4 changes: 2 additions & 2 deletions util/src/main/scala/com/scalableminds/util/tools/Fox.scala
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ object Fox extends FoxImplicits {
}

// run serially, return individual results in list of box
def serialSequenceBox[A, B](l: List[A])(f: A => Fox[B])(implicit ec: ExecutionContext): Future[List[Box[B]]] = {
def serialSequenceBox[A, B](l: Seq[A])(f: A => Fox[B])(implicit ec: ExecutionContext): Future[List[Box[B]]] = {
def runNext(remaining: List[A], results: List[Box[B]]): Future[List[Box[B]]] =
remaining match {
case head :: tail =>
Expand All @@ -99,7 +99,7 @@ object Fox extends FoxImplicits {
case Nil =>
Future.successful(results.reverse)
}
runNext(l, Nil)
runNext(l.toList, Nil)
}

def sequence[T](l: List[Fox[T]])(implicit ec: ExecutionContext): Future[List[Box[T]]] =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ class DataStoreModule extends AbstractModule {
bind(classOf[AgglomerateService]).asEagerSingleton()
bind(classOf[AdHocMeshServiceHolder]).asEagerSingleton()
bind(classOf[ApplicationHealthService]).asEagerSingleton()
bind(classOf[DatasetErrorLoggingService]).asEagerSingleton()
bind(classOf[DSDatasetErrorLoggingService]).asEagerSingleton()
bind(classOf[MeshFileService]).asEagerSingleton()
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ class DataSourceController @Inject()(
connectomeFileService: ConnectomeFileService,
segmentIndexFileService: SegmentIndexFileService,
storageUsageService: DSUsedStorageService,
datasetErrorLoggingService: DatasetErrorLoggingService,
datasetErrorLoggingService: DSDatasetErrorLoggingService,
exploreRemoteLayerService: ExploreRemoteLayerService,
uploadService: UploadService,
composeService: ComposeService,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,16 @@ package com.scalableminds.webknossos.datastore.dataformats
import com.scalableminds.util.accesscontext.TokenContext
import com.scalableminds.util.tools.Fox
import com.scalableminds.webknossos.datastore.models.requests.DataReadInstruction
import net.liftweb.common.Box

import scala.concurrent.ExecutionContext

trait BucketProvider {
def load(readInstruction: DataReadInstruction)(implicit ec: ExecutionContext, tc: TokenContext): Fox[Array[Byte]]

def loadMultiple(readInstructions: Seq[DataReadInstruction])(implicit ec: ExecutionContext,
tc: TokenContext): Fox[Seq[Box[Array[Byte]]]] =
Fox.serialSequenceBox(readInstructions) { readInstruction =>
load(readInstruction)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,24 @@ class NativeBucketScanner() {
bytesPerElement: Int,
isSigned: Boolean,
skipZeroes: Boolean): Array[Long]

@native def countSegmentVoxels(bucketBytes: Array[Byte],
bytesPerElement: Int,
isSigned: Boolean,
segmentId: Long): Long

@native def extendSegmentBoundingBox(bucketBytes: Array[Byte],
bytesPerElement: Int,
isSigned: Boolean,
bucketLength: Int,
segmentId: Long,
bucketTopLeftX: Int,
bucketTopLeftY: Int,
bucketTopLeftZ: Int,
existingBBoxTopLeftX: Int,
existingBBoxTopLeftY: Int,
existingBBoxTopLeftZ: Int,
existingBBoxBottomRightX: Int,
existingBBoxBottomRightY: Int,
existingBBoxBottomRightZ: Int): Array[Int]
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ package com.scalableminds.webknossos.datastore.helpers
import com.scalableminds.util.geometry.{BoundingBox, Vec3Int}
import com.scalableminds.util.tools.{Fox, FoxImplicits}
import com.scalableminds.webknossos.datastore.geometry.Vec3IntProto
import com.scalableminds.webknossos.datastore.models.datasource.DataLayer
import com.scalableminds.webknossos.datastore.models.{AdditionalCoordinate, SegmentInteger}
import com.scalableminds.webknossos.datastore.models.datasource.{DataLayer, ElementClass}
import com.scalableminds.webknossos.datastore.models.AdditionalCoordinate
import net.liftweb.common.{Box, Empty, Failure, Full}
import net.liftweb.common.Box.tryo
import play.api.libs.json.{Json, OFormat}

import scala.concurrent.ExecutionContext
Expand All @@ -19,35 +21,44 @@ object SegmentStatisticsParameters {

trait SegmentStatistics extends ProtoGeometryImplicits with FoxImplicits {

def calculateSegmentVolume(segmentId: Long,
mag: Vec3Int,
additionalCoordinates: Option[Seq[AdditionalCoordinate]],
getBucketPositions: (Long, Vec3Int) => Fox[Set[Vec3IntProto]],
getTypedDataForBucketPosition: (
Vec3Int,
Vec3Int,
Option[Seq[AdditionalCoordinate]]) => Fox[Array[SegmentInteger]])(
protected def bucketScanner: NativeBucketScanner

def calculateSegmentVolume(
segmentId: Long,
mag: Vec3Int,
additionalCoordinates: Option[Seq[AdditionalCoordinate]],
getBucketPositions: (Long, Vec3Int) => Fox[Set[Vec3IntProto]],
getDataForBucketPositions: (
Seq[Vec3Int],
Vec3Int,
Option[Seq[AdditionalCoordinate]]) => Fox[(Seq[Box[Array[Byte]]], ElementClass.Value)])(
implicit ec: ExecutionContext): Fox[Long] =
for {
bucketPositionsProtos: Set[Vec3IntProto] <- getBucketPositions(segmentId, mag)
bucketPositionsInMag = bucketPositionsProtos.map(vec3IntFromProto)
volumeBoxes <- Fox.serialSequence(bucketPositionsInMag.toList)(bucketPosition =>
for {
dataTyped: Array[SegmentInteger] <- getTypedDataForBucketPosition(bucketPosition, mag, additionalCoordinates)
count = dataTyped.count(segmentInteger => segmentInteger.toLong == segmentId)
} yield count.toLong)
counts <- Fox.combined(volumeBoxes.map(_.toFox))
(bucketBoxes, elementClass) <- getDataForBucketPositions(bucketPositionsInMag.toSeq, mag, additionalCoordinates)
counts <- Fox.serialCombined(bucketBoxes.toList) {
case Full(bucketBytes) =>
tryo(
bucketScanner.countSegmentVoxels(bucketBytes,
ElementClass.bytesPerElement(elementClass),
ElementClass.isSigned(elementClass),
segmentId)).toFox
case Empty => Full(0L).toFox
case f: Failure => f.toFox
}
} yield counts.sum

// Returns the bounding box in voxels in the target mag
def calculateSegmentBoundingBox(segmentId: Long,
mag: Vec3Int,
additionalCoordinates: Option[Seq[AdditionalCoordinate]],
getBucketPositions: (Long, Vec3Int) => Fox[Set[Vec3IntProto]],
getTypedDataForBucketPosition: (
Vec3Int,
Vec3Int,
Option[Seq[AdditionalCoordinate]]) => Fox[Array[SegmentInteger]])(
def calculateSegmentBoundingBox(
segmentId: Long,
mag: Vec3Int,
additionalCoordinates: Option[Seq[AdditionalCoordinate]],
getBucketPositions: (Long, Vec3Int) => Fox[Set[Vec3IntProto]],
getDataForBucketPositions: (
Seq[Vec3Int],
Vec3Int,
Option[Seq[AdditionalCoordinate]]) => Fox[(Seq[Box[Array[Byte]]], ElementClass.Value)])(
implicit ec: ExecutionContext): Fox[BoundingBox] =
for {
allBucketPositions: Set[Vec3IntProto] <- getBucketPositions(segmentId, mag)
Expand All @@ -58,14 +69,16 @@ trait SegmentStatistics extends ProtoGeometryImplicits with FoxImplicits {
Int.MinValue,
Int.MinValue,
Int.MinValue) //topleft, bottomright
_ <- Fox.serialCombined(relevantBucketPositions.iterator)(
bucketPosition =>
extendBoundingBoxByData(mag,
segmentId,
boundingBoxMutable,
bucketPosition,
additionalCoordinates,
getTypedDataForBucketPosition))
(bucketBoxes, elementClass) <- getDataForBucketPositions(relevantBucketPositions.toSeq,
mag,
additionalCoordinates)
_ <- Fox.serialCombined(relevantBucketPositions.zip(bucketBoxes)) {
case (bucketPosition, Full(bucketData)) =>
Fox.successful(
extendBoundingBoxByData(segmentId, boundingBoxMutable, bucketPosition, bucketData, elementClass))
case (_, Empty) => Fox.successful(())
case (_, f: Failure) => f.toFox
}
} yield
if (boundingBoxMutable.exists(item => item == Int.MaxValue || item == Int.MinValue)) {
BoundingBox.empty
Expand Down Expand Up @@ -93,44 +106,30 @@ trait SegmentStatistics extends ProtoGeometryImplicits with FoxImplicits {
.map(vec3IntFromProto)
}

private def extendBoundingBoxByData(mag: Vec3Int,
segmentId: Long,
mutableBoundingBox: scala.collection.mutable.ListBuffer[Int],
private def extendBoundingBoxByData(segmentId: Long,
boundingBoxMutable: scala.collection.mutable.ListBuffer[Int],
bucketPosition: Vec3Int,
additionalCoordinates: Option[Seq[AdditionalCoordinate]],
getTypedDataForBucketPosition: (
Vec3Int,
Vec3Int,
Option[Seq[AdditionalCoordinate]]) => Fox[Array[SegmentInteger]]): Fox[Unit] =
for {
dataTyped: Array[SegmentInteger] <- getTypedDataForBucketPosition(bucketPosition, mag, additionalCoordinates)
bucketTopLeftInTargetMagVoxels = bucketPosition * DataLayer.bucketLength
_ = scanDataAndExtendBoundingBox(dataTyped, bucketTopLeftInTargetMagVoxels, segmentId, mutableBoundingBox)
} yield ()

private def scanDataAndExtendBoundingBox(dataTyped: Array[SegmentInteger],
bucketTopLeftInTargetMagVoxels: Vec3Int,
segmentId: Long,
mutableBoundingBox: scala.collection.mutable.ListBuffer[Int]): Unit =
for {
x <- 0 until DataLayer.bucketLength
y <- 0 until DataLayer.bucketLength
z <- 0 until DataLayer.bucketLength
index = z * DataLayer.bucketLength * DataLayer.bucketLength + y * DataLayer.bucketLength + x
} yield {
if (dataTyped(index).toLong == segmentId) {
val voxelPosition = bucketTopLeftInTargetMagVoxels + Vec3Int(x, y, z)
extendBoundingBoxByPosition(mutableBoundingBox, voxelPosition)
}
}

private def extendBoundingBoxByPosition(mutableBoundingBox: scala.collection.mutable.ListBuffer[Int],
position: Vec3Int): Unit = {
mutableBoundingBox(0) = Math.min(mutableBoundingBox(0), position.x)
mutableBoundingBox(1) = Math.min(mutableBoundingBox(1), position.y)
mutableBoundingBox(2) = Math.min(mutableBoundingBox(2), position.z)
mutableBoundingBox(3) = Math.max(mutableBoundingBox(3), position.x)
mutableBoundingBox(4) = Math.max(mutableBoundingBox(4), position.y)
mutableBoundingBox(5) = Math.max(mutableBoundingBox(5), position.z)
bucketBytes: Array[Byte],
elementClass: ElementClass.Value): Unit = {
val bucketTopLeftInTargetMagVoxels = bucketPosition * DataLayer.bucketLength
val extendedBBArray: Array[Int] =
bucketScanner.extendSegmentBoundingBox(
bucketBytes,
ElementClass.bytesPerElement(elementClass),
ElementClass.isSigned(elementClass),
DataLayer.bucketLength,
segmentId,
bucketTopLeftInTargetMagVoxels.x,
bucketTopLeftInTargetMagVoxels.y,
bucketTopLeftInTargetMagVoxels.z,
boundingBoxMutable(0),
boundingBoxMutable(1),
boundingBoxMutable(2),
boundingBoxMutable(3),
boundingBoxMutable(4),
boundingBoxMutable(5)
)
(0 until 6).foreach(i => boundingBoxMutable(i) = extendedBBArray(i))
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ case class DataServiceDataRequest(
dataLayer: DataLayer,
cuboid: Cuboid,
settings: DataServiceRequestSettings
)
) {
def isSingleBucket: Boolean = cuboid.isSingleBucket(DataLayer.bucketLength)
}

case class DataReadInstruction(
baseDir: Path,
Expand Down
Loading