Skip to content

Conversation

@lkts
Copy link
Contributor

@lkts lkts commented Mar 27, 2025

This PR introduces usage of FallbackSyntheticSourceBlockLoader for point and geo_point fields and adds relevant tests.

@lkts lkts added >enhancement auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v8.19.0 labels Mar 27, 2025
@lkts lkts requested review from dnhatn and martijnvg March 27, 2025 22:40
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Hi @lkts, I've created a changelog YAML for you.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Sasha!

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good, I just have two questions

  1. Maybe I missed something, but how is the x-pack/plugin/logsdb/property-tests/.jqwik-database file being used?
  2. The other one is on GeoPointFieldMapper.

return new BlockDocValuesReader.LongsBlockLoader(name());
}

// _ignored_source field will only be present if there is no doc_values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand this, can someone not just use synthetic source and enable doc values? In that case there wouldn't be _source to fallback to?

Copy link
Contributor Author

@lkts lkts Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure i fully understand the question. This is needed due to the blContext.fieldExtractPreference() == DOC_VALUES check above. There are two scenarios possible once we arrive here:

  1. Stored source - we'll just use blockLoaderFromSource
  2. Synthetic source. However because of the fieldExtractPreference() check above it is still possible that doc_values are present here. So we have two sub-cases:
    • doc_values are enabled - _ignored_source_ does not exist since we have doc_values and we use blockLoaderFromSource which reads "classic" synthetic source.
    • doc_values are disabled - we know that _ignored_source field is present and use special block loader.

I think the ideal scenario here is to implement yet another block loader that will directly read doc_values and transform them into a WKB format bypassing the synthetic source.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool, I get this now. Would you be able to add this comment to line 540?

@lkts
Copy link
Contributor Author

lkts commented Mar 28, 2025

1. Maybe I missed something, but how is the `x-pack/plugin/logsdb/property-tests/.jqwik-database` file being used?

It's not :)

if (syntheticSourceKeep.equals("all")) {
return exactValuesFromSource(values, nullValue);
}
if (syntheticSourceKeep.equals("arrays") && extractedFieldValues.documentHasObjectArrays()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting case and i suspect it is not intentional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate why this isn't intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't apply custom logic of synthetic_source_keep: "arrays" to fields that have parsesArrayValue() set to true. But in this context we do.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lkts lkts merged commit f3ccde6 into elastic:main Apr 1, 2025
17 checks passed
@lkts lkts deleted the synthetic_source_block_loader_geo3 branch April 1, 2025 19:55
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 125816

@lkts
Copy link
Contributor Author

lkts commented Apr 1, 2025

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

lkts added a commit that referenced this pull request Apr 1, 2025
…125816) (#126079)

* Use FallbackSyntheticSourceBlockLoader for point and geo_point (#125816)

(cherry picked from commit f3ccde6)

# Conflicts:
#	server/src/test/java/org/elasticsearch/index/mapper/blockloader/BooleanFieldBlockLoaderTests.java
#	test/framework/src/main/java/org/elasticsearch/index/mapper/BlockLoaderTestCase.java
#	test/framework/src/main/java/org/elasticsearch/index/mapper/NumberFieldBlockLoaderTestCase.java

* compilation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >enhancement :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants