Skip to content

Commit 1bce184

Browse files
authored
ESQL: Document warnings behavior in CsvTests (#125441) (#125501)
The `CsvTests` has a slight difference regarding warnings from real Elasticsearch indices and this is worth documenting. I've also added an explanation to `SingleValueMatchQuery` that explains *exactly* when it makes a warning because it's not *exactly* the same as when the compute engine would make a warning. The resulting documents are the same - but the warnings are not.
1 parent 8f6e0ff commit 1bce184

File tree

2 files changed

+23
-1
lines changed

2 files changed

+23
-1
lines changed

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/querydsl/query/SingleValueMatchQuery.java

+11-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,17 @@
3737
import java.util.Objects;
3838

3939
/**
40-
* Finds all fields with a single-value. If a field has a multi-value, it emits a {@link Warnings}.
40+
* Finds all fields with a single-value. If a field has a multi-value, it emits
41+
* a {@link Warnings warning}.
42+
* <p>
43+
* Warnings are only emitted if the {@link TwoPhaseIterator#matches}. Meaning that,
44+
* if the other query skips the doc either because the index doesn't match or because it's
45+
* {@link TwoPhaseIterator#matches} doesn't match, then we won't log warnings. So it's
46+
* most safe to say that this will emit a warning if the document would have
47+
* matched but for having a multivalued field. If the document doesn't match but
48+
* "almost" matches in some fairly lucene-specific ways then it *might* emit
49+
* a warning.
50+
* </p>
4151
*/
4252
public final class SingleValueMatchQuery extends Query {
4353

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/CsvTests.java

+12
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
import org.elasticsearch.compute.operator.DriverRunner;
3131
import org.elasticsearch.compute.operator.exchange.ExchangeSinkHandler;
3232
import org.elasticsearch.compute.operator.exchange.ExchangeSourceHandler;
33+
import org.elasticsearch.compute.querydsl.query.SingleValueMatchQuery;
3334
import org.elasticsearch.core.Releasables;
3435
import org.elasticsearch.core.Tuple;
3536
import org.elasticsearch.index.IndexMode;
@@ -153,6 +154,17 @@
153154
* it’s creating its own Source physical operator, aggregation operator (just a tiny bit of it) and field extract operator.
154155
* <p>
155156
* To log the results logResults() should return "true".
157+
* <p>
158+
* This test never pushes to Lucene because there isn't a Lucene index to push to. It always runs everything in
159+
* the compute engine. This yields the same results modulo a few things:
160+
* <ul>
161+
* <li>Warnings for multivalued fields: See {@link SingleValueMatchQuery} for an in depth discussion, but the
162+
* short version is this class will always emit warnings on multivalued fields but tests that run against
163+
* a real index are only guaranteed to emit a warning if the document would match all filters <strong>except</strong>
164+
* it has a multivalue field.</li>
165+
* <li>Sorting: This class emits values in the order they appear in the {@code .csv} files that power it. A real
166+
* index emits documents a fair random order. Multi-shard and multi-node tests doubly so.</li>
167+
* </ul>
156168
*/
157169
// @TestLogging(value = "org.elasticsearch.xpack.esql:TRACE,org.elasticsearch.compute:TRACE", reason = "debug")
158170
public class CsvTests extends ESTestCase {

0 commit comments

Comments
 (0)