[IcebergIO] Support filter pushdown during reads#34827
[IcebergIO] Support filter pushdown during reads#34827ahmedabu98 merged 16 commits intoapache:masterfrom
Conversation
sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/FilterUtils.java
Show resolved
Hide resolved
|
Assigning reviewers. If you would like to opt out of this review, comment R: @robertwb for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Hi @ahmedabu98, are you still trying to fix the tests or is this truly ready for review again? Thanks! |
|
Test failures are irrelevant, this is ready for a review |
|
Hi @robertwb and @kennknowles - please review when you get a chance. Thanks! |
| * Utilities that convert between a SQL filter expression and an Iceberg {@link Expression}. Uses | ||
| * Apache Calcite semantics. | ||
| * | ||
| * <p>Note: Only supports top-level fields (i.e. cannot reference nested fields). |
There was a problem hiding this comment.
Let's make sure we clearly fail for unsupported queries.
sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/ScanTaskReader.java
Outdated
Show resolved
Hide resolved
sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/FilterUtils.java
Show resolved
Hide resolved
| call, | ||
| schema); | ||
| case NOT_EQ: | ||
| return convertFieldAndLiteral( |
There was a problem hiding this comment.
Let's try to do 100% test coverage for this file.
There was a problem hiding this comment.
Tried to do that in FilterUtilsTest. Let me know if anything is missing
sdks/java/io/iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/FilterUtilsTest.java
Outdated
Show resolved
Hide resolved
sdks/java/io/iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/IcebergIOReadTest.java
Outdated
Show resolved
Hide resolved
sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/IcebergScanConfig.java
Outdated
Show resolved
Hide resolved
|
Failing test is unrelated -- merging now |
|
Any ETA when this change will be released officially ? |
Part of #34789
Allows users to pass a SQL expression to filter files and rows when scanning. For example:
"colA" = 'SUCCESS' AND "date" < '2025-05-06'Uses Apache Calcite to parse SQL expressions (see doc reference: https://calcite.apache.org/docs/reference.html)