-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Support table format: Iceberg, Delta, and Hudi #5650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
protos/feast/core/DataSource.proto
Outdated
| string date_partition_column_format = 5; | ||
|
|
||
| // Table Format (e.g. iceberg, delta, etc) | ||
| string table_format = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO, create TableFormat proto, consolidate with FileFormat proto
|
+1 on the inclusion of all 3 formats. Still I think we might be able to better design data-source side such that data source definitions don't tie the sources to specific offline stores. For example right now I think we can have best of both worlds if we instead go for adding all these formats as separate independent data sources ( |
| query: The query to be executed in Spark. | ||
| path: The path to file data. | ||
| file_format: The format of the file data. | ||
| file_format: The underlying file format (parquet, avro, csv, json). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not consolidate now?
+1 |
|
@franciscojavierarceo @tokoko consolidation with FilleFormat and new data sources could break the backward compatibility, so I want to do it pace by pace. |
|
That makes sense |
|
@HaoXuAI Why would new data sources break backwards compatibility though? |
There will be some proto changes, no 100% sure if there will be API changes exposed to users but I think might be the case |
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
|
@franciscojavierarceo @ntkathole mind take a look |
franciscojavierarceo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HaoXuAI i don't see use actually using or testing Spark Table, Iceberg, or Hudi format's outside of our definitions, can you add that?
Can you also add documentation that these formats are now supported?
Otherwise lgtm.
|
Gonna update to add the TableFormat proto in the next PR, after that I'll add the docs. And I think the test will need to be changed as well. |
Signed-off-by: hao-xu5 <[email protected]>
Signed-off-by: hao-xu5 <[email protected]>
Signed-off-by: hao-xu5 <[email protected]>
Signed-off-by: hao-xu5 <[email protected]>
Signed-off-by: hao-xu5 <[email protected]>
|
@franciscojavierarceo mind take another look? |
franciscojavierarceo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
* add support for table format such as Iceberg, Delta, Hudi etc. Signed-off-by: HaoXuAI <[email protected]> * linting Signed-off-by: HaoXuAI <[email protected]> * linting Signed-off-by: HaoXuAI <[email protected]> * add tests Signed-off-by: HaoXuAI <[email protected]> * fix tests Signed-off-by: HaoXuAI <[email protected]> * fix tests Signed-off-by: HaoXuAI <[email protected]> * linting Signed-off-by: HaoXuAI <[email protected]> * add tableformat proto Signed-off-by: hao-xu5 <[email protected]> * update Signed-off-by: hao-xu5 <[email protected]> * update doc Signed-off-by: hao-xu5 <[email protected]> * fix linting Signed-off-by: hao-xu5 <[email protected]> * fix test Signed-off-by: hao-xu5 <[email protected]> --------- Signed-off-by: HaoXuAI <[email protected]> Signed-off-by: hao-xu5 <[email protected]> Co-authored-by: hao-xu5 <[email protected]>
# [0.57.0](v0.56.0...v0.57.0) (2025-11-13) ### Bug Fixes * Improve trino to feast type mapping with (real,varchar,timestamp,decimal) ([#5691](#5691)) ([f855ad2](f855ad2)) * Materialize API - ODFV views not looked-up (thinks views non existant) - crashes materialize ([#5716](#5716)) ([1b050b3](1b050b3)) * Support historical feature retrieval with start_date/end_date in RemoteOfflineStore ([#5703](#5703)) ([ad32756](ad32756)) * Thread safe Clickhouse offline store ([#5710](#5710)) ([5f446ed](5f446ed)) ### Features * Add annotations to cronjob CRDs ([#5701](#5701)) ([be6e6c2](be6e6c2)) * Add batch commit mode for MySQL OnlineStore ([#5699](#5699)) ([3cfe4eb](3cfe4eb)) * Add possibility to materialize only latest values, to increase performance ([#5713](#5713)) ([8d77b72](8d77b72)) * Support table format: Iceberg, Delta, and Hudi ([#5650](#5650)) ([2915ad1](2915ad1))
What this PR does / why we need it:
examples:
Which issue(s) this PR fixes:
Misc