Skip to content

Conversation

@teskje
Copy link
Contributor

@teskje teskje commented Jan 29, 2026

This PR adds topological sorting of catalog entries to the coordinator bootstrap process. Previously, we were only sorting by catalog IDs, but ID order doesn't always match dependency order nowadays, and it is important that the entries are bootstrapped in dependency order.

In the sorting we treat indexes specially, to ensure they get created as early as possible. This makes sure any object depending on an indexed relation is also able to make use of the index. On main, this change reduces the number of dataflow operators in mz_catalog_server from 11863 to 9948 and the number of arrangements from 842 to 763.

To avoid code duplication, this PR pulls out the existing topo sort implementation from apply.rs into a util module and makes it generic.

Motivation

  • This PR fixes a previously unreported bug.

mz_catalog_server has a bunch of dataflows that don't make use of available indexes. Slack thread.

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

This commit adds topological sorting of catalog entries to the
coordinator bootstrap process. Previously, we were only sorting by
catalog IDs, but ID order doesn't always match dependency order
nowadays, and it is important that the entries are bootstrapped in
dependency order.

In the sorting we treat indexes specially, to ensure they get created as
early as possible. This makes sure any object depending on an indexed
relation is also able to make use of the index.

To avoid code duplication, this commit pulls out the existing topo sort
implementation from `apply.rs` into a util module and makes it generic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant