Skip to content

Conversation

@leoguillaume
Copy link
Member

@leoguillaume leoguillaume commented Nov 17, 2025

Summary by CodeRabbit

  • Refactor

    • Standardized internal database session provider across the API to improve reliability, consistency, and maintainability (no user-facing behavior changes).
  • Chores

    • Updated coverage badge metric and adjusted internal dependency wiring to streamline infrastructure and diagnostics.

@leoguillaume leoguillaume self-assigned this Nov 17, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

The PR replaces all uses of get_db_session with get_postgres_session across endpoints, helpers, and utilities, removes get_db_session from api/sql/session.py, and refactors get_postgres_session implementation to use an async context-manager pattern.

Changes

Cohort / File(s) Summary
Admin Endpoints
api/endpoints/admin/organizations.py, api/endpoints/admin/providers.py, api/endpoints/admin/roles.py, api/endpoints/admin/routers.py, api/endpoints/admin/tokens.py, api/endpoints/admin/users.py
Swapped dependency from get_db_sessionget_postgres_session in all CRUD endpoints; updated imports only.
Public Endpoints
api/endpoints/audio.py, api/endpoints/auth.py, api/endpoints/chat.py, api/endpoints/chunks.py, api/endpoints/collections.py, api/endpoints/documents.py, api/endpoints/embeddings.py, api/endpoints/files.py, api/endpoints/me.py, api/endpoints/models.py, api/endpoints/ocr.py, api/endpoints/proconnect/__init__.py, api/endpoints/rerank.py, api/endpoints/search.py, api/endpoints/usage.py
Replaced session dependency to Depends(get_postgres_session) and updated imports; no other logic changes.
Utilities & Helpers
api/helpers/_accesscontroller.py, api/utils/hooks_decorator.py, api/utils/lifespan.py
Updated usages/imports to use get_postgres_session in helper flows and lifecycle loops.
Session Core
api/sql/session.py
Removed the legacy get_db_session() dependency provider and its scaffold.
Session Provider Refactor
api/utils/dependencies.py
Reworked get_postgres_session() implementation to acquire sessions via session_factory() with async with (context-manager) instead of async iterating global_context.postgres_session().
Minor Updates
.github/badges/coverage.json, api/tasks/celery_app.py, api/tests/unit/test_utils/test_carbon.py
Coverage badge updated; linting noqa added (E402); test formatting whitespace adjusted.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Endpoint
  participant Dependencies as "get_postgres_session()"
  participant Postgres
  Client->>Endpoint: HTTP Request
  Endpoint->>Dependencies: Depends(get_postgres_session)
  Note over Dependencies,Postgres: async context-manager\nsession_factory() as session
  Dependencies-->>Endpoint: AsyncSession
  Endpoint->>Postgres: DB queries using AsyncSession
  Postgres-->>Endpoint: Results
  Endpoint-->>Client: HTTP Response
  Note right of Dependencies: session closed on exit
Loading

(Optional comparison — old vs new)

sequenceDiagram
  autonumber
  participant Endpoint
  participant Old as "get_db_session (generator)"
  participant New as "get_postgres_session (async with)"
  Endpoint->>Old: async for session in get_db_session()...
  Old-->>Endpoint: yielded session (generator pattern)
  Endpoint->>Old: continue/exit -> generator cleanup
  Endpoint->>New: async with session_factory() as session
  New-->>Endpoint: session (context-managed)
  Note right of New: deterministic enter/exit semantics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Consistent, repetitive import/dependency swap across many files reduces per-file review time.
  • Pay close attention to:
    • api/utils/dependencies.py — confirm async context-manager lifecycle and cleanup semantics.
    • api/sql/session.py — ensure the removed provider has no remaining references.
    • api/utils/lifespan.py & api/utils/hooks_decorator.py — verify loops and background setup still function with the new provider.

Possibly related PRs

Suggested reviewers

  • blanch0t

Poem

🐰 I hopped through imports, quick and spry,
Swapped old sessions for Postgres sky-high.
Async-with cradles each fleeting connection,
Clean exits now guard our DB affection. ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: renaming the get_db_session dependency provider to get_postgres_session across the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 87.69% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-postgres-session

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
api/utils/dependencies.py (1)

57-62: <!-- [scratchpad_start]
[task verify_session_cleanup_1/1]

[observations]

  • File: api/utils/dependencies.py, lines 57-62
  • Code structure: async with session_factory() as session: followed by yield session then explicit if session.in_transaction(): await session.close()
  • session_factory is SQLAlchemy's sessionmaker created with class_=AsyncSession
  • The explicit close() happens after yield, inside the context manager
  • No test files found that specifically test session cleanup with transaction states
  • The pattern is used only in this one location in the codebase

[analysis]

  • This is a FastAPI dependency generator (async generator with yield)
  • The context manager (aexit) will be called after the yield returns
  • Calling session.close() before context manager exit could cause double-close issues
  • Need to verify SQLAlchemy's AsyncSession behavior with context managers

[pending]

  • What does AsyncSession.aexit do?
  • Does session.close() conflict with the context manager cleanup?
  • Is there a transaction-specific reason for the close() call?
  • Should this be removed or moved outside the context manager?

[actions]

  • Search web for SQLAlchemy AsyncSession context manager behavior
  • Determine if explicit close() inside async with block is problematic
    </scratchpad_end] -->

I need to verify the SQLAlchemy AsyncSession behavior to determine if the explicit close() call causes issues.Based on my verification, I now have sufficient information to provide my analysis.

Remove the explicit session.close() call inside the async context manager.

The async context manager automatically closes the AsyncSession at the end of the block, which is equivalent to calling AsyncSession.close(). The explicit session.close() call on line 62 is redundant and unnecessary.

Additionally, you don't have to call session.close() since you are using a context manager and it will do it for you when you exit from the with block. The current pattern of calling close() inside the async with block before the context manager's __aexit__ executes violates the standard lifecycle and may cause unexpected behavior.

The correct pattern for this FastAPI dependency should be:

async def get_postgres_session() -> AsyncSession:
    """
    Get a PostgreSQL session from the global context.

    Returns:
        AsyncSession: A PostgreSQL session instance.
    """
    session_factory = global_context.postgres_session_factory
    async with session_factory() as session:
        yield session

Remove lines 60-62 entirely. The AsyncSession.close() method closes out transactional resources and ORM objects, ends any transaction in progress, and releases any AsyncConnection objects, then leaves the AsyncSession in a state which it may be used again, so the conditional check for in_transaction() is also unnecessary—the context manager handles this automatically.

api/endpoints/admin/providers.py (1)

78-90: Critical: Add missing router parameter to get_provider function.

The get_provider function (line 78) calls model_registry.get_providers(router_id=router, ...) on line 87, but router is not defined in the function signature. This will cause a NameError at runtime. Add router as a path or query parameter similar to the get_providers function (line 99).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f6f448 and 837f9ad.

📒 Files selected for processing (29)
  • .github/badges/coverage.json (1 hunks)
  • api/endpoints/admin/organizations.py (6 hunks)
  • api/endpoints/admin/providers.py (5 hunks)
  • api/endpoints/admin/roles.py (6 hunks)
  • api/endpoints/admin/routers.py (6 hunks)
  • api/endpoints/admin/tokens.py (5 hunks)
  • api/endpoints/admin/users.py (6 hunks)
  • api/endpoints/audio.py (2 hunks)
  • api/endpoints/auth.py (1 hunks)
  • api/endpoints/chat.py (2 hunks)
  • api/endpoints/chunks.py (3 hunks)
  • api/endpoints/collections.py (5 hunks)
  • api/endpoints/documents.py (5 hunks)
  • api/endpoints/embeddings.py (2 hunks)
  • api/endpoints/files.py (2 hunks)
  • api/endpoints/me.py (6 hunks)
  • api/endpoints/models.py (3 hunks)
  • api/endpoints/ocr.py (2 hunks)
  • api/endpoints/proconnect/__init__.py (3 hunks)
  • api/endpoints/rerank.py (2 hunks)
  • api/endpoints/search.py (2 hunks)
  • api/endpoints/usage.py (2 hunks)
  • api/helpers/_accesscontroller.py (2 hunks)
  • api/sql/session.py (0 hunks)
  • api/tasks/celery_app.py (1 hunks)
  • api/tests/unit/test_utils/test_carbon.py (1 hunks)
  • api/utils/dependencies.py (1 hunks)
  • api/utils/hooks_decorator.py (3 hunks)
  • api/utils/lifespan.py (3 hunks)
💤 Files with no reviewable changes (1)
  • api/sql/session.py
🧰 Additional context used
🧬 Code graph analysis (24)
api/helpers/_accesscontroller.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/proconnect/__init__.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/documents.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/chat.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/usage.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/admin/users.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/ocr.py (1)
api/utils/dependencies.py (4)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
  • get_redis_client (34-46)
  • get_request_context (12-20)
api/utils/hooks_decorator.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/search.py (1)
api/utils/dependencies.py (3)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
  • get_redis_client (34-46)
api/endpoints/auth.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/models.py (1)
api/utils/dependencies.py (3)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
  • get_request_context (12-20)
api/endpoints/admin/routers.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/audio.py (1)
api/utils/dependencies.py (4)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
  • get_redis_client (34-46)
  • get_request_context (12-20)
api/endpoints/admin/tokens.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/files.py (1)
api/utils/dependencies.py (4)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
  • get_redis_client (34-46)
  • get_request_context (12-20)
api/endpoints/chunks.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/admin/providers.py (1)
api/utils/dependencies.py (2)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
api/endpoints/collections.py (3)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/utils/exceptions.py (1)
  • CollectionNotFoundException (79-81)
api/helpers/_accesscontroller.py (1)
  • AccessController (32-281)
api/endpoints/me.py (4)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/helpers/_accesscontroller.py (1)
  • AccessController (32-281)
api/endpoints/admin/users.py (1)
  • get_user (101-110)
api/helpers/_identityaccessmanager.py (1)
  • get_user (671-694)
api/endpoints/embeddings.py (1)
api/utils/dependencies.py (3)
  • get_model_registry (23-31)
  • get_postgres_session (49-62)
  • get_redis_client (34-46)
api/endpoints/admin/organizations.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/admin/roles.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/utils/lifespan.py (1)
api/utils/dependencies.py (1)
  • get_postgres_session (49-62)
api/endpoints/rerank.py (1)
api/utils/dependencies.py (3)
  • get_postgres_session (49-62)
  • get_redis_client (34-46)
  • get_request_context (12-20)
🔇 Additional comments (36)
.github/badges/coverage.json (1)

1-1: Verify whether this auto-generated badge file should be manually committed.

Coverage badge files are typically auto-generated by CI/CD pipelines and should not be manually updated or committed. Doing so can cause the badge to drift from actual test coverage and creates unnecessary merge conflicts.

Confirm that:

  1. This badge is meant to be manually updated (vs. auto-generated by your CI/CD pipeline)
  2. The coverage increase reflects genuine improvements from the refactoring
api/tests/unit/test_utils/test_carbon.py (1)

67-72: LGTM - formatting improvement.

The multi-line formatting improves readability without changing behavior.

api/tasks/celery_app.py (1)

99-99: LGTM - appropriate lint suppression.

Adding E402 suppression is correct here since the routing import must occur after Celery app configuration to ensure proper task registration.

api/utils/hooks_decorator.py (3)

15-15: LGTM - import updated correctly.

The import change from get_db_session to get_postgres_session aligns with the PR objective.


206-212: LGTM - usage pattern is correct.

The async for session in get_postgres_session(): pattern correctly consumes the async generator.


233-255: LGTM - consistent usage pattern.

Both log_usage and update_budget follow the same correct async iteration pattern for session acquisition.

api/endpoints/files.py (2)

20-20: LGTM - import updated correctly.

Import change aligns with the PR-wide migration to get_postgres_session.


33-33: LGTM - dependency updated correctly.

The session dependency now uses get_postgres_session, consistent with other endpoints.

api/endpoints/rerank.py (2)

12-12: LGTM - import updated correctly.

Import statement now includes get_postgres_session as part of the PR-wide refactoring.


24-24: LGTM - dependency updated correctly.

Session dependency correctly switched to get_postgres_session.

api/endpoints/ocr.py (2)

18-18: LGTM - import updated correctly.

Import change aligns with the repository-wide migration to get_postgres_session.


34-34: LGTM - dependency updated correctly.

Session dependency correctly uses get_postgres_session.

api/endpoints/chat.py (2)

16-16: LGTM - import updated correctly.

Import statement correctly includes get_postgres_session as part of the refactoring.


39-39: LGTM - dependency updated correctly.

The chat endpoint's session dependency correctly switched to get_postgres_session.

api/endpoints/models.py (1)

12-12: LGTM! Consistent refactoring applied.

The import and dependency injection updates are correct and consistent across both endpoints.

Also applies to: 30-30, 52-52

api/endpoints/proconnect/__init__.py (1)

18-18: LGTM! OAuth endpoints updated consistently.

The session dependency has been correctly updated in both the callback and logout endpoints.

Also applies to: 104-104, 175-175

api/endpoints/usage.py (1)

11-11: LGTM! Usage endpoint updated correctly.

The session dependency is properly updated.

Also applies to: 31-31

api/endpoints/admin/tokens.py (1)

11-11: LGTM! All admin token endpoints updated consistently.

The session dependency has been correctly updated across all four endpoints.

Also applies to: 26-26, 51-51, 71-71, 95-95

api/endpoints/audio.py (1)

22-22: LGTM! Audio endpoint updated correctly.

The session dependency is properly updated.

Also applies to: 40-40

api/endpoints/chunks.py (1)

9-9: LGTM! Chunk endpoints updated consistently.

The session dependency has been correctly updated in both chunk retrieval endpoints.

Also applies to: 21-21, 42-42

api/helpers/_accesscontroller.py (1)

15-15: Session dependency properly integrated into AccessController.

The refactoring is complete and correct. Line 15 properly imports get_postgres_session, and line 49 correctly declares the session parameter with Depends(get_postgres_session). The session is actively used throughout the __call__ method and all helper methods (_check_api_key, _check_chat_completions, _check_limits, etc.) for authentication, authorization, and database queries. FastAPI's dependency injection ensures all endpoints using AccessController as a dependency automatically receive a properly scoped session instance.

api/endpoints/embeddings.py (1)

10-10: LGTM! Session dependency successfully migrated.

The import and dependency injection have been correctly updated from get_db_session to get_postgres_session. The endpoint logic remains unchanged.

Also applies to: 22-22

api/endpoints/admin/roles.py (1)

10-10: LGTM! All role endpoints migrated successfully.

All five role management endpoints have been consistently updated to use get_postgres_session. Import and dependency injections are correct throughout.

Also applies to: 25-25, 46-46, 66-66, 92-92, 115-115

api/endpoints/auth.py (1)

7-7: LGTM! Authentication endpoint migrated successfully.

The login endpoint has been correctly updated to use get_postgres_session. Import and dependency injection are both correct.

Also applies to: 14-14

api/endpoints/search.py (2)

13-13: LGTM! Session dependency successfully migrated.

The session dependency has been correctly updated to use get_postgres_session.

Also applies to: 24-24


27-27: Verify scope: Additional dependency change beyond session migration.

The request_context parameter has been changed to use Depends(get_request_context) instead of a direct ContextVar reference. While this may be intentional, it extends beyond the PR's stated objective of renaming the session provider.

Ensure this change is deliberate and doesn't introduce unintended behavioral differences in how the request context is accessed.

api/endpoints/admin/users.py (1)

11-11: LGTM! All user management endpoints migrated successfully.

All five user management endpoints have been consistently updated to use get_postgres_session. Import and dependency injections are correct throughout.

Also applies to: 26-26, 55-55, 74-74, 102-102, 126-126

api/endpoints/me.py (1)

10-10: LGTM! All user profile endpoints migrated successfully.

All six endpoints for user profile and API key management have been consistently updated to use get_postgres_session. Import and dependency injections are correct throughout.

Also applies to: 17-17, 31-31, 53-53, 73-73, 88-88, 108-108

api/endpoints/documents.py (1)

40-40: LGTM! All document endpoints migrated successfully.

All four document management endpoints have been consistently updated to use get_postgres_session. Import and dependency injections are correct throughout.

Also applies to: 50-50, 130-130, 151-151, 180-180

api/endpoints/admin/organizations.py (1)

17-17: LGTM! All organization endpoints migrated successfully.

All five organization management endpoints have been consistently updated to use get_postgres_session. Import and dependency injections are correct throughout.

Also applies to: 32-32, 46-46, 61-61, 76-76, 94-94

api/endpoints/admin/providers.py (2)

15-15: LGTM: Import updated correctly.

The import change from get_db_session to get_postgres_session aligns with the PR objective.


29-29: LGTM: Dependency injection updated consistently with proper transaction cleanup.

Verification confirms the change is safe:

  • Calling session.close() on an active transaction automatically rolls back the transaction, preventing resource leaks or dangling uncommitted changes.

  • FastAPI's dependency system with yield executes cleanup code after the response is sent, properly closing the session, which is the idiomatic pattern for managing database sessions.

  • The codebase already has explicit session.commit() calls throughout helper classes (_documentmanager, _identityaccessmanager, _modelregistry) that handle business logic, ensuring committed transactions before cleanup.

The four endpoints now consistently use get_postgres_session for database access with safe transaction semantics.

api/endpoints/collections.py (2)

8-8: LGTM: Import updated correctly.

The import change from get_db_session to get_postgres_session is consistent with the PR objective.


16-16: LGTM: Dependency injection updated consistently.

All five collection endpoints now use get_postgres_session instead of get_db_session. The changes maintain the same session type (AsyncSession) and are applied uniformly across create, read, update, and delete operations.

Also applies to: 43-43, 67-67, 91-91, 113-113

api/endpoints/admin/routers.py (2)

10-10: LGTM: Import updated correctly.

The import change from get_db_session to get_postgres_session aligns with the PR objective.


20-20: LGTM: Dependency injection updated consistently.

All five router administration endpoints now use get_postgres_session instead of get_db_session. The changes are uniform across create, delete, update, and get operations, maintaining the same session type.

Also applies to: 47-47, 67-67, 97-97, 116-116

from api.sql.session import get_db_session
from api.utils.configuration import get_configuration
from api.utils.context import global_context
from api.utils.dependencies import get_postgres_session
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

LGTM! Consider refactoring the session acquisition pattern.

The changes are consistent with the rename. However, the async for session in get_postgres_session(): pattern is unusual for a generator that yields once.

Consider refactoring to use the session factory directly for better clarity:

-    async for session in get_postgres_session():
+    async with global_context.postgres_session_factory() as session:
         global_context.model_registry = ModelRegistry(
             app_title=configuration.settings.app_title,

This pattern is more idiomatic for context managers and makes the intent clearer. Apply the same pattern at line 132 in _setup_document_manager.

Also applies to: 94-94, 132-132

🤖 Prompt for AI Agents
In api/utils/lifespan.py around lines 24, 94, and 132, the code uses "async for
session in get_postgres_session()" to acquire a session from an async generator
that yields only once; refactor these sites to call the session factory as an
async context manager (e.g., "async with get_postgres_session() as session:") so
the intent is clearer and more idiomatic, and apply the same change at the two
other noted lines (_setup_document_manager at line 132 and the other site at
line 94).

@leoguillaume leoguillaume merged commit 7e993da into main Nov 17, 2025
1 check passed
This was referenced Nov 17, 2025
@tibo-pdn tibo-pdn deleted the fix-postgres-session branch December 11, 2025 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants