feat: support json_serializer parameter in create_engine() by waiho-gumloop · Pull Request #823 · googleapis/python-spanner-sqlalchemy

waiho-gumloop · 2026-02-28T00:41:44Z

Summary

Fixes #822

SpannerDialect now accepts json_serializer and json_deserializer kwargs from create_engine(), matching the standard SQLAlchemy convention used by PostgreSQL, MySQL, SQLite, and other dialects. Previously, passing these raised TypeError.

Approach: serialize-then-wrap

The Spanner pipeline wraps values in JsonObject first, then serializes later in _helpers._make_param_value_pb via obj.serialize(). This differs from other dialects where _json_serializer replaces json.dumps directly.

To bridge this gap without modifying JsonObject or the core google-cloud-spanner library:

The user's json_serializer(value) produces a JSON string with custom types already handled
JsonObject.from_str() parses it back into a JsonObject containing only native Python types
When _helpers.py later calls obj.serialize(), standard json.dumps works — no custom types remain

This is a one-extra-json.loads round-trip per JSON value, which is negligible for typical payloads.

Backward compatible: When no json_serializer is provided, the dialect behaves identically to before (_json_serializer = JsonObject).

Example

from sqlalchemy import create_engine
import json, datetime

def my_serializer(obj):
    return json.dumps(obj, default=lambda o: o.isoformat() if hasattr(o, 'isoformat') else str(o))

engine = create_engine(
    "spanner:///projects/p/instances/i/databases/d",
    json_serializer=my_serializer,
)

Changes

sqlalchemy_spanner.py: Added __init__ to SpannerDialect accepting json_serializer / json_deserializer, plus _make_json_serializer() factory
test/unit/test_json_serializer.py: 21 unit + integration tests covering the factory, dialect wiring, SQLAlchemy bind_processor pipeline, null handling, and Spanner _helpers.py compatibility

Test plan

Unit tests for _make_json_serializer factory (JsonObject passthrough, callable wrapping, arrays, nulls, nested datetimes)
Dialect integration tests (kwarg acceptance, get_cls_kwargs introspection, class attribute isolation)
End-to-end tests simulating the full JSON.bind_processor → JsonObject → serialize() pipeline
Spanner _helpers.py compatibility test (isinstance check + bare serialize() call)

Made with Cursor

SpannerDialect now accepts `json_serializer` and `json_deserializer` kwargs, matching the standard SQLAlchemy convention used by PostgreSQL and other dialects. Previously, passing `json_serializer` to `create_engine()` raised `TypeError` because the dialect's `__init__` did not declare these parameters. The implementation uses a serialize-then-wrap strategy: the user's `json_serializer` function pre-serializes values (handling custom types like `datetime`), then the result is parsed back into a `JsonObject` via `from_str()`. This preserves the existing Spanner client pipeline (`_helpers.py` expects `JsonObject` instances) while allowing custom type handling — without subclassing or modifying `JsonObject` itself. Example usage: engine = create_engine( "spanner:///...", json_serializer=lambda obj: json.dumps(obj, cls=MyEncoder), ) Made-with: Cursor

gemini-code-assist · 2026-02-28T00:41:58Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for json_serializer and json_deserializer parameters in the create_engine() function for the SpannerDialect, bringing it in line with standard SQLAlchemy behavior. It addresses the challenge of integrating custom JSON serialization logic with Spanner's existing JsonObject wrapping mechanism through a "serialize-then-wrap" approach, ensuring custom types are handled correctly while maintaining compatibility.

Highlights

JSON Serializer Support: Enabled json_serializer and json_deserializer parameters in SpannerDialect's create_engine() method, allowing users to provide custom JSON serialization and deserialization functions.
"Serialize-then-wrap" Strategy: Implemented a unique approach where user-provided serializers first convert objects to JSON strings, which are then re-parsed into JsonObject instances, ensuring compatibility with Spanner's internal data handling without modifying core libraries.
Backward Compatibility: The dialect retains its previous behavior when no custom json_serializer is provided, ensuring existing applications are unaffected.
Comprehensive Testing: Added a new dedicated test file with 21 unit and integration tests covering the new factory, dialect wiring, SQLAlchemy bind processor pipeline, null handling, and compatibility with Spanner's _helpers.py.

Changelog

google/cloud/sqlalchemy_spanner/sqlalchemy_spanner.py
- Introduced _make_json_serializer function to adapt user-provided JSON serializers to Spanner's JsonObject pipeline.
- Modified SpannerDialect.__init__ to accept and apply json_serializer and json_deserializer arguments.
test/unit/test_json_serializer.py
- Added new file test/unit/test_json_serializer.py containing extensive tests for the _make_json_serializer factory, SpannerDialect's JSON serialization capabilities, and end-to-end scenarios.

Activity

No human activity (comments, reviews, etc.) has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for custom json_serializer and json_deserializer in create_engine, aligning the Spanner dialect with other SQLAlchemy dialects. The implementation uses a 'serialize-then-wrap' strategy which is well-documented. The accompanying tests are comprehensive.

However, I've identified a significant issue with how JSON null values are handled, both with the new custom serializer and in the existing default behavior. The current implementation incorrectly maps Python None to SQL NULL instead of JSON null when JSON(none_as_null=False) is used, which is the default. This prevents storing JSON null values. I've left detailed comments on the implementation and the corresponding test case with suggestions on how to address this to ensure compliance with SQLAlchemy's documented behavior.

gemini-code-assist · 2026-02-28T00:47:40Z

google/cloud/sqlalchemy_spanner/sqlalchemy_spanner.py

+    def _factory(value):
+        json_str = json_serializer(value)
+        return JsonObject.from_str(json_str)


This implementation incorrectly handles JSON null values. When SQLAlchemy processes a JSON type with none_as_null=False (the default), a Python None value should be stored as a JSON null.

However, in this _factory:

json_serializer(None) returns the string 'null'.

JsonObject.from_str('null') creates a JsonObject(None) because json.loads('null') is None.

JsonObject(None).serialize() returns None, which the DBAPI driver interprets as a SQL NULL.

This prevents storing JSON null values and contradicts the expected behavior of none_as_null=False.

To fix this, you should handle None values (and 'null' strings from the serializer) specially to produce an object that serializes to the string 'null'. This likely requires a private JsonObject subclass that overrides the serialize() method for this purpose.

gemini-code-assist · 2026-02-28T00:47:40Z

test/unit/test_json_serializer.py

+    def test_callable_handles_null(self):
+        factory = _make_json_serializer(lambda v: json.dumps(v))
+        obj = factory(None)
+        assert isinstance(obj, JsonObject)
+        assert obj.serialize() is None


This test seems to confirm an incorrect behavior. factory(None) should produce a JsonObject that represents a JSON null, which should serialize to the string 'null', not Python None (which represents a SQL NULL).

As implemented, factory(None) calls json.dumps(None) which is 'null', then JsonObject.from_str('null') which is JsonObject(None), and JsonObject(None).serialize() returns None. This means a Python None is always converted to a SQL NULL, which prevents storing JSON null values when using JSON(none_as_null=False).

This test should be updated to assert the correct behavior, which is serialization to 'null'. For example: assert obj.serialize() == 'null'.

waiho-gumloop requested a review from a team as a code owner February 28, 2026 00:41

product-auto-label bot added the api: spanner Issues related to the googleapis/python-spanner-sqlalchemy API. label Feb 28, 2026

gemini-code-assist bot reviewed Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support json_serializer parameter in create_engine()#823

feat: support json_serializer parameter in create_engine()#823
waiho-gumloop wants to merge 1 commit intogoogleapis:mainfrom
waiho-gumloop:feat/support-json-serializer-kwarg

waiho-gumloop commented Feb 28, 2026

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 28, 2026

Uh oh!

gemini-code-assist bot Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

waiho-gumloop commented Feb 28, 2026

Summary

Example

Changes

Test plan

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant