Skip to content

Conversation

@fredroy
Copy link
Contributor

@fredroy fredroy commented Feb 3, 2026

I did detect this issue some time ago, about having a controller in a scene was slowing down the simulation so much, especially on macOS. Even if the controller do nothing. And it gets slower and slower more there are Controllers.

DISCLAIMER: this was mostly the work of Claude, which detected/suggested the issues (lookups were slowing down the simulation) and generated the solution.
I just did the benches/tests to make sure it works, but as for everything with sofapython3 and/or pybind11, I cannot prove everything is okay/well-done. So deep review from experts would be appreciated 🫠

In any case, the modifications lead to a dramatic speed up:
(refer to the scene with this PR, which creates an empy scene with a certain number of Controller doing nothing)

Ubuntu 22.04 (gcc12, i7 13700k)

before:
Scene with 1 controllers and 10000 steps took 2.464146852493286 seconds.
Scene with 5 controllers and 10000 steps took 12.076464414596558 seconds.
Scene with 10 controllers and 10000 steps took 24.062500715255737 seconds.

after:
Scene with 1 controllers and 10000 steps took 0.04976940155029297 seconds.
Scene with 5 controllers and 10000 steps took 0.09446001052856445 seconds.
Scene with 10 controllers and 10000 steps took 0.1459205150604248 seconds.

--> with 10controllers, 150x faster... 😮

Windows (MSVC2026, i7 11800h)

before:
Scene with 1 controllers and 10000 steps took 6.102800607681274 seconds.
Scene with 5 controllers and 10000 steps took 27.300215482711792 seconds.
Scene with 10 controllers and 10000 steps took 54.59787082672119 seconds.
after:
Scene with 1 controllers and 10000 steps took 0.12163424491882324 seconds.
Scene with 5 controllers and 10000 steps took 0.18189406394958496 seconds.
Scene with 10 controllers and 10000 steps took 0.27340126037597656 seconds.

--> with 10controllers, 200x faster... 😲

macOS (xcode26, M3 max)

before:
Scene with 1 controllers and 10000 steps took 8.079632759094238 seconds.
Scene with 5 controllers and 10000 steps took 40.43093395233154 seconds.
Scene with 10 controllers and 10000 steps took 79.13048505783081 seconds.

after:
Scene with 1 controllers and 10000 steps took 0.03541707992553711 seconds.
Scene with 5 controllers and 10000 steps took 0.06284904479980469 seconds.
Scene with 10 controllers and 10000 steps took 0.09451079368591309 seconds.

--> with 10controllers, 837x faster... 🤪

Summary of Modifications

  The changes in this branch (speedup_controller) optimize the Controller_Trampoline class in the Python bindings by adding a caching mechanism for Python method lookups:

  Key Changes:

  1. New caching infrastructure (in Binding_Controller.h):
  - Added member variables to cache:
    - m_pySelf - cached Python self reference (avoids repeated py::cast(this))
    - m_methodCache - unordered_map storing Python method objects by name
    - m_onEventMethod - cached fallback "onEvent" method
    - m_hasOnEvent / m_cacheInitialized - state flags

  2. New methods (in Binding_Controller.cpp):
  - initializePythonCache() - initializes the cache on first use
  - getCachedMethod() - retrieves methods from cache (or looks them up once and caches)
  - callCachedMethod() - calls a cached Python method with an event
  - Constructor and destructor to properly manage the cached Python objects with GIL

  3. Optimized handleEvent():
  - Previously: every event caused py::cast(this), py::hasattr(), and attr() lookups
  - Now: uses cached method references, avoiding repeated Python attribute lookups

  4. Optimized getClassName():
  - Uses the cached m_pySelf when available instead of casting each time

  Purpose:

  This is a performance optimization that reduces overhead when handling frequent events (like AnimateBeginEvent, AnimateEndEvent), which can be called many times per simulation step. The caching eliminates repeated Python/C++ boundary crossings for method lookups.
@fredroy fredroy added enhancement New feature or request pr: status to review pr: clean-fix pr: highlighted in next release Highlight this contribution in the notes of the upcoming release topic for next dev-meeting Worth discussion at dev meeting labels Feb 3, 2026
Copy link
Contributor

@alxbilger alxbilger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the cache system also relevant for other trampoline classes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request pr: clean-fix pr: highlighted in next release Highlight this contribution in the notes of the upcoming release pr: status to review topic for next dev-meeting Worth discussion at dev meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants