The Wayback Machine - https://web.archive.org/web/20230310205252/https://github.com/RustPython/RustPython/issues/1690
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate adoption of the PyPy approach to object layout #1690

Open
dralley opened this issue Jan 15, 2020 · 1 comment
Open

Investigate adoption of the PyPy approach to object layout #1690

dralley opened this issue Jan 15, 2020 · 1 comment
Labels
A-design About RustPython's own implementation RFC Request for comments

Comments

@dralley
Copy link
Contributor

dralley commented Jan 15, 2020

PyPy uses a totally different approach to object layout than CPython which saves a huge amount of memory (25% - 50%) on the overhead of objects when there are many objects with the same layout (same attributes).

I haven't seen any information about whether there is a slight performance penalty for the added indirection, although I believe the PyPy JIT is able to optimize it away. I'm also not sure if it contributes to making native code extensions more difficult. It's a big enough savings to warrant a serious consideration, though.

Details here:

https://dev.nextthought.com/blog/2018/08/cpython-vs-pypy-memory-usage.html
https://morepypy.blogspot.com/2010/11/efficiently-implementing-python-objects.html#using-maps-for-memory-efficient-instances

@dralley dralley added the RFC Request for comments label Jan 15, 2020
@dralley
Copy link
Contributor Author

dralley commented Jan 16, 2020

It seems like there's at least one consequence of this choice for PyPy, though minor:

sys.getsizeof() always raises TypeError. This is because a memory profiler using this function is most likely to give results inconsistent with reality on PyPy. It would be possible to have sys.getsizeof() return a number (with enough work), but that may or may not represent how much memory the object uses. It doesn’t even make really sense to ask how much one object uses, in isolation with the rest of the system. For example, instances have maps, which are often shared across many instances; in this case the maps would probably be ignored by an implementation of sys.getsizeof(), but their overhead is important in some cases if they are many instances with unique maps. Conversely, equal strings may share their internal string data even if they are different objects—or empty containers may share parts of their internals as long as they are empty. Even stranger, some lists create objects as you read them; if you try to estimate the size in memory of range(10**6) as the sum of all items’ size, that operation will by itself create one million integer objects that never existed in the first place. Note that some of these concerns also exist on CPython, just less so. For this reason we explicitly don’t implement sys.getsizeof().

@youknowone youknowone added the A-design About RustPython's own implementation label Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design About RustPython's own implementation RFC Request for comments
Projects
None yet
Development

No branches or pull requests

2 participants