Investigate adoption of the PyPy approach to object layout #1690

dralley · 2020-01-15T17:15:29Z

PyPy uses a totally different approach to object layout than CPython which saves a huge amount of memory (25% - 50%) on the overhead of objects when there are many objects with the same layout (same attributes).

I haven't seen any information about whether there is a slight performance penalty for the added indirection, although I believe the PyPy JIT is able to optimize it away. I'm also not sure if it contributes to making native code extensions more difficult. It's a big enough savings to warrant a serious consideration, though.

Details here:

https://dev.nextthought.com/blog/2018/08/cpython-vs-pypy-memory-usage.html
https://morepypy.blogspot.com/2010/11/efficiently-implementing-python-objects.html#using-maps-for-memory-efficient-instances

dralley · 2020-01-16T01:57:43Z

It seems like there's at least one consequence of this choice for PyPy, though minor:

sys.getsizeof() always raises TypeError. This is because a memory profiler using this function is most likely to give results inconsistent with reality on PyPy. It would be possible to have sys.getsizeof() return a number (with enough work), but that may or may not represent how much memory the object uses. It doesn’t even make really sense to ask how much one object uses, in isolation with the rest of the system. For example, instances have maps, which are often shared across many instances; in this case the maps would probably be ignored by an implementation of sys.getsizeof(), but their overhead is important in some cases if they are many instances with unique maps. Conversely, equal strings may share their internal string data even if they are different objects—or empty containers may share parts of their internals as long as they are empty. Even stranger, some lists create objects as you read them; if you try to estimate the size in memory of range(10**6) as the sum of all items’ size, that operation will by itself create one million integer objects that never existed in the first place. Note that some of these concerns also exist on CPython, just less so. For this reason we explicitly don’t implement sys.getsizeof().

dralley added the RFC Request for comments label Jan 15, 2020

jamestwebber mentioned this issue May 23, 2020

[RFC] What are the compatibility goals for RustPython? #1940

Open

youknowone added the A-design About RustPython's own implementation label Apr 18, 2022

Feb	MAR	Apr
	10
2022	2023	2024

Investigate adoption of the PyPy approach to object layout #1690

Investigate adoption of the PyPy approach to object layout #1690

dralley commented Jan 15, 2020 •

edited

dralley commented Jan 16, 2020 •

edited

Investigate adoption of the PyPy approach to object layout #1690

Investigate adoption of the PyPy approach to object layout #1690

Comments

dralley commented Jan 15, 2020 • edited

dralley commented Jan 16, 2020 • edited

dralley commented Jan 15, 2020 •

edited

dralley commented Jan 16, 2020 •

edited