Is there any difference Between Memory Management of Mutable and Immutable Data Structures in Python
-
Wanted to understand if is there any difference between Memory Management of Mutable and Immutable Data Structures in Python. Was Reading a couple of articles but I am still confused about this.
Thanks!
-
Hello @andy09,
welcome to the Plugin Café and thank you for reaching out to us. This is a forum dedicated to developing plugins for Cinema 4D. We welcome here also general programming questions and you have already posted in the correct forum for such questions.
Your question is quite clear, but at the same time is so broad that it is unanswerable (at least for me). Memory management is just a term for how a programming language reserves and frees memory for an application. This can range from automated memory management, a.k.a., garbage collection, as employed by Python or .NET languages as C# to manual memory management as employed by C++ (there are also automated memory management libraries for C++). So, your question does not make much sense without the context of a language.
When you are referring to our C++ API, specifically the maxon API, then you are in a managed memory environment and do not have to worry about allocating and freeing memory (most of the time). In C++ (and other languages) there is also the additional layer of stack and heap allocated memory, where the former provides a small amount of memory (the size depends on the OS) which is automatically freed and bound to the local scope.
Usually, it does not make a difference if a type is mutable or not in C++ regarding the form of memory management which can be applied to it, but the stack can easily become too small for a mutable type which holds a large set of data. The problem with the term "mutable" is also that in conjunction with not specifying a language, it is unclear to me what you are talking about. In Python and Java for example, strings are immutable from a technical perspective, and the reason they are is memory management. But that is likely not what you are talking about?
To get a clearer answer, you should provide more context.
Cheers,
Ferdinandedit: Eh, it is Python. I think you have edited your posting, or I had not enough coffee. Well, in Python there is no manual memory management worth mentioning, everything is handled by Python's GC (garbage collector). There are also not many immutable types in Python which are truly immutable in a practical sense, the only examples in the std library I can think of are tuples and frozen collections. Most atomic types as integers, strings, Booleans, etc. are also immutable in a technical sense, but that has little impact on us as a user, as we can view and treat them as mutable.
-
Thanks, @ferdinand for your answer, actually I was asking for Python which I missed mentioning (I have edited the question), I was reading this article 1 and article 2 to understand the difference but was confused so asked here.
Anyway, thank You!
-
Hello @andy09,
Yeah, I just saw it, I probably did not have enough coffee I edited my initial posting a bit before you posted your answer.
Regarding your articles: They are mostly unnecessarily technical and talk about what I did refer to under "technically immutable types".
As a brief explanation: When you have two variables in Python which both get the value
42
assigned.a = 42 b = 42
then what happens, is that Python will allocate an integer object in its object table and
a
andb
will point to that object.Code | Python Backend Object Table | a <--+------Object(42) b <--| | Object("Bob is your uncle!") | ...
Which is different from C++ for example, where writing
Int32 a = 42;
will just reserve 4 bytes of memory for that variable a, and if there is aInt32 b = 42;
, it will do the exact same thing again. So, the Python approach (which is also used by many other languages) is more memory efficient within this simplified view, as you only must store a value once.The problem is now what happens to b when you mutate a.
a = 42 b = 42 a = a + 1
What is now the value of b? Since we said a and b point to the same
42
object, one could assume that both a and b are 43. Which is of course not true, only a has the value of 43. Here comes into play the fact that integers are immutable in Python. What really happens here could be described as:a = CreateNewObject(a + 1)
and a schematic view would be then this:
Code | Python Backend Object Table | a <---------Object(43) b <---------Object(42) | Object("Bob is your uncle!")
So, things share objects to save memory, but some objects cannot be altered, as this would throw a wrench into this shared access model. Another popular example are strings which follow the same model in Python and other languages. When you have 1000 variables storing the string
"Bob is your uncle!"
, then there is exactly one object in Python's object table storing that value. When one of the variables is being modified, then not the string object is being modified, as this would also modify the value of all other 999 variables, but a new object is being created.So, technically integers, strings, and other atomic types are immutable in Python, but effectively they are not, it is just some very technical information to confuse beginners which has no real value.
Which is also why this list in your link,
is confusing at best.
int
,float
,str
are technically immutable in Python whiletuple
and frozen collections are practically immutable. Which is demonstrated by the snippet below:# A tuple in Python a = (1, 2, 3, 4) # We cannot do this, as tuples are intentionally immutable in Python, we cannot change their data. a[0] = 2 # A string in Python s = "Bob is your uncle!" # We can very much do this, so from a user's perspective strings are mutable. s += " And Marry your aunt!"
Long story short: This is all very technical without providing a real value or insight for most users. Mutable and immutable has also no impact on memory management in any tangible form for the user in Python. Most atomic types are immutable to save memory (i.e., a form of memory management), but that is not something we users have to worry about
Cheers,
Ferdinand -
Thanks a ton @ferdinand for your response. Coffe on me
-
Nice postings, Ferdinand.
I don't want to get too nitpicky, but you wrote "So, the Python approach (which is also used by many other languages) is more memory efficient, as you only have to store a value once."
In general probably not wrong. In conjunction with the example of two 32 Bit integers, for me at least questionable. The constant is four bytes, usually in a process' read only segment (well, Python... not sure it has something like a read only segment. Or rather its process has, but not sure it's being (or can be) put to use from the script language itself). But if treated as objects, these objects need to be referred to, which I would expect to happen via pointers, which on a modern 64 bit system are almost always 64 bit, 8 bytes wide. So in my mind, the example leads to 8 bytes in C++ world and 20 bytes (two pointers plus the actual value being pointed to and completely ignoring any additional overhead from object management, like use counts,...). Pretty sure Python has optimized ways of realizing this situation, but if it can get as efficient as C++ (only talking for this exact example), I somehow doubt.
And it would surely be a bad idea to try to mimic this behavior in C++ for a given number of integer constants, because it's said to be more efficient. This is the actual reason for me posting this at all.
-
Hello @a_block,
yeah, you are right. I somewhat expected such comment from Maxime when writing this answer, because this is something we often talk about. What I am constantly pressing for is that we must abstract and compress things, as this is what documentation is meant to do.
You are of course completely right that Python (or Java, C#, etc.) are not more memory efficient with this model than C++, as the whole infrastructure to make this model happen takes up much more space than it is saving space except for a very small set of outliers. Objects are pretty heavy in Python, and can also have weird behaviors which makes it very hard to predict their size. There is this
by mCoding which explores some of the madness of Python lists (which is caused by the subject discussed here). Added to that is that this relative reduction of space complexity is bought with a heavy increase of runtime-complexity, as this all must be managed. But I did hide this away to make the subject more accessible.There are also other caveats with my examples, as strings in such languages usually exhibit properties of immutability. It is, for example, not possible to do this in Python:
string= "Hello World!" string[0] = "X"
which I also ignored. I just followed here what is relevant for the user, who I assumed to not have a broad technical background and was struggling with these concepts. I.e., I compressed and abstracted the subject a bit. I just tied together what is meant with mutability in Python and how this is related to memory management for a non-expert audience.
And I would not be able to properly explain this to an expert audience anyways, since I am only a technical librarian
Cheers,
Ferdinand -
I liked your approach of explanation and I do not want to diminish the value of your posts. Quite the opposite.
I only stumbled across the absoluteness of this one sentence, totally aware, that you probably know better and only chose to have a simple example with a briefness of explanation suitable for this forum. Just wanted to make sure, this is not being taken away, if not fully understood by potential readers. -
Yeah, I understood that. I did not take any offense here; I just gave a little insight into why I sometimes simplify things to a degree where an expert might say: "But this is not quite right!".
I modified the sentence you had an issue with to:
So, the Python approach (which is also used by many other languages) is more memory efficient within this simplified view, as you only must store a value once.
To more clearly indicate that this is an abstraction. I still would not point out that non-managed languages are effectively more memory efficient than managed ones, as that is not the subject here and actively obfuscates one of the reasons why this is done in these managed languages: To save memory.