Reason why deep and shallow copy differences between "primitive" and "non-primitive" objects?

Related searches

I know the difference between shallow and deep copy in Python, and the question is not about when to use one or the other. However I find this trivial example pretty amusing and non-intuitive

from copy import deepcopy

 a=0
 b=deepcopy(a)
 c=a
 a+=1
 print(a,b,c)

output: 1 0 0

 from copy import deepcopy
 a=[0]
 b=deepcopy(a)
 c=a
 a[0]+=1
 print(a,b,c)

output: [1] 0 [1]

I would like to know the reason why this design choice was made, since the two snippets of code are in my opinion quite equivalent however their output is completely different. To make myself more explicit I wonder why = is a deepcopy in the case of a "primitive" variable and a shallow copy in the case of a "non-primitive" (but still part of the basic language) variable like a list? I personally find this behavior counter-intuitive Note: I used python 3

The catch here is mutability and immutability.

There is no such thing is primitive and non primitive in python, everything is a type, some are just inbuilt.

You need to understand how python stores data in variables. Assuming you come from a C background, you can think of all python variables are pointers. All python variables store the reference to the location where the value of the variable actually is.

The builtin id function somewhat lets us look into where the value of a variable is actually stored.

>>> x = 12345678
>>> id(x)
1886797010128
>>> y = x
>>> id(y)
1886797010128
>>> y += 1
>>> y
12345679
>>> x
12345678
>>> id(y)
1886794729648

The variable x points to location 1886797010128 and location 1886797010128 holds the value of 10. int is an immutable type in python, which means that the data stored in the location 1886797010128 can't be changed.

When we assign y = x, y now also points to the same address, since it's not necessary to allocate more memory for the same value.

When y is changed (remember that int is an immutable type and it's value can't be changed), a new int is created in the new location 1886794729648 and y now points to this new int object at the new address.

The same happens when you try to update the value of a variable that holds immutable data.

>>> id(x)
140707671077744
>>> x = 30
>>> id(x)
140707671078064

Changing the value of a variable that has immutable data simply makes the variable point to a new object with the updated value.


This is not the case with mutable types like list.

>>> a = [1, 2, 3]
>>> b = a
>>> id(a), id(b)
(1886794896456, 1886794896456)
>>> b.append(4)
>>> a
[1, 2, 3, 4]
>>> b
[1, 2, 3, 4]
>>>

a is a list and is mutable, changing it using methods like append will actually mutate the value at address 1886794896456. Since b also points to the same address, the value of a also gets updated.


A deepcopy creates a new object at a different memory location with the same value as its parameter, i.e. the object passed to it.

I would like to know the reason why this design choice was made

This is simply because of how python is designed as an object oriented language. Similar behavior can be seen in java objects.

I personally find this behavior counter-intuitive

Intuition comes from practice. Practicing one language cannot help with how other languages work, there are different design patterns and conventions for different languages and I think some amount of effort should be put in to learn what they are for the language we are about to use.

Reason why deep and shallow copy differences between "primitive , Hence the reason it's called a shallow copy. Changes made to the copied engine class properties would also reflect in the original. Let's go over a� Shallow copy works fine when dynamic memory allocation is not involved because when dynamic memory allocation is involved then both objects will points towards the same memory location in a heap, Therefore to remove this problem we wrote deep copy so both objects have their own copy of attributes in a memory.

c = a is neither a shallow copy nor a deep copy, no matter what a refers to. It's even shallower than that - it only copies a reference. Both c and a hold references to the same object after this assignment.

It is not possible to modify the value of an int in Python. When you use += on an int, Python assigns a (reference to a) new int to wherever you retrieved the original int from.

For the first case, a += 1 reassigns the a variable, while b and c continue to refer to the ints they referred to before the assignment.

For the second case, a[0] += 1 reassigns cell 0 of the list a refers to. b continues to refer to the copy, which is unchanged, and c continues to refer to the same list a refers to. Since this list has changed state, the change is visible through the c variable.


Incidentally, deepcopy is designed to produce a deep copy, in the sense that (arbitrarily deep) modifications to the return value will not modify the argument, and vice versa. Since it is not possible to modify the value of an int in Python, an int counts as a (deep) copy of itself, and indeed, the deepcopy implementation simply returns its argument if its argument is an int.

>>> x = 1000
>>> copy.deepcopy(x) is x
True

Difference Between Shallow And Deep Copy | by Petey, Difference Between Deep and Shallow Copies Deep copy stores copies of an object's values, whereas shallow copy stories references to the original memory address. Deep copy doesn't reflect changes made to the new/copied object in the original object; whereas, shallow copy does. Now that we have discussed what shallow and deep copies are and why we create copies, it's time to talk about the difference between them. Essentially, there are just two core differences and they're linked with each other: Deep copy stores copies of an object's values, whereas shallow copy stories references to the original memory address

It is commonplace to link objects between them during copies and not primitives.

The difference between your snippets is that, in the second one, c is a copy of the list a, and a list is an object, so they are linked. Whereas c was a "copy" of a primitive in the first snippet, which doesn't link.

Deep vs Shallow Copies in Python, When creating copies of arrays or objects one can make a deep copy or a shallow folks say pointers, but there are differences between references and points). programming students this seems odd, but their are valid technical reasons for� Shallow Copy vs. Deep Copy. Must Read – 9 Ways to Copy a File in Python. The difference between Shallow and Deep copy. A shallow copy is one which makes a new object stores the reference of another object. While, in deep copy, a new object stores the copy of all references of another object making it another list separate from the original one.

Deep vs. Shallow copying., Note: In the above diagram The Global Rank is value type field so it creates the copy of that and store it in different location but the Name (Desc)� So the basic difference between shallow and deep copy is. Shallow copy copies the primitive values as it is but for reference type it copies reference to the object and doesnt create the new object. Meanwhile in deep copy you can override the clone() method to copy the exact values in the new object you create which is deep copy. Reply Delete

Shallow Copy and Deep Copy in C#, The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances): - A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

OK. This is good for standard data types. A class can have one or more data members. How the copy occurs between the data members is what we are going to deal with this hub. When the Hub progresses, I will explain Shallow Copy, Deep Copy and the need for our own copy constructor. 2. ShalloC class

Comments
  • Please share a Minimal, Complete, and Verifiable example
  • c=a isn't a shallow copy - it's even shallower than that. It only copies the reference.
  • So I suppose the reason why the integers work as they do is to do as little work as possible, i.e. to acquire new memory resources only when I really need them. Is it the right line of reasoning?
  • @alessiolapolla: It's not really about memory efficiency. The "references for everything" design arises naturally when trying to design a language that allows complex data structures without explicit pointers, and then if you let basic numeric types be mutable in such a design, you end up having to make a zillion explicit copies all over the place to prevent ints and floats from changing unexpectedly. Numbers get passed around far too often for that to be practical.
  • y doesn't point to the same object as x after y = x just to save memory. It points to the same object because variable assignment means "evaluate the expression on the right and store the resulting reference in the variable on the left". Regardless of mutability or immutability, this assignment is not allowed to create a new object or to make y point somewhere other than where x points.
  • Ints are objects too in Python.
  • Yes, you're right, but they kept the convention that you can't link "primitives", even though they are technically objects too.
  • That's not what's going on. There are no "linked" copies. Rather, Python variables hold references to objects, and c=a causes c and a to hold references to the same object - it does not copy any objects.
  • What I am saying is that python's ints do not reference the same object in order to respect the convention you'll find in other programming languages such as java. a=0 b=a a+=1 print(b) : 0
  • That's not true. Two Python variables can reference the same int just fine, and indeed, c and a do reference the same int after c=a in the first snippet. They just stop referencing the same int after a+=1, because that assignment causes a to start referencing a different int.