Strange behavior in np.ndarray` "is"

"is" built-in operator shows a strange behavior for the element in np.ndarray.

Although the id of the rhs and the lhs is the same, the "is" operator returns False (this behavior is specific to np.ndarray).

a = np.array([1.,])
b = a.view()
print(id(a[0] == id(b[0])))  # True
print(a[0] is b[0])  # False

This strange behavior even happens without the copy of view.

a = np.array([1.,])
print(a[0] is a[0])  # False

Does anyone know the mechanism of this strange behavior (and possibly the evidence or specification)?

Post Script: Please re-think the two examples.

  1. If this is a list, this phenomenon is not observed.
a = [0., 1., 2.,]
b = []
b.append(a[0])
print(a[0] is b[0])  # True
  1. a[0] and b[0] refer the exact same object.
a = np.array([1.,])
b = a.view()
b[0] = 0.
print(a[0])  # 0.0
print(id(a[0]) == id(b[0]))  # True

Note: This question can be a duplication, but I'm still a bit confused.

a = np.array([1.,])
b = a.view()
x = a[0]
y = b[0]
print(id(a[0]))  # 139746064667728
print(id(b[0]))  # 139746064667728
print(id(a[0]) == id(b[0])) # True
print(id(a[0]) == id(x)) # False
print(id(x) == id(y))  # False
  1. Is a[0] a temporal object?
  2. Is the id for a temporal object reused?
  3. Doesn't it contradict to the specification? (https://docs.python.org/3.7/reference/expressions.html#is)
6.10.3. Identity comparisons
The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. Object identity is determined using the id() function. x is not y yields the inverse truth value.
  1. If the id is re-used for the temporal objects, why in this case the id is different?
>>> id(100000000000000000 + 1) == id(100000000000000001)
True
>>> id(100000000000000000 + 1) == id(100000000000000000)
False

This is simply due to the difference in how the is and == works , the is operator doesn't compare the values they simply check if the two operands refer to the same object or not.

For example if you do:

print(a is a)

The output will be: True for more information look up here

When python compares it allocates different positions to the operands and the same behaviour can be observed with a simple test using an id function.

print(id(a[0]),a[0] is a[0],id(a[0]))

The output will be:

140296834593128 False 140296834593248

The answer to the question that you are asking in addition that why lists don't behave the way numpy arrays behave is simply based on their construction. Np.arrays were designed to be more efficient in their processing capabilities and more efficient in their storage than a normal python list.

So every-time you load or perform an operation on a numpy array it is loaded and assigned a different id as you can observe from the following code:

a = np.array([0., 1., 2.,])
b = []
b.append(a[0])
print(id(a[0]),a[0] is b[0],id(b[0]))

Here are the outputs of multiple re-runs of the same code in jupyter-lab:

140296834595096 False 140296834594496
140296834595120 False 140296834594496
140296834595120 False 140296834594496
140296834595216 False 140296834594496
140296834595288 False 140296834594496

Notice something strange?, The ids of the numpy array with each re-run is different however the id for the list object remains the same. This explains the strange behaviour for numpy arrays in your question.

If you want to read more on this behaviour I will suggest numpy docs

Strange behavior with numpy boolean array. - learnpython, Strange behavior with numpy boolean array. [deleted]. Share20. 9. 20 Comments sorted byBest. Log in or sign up to leave a comment. Post is archived  Description Converting numpy arrays to mxnet and back again can give numpy arrays that behave very strangely. Minimal example: import mxnet as mx import numpy as np n = 1_000_000 # Create an array with two columns with values 100 and -10

a[0] is of type <class 'numpy.float64'>. When you do the comparison it crates two instances of the class, so the is check fails. However if you do the following you will get what you wanted, because now both are referencing the same object.

x = a[0]
print(x is x)  # True

numpy.place, I am using: Python 2.7.10 (default, Oct 14 2015, 16:09:02) [GCC 5.2.1 20151010] on linux2 numpy 1.10.1 Linux wojtek-dell 3.13.0-66-generic  Unfortunately, I'm unable to post the image I'm analyzing as it contains confidential information from my employer. However, since the behavior of the library itself is the really strange thing I'm hoping that should be enough to start looking into why this strange behavior is occurring.

This is covered by id() vs `is` operator. Is it safe to compare `id`s? Does the same `id` mean the same object? . In this particular case:

  1. a[0] and b[0] are created anew each time

    In [7]: a[0] is a[0]
    Out[7]: False
    
  2. In id(a[0]) == id(b[0]), each object is immediately discarded after taking its id, and b[0] just happened to take up the id of the recently-discarded a[0]. Even if this happens each time in your version of CPython for this particular expression (due to a specific evaluation order and heap organization), this is an implementation detail and you can't rely on it.

Strange behaviour when asking whether list is in numpy array of lists , I think the issue is self explanatory when looking at the code below. The second boolean should be False. Is this intended? If so, should there  Directed by Michael Laughlin. With Michael Murphy, Louise Fletcher, Dan Shor, Fiona Lewis. A scientist is experimenting with teenagers and turning them into murderers.

Numpy stores array data as a raw data buffer. When you access the data like a[0], it reads from the buffer and constructs a python object for it. Thus, calling a[0] twice will construct 2 python objects. is checks for identity, so 2 different objects will compare false.

This illustration should make the process much clearer:

NOTE: id numbers are sequential to be used simply as examples. clearly you'd get a random like number. The multiple id 3s in the example also may not necessarily always be the same number. It's just possible that they are, because id 3 is repeatedly freed and thus reusable.

a = np.array([1.,])
b = a.view()
x = a[0]    # python reads a[0], creates new object id 1.
y = b[0]    # python reads b[0] which reads a[0], creates new object id 2. (1 is used by object x)

print(id(a[0]))  # python reads a[0], creates new object id 3.
                 # After this call, the object id 3 a[0] is no longer used.
                 # Its lifetime has ended and id 3 is freed.

print(id(b[0]))  # python reads b[0] which reads a[0], creates new object id 3. 
                 # id 3 has been freed and is reusable.
                 # After this call, the object id 3 b[0] is no longer used.
                 # Its lifetime has ended and id 3 is freed (again).

print(id(a[0]) == id(b[0])) # This runs in 2 steps.
                            # First id(a[0]) is run. This is just like above, creates a object with id 3.
                            # Then a[0] is disposed of since no references are created to it. id 3 is freed again.
                            # Then id(b[0]) is run. Again, it creates a object with id 3. (Since id 3 is free).
                            # So, id(a[0]) == 3, id(b[0]) == 3. They are equal.

print(id(a[0]) == id(x)) # Following the same thing above, id(a[0]) can create a object of id 3, x maintains its reference to id 1 object. 3 != 1.

print(id(x) == id(y))  # x references id 1 object, y references id 2 object. 1 != 2

Regarding

>>> id(100000000000000000 + 1) == id(100000000000000001)
True
>>> id(100000000000000000 + 1) == id(100000000000000000)
False

id allocation, and garbage collection are implementation details. What is guaranteed, is that, at a single point in time, references to 2 different objects are different and references to 2 identical objects are the same. The problem is that some expressions may not be atomic (i.e. not run at a single point in time).

Python may decide to reuse or not to reuse freed id numbers as it wishes, depending on the implementation. In this case, it decided to reuse in one case and not in the other. (it's likely that in the id(100000000000000000 + 1) == id(100000000000000001) python realises that since the number is the same, it can reuse it efficiently because 100000000000000001 would be in the same location in memory.)

Quickstart tutorial, Note that numpy.array is not the same as the Standard Python Library class array​.array To disable this behaviour and force NumPy to print the entire array, you can of another array or created with unusual options, it may need to be copied. Strange Behavior (also known as Dead Kids) is a 1981 slasher film written and directed by Michael Laughlin, co-written with Bill Condon, and starring Michael Murphy, Louise Fletcher, and Dan Shor. Its plot follows a series of bizarre murders being perpetrated against teenagers in a small Midwestern town.

A big part of the confusion here is the nature of a[0] in the case of an array.

For a list, b[0] is an actual element of b. We can illustrate this by making a list of mutable items (other lists):

In [22]: b = [[0],[1],[2],[3]]
In [23]: b1 = b[0]
In [24]: b1
Out[24]: [0]
In [25]: b[0].append(10)
In [26]: b
Out[26]: [[0, 10], [1], [2], [3]]
In [27]: b1
Out[27]: [0, 10]
In [28]: b1.append(20)
In [29]: b
Out[29]: [[0, 10, 20], [1], [2], [3]]

Mutating b[0] and b1 act on the same object.

For an array:

In [35]: a = np.array([0,1,2,3])
In [36]: c = a.view()
In [37]: a1 = a[0]
In [38]: a += 1
In [39]: a
Out[39]: array([1, 2, 3, 4])
In [40]: c
Out[40]: array([1, 2, 3, 4])
In [41]: a1
Out[41]: 0

an inplace change in a does not change a1, even though it did change c.

__array_interface__ shows us where the databuffer for an array is stored - think of it, in a loose sense, as the memory address of that buffer.

In [42]: a.__array_interface__['data']
Out[42]: (31233216, False)
In [43]: c.__array_interface__['data']
Out[43]: (31233216, False)
In [44]: a1.__array_interface__['data']
Out[44]: (28513712, False)

The view has the same databuffer. But a1 does not. a[0:1] is a single element view of a, and does share the data buffer.

In [45]: a[0:1].__array_interface__['data']
Out[45]: (31233216, False)
In [46]: a[1:2].__array_interface__['data']  # 8 bytes over
Out[46]: (31233224, False)

So id(a[0]) tells us next to nothing about a. Comparing ids only tells us something about how memory slots are recycled, or not, when constructing Python objects.

NumPy 1.19.0 Release Notes, This ensures type safety except when the input array has a smaller integer type than to_begin or to_end . In rare cases, the behaviour will be more strict than it  The N-dimensional array (ndarray)¶An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size. The number of dimensions and items in an array is defined by its shape, which is a tuple of N positive integers that specify the sizes of each dimension.

Tentative_NumPy_Tutorial - SciPy wiki dump, To disable this behaviour and force NumPy to print the entire array, you can change of another array or created with unusual options, it may need to be copied. Because you have a 32bit system. You can use np.issubdtype(arr.dtype, np.integer) (or probably also isinstance there) for example. For your stuff up there I would maybe suggest you rather check on np.generic and use generic.item() and maybe array.tolist() (you need to check the item after that of course, but it should be a python type).

1.4.1. The NumPy array object, I encountered a strange behavior of np.ndarray.tobytes() that makes me doubt that it is working deterministically, While a gradual personality change isn't unusual, a sudden change can be caused by an injury or illness. Look for the following signs to determine if strange or unusual behavior is an emergency

numpy.divide, What are NumPy and NumPy arrays? import numpy as np And then create your own: how about odd numbers counting backwards on the first This behavior can be surprising at first sight… but it allows to save both memory and time. python,python-2.7,behavior Short answer: your correct doesn't work. Long answer: The binary floating-point formats in ubiquitous use in modern computers and programming languages cannot represent most numbers like 0.1, just like no terminating decimal representation can represent 1/3.

Comments
  • Possible duplicate of id() vs `is` operator. Is it safe to compare `id`s? Does the same `id` mean the same object?
  • No, this is not a duplicated question. In that case, the temporary object for which we cannot estimate its lifetime was discussed. However, in this case, the objective is a memory buffer which can also be confirmed even by memory-checking tools.
  • a[0] does not give direct access to the databuffer of a.
  • Re: edit: as per the duplicate, the result of id(foo()) == id(bar()) is undefined behavior, and for int specifically, there's another factor at play.
  • No, I do not refer to that difference. Remember, Python's "is" (usually) returns True when the lhs and rhs's ids are the same. For example, a = [0, 1, 2]; b = []; b.append(a[0]); print(a[0] is b[0]) # True. This is only specific to numpy.
  • @YukiHashimoto if the id's are same then it should be true, please elaborate whaat you are saying a little more
  • Thanks again. Please try to reproduce the code in my question. The ids of a[0] and "another" a[0] is the same, but the is operator returns False.
  • @YukiHashimoto that is because numpy arrays and lists are quite different from each other on how they are called, I will edit my answer in a minute to explain.
  • i doubt if it is possible for two objects with different lifetimes to have the same id's, id in a very crude way of saying is like the memory number, two things can't be at the same place at a given time.
  • Thanks for the comment. However, in the first case, a[0] and b[0] is not different object. For example, a = np.array([1.,]); b = a.view(); b[0] = 0; print(a[0]) # 0.
  • No, a[0] and b[0] are still different numpy objects. They reference, or unbox, the same value in a. For a list alist[0] is the actual object in the list, so id's match. But the 'contents' of array a is a flat data buffer. a[0] is not an item in the buffer.
  • OK, I understood what happened... But doesn't this contradict to the Python documentation?
  • @YukiHashimoto No. And the duplicate explains why: because in id(a[0]) == id(b[0]) , a[0] and b[0] do not exist at the same time.
  • OK, but why their id is the same?