How to properly mask a numpy 2D array?

numpy mask 2d array
apply mask to 2d array numpy
numpy mask array
numpy mask 2d array with 1d array
numpy mask array with another array
numpy apply 2d mask to 3d array
convert masked array to numpy array
numpy operations on masked array

Say I have a two dimensional array of coordinates that looks something like

x = array([[1,2],[2,3],[3,4]])

Previously in my work so far, I generated a mask that ends up looking something like

mask = [False,False,True]

When I try to use this mask on the 2D coordinate vector, I get an error

newX = np.ma.compressed(np.ma.masked_array(x,mask))

>>>numpy.ma.core.MaskError: Mask and data not compatible: data size 
   is 6, mask size is 3.`

which makes sense, I suppose. So I tried to simply use the following mask instead:

mask2 = np.column_stack((mask,mask))
newX = np.ma.compressed(np.ma.masked_array(x,mask2))

And what I get is close:

>>>array([1,2,2,3])

to what I would expect (and want):

>>>array([[1,2],[2,3]])

There must be an easier way to do this?

Is this what you are looking for?

import numpy as np
x[~np.array(mask)]
# array([[1, 2],
#        [2, 3]])

Or from numpy masked array:

newX = np.ma.array(x, mask = np.column_stack((mask, mask)))
newX

# masked_array(data =
#  [[1 2]
#  [2 3]
#  [-- --]],
#              mask =
#  [[False False]
#  [False False]
#  [ True  True]],
#        fill_value = 999999)

How to properly mask a numpy 2D array?, Is this what you are looking for? import numpy as np x[~np.array(mask)] # array([[ 1, 2], # [2, 3]]). Or from numpy masked array: ma.count (self [, axis, keepdims]) Count the non-masked elements of the array along the given axis. ma.count_masked (arr [, axis]) Count the number of masked elements along the given axis. ma.getmask (a) Return the mask of a masked array, or nomask.

Your x is 3x2:

In [379]: x
Out[379]: 
array([[1, 2],
       [2, 3],
       [3, 4]])

Make a 3 element boolean mask:

In [380]: rowmask=np.array([False,False,True])

That can be used to select the rows where it is True, or where it is False. In both cases the result is 2d:

In [381]: x[rowmask,:]
Out[381]: array([[3, 4]])

In [382]: x[~rowmask,:]
Out[382]: 
array([[1, 2],
       [2, 3]])

This is without using the MaskedArray subclass. To make such array, we need a mask that matches x in shape. There isn't provision for masking just one dimension.

In [393]: xmask=np.stack((rowmask,rowmask),-1)  # column stack

In [394]: xmask
Out[394]: 
array([[False, False],
       [False, False],
       [ True,  True]], dtype=bool)

In [395]: np.ma.MaskedArray(x,xmask)
Out[395]: 
masked_array(data =
 [[1 2]
 [2 3]
 [-- --]],
             mask =
 [[False False]
 [False False]
 [ True  True]],
       fill_value = 999999)

Applying compressed to that produces a raveled array: array([1, 2, 2, 3])

Since masking is element by element, it could mask one element in row 1, 2 in row 2 etc. So in general compressing, removing the masked elements, will not yield a 2d array. The flattened form is the only general choice.

np.ma makes most sense when there's a scattering of masked values. It isn't of much value if you want want to select, or deselect, whole rows or columns.

===============

Here are more typical masked arrays:

In [403]: np.ma.masked_inside(x,2,3)
Out[403]: 
masked_array(data =
 [[1 --]
 [-- --]
 [-- 4]],
             mask =
 [[False  True]
 [ True  True]
 [ True False]],
       fill_value = 999999)

In [404]: np.ma.masked_equal(x,2)
Out[404]: 
masked_array(data =
 [[1 --]
 [-- 3]
 [3 4]],
             mask =
 [[False  True]
 [ True False]
 [False False]],
       fill_value = 2)

In [406]: np.ma.masked_outside(x,2,3)
Out[406]: 
masked_array(data =
 [[-- 2]
 [2 3]
 [3 --]],
             mask =
 [[ True False]
 [False False]
 [False  True]],
       fill_value = 999999)

Masking a 2D array and operating on second array based off , How to properly mask a numpy 2D array?, Make a 3 element boolean mask: In [ 380]: rowmask=np.array([False,False,True]). That can be used to select the rows � I have a 3-dimensional array that I want to mask using a 2-dimensional array that has the same dimensions as the two rightmost of the 3-dimensional array. Is there a way to do this without writing

Since none of these solutions worked for me, I thought to write down what solution did, maybe it will useful for somebody else. I use python 3.x and I worked on two 3D arrays. One, which I call data_3D contains float values of recordings in a brain scan, and the other, template_3D contains integers which represent regions of the brain. I wanted to choose those values from data_3D corresponding to an integer region_code as per template_3D:

my_mask = np.in1d(template_3D, region_code).reshape(template_3D.shape)
data_3D_masked = data_3D[my_mask]

which gives me a 1D array of only relevant recordings.

Masked array operations — NumPy v1.20.dev0 Manual, Stack 1-D arrays as columns into a 2-D array. ma.concatenate (arrays[, axis]). Concatenate a sequence of arrays along the given axis. In both NumPy and Pandas we can create masks to filter data. Masks are ’Boolean’ arrays – that is arrays of true and false values and provide a powerful and flexible method to selecting data. NumPy creating a mask. Let’s begin by creating an array of 4 rows of 10 columns of uniform random number between 0 and 100.

In your last example, the problem is not the mask. It is your use of compressed. From the docstring of compressed:

Return all the non-masked data as a 1-D array.

So compressed flattens the nonmasked values into a 1-d array. (It has to, because there is no guarantee that the compressed data will have an n-dimensional structure.)

Take a look at the masked array before you compress it:

In [8]: np.ma.masked_array(x, mask2)

Out[8]: 
masked_array(data =
 [[1 2]
 [2 3]
 [-- --]],
             mask =
 [[False False]
 [False False]
 [ True  True]],
       fill_value = 999999)

numpy.ma.core — Astropy v4.2.dev526+gbe39265ca, numpy.ma : a package to handle missing or invalid values. If we're indexing a multidimensional field in a # structured array (such as dtype("(2,)i2,(2 A masked array does not own its data and therefore cannot safely be resized in place. NumPy library provides objects for multi-dimensional arrays, whereas Pandas is capable of offering an in-memory 2d table object called DataFrame. The corresponding non-zero values can be obtained with: a[numpy. mask_rows (a[, axis]) Mask rows of a 2D array that contain masked values. Starting from numpy 1. The fourth channel is an alpha channel.

With np.where you can do all sorts of things:

x_maskd = np.where(mask, x, 0)

Comparisons, Masks, and Boolean Logic, Working with Boolean Arrays�. Given a Boolean array, there are a host of useful operations you can do. We'll work with x , the two-� Creating arrays in NumPy 3. mask_rows (a[, axis]) Mask rows of a 2D array that contain masked values. In fact, both sliding windows and image pyramids are both used in my 6-step. concatenate ( [a1,a2]) operation does not actually link the two arrays but returns a new one, filled with the entries from both given arrays in sequence.

20. Masked Arrays, To create a masked array where all values "near" 1.e20 are invalid, we can do: Return the Python list self.filled(fill_value).tolist(); note that masked values are filled. Note that a new array is created only if necessary to create a correctly filled, contiguous, Numeric array. Averaging an entire multidimensional array. import numpy as np import matplotlib.pyplot as plt # Construct a random 50x50 RGB image image = np.random.random((50, 50, 3)) # Construct mask according to some condition; # in this case, select all pixels with a red value > 0.3 mask = image[, 0] > 0.3 # Set all masked pixels to zero masked = image.copy() masked[mask] = 0 # Display original and masked images side-by-side f, (ax0, ax1) = plt.subplots(1, 2) ax0.imshow(image) ax1.imshow(masked) plt.show()

Nodata Masks — rasterio documentation, The other kind of mask is Numpy's masked array which has the inverse sense: True values in a With care, you can safely navigate convert between the two mask types. This 2D array is a valid data mask in the sense of GDAL RFC 15. a.shape[0] is the number of rows and the size of the first dimension, while a.shape[1] is the size of the second dimension. You need to write: for x in range(0, rows): for y in range(0, cols): print a[x,y]

Look Ma, No For-Loops: Array Programming With NumPy – Real , In NumPy, an axis refers to a single dimension of a multidimensional array: A trick for doing this is to first mask the array of NumPy “shape-tuples” in places� Numpy convert 2d array to float. Numpy convert 2d array to float

Comments
  • Ah I see, so what I was trying does work, I just can't compress it. Hm. is there a way to remove masked elements of an array without loosing dimensionality of the array? np.ma.compressed() does both.
  • I don't too much about masked array either, probably the same level as you. Just trying to make it work. Well, if you are trying to remove elements, I think logic index is not a bad way.
  • You're right, its correct before I compress it. I will read the documentation for a way to remove masked elements while preserving array dimensionality. Thanks
  • If I understand what you are trying to do, @Psidom's first suggestion looks reasonable. In particular, you probably don't need a masked array. Just index a regular array with a boolean array.