Mask out specific values from an array

Question

Example:

I have an array:

array([[1, 2, 0, 3, 4],
       [0, 4, 2, 1, 3],
       [4, 3, 2, 0, 1],
       [4, 2, 3, 0, 1],
       [1, 0, 2, 3, 4],
       [4, 3, 2, 0, 1]], dtype=int64)

I have a set (variable length, order doesn't matter) of "bad" values:

{2, 3}

I want to return the mask that hides these values:

array([[False,  True, False,  True, False],
       [False, False,  True, False,  True],
       [False,  True,  True, False, False],
       [False,  True,  True, False, False],
       [False, False,  True,  True, False],
       [False,  True,  True, False, False]], dtype=bool)

What's the simplest way to do this in NumPy?


Show source
| python   | arrays   | numpy   | membership   2016-12-25 06:12 3 Answers

Answers ( 3 )

  1. 2016-12-25 07:12

    There might be simpler ways than this. But this can be one way:

    import numpy as np
    
    a = np.array([[1, 2, 0, 3, 4],
           [0, 4, 2, 1, 3],
           [4, 3, 2, 0, 1],
           [4, 2, 3, 0, 1],
           [1, 0, 2, 3, 4],
           [4, 3, 2, 0, 1]], dtype=np.int64)
    
    f = np.vectorize(lambda x: x in {2,3})
    print f(a)
    

    Output:

    [[False  True False  True False]
     [False False  True False  True]
     [False  True  True False False]
     [False  True  True False False]
     [False False  True  True False]
     [False  True  True False False]]
    
  2. 2016-12-25 07:12

    Use np.in1d that gives us a flattened mask of such matching occurrences and then reshape back to input array shape for the desired output, like so -

    np.in1d(a,[2,3]).reshape(a.shape)
    

    Sample run -

    In [5]: a
    Out[5]: 
    array([[1, 2, 0, 3, 4],
           [0, 4, 2, 1, 3],
           [4, 3, 2, 0, 1],
           [4, 2, 3, 0, 1],
           [1, 0, 2, 3, 4],
           [4, 3, 2, 0, 1]])
    
    In [6]: np.in1d(a,[2,3]).reshape(a.shape)
    Out[6]: 
    array([[False,  True, False,  True, False],
           [False, False,  True, False,  True],
           [False,  True,  True, False, False],
           [False,  True,  True, False, False],
           [False, False,  True,  True, False],
           [False,  True,  True, False, False]], dtype=bool)
    
  3. 2016-12-25 09:12
    In [965]: np.any([x==i for i in (2,3)],axis=0)
    Out[965]: 
    array([[False,  True, False,  True, False],
           [False, False,  True, False,  True],
           [False,  True,  True, False, False],
           [False,  True,  True, False, False],
           [False, False,  True,  True, False],
           [False,  True,  True, False, False]], dtype=bool)
    

    This does iterate, but if the (2,3) set is small (relative to the size of x) this is relatively fast. In fact for small arr2, np.in1d does this:

            mask = np.zeros(len(ar1), dtype=np.bool)
            for a in ar2:
                mask |= (ar1 == a)
    

    Making a masked array from this:

    In [970]: np.ma.MaskedArray(x,mask)
    Out[970]: 
    masked_array(data =
     [[1 -- 0 -- 4]
     [0 4 -- 1 --]
     [4 -- -- 0 1]
     [4 -- -- 0 1]
     [1 0 -- -- 4]
     [4 -- -- 0 1]],
                 mask =
     [[False  True False  True False]
     [False False  True False  True]
     [False  True  True False False]
     [False  True  True False False]
     [False False  True  True False]
     [False  True  True False False]],
           fill_value = 999999)
    
◀ Go back