python - Find missing values in NumPy array of dtype obj -
i'm being driven crazy numpy array of dtype obj missing value (in example below, penultimate value).
>> array([0, 3, 'braund, mr. owen harris', 'male', 22.0, 1, 0, 'a/5 21171', 7.25, nan, 's'], dtype=object) i want find missing value programatically function returns boolean vector true values in elements correspond missing values in array (as per example below).
>> some_function(a) array([false, false, false, false, false, false, false, false, false, true, false], dtype=bool) i tried isnan no avail.
>> isnan(a) traceback (most recent call last): file "<stdin>", line 1, in <module> typeerror: ufunc 'isnan' not supported input types, , inputs not safely coerced supported types according casting rule ''safe'' i attempted performing operation explicitly on every element of array apply_along_axis, same error returned.
>> apply_along_axis(isnan, 0, a) traceback (most recent call last): file "<stdin>", line 1, in <module> typeerror: ufunc 'isnan' not supported input types, , inputs not safely coerced supported types according casting rule ''safe'' can explain me (1) i'm doing wrong , (2) can solve problem? error, gather has 1 of elements not being in appropriate type. easiest way around issue?
another workaround is:
in [148]: [item != item item in a] out[148]: [false, false, false, false, false, false, false, false, false, true, false] since nans not equal themselves. note, however, possible define custom objects which, nan, not equal themselves:
class foo(object): def __cmp__(self, obj): return -1 foo = foo() assert foo != foo so using item != item not mean item nan.
note idea avoid numpy arrays of dtype object if possible.
- they not particularly quick -- operations on contents devolve python calls on underlying python objects. normal python list has better performance.
- unlike numeric arrays can more space efficient python list of numbers, object arrays not particularly space efficient since every item reference python object.
- they not particular convenient since many numpy operations not work on arrays of dtype
object.isnan1 such example.
Comments
Post a Comment