nan – Page 4 – Tarik Billa

Pandas: print column name with missing values

August 10, 2023 by Tarik

df.isnull().any() generates a boolean array (True if the column has a missing value, False otherwise). You can use it to index into df.columns: df.columns[df.isnull().any()] will return a list of the columns which have missing values. df = pd.DataFrame({‘A’: [1, 2, 3], ‘B’: [1, 2, np.nan], ‘C’: [4, 5, 6], ‘D’: [np.nan, np.nan, np.nan]}) df Out: … Read more

Select non-null rows from a specific column in a DataFrame and take a sub-selection of other columns

August 8, 2023 by Tarik

You can pass a boolean mask to your df based on notnull() of ‘Survive’ column and select the cols of interest: In [2]: # make some data df = pd.DataFrame(np.random.randn(5,7), columns= [‘Survive’, ‘Age’,’Fare’, ‘Group_Size’,’deck’, ‘Pclass’, ‘Title’ ]) df[‘Survive’].iloc[2] = np.NaN df Out[2]: Survive Age Fare Group_Size deck Pclass Title 0 1.174206 -0.056846 0.454437 0.496695 1.401509 … Read more

Comparing Double.NaN with itself

August 2, 2023 by Tarik

The reason for the difference is simple, if not obvious. If you use the equality operator ==, then you’re using the IEEE test for equality. If you’re using the Equals(object) method, then you have to maintain the contract of object.Equals(object). When you implement this method (and the corresponding GetHashCode method), you have to maintain that … Read more

Why does “np.inf // 2” result in NaN and not infinity?

August 1, 2023 by Tarik

I’m going to be the person who just points at the C level implementation without any attempt to explain intent or justification: *mod = fmod(vx, wx); div = (vx – *mod) / wx; It looks like in order to calculate divmod for floats (which is called when you just do floor division) it first calculates … Read more

Can an integer be NaN in C++?

July 31, 2023 by Tarik

No, NaN is a floating point value. Every possible value of an int is a number. Edit The standard says: 6.2.6.2 40) Some combinations of padding bits might generate trap representations, for example, if one padding bit is a parity bit. Regardless, no arithmetic operation on valid values can generate a trap representation other than … Read more

NumPy: calculate averages with NaNs removed

July 29, 2023 by Tarik

I think what you want is a masked array: dat = np.array([[1,2,3], [4,5,’nan’], [‘nan’,6,’nan’], [‘nan’,’nan’,’nan’]]) mdat = np.ma.masked_array(dat,np.isnan(dat)) mm = np.mean(mdat,axis=1) print mm.filled(np.nan) # the desired answer Edit: Combining all of the timing data from timeit import Timer setupstr=””” import numpy as np from scipy.stats.stats import nanmean dat = np.random.normal(size=(1000,1000)) ii = np.ix_(np.random.randint(0,99,size=50),np.random.randint(0,99,size=50)) dat[ii] = … Read more

How do I declare NaN (not a number) in Ruby?

July 29, 2023 by Tarik

Since Ruby 1.9.3 there is a constant to get the NaN value Float::NAN => NaN

Why does Release/Debug have a different result for std::min?

July 23, 2023 by Tarik

In IEEE 754 comparing NAN to anything will always yield false, no matter what it is. slope > 0; // false slope < 0; // false slope == 0; // false And, more importantly for you slope < DBL_MAX; // false DBL_MAX < slope; // false So it seems that the compiler reorders the parameters/uses … Read more

Pandas – check if ALL values are NaN in Series

June 29, 2023 by Tarik

Yes, that’s correct, but I think a more idiomatic way would be: mys.isnull().all()

Counting the number of missing/NaN in each row

June 28, 2023 by Tarik

You could first find if element is NaN or not by isnull() and then take row-wise sum(axis=1) In [195]: df.isnull().sum(axis=1) Out[195]: 0 0 1 0 2 0 3 3 4 0 5 0 dtype: int64 And, if you want the output as list, you can In [196]: df.isnull().sum(axis=1).tolist() Out[196]: [0, 0, 0, 3, 0, 0] … Read more