Pandas: print column name with missing values

df.isnull().any() generates a boolean array (True if the column has a missing value, False otherwise). You can use it to index into df.columns: df.columns[df.isnull().any()] will return a list of the columns which have missing values. df = pd.DataFrame({‘A’: [1, 2, 3], ‘B’: [1, 2, np.nan], ‘C’: [4, 5, 6], ‘D’: [np.nan, np.nan, np.nan]}) df Out: … Read more

Select non-null rows from a specific column in a DataFrame and take a sub-selection of other columns

You can pass a boolean mask to your df based on notnull() of ‘Survive’ column and select the cols of interest: In [2]: # make some data df = pd.DataFrame(np.random.randn(5,7), columns= [‘Survive’, ‘Age’,’Fare’, ‘Group_Size’,’deck’, ‘Pclass’, ‘Title’ ]) df[‘Survive’].iloc[2] = np.NaN df Out[2]: Survive Age Fare Group_Size deck Pclass Title 0 1.174206 -0.056846 0.454437 0.496695 1.401509 … Read more

Comparing Double.NaN with itself

The reason for the difference is simple, if not obvious. If you use the equality operator ==, then you’re using the IEEE test for equality. If you’re using the Equals(object) method, then you have to maintain the contract of object.Equals(object). When you implement this method (and the corresponding GetHashCode method), you have to maintain that … Read more

Can an integer be NaN in C++?

No, NaN is a floating point value. Every possible value of an int is a number. Edit The standard says: 6.2.6.2 40) Some combinations of padding bits might generate trap representations, for example, if one padding bit is a parity bit. Regardless, no arithmetic operation on valid values can generate a trap representation other than … Read more

NumPy: calculate averages with NaNs removed

I think what you want is a masked array: dat = np.array([[1,2,3], [4,5,’nan’], [‘nan’,6,’nan’], [‘nan’,’nan’,’nan’]]) mdat = np.ma.masked_array(dat,np.isnan(dat)) mm = np.mean(mdat,axis=1) print mm.filled(np.nan) # the desired answer Edit: Combining all of the timing data from timeit import Timer setupstr=””” import numpy as np from scipy.stats.stats import nanmean dat = np.random.normal(size=(1000,1000)) ii = np.ix_(np.random.randint(0,99,size=50),np.random.randint(0,99,size=50)) dat[ii] = … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)