It also depends on the meaning of 0 in your data.
- If these are indeed ‘0’ values, then your approach is good
-
If ‘0’ is a placeholder for a value that was not measured (i.e. ‘NaN’), then it might make more sense to replace all ‘0’ occurrences
with ‘NaN’ first. Calculation of the mean then by default exclude NaN
values.df = pd.DataFrame([1, 0, 2, 3, 0], columns=['a']) df = df.replace(0, np.NaN) df.mean()