So, you want to get rid of any row that has ‘NaN’ ( null or not a number ) values, because it doesn’t work with some functions ( or can’t ignore )

obs1.dropna(how = 'all', subset = ['wind_mph'], inplace = True)
obs1 = obs1.reset_index(drop=True)

Another important point – is that with Pandas – you have to convert any value you load to NaN.  Or somehow change it to the cannonical ‘NaN’.   here is a way to convert text to ‘NaN’.    This might also clean up any automated conversion for columns – so if 99.9% of your column data is float, but one value text or something – it will though an internal exception and keep the column as a object.  The data my program generates puts a “<no_value_provided>”, the parm to use is na_values ( and it can be a list if your data has more than one notation.

def date_utc(x): return dateutil.parser.parse(x[:20], ignoretz=True)
obs1 = pd.read_csv(target_csv, parse_dates=[9], date_parser=date_utc,
                    dtype = { 'wind_mph': 'float64'},
                    na_values = "<no_value_provided>")

 

Leave a Reply