In a DataFrame I have created a Month column, where I have taken the number of the month from another column that had the format 2015.06.12. From the .06. removed, I have converted it into a number with a:
str.extract('(\.\d{2}\.)')
And I've removed the colon like this:
str.extract('(\d{2})')
Now what I wanted to do is change those numbers to the corresponding month. I have seen that it is possible to do it with the calendar module, I have tried to do it like this:
def month(x):
return calendar.month_name[x]
df['Month'] = df.Month.apply(month)
But I get the error: "list indices must be integers or slices, not float"
Also say that there are NaN cells, that in principle I want to leave them like this.
Any suggestions how I can do it?
Thank you!
The problem is the values
NaN
as I think you suspect.NaN
is actually represented as a float, hence the error in using it as an index tocalendar.month_name
. In fact, the column with theNaN
can't be converted toint
without losing theNaN
.You can deal with it in different ways, following your reasoning, a very simple one is to check in the function if the received value (
x
) is a value or notNaN
. A reproducible example:However, I would not get so complicated with regex and
calendar
, you can in principle do this with pandas in a vectorized way just by converting the strings with the dates to typeDateTime
and using the methodpandas.Series.dt.month_name
:I don't know what language you want for the month names, but you can change it by specifying the locale you want using the argument
locale
. Default isen_US
.