I have a DataFrame with a column with dates of type object
and I have to create a column that only shows the years. The column has this format:
date
0 2019-06-08
1 2019-06-08
2 2019-06-04
...
When I have separated it using str.split("-")
it, it has been divided into sublists, in this way:
0 [2019, 06, 08]
1 [2019, 06, 08]
2 [2019, 06, 08]
so when I try to choose only the first part of those sabers by doing df[:,0]
or using axis
, I get an error because each list is unique. I have also tried the parameter maxsplit
but I get an error. How could I choose that first part of all the lists?
First clarify that the argument
maxsplit
does not exist inpandas.Series.str.split
, its name isn
. It does not solve the problem, but for efficiency it can be used.Simply reuse
pandas.Series.str
on the output ofpandas.Series.str.split
to allow it to be indexed in vectorized form:Although it would be more direct to just use a slicing if your format is always
yyyy-*
:If you also want to convert to integer, apply
pandas.Series.astype
: