I have a column that I load from a CSV, however they are loaded in the order that the CSV brings and not in calendar order (my data source does not order the dates in calendar order either).
Fecha Pais count
0 2017-06-01 Argentina 1
1 2017-06-01 China 31230
2 2017-06-01 Ecuador 1
3 2017-06-01 Egypt 2
4 2017-06-01 Latvia 360
5 2017-06-01 Portugal 1
6 2017-06-01 Slovak Republic 2
7 2017-06-01 Taiwan 2
8 2017-06-01 Ukraine 31
9 2017-06-01 United Kingdom 1
10 2017-06-02 Argentina 2
11 2017-06-02 Canada 1
12 2017-06-02 China 3980
13 2017-06-02 Slovak Republic 3
14 2017-06-02 Sweden 1
15 2017-06-02 Ukraine 99
16 2017-05-30 Argentina 1
17 2017-05-30 China 4022
18 2017-05-30 Ecuador 1
19 2017-05-30 France 16
20 2017-05-30 Germany 2
21 2017-05-30 Indonesia 1
22 2017-05-30 No Identificado 56
23 2017-05-30 Romania 1
24 2017-05-30 Russia 4
25 2017-05-30 Sweden 158
26 2017-05-30 Taiwan 1
27 2017-05-30 Ukraine 31
28 2017-05-30 Vietnam 18
29 2017-05-31 Argentina 3
30 2017-05-31 China 14477
31 2017-05-31 Czechia 35
32 2017-05-31 India 6
33 2017-05-31 Liberia 1
34 2017-05-31 No Identificado 1
35 2017-05-31 Republic of Korea 1
36 2017-05-31 Russia 1
37 2017-05-31 United States 3
If I do a reverse SORT on the data source, the file changes starting at 31, 30, 2 and 1
Either way, plotting plots in the order of the array (column) and not in calendar order (30,31,1,2), resulting in this (either 31,30,2,1 or 1 ,2,30,31).
How can I sort the date column in calendar order?
My code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib.dates import DateFormatter, DayLocator, AutoDateLocator, AutoDateFormatter
df = pd.read_csv("72hcountcountry.csv", delimiter=',', parse_dates = ['Fecha','count'], dayfirst=True)
grupos = df.groupby(['Pais'])
print df
fig, ax = plt.subplots()
color=iter(cm.rainbow(np.linspace(0,1,len(grupos))))
for nombre, grupo in grupos:
ax.plot_date(x = grupo['Fecha'], y = grupo['count'], color = next(color), marker='o', ls = 'solid', label=nombre)
locator = DayLocator()
formatter = AutoDateFormatter(locator)
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(formatter)
ax.autoscale_view()
ax.grid(True)
fig.autofmt_xdate()
ax.margins(0.05)
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.show()
I think the problem is that the data in the "Date" column is of type object
If you change them to type datatime you should be able to sort them by date
If you check the types of the dataframe you will see that now it is as datatime:
I've only used a couple of rows and changed the dates but it's ok for me now:
[m3_stackoverflowUPDATE]
The order of the array/column is not altered, the order is maintained and when invoking the graph it does so with the order of the array, not SORT.
I used
df = df.sort_values(["Fecha"])
Still the plot remains the same
If I have as DATE the type of the column in the array