I am using the information from the following file:
To read it and display it I use the following:
import pandas as pd
import chardet
import numpy as np
### Si quieren usar el codigo de ejemplo, la siguiente linea
### tarda bastante en hacerse porque busca la codificación que usa el
### archivo, ( si usan el archivo reducido es instantáneo)
### {'encoding': 'Windows-1252', 'confidence': 0.73, 'language': ''}
with open('Direccion donde esta el CSV', 'rb') as f:
result = chardet.detect(f.read())
datos = pd.read_csv('Direccion donde esta el archivo',
encoding=result['encoding'], low_memory=False)
df = pd.DataFrame(datos)
limpiarNaN = df.replace(np.nan, '0')
archivoConvertidoInt = limpiarNaN.describe(include = [np.number])
archivoConvertidoInt = limpiarNaN.replace("NR", "0")
archivoConvertidoInt['Wsets'] = archivoConvertidoInt.Wsets.astype(float)
archivoConvertidoInt['WRank'] = archivoConvertidoInt.WRank.astype(float)
archivoConvertidoInt.set_index("Location", inplace = True)
(archivoConvertidoInt.loc[['Adelaide' , 'St. Petersburg'], 'Series' :
'Round']
.groupby(["Location", "Series", "Court", "Surface", "Round"])
["Series",].count())
This returns me the following:
As you can see, it shows me information about the locations of Adelaide and St. Petersburg.
Now:
1) How can I show the same information but not only from Adelaide and St. Petersburg but from all the locations between them inclusive, if I use : to separate it gives me a syntax error.
and 2) Adding a sum at the end I can know how much is the total number of series, how can I add it as a new row?
How to get something like this:
The index of the dataframe (the "Location" column) is not ordered, so all the rows of the same location are not followed, which prevents you from using
.loc["Localidad1":"Localidad2"]
, you have to prefix it.sort_index()
.On the other hand, I think you have some extra brackets around the range of cities.
With the example dataframe you posted, I think the following does what you were looking for (I've changed it
St. Petesburg
toMontecarlo
, since St. Petersburg was not in the extract you posted)It generates a long dataframe that starts like this:
and ends like this:
Regarding adding a row with the total, in principle it should be as simple as:
But the problem in your case is that by having the dataframe a multi-index and adding a row to it that is not multi-index , it forces the rest of the dataframe to stop being multi-index (and convert each index into a tuple). To avoid this, let's create that last row with a multi-index as well , specifying text for all five levels of the index. So:
Now the table ends like this: