I have the following DataFrame.
import pandas as pd
lista = [['Producto1', '06/22/2012', 66.72721, 17.995],
['Producto1', '09/18/2017', 240.56891, 19.244],
['Producto1', '01/08/2018', 219.24285, 17.459],
['Producto1', '03/06/2018', 667.32977, 18.245],
['Producto2', '10/07/2018', 49.328, 254.64],
['Producto3', '27/04/2016', 733.27266, 13.7643],
['Producto3', '06/12/2019', 1213.05103, 13.71],
['Producto4', '10/08/2012', 44986.57615, 10.4718] ]
df_aux = pd.DataFrame(lista)
df_aux.columns = ["Producto", "Fecha", "Cantidad", 'Precio']
df_aux['Total'] = df_aux['Cantidad'] * df_aux['Precio']
df_aux
Bring back.
Producto Fecha Cantidad Precio Total
0 Producto1 06/22/2012 66.72721 17.9950 1200.756144
1 Producto1 09/18/2017 240.56891 19.2440 4629.508104
2 Producto1 01/08/2018 219.24285 17.4590 3827.760918
3 Producto1 03/06/2018 667.32977 18.2450 12175.431654
4 Producto2 10/07/2018 49.32800 254.6400 12560.881920
5 Producto3 27/04/2016 733.27266 13.7643 10092.984874
6 Producto3 06/12/2019 1213.05103 13.7100 16630.929621
7 Producto4 10/08/2012 44986.57615 10.4718 471090.4281
The DataFrame has four columns: 'Products', 'Dates', 'Quantity', 'Price', 'Total' I want to group by products, and obtain a DataFrame with only two columns, the sum of 'Quantity' and 'Total. I execute
df_productos = df_aux.groupby(by= ['Producto']).sum( )
df_productos
Which brings me back
Cantidad Precio Total
Producto
Producto1 1193.86874 72.9430 21833.456820
Producto2 49.32800 254.6400 12560.881920
Producto3 1946.32369 27.4743 26723.914495
Producto4 44986.57615 10.4718 471090.428128
With the groupiby() method, can I get it not to return (add) the prices column? I will appreciate suggestions.
To define which columns and aggregations you want to use you can use the
agg
pandas groups method.You can use the
iloc
and/or functionloc
to select the columns you need. Something like: