How can I filter the DF and show of all the varieties the one with the highest score for each one and other conditions?
For example with this line I can see one by one changing the variety:
dataDos[(dataDos.variedad == 'malbec') & (dataDos.puntaje==
dataDos.puntaje[dataDos.variedad == 'malbec'].max())]
This would show the most expensive Malbec in all of Mexico City:
How can I apply it to the entire DF and show me the one with the highest score by variety (they would be the ones that are painted blue)
UPDATE
I reached the desired goal by doing the following, but I'm leaving the question open because there has to be a simpler solution:
variedades = data.variedad.unique().tolist()
dfPuntajes = pd.DataFrame(columns = data.columns)
for x in variedades:
filtroUno = data[data.variedad == x]
filtroDos = filtroUno[filtroUno.puntaje == filtroUno.puntaje.max()]
filtroTres = filtroDos[filtroDos.precio_en_pesos == filtroDos.precio_en_pesos.max()]
filtroCuatro = filtroTres[filtroTres.creacion == filtroTres.creacion.min()]
dfPuntajes = dfPuntajes.append(filtroCuatro, ignore_index=True)
dfPuntajes['id'] = dfPuntajes.index
dfPuntajes.sort_values('puntaje', ascending=[0]).head(10)
Well, I don't quite understand the solution you give, since apparently you are left with the one with the highest score, and not with the one with the highest price as you requested in the statement.
In addition, you apply not only one, but several successive filters. I understand that you are looking for, if there are two with the same score, then keep the one with the highest price, and if there are also several prices, then with the one with the lowest creation date. This doesn't correspond to what you were asking initially, but anyway I think I have a shorter solution.
The trick is: first you order the dataframe according to the desired criteria (it would be by descending score, descending price and ascending creation). Then, on that dataframe, you remove the duplicates by looking at the "variety" column.
Namely:
And it outputs the same as with your method:
I have tested your method and the one you propose on the same example data you give. The difference is the column
id
I don't know why you reassign.