What is a promise in Javascript?

Question

Asked: 2022-09-19 17:42:19 +0800 CST 2022-09-19 17:42:19 +0800 CST 2022-09-19 17:42:19 +0800 CST

efficient way to add data to dataframe

772

I have a list with 140 items, which I request 3 data from an api and store their data in a list, I do all this by reusing the same lists, since I receive the data, store them in lists and add them to the dataframe and I eliminate the content of the lists to repeat again, to what would be about 420 columns of data that I add to my dataframe. the code works fine, without problems, but in the console it shows me the following:

Performance Warning: DataFrame is highly fragmented. This is usually the result of calling frame.insertmany times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, usenewframe = frame.copy()

my code would be something like this:

import os
import pandas as pd

data=[1,2,3,4,5,7,8,9....140]

alto=[]
largo=[]
ancho=[]


df = pd.DataFrame()

for i in data:

    alto.clear()
    largo.clear()
    ancho.clear()

    obj=client_get_data(i)

    """ respuesta de la API
la misma me da unos 50 resultados parecidos a estos, ya que los objetos cambian sus dimensiones con el tiempo y la consulta la hago desde que se creo
[
  [
    150,      // alto
    '20',       // ancho
    '70',       // largo
    1354255   //ignorar
  ]
]"""

        
    for element in obj:           
        alto.append(element[0])
        ancho.append(element[1])
        largo.append(element[2])

    #agrego las listas al dataframe y convierto dos de sus datos a tipo float para su uso final ya que los recibo string
    df[f'alto {i}']=alto
    df[f'ancho {i}']=ancho
    df[f'ancho {i}'] = pd.to_numeric(df[f'ancho {i}'], downcast="float")
    df[f'largo {i}']=largo
    df[f'largo {i}'] = pd.to_numeric(df[f'largo {i}'], downcast="float")

    df[f'volumen {i}']=df[f'alto {i}']*df[f'ancho {i}']*df[f'largo {i}']

    #elimino esas columnas del dataframe ya que no las utilizare mas

    df=df.drop([f'alto {i}', f'ancho {i}',f'largo {i}'], axis=1) 

#para terminar guardo los datos en un excel
df.to_excel(f"{os.path.dirname(__file__)}\data.xlsx",index=False)

would there be a way to add the data to the dataframe without that warning popping up

1 Answers

Voted

Nigan · Answer 1 · 2022-09-21T11:59:32+08:00

the best way to add "too many" columns to the dataframe would be like this:

using the function concat, it creates a dataframe of several data at once (in my case lists), while appendbefore adding the content it creates a copy of it, hence the performance error that it showed me (attached image at the end of comparison)

df=pd.concat([pd.DataFrame(zip(alto,largo,ancho),columns=[f'alto {i}',f'largo {i}',f'ancho {i}'])])

after having modified those columns to taste and having eliminated the ones I didn't want, I use concatto increase the final dataframe, the one that will be used after collecting all the data

df_Final=pd.concat([df_Final, df],axis=1)

that would be all, now the error does not appear again and our code improves in performance, I hope my problem has been of help to someone else

efficient way to add data to dataframe

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?