I have a csv file where it gives me information for each identifier which RGB color corresponds to it, which is as follows:
Identificador RGB
"1103", "(0,255,0)"
"1102", "(255,255,0)"
"1101", "(153,51,153)"
As you can see, all the fields are of type string
and it would be interesting if the identifier were an integer and RGB a tuple of integers.
My goal is to add a third column to the dataframe with the hex conversion. For this I have a very simple function that transforms from rgb to hexadecimal and to all this I add the use of the Pandas library,
from pandas import read_csv, DataFrame, concat
def rgb_to_hex(rgb):
return '%02x%02x%02x' % rgb
def getColors():
# Selecciono fichero y las columnas de interés
df = read_csv('from/colors.csv')
df = df[['Identificador','RGB']]
df['HEX'] = df['RGB'].apply(rgb_to_hex)
def main():
getColors()
print('FIN')
if __name__ == '__main__':
main()
The problem comes from what was said before, that RGB is a string
and I need a tuple. How would it be done? I have tried df['HEX'] = df['RGB'].apply(tuple(rgb_to_hex))
and it still gives error.
Can someone guide me on what I should do?
Thank you very much and greetings
PS:
Convert the identifier column to numeric if you would know, df["Identificador"] = to_numeric(df["Identificador"])
the problem is the tuple.
The command
eval(cadena)
can be used to evaluate what is inside a string as if it were a python expression. In your case, if the string contains"(153,51,153)"
the result ofeval()
will be the tuple(153,51,153)
, just what you need.So you can write:
However this solution is potentially dangerous if what comes in the string
rgb
is not the tuple you expect, but any other python expression. You would be allowing code injection (someone can prepare a malicious dataframe with code in that column for you to execute viaeval()
).For this particular case, you can also process that string "by hand", removing the initial and final parentheses and dividing what it contains by the comma, to then convert each of the resulting pieces to an integer. So: