Create a Data Frame to be able to list the best-selling brands from the following dataset:
What I did to fix the problem was to "Hardcode" the marks and in the new DataFrame group it. In short, the heart of the solution is here:
df['brand'] = pd.np.where(df.title.str.contains("Samsung"), "Samsung",
pd.np.where(df.title.str.contains("Sansung"), "Samsung",
pd.np.where(df.title.str.contains("Galaxy"), "Samsung",
pd.np.where(df.title.str.contains("Samgung"), "Samsung",
pd.np.where(df.title.str.contains("Philco"), "Philco",
pd.np.where(df.title.str.contains("Kanji"), "Kanji",
pd.np.where(df.title.str.contains("Nokia"), "Nokia",
pd.np.where(df.title.str.contains("Lenovo"), "Lenovo",
pd.np.where(df.title.str.contains("Xiaomi"), "Xiaomi",
pd.np.where(df.title.str.contains("Energizer"), "Energizer",
pd.np.where(df.title.str.contains("Vodafone"), "Vodafone",
pd.np.where(df.title.str.contains("Bgh"), "Bgh",
pd.np.where(df.title.str.contains("Zte"), "Zte",
pd.np.where(df.title.str.contains("Nextel"), "Nextel",
pd.np.where(df.title.str.contains("Blu"), "Blu",
pd.np.where(df.title.str.contains("Quantum"), "Quantum",
pd.np.where(df.title.str.contains("Micromax"), "Micromax",
pd.np.where(df.title.str.contains("Cat"), "Cat",
pd.np.where(df.title.str.contains("Lava"), "Lava",
pd.np.where(df.title.str.contains("Infinix"), "Infinix",
pd.np.where(df.title.str.contains("Wiko"), "Wiko",
pd.np.where(df.title.str.contains("Tecno"), "Tecno",
pd.np.where(df.title.str.contains("Meizu"), "Meizu",
pd.np.where(df.title.str.contains("Vivo"), "Vivo",
pd.np.where(df.title.str.contains("Asus"), "Asus",
pd.np.where(df.title.str.contains("Oneplus"), "Oneplus",
pd.np.where(df.title.str.contains("Microsoft"), "Microsoft",
pd.np.where(df.title.str.contains("Huawei"), "Huawei",
pd.np.where(df.title.str.contains("Sony"), "Sony",
pd.np.where(df.title.str.contains("Lg"), "Lg",
pd.np.where(df.title.str.contains("Panasonic"), "Panasonic",
pd.np.where(df.title.str.contains("Plum"), "Plum",
pd.np.where(df.title.str.contains("Yu"), "Yu",
pd.np.where(df.title.str.contains("Verykool"), "Verykool",
pd.np.where(df.title.str.contains("Blackberry"), "Blackberry",
pd.np.where(df.title.str.contains("Alcatel"), "Alcatel",
pd.np.where(df.title.str.contains("Apple"), "Iphone",
pd.np.where(df.title.str.contains("Iphone"), "Iphone",
pd.np.where(df.title.str.contains("Htc"), "Htc",
pd.np.where(df.title.str.contains("Motorola"), "Motorola",
pd.np.where(df.title.str.contains("Moto"), "Motorola",
pd.np.where(df.title.str.contains("Acer"), "Acer",
pd.np.where(df.title.str.contains("Google"), "Google",
pd.np.where(df.title.str.contains("Honor"), "Honor",
pd.np.where(df.title.str.contains("Oppo"), "Oppo",
pd.np.where(df.title.str.contains("Realme"), "Realme", "Otros"))))))))))))))))))))))))))))))))))))))))))))))
However, what I want to do is evaluate if a cell contains values that are given in a list, which would iterate until it finds it. But, for example, when I think of a function that returns all the names of a list, I do it as follows:
def listPhoneBrands(list_phones):
for brand in list_phones:
return brand
The function returns only the first one (because return terminates a function). So: How is the procedure carried out in which a function returns all the elements that are passed to it by a list? and Is there a more optimal way to evaluate the content of a dataDrame than how I am doing it?
The rest of the code can be found in the gitHub repository: hardcoding_solution
Ultimately, I'm looking for a more scalable solution.
To give an order we can create a dictionary where the key is the mark and the value is a list of the text options that are mapped to the mark .
And then a function is created that chooses the option according to the rules that were established and apply them using
map
.