I have the following code:
import random
import pandas as pd
from datetime import datetime
inicio = datetime(2017, 1, 30)
final = datetime(2019, 3, 21)
datos = []
for i in range (0, 10000):
datos.append(inicio + (final - inicio) * random.random())
df = pd.DataFrame(datos)
df.rename(columns={0: "Fecha"}, inplace=True)
procesos = []
for a in range (1, 11):
procesos.append('Proceso' + str(a))
total = 0
proceso = []
for i in range (0, 10):
for j in range ( 0, 1000):
proceso.append(procesos[total])
total += 1
datosProceso = pd.DataFrame(proceso)
datosProceso.rename(index=str, columns={0: "Proceso"}, inplace=True)
This creates two df's with 10,000 random data, one has 10,000 dates and the other has 10,000 process data separated into 10 random processes i.e. 1000 process data1, 1000 process data2 etc.
Now how could I join the two dataframes into one that has two columns, processes and dates, I tried adding ids, with concat but it throws it down, join gives me an error, etc.
Without using the two DFs that are assembled quickly, I do it this way but it takes 2 to 3 minutes because it goes line by line in the position, supplanting the value in addition to the fact that SettingWithCopyWarning throws:
import random
import pandas as pd
from datetime import datetime
inicio = datetime(2017, 1, 30)
final = datetime(2019, 3, 21)
datos = []
for i in range (0, 10000):
datos.append(inicio + (final - inicio) * random.random())
df = pd.DataFrame(datos)
df.rename(columns={0: "Fecha"}, inplace=True)
df['proceso'] = ''
procesos = []
for a in range (1, 11):
procesos.append('Proceso' + str(a))
total = 0
posicion = 0
for i in range (0, 10):
for j in range ( 0, 1000):
df['proceso'][posicion] = procesos[total]
print(posicion)
posicion += 1
total += 1
What complicates is the index of the two,
dataframes
since the options are different,merge
they do not work as we would like in these cases. What can be done is to initialize the index of the twodata.frame
to a simple sequential number, which we can do withreset_index()
Conceptually, in this way, the two
dataframes
end up having the same index(1:9999)
, so now itmerge()
does what we expect: sequentially join bothdataframes
.