What is a promise in Javascript?

Question

Asked: 2020-01-27 15:12:02 +0800 CST 2020-01-27 15:12:02 +0800 CST 2020-01-27 15:12:02 +0800 CST

Unnesting dictionary in Dataframe_Python

772

Good day,

I have a problem unnesting the following

df=
    Node1                               ; Node2
0   (22, {'Y': '996.3', 'X': 773.6})    ;(56, {'Y': '996.1', 'X': 773.1})
1   (23, {'Y': '996.5', 'X': 773.8})    ;(57, {'Y': '996.30', 'X': 773.2})
2   (24, {'Y': '996.8', 'X': 773.6})    ;(58, {'Y': '996.16', 'X': 773.69})
3   (25, {'Y': '996.7', 'X': 773.6})    ;(59, {'Y': '996.60', 'X': 773.15})
[4 rows x 2 columns]

type(df)
pandas.core.frame.DataFrame

How can I unnest and convert this Dataframe into the following:

      Node1                ; Node2
     Num1 ; Y1;  X1      ;Num2; Y2;  X2 
0   22; 996.3; 773.6     ;56;  996.1;  773.1
1   23; 996.5; 773.8     ;57; 996.30;  773.2
2   24; 996.8; 773.6     ;58;  996.16; 773.69
3   25; 996.7; 773.6     ;59; 996.60; 773.15

I will be grateful for your help or any idea how I could do it.

Cheers

1 Answers

Voted

abulafia · Answer 1 · 2020-01-27T16:01:42+08:00

There are several difficulties with your input data. In addition to each cell in your dataframe being a tuple, and the second element of the tuple being a dictionary, on top of that the dictionary values are either strings (in Y) or floats (in X). I understand that you want to convert all of them floatto your final structure.

This is the value of dfthat I will use as input, the same as the one you provided, as shown by a print():

>>> print(df)
                              Node1                               Node2
0  (22, {'Y': '996.3', 'X': 773.6})    (56, {'Y': '996.1', 'X': 773.1})
1  (23, {'Y': '996.5', 'X': 773.8})   (57, {'Y': '996.30', 'X': 773.2})
2  (24, {'Y': '996.8', 'X': 773.6})  (58, {'Y': '996.16', 'X': 773.69})
3  (25, {'Y': '996.7', 'X': 773.6})  (59, {'Y': '996.60', 'X': 773.15})

The first thing that comes to my mind is to take a column, for example, df.Node1and use its values to create a new dataframe, using pd.DataFrame.from_items(), since this constructor expects me to pass it a list whose elements are tuples (matches) whose second elements have to be dictionaries (matches also).

However it does not produce the desired result:

>>> print(pd.DataFrame.from_items(list(df.Node1)))
      22     23     24     25
X  773.6  773.8  773.6  773.6
Y  996.3  996.5  996.8  996.7

But we are very close. If we do the transpose of this ( operator .T) that changes rows by columns, we almost have it. By the way I can use .applymap()to convert to floatall elements:

>>> print(pd.DataFrame.from_items(list(df.Node1)).T.applymap(float))
        X      Y
22  773.6  996.3
23  773.8  996.5
24  773.6  996.8
25  773.6  996.7

It would only be missing that the numbers 22, 23, 24, 25, instead of being the index, were another column called "Num1", and rename the columns "X", "Y" so that they are "X1", "Y1" . This can be done by giving the index the name "Num1" and then doing a reset_index().

Once that is done, we can do the same with the column Num2and finally use pd.concat()to concatenate the dataframes obtained in each case.

The following code implements these ideas:

p1 = pd.DataFrame.from_items(list(df.Node1)).T.applymap(float)
p1.index.name = "Num1"
p1.columns = ["X1", "Y1"]
p1.reset_index(inplace=True)

p2 = pd.DataFrame.from_items(list(df.Node2)).T.applymap(float)
p2.index.name = "Num2"
p2.columns = ["X2", "Y2"]
p2.reset_index(inplace=True)
r = pd.concat([p1, p2], axis=1)

The result in ris:

   Num1     X1     Y1  Num2      X2      Y2
0    22  773.6  996.3    56  773.10  996.10
1    23  773.8  996.5    57  773.20  996.30
2    24  773.6  996.8    58  773.69  996.16
3    25  773.6  996.7    59  773.15  996.60

additional note

I'm not entirely sure if you also want the columns to be hierarchical, that is, to have the headers "Node1" and "Node2" grouping the three respective columns. If this is the case, just change the last line to:

r = pd.concat([p1, p2], axis=1, keys=["Node1", "Node2"])

to get:

  Node1               Node2                
   Num1     X1     Y1  Num2      X2      Y2
0    22  773.6  996.3    56  773.10  996.10
1    23  773.8  996.5    57  773.20  996.30
2    24  773.6  996.8    58  773.69  996.16
3    25  773.6  996.7    59  773.15  996.60

although in this case I don't see any point in "renaming" the columns X, Y to be X1, Y1 and X2, Y2. They could very well keep their original names (and the columns Num1 and Num2 both named Num). Namely:

p1 = pd.DataFrame.from_items(list(df.Node1)).T.applymap(float)
p1.index.name = "Num"
p1.reset_index(inplace=True)

p2 = pd.DataFrame.from_items(list(df.Node2)).T.applymap(float)
p2.index.name = "Num"
p2.reset_index(inplace=True)
r = pd.concat([p1, p2], axis=1, keys=["Node1", "Node2"])

Resulting in:

  Node1               Node2                
    Num      X      Y   Num       X       Y
0    22  773.6  996.3    56  773.10  996.10
1    23  773.8  996.5    57  773.20  996.30
2    24  773.6  996.8    58  773.69  996.16
3    25  773.6  996.7    59  773.15  996.60

There is no ambiguity in those repeated names, due to the hierarchy. To access, for example, the X column of Node2, you could put:

r[("Node2", "X")]

0    773.10
1    773.20
2    773.69
3    773.15
Name: (Node2, X), dtype: float64

Unnesting dictionary in Dataframe_Python

additional note

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?