What is a promise in Javascript?

Question

kev

Asked: 2021-10-31 04:08:29 +0800 CST 2021-10-31 04:08:29 +0800 CST 2021-10-31 04:08:29 +0800 CST

How to group data inside a loop in R

772

I have the dataset downloaded from google mobility report and the idea is to process it and group it by country and by month in order to have a more concise summary. When I forrun through all the mobility variants and export it to excel, the values appear empty or the values are all the same. I leave the script for your consideration.

library(dplyr)
library(lubridate)
library(tidyr)
library(writexl)

Global_Mobility_Report <- read.csv("C:/Global_Mobility_Report.csv")
View(Global_Mobility_Report)

Global_Mobility_Report$date <- as.Date(Global_Mobility_Report$date)                                     
Global_Mobility_Report$mes <- month(Global_Mobility_Report$date)

names <- colnames(Global_Mobility_Report)

for (i in 9:14) {
  
  data = paste("Global_Mobility_Report$", names[i], sep = "")
  
  df <- Global_Mobility_Report %>%
    group_by(mes, country_region) %>% 
    summarise(value_mean = mean(data, na.rm = TRUE))
  
  dataset <- df %>% 
    spread(mes, value_mean)
  nombre = paste(names[i], ".xlsx")
  
  write_xlsx(dataset,nombre)
  
}

The result I expect is an excel that contains the months in the first row, and the countries in the first column. Have one tab per variable

1 Answers

Voted

mpaladino · Answer 1 · 2021-10-31T07:54:37+08:00

Here's what I think is a solution, although I'm not entirely clear on the problem. Anyway it could help you solve it or understand it better.

Bookstores:

library(readr)
library(tidyverse) # para dplyr y purr, se podrían cargar por separado
library(lubridate) # para month


gmr <- read_csv("Global_Mobility_Report.csv", 
                col_types = cols(date = col_datetime(format = "%Y-%m-%d"))) #Defino aquí como fecha a date, ya lo lee así

Summary of averages

Instead of using a loop and fixed column positions (9:14) I use summarise(across())and indicate on which columns taking advantage of the fact that the ones of interest all end in the same suffix: "from_baseline". I use the argument na.rm = TRUEin meanbecause if there are any missing the function returns NA.

gmr %>% 
  mutate(mes = month(date)) %>% 
  group_by(country_region, mes) %>% 
  summarise(across(ends_with("from_baseline"),  
            mean, na.rm = TRUE)) -> medias_tidy

From the way I work with the format of the object, medias_tidy it seems to me that it is the final format: each row is an observation (average by country and month) and each column is a variable (type of activity). However, spreadI understand that you want each column to be a month/variable combination and each row a country. If not, you could better define the expected results in the question, you can always edit it.

country list

Since I think you want an excel for each country, I start by converting that data.frame into a list, in which each element is a country. With this I advance to the next step which will be to write a file by country. The problem is that since I have a list I will need an iterator to do the operations inside that list. In this case use map(), in which an anonymous function can be declared using the symbol ~, instead of using function(x) {}. Using map()pivoting the data to make each column a combination of month and variable. I specify the name of those columns by pasting the month and value with names_glue.

pivot_wider >> spread

I get a list where each item is a data.frame with a single row. Are you looking for that?

split(medias_tidy, medias_tidy$country_region) %>% 
  map(~pivot_wider(.x, 
                   names_from  = mes, 
                   values_from = ends_with("from_baseline"), 
                   names_glue  = "{mes}_{.value}")) -> lista_ancha_por_pais

write the files

To write it to disk I use purr::iwalk(), which is from the same family as map(), but instead of generating a list it only generates the side effect, in this case writing the file. I use the version iwalk()that allows me to simultaneously iterate over a list (argument .x) and over the names of the list (argument .y). I go over the list above and for each item I write a file and name it the name of the item in the list (the country) and the extension .csv. In this test I write .csv instead of .xlsx because I don't have the library that contains it installed, but it is more or less the same.

iwalk(lista_ancha_por_pais[1:2], 
      ~write_csv(.x, 
                 path = paste0(.y, ".csv"))
      )

I hope it helps you, if it does not solve it, indicate it in a comment or, better yet, specify your question better. I understand that applying iterators and handling lists has a steep learning curve, but functional programming is very powerful and by creating the objects step by step it is much easier to identify errors vs. do it inside a loop.

All the code together:

library(tidyverse)
library(lubridate)
gmr <- read_csv("Global_Mobility_Report.csv", 
                                   col_types = cols(date = col_datetime(format = "%Y-%m-%d")))

gmr %>% 
  mutate(mes = month(date)) %>% 
  group_by(country_region, mes) %>% 
  summarise(across(ends_with("from_baseline"),  
            mean, na.rm = TRUE)) -> medias_tidy

split(medias_tidy, medias_tidy$country_region) %>% 
  map(~pivot_wider(.x, 
                   names_from =mes, 
                   values_from = ends_with("from_baseline"), 
                   names_glue = "{mes}_{.value}")) -> lista_ancha_por_pais

iwalk(lista_ancha_por_pais[1:2], 
      ~write_csv(.x, 
                 path = paste0(.y, ".csv"))
      )

Update

From a comment by @kev is a solution to the problem better specified.

library(tidyverse)
library(lubridate)
gmr <- read_csv("Global_Mobility_Report.csv", 
                col_types = cols(date = col_datetime(format = "%Y-%m-%d")))

gmr %>% 
  mutate(mes = month(date)) %>% 
  group_by(country_region, mes) %>% 
  summarise(across(ends_with("from_baseline"),  
                   mean, na.rm = TRUE)) -> medias_tidy

medias_tidy %>% 
# Paso a formato largo para después poder separar por variable
  pivot_longer(cols = retail_and_recreation_percent_change_from_baseline:residential_percent_change_from_baseline, 
               names_to = "variable", 
               values_to = "valor") %>%
  split(.$variable) %>% 
# Dentro de cada elemento de la lista (variables) paso a formato ancho, para que cada mes sea una columna
  map(~pivot_wider(.x, names_from = mes, values_from = valor)) -> lista_variables_meses

library(openxlsx) #Esto es una maravilla que escribe xlsx sin la dependencia de Rjava que tienen otras librerías


write.xlsx(lista_variables_meses, file = "test_excel.xlsx") #Si le pasas una lista de data.frame por defectos hace una hoja nombrada por cada df

How to group data inside a loop in R

Bookstores:

Summary of averages

country list

write the files

All the code together:

Update

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?