It turns out that I have a CSV file about covid and I want to find out who has received more PCR, if men and women. The SEX variable contains whether the individual is male or female, and the PCR variable contains the PCRs performed by each individual. However, I don't know how I could calculate this. I am thinking that I will still have to make a conditional if/else
to indicate that if the individual is a man, add the PCR performed and if it is a woman, add them on the other hand, to finally compare who has received more. If so, I have to investigate more about the use of the if conditional since I still don't know how to use it very well in the shell. If there is another way to do it, I can't think of anything.
This is what I have done so far to locate the data I need from the CSV file:
cat covid19.csv | cut -d "," -f4,8 | head -15
Where the first 15 results would be the following:
SEXE,PCR
Home,2
Dona,0
Dona,1
Dona,0
Dona,1
Dona,0
Dona,0
Home,0
Home,0
Home,0
Home,1
Home,0
Dona,1
Dona,5
Would anyone know how I could calculate the sum of the CRPs for men and for women? I don't know if in this case it can be used sed
or awk
but I am not allowed to use them in this exercise, so I would look for another alternative .
If you install datamash, you could do it with:
Resulting, from a csv file type:
So when you run the command
datamash
, you get:Where:
-t ,
indicates that the separator is commas-g 1
indicates to group based on the first fieldsum 2
indicates that, of the grouped, the second column will be addedThis is assuming the file is as I described above. Regardless of the file, substitute the "1" for the column that contains the genus, and the "2" for the one that has the numerical data of the PCRs.
Update
I came up with another way with a script and using only bash:
Resulting in:
In the script, inside the
while
, I assumed that the first variable after the comma is the gender, and the second is the number. If not, assign the variable names according to their corresponding position in your csv file.I would do something like this (ugly, but it works):
Doing this kind of thing in bash is a bit tricky. I would recommend you to look at python.