Let's imagine that I have a csv file with the variables (columns) Name and Job, such that:
Pedro Ingeniero
Juan Electricista
Antonio Electricista
Jose Arquitecto
Roberto Ingeniero
Would there be any way to use sed
to be able to see the Job field and the number of observations that each one has or is it not possible? It is that I am searching on the internet how to filter by fields and count the records through sed
but for now I have not had much success in my search.
Continuing with the example, the desired output would be:
Ingeniero 2,
Electricista 2,
Arquitecto 1.
Thanks in advance.
In the end I've seen that I can't do what I want using sed explicitly, so what I've done is delete the whitespace with sed and use cut, sort and uniq to count the lines that were repeated:
sed '/^$/d' fichero.csv | cut -d ',' -f2 | sort | uniq -c | sort -nr
Sed is a tool for parsing streams (not surprisingly, it's stream editor , s-ed). To process data and operate with it, it is better to combine with
sort
and so on, as you did. Or use something more complex like Awk, Perl or who knows what.In this case, Awk does it for us in a rather elegant way:
Where:
{contador[$2]++}
Define an array
contador[]
whose indices are the 2nd column of your file. Every time there is an element of type "Engineer", "Electrician" or others, the array adds one tocontador[Ingeniero]
, so that it finally ends up having values of typecontador[Ingeniero]=2
,contador[Arquitecto]=1
and others.END{for(item in contador) print item, contador[item]}
Once it has finished traversing the file, the block is executed
END{}
. In it, we iterate over the content of the arraycontador[]
and print the value of each of the records that we have stored.In your case, it returns:
You can also try this:
Which only applies a dirty trick
awk
where the second field becomes the entire record. Then the output is ordered withsort
and subsequently the occurrences are counted withuniq -c
.Resulting in:
Now, if you really love
sed
with all your heart, or it's a matter of life or death, you can use this script that I adapted from this answer :Just save it to a file called sed_script (or whatever you like), give it execute permissions with
chmod u+x sed_script
.Finally run it followed by the name of your file:
As the comments in the script indicate, comment and uncomment as you like this or that profession.