I have a CSV file with 16 variables that I would be interested in counting the missing column values from. I show the first rows of the file:
2009-01-09,,,0,,,,700.0,0.0,14,1.0,,,,3010,14
2009-01-10,,,0,,,,3050.0,0.0,61,1.0,,,,13129,61
2009-01-11,,,0,,,,4650.0,0.0,93,1.0,,,,20033,93
2009-01-12,,,7,,,,4700.0,0.0,102,1.0,5,,0.0,22031,94
2009-01-13,,,0,,,,6150.0,0.0,123,1.0,,,,26527,123
2009-01-14,,,1,,,,6450.0,0.0,133,1.0,0,,0.0,28276,129
2009-01-15,,,8,,,,6300.0,0.0,140,1.0,6,,0.0,30061,126
2009-01-16,,,2,,,,5400.0,0.0,114,1.0,0,,0.0,23854,108
2009-01-17,,,0,,,,5450.0,0.0,109,1.0,,,,23528,109
Practically all the references I have found on the internet have been to count the lines of a file, but not its missing data.
Would there be a way to count those cells where there are null values?
Thanks in advance.
You can do it with
awk
.Suppose we have this file:
So we can execute a program-text in
awk
:And it gives us:
In this script, for each row, we check each field for empty fields. With
-F ,
, we make each field defined by commas.Another option can be with
grep
, which, although it does not count the empty fields, only shows them:There is no point in putting the output here because you need the color set by
grep
with the parameter--color
.In this one-liner, only the complete lines are shown but with the empty fields between commas in red (it would be good to ask if you can only have this color with
grep
).Another option that occurs to me and that seems somewhat inefficient to me (and that is why I put it last), is something like this:
whose output is:
In this case, each rung will generate one output.