linux - How to handle missing values while calculating average in shell script? -
i have dataset many missing values double slash (//). part of data
input.txt 30 // 10 40 23 44 // // 31 // 54 // , on
i calculate average in each 6 rows interval without considering missing values. trying this, not getting need.
awk '{if ($1 -ne "//") sum += $1} (nr%6)==0 {print(sum/nr)}' input.txt
it giving
24.5 19.33
but answer should come
29.4 42.5
you need modify awk
bit obtain output as
$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0}'
test
$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0}' file 29.4 42.5
what does?
!/\//{sum += $1; count++}
!/\//
pattern checks if line doesn't contain/
{sum += $1; count++}
when line doesn't contain/
action performed, sums column 1,$1
sum
, incrementscount
tells how many digitsawk
has seen till next print.
nr%6==0{print sum/count;sum=count=0}
when number of records,nr
multiple of6
, print average,sum/count
, resetscount
,sum
variables.
edit
to print last set of lines may less 6
in number, can use end
block as
$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0} end{print sum/count}' file 29.4 42.5
end{print sum/count}'
end
block updated when file reaches end.
edit 2
edge case when no numbers occur in 6 lines, above script can lead divide 0 error. print statement can formatted handle case.
print count ? (sum/count) : count;sum=count=0}
- this basic ternary operator checks if count non zero, if prints divided value , else print
count
, 0
test
$ awk '!/\//{sum += $1; count++} nr%6==0{print count ? (sum/count) : count;sum=count=0}' file 29.4 42.5 0
Comments
Post a Comment