linux - How to handle missing values while calculating average in shell script? -
i have dataset many missing values double slash (//). part of data
input.txt 30 // 10 40 23 44 // // 31 // 54 // , on i calculate average in each 6 rows interval without considering missing values. trying this, not getting need.
awk '{if ($1 -ne "//") sum += $1} (nr%6)==0 {print(sum/nr)}' input.txt it giving
24.5 19.33 but answer should come
29.4 42.5
you need modify awk bit obtain output as
$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0}' test
$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0}' file 29.4 42.5 what does?
!/\//{sum += $1; count++}!/\//pattern checks if line doesn't contain/{sum += $1; count++}when line doesn't contain/action performed, sums column 1,$1sum, incrementscounttells how many digitsawkhas seen till next print.
nr%6==0{print sum/count;sum=count=0}when number of records,nrmultiple of6, print average,sum/count, resetscount,sumvariables.
edit
to print last set of lines may less 6 in number, can use end block as
$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0} end{print sum/count}' file 29.4 42.5 end{print sum/count}'endblock updated when file reaches end.
edit 2
edge case when no numbers occur in 6 lines, above script can lead divide 0 error. print statement can formatted handle case.
print count ? (sum/count) : count;sum=count=0} - this basic ternary operator checks if count non zero, if prints divided value , else print
count, 0
test
$ awk '!/\//{sum += $1; count++} nr%6==0{print count ? (sum/count) : count;sum=count=0}' file 29.4 42.5 0
Comments
Post a Comment