linux - How to handle missing values while calculating average in shell script? -


i have dataset many missing values double slash (//). part of data

input.txt 30 // 10 40 23 44 // // 31 // 54 // , on 

i calculate average in each 6 rows interval without considering missing values. trying this, not getting need.

awk '{if ($1 -ne "//") sum += $1} (nr%6)==0 {print(sum/nr)}' input.txt 

it giving

24.5 19.33 

but answer should come

29.4 42.5 

you need modify awk bit obtain output as

$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0}' 

test

$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0}' file 29.4 42.5 

what does?

  • !/\//{sum += $1; count++}

    • !/\// pattern checks if line doesn't contain /

    • {sum += $1; count++} when line doesn't contain / action performed, sums column 1, $1 sum , increments count tells how many digits awk has seen till next print.

  • nr%6==0{print sum/count;sum=count=0} when number of records, nr multiple of 6, print average, sum/count , resets count , sum variables.

edit

to print last set of lines may less 6 in number, can use end block as

$ awk '!/\//{sum += $1; count++} nr%6==0{print sum/count;sum=count=0} end{print sum/count}' file 29.4 42.5 
  • end{print sum/count}' end block updated when file reaches end.

edit 2

edge case when no numbers occur in 6 lines, above script can lead divide 0 error. print statement can formatted handle case.

print count ? (sum/count) : count;sum=count=0} 
  • this basic ternary operator checks if count non zero, if prints divided value , else print count, 0

test

$ awk '!/\//{sum += $1; count++} nr%6==0{print count ? (sum/count) : count;sum=count=0}' file 29.4 42.5 0 

Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -