Replacing missing data with the mean of a subgroup in R -
i have table in there missing data i'd replace mean of other, related data, based on conditions. have toy data show problem below:
var1 var2 var3 123.1 2.1 113 166.5 2.1 113 200.3 2.1 112 na 2.1 113 na 2.1 na 212.1 3.3 112 ... ... ...
what i'd able to fill in na values var1
mean of va1
in case both have same var2
, var3
.
ie, first na in var1
column, matches on both var2 , var3 1st , 2nd entries, value of (123.1 + 166.5) / 2 .
the second na in var1
column missing var3
information given mean of other var1
values var2 = 2.1.
i'm relatively new r , can't seem conditional logic correct - in advance!
what i'd able to fill in na values var1 mean of var2 in case both have same var3.
hmm... don't think that's want, that:
means <- tapply(var2, var3, mean, na.rm=t) var1[is.na(var1)] <- means[match(var3[is.na(var1)], sort(unique(var3)))]
Comments
Post a Comment