Replacing missing data with the mean of a subgroup in R -


i have table in there missing data i'd replace mean of other, related data, based on conditions. have toy data show problem below:

var1    var2    var3 123.1    2.1    113 166.5    2.1    113 200.3    2.1    112  na      2.1    113  na      2.1     na  212.1    3.3    112     ...     ...    ...  

what i'd able to fill in na values var1 mean of va1 in case both have same var2 , var3.

ie, first na in var1 column, matches on both var2 , var3 1st , 2nd entries, value of (123.1 + 166.5) / 2 .

the second na in var1 column missing var3 information given mean of other var1 values var2 = 2.1.

i'm relatively new r , can't seem conditional logic correct - in advance!

what i'd able to fill in na values var1 mean of var2 in case both have same var3.

hmm... don't think that's want, that:

means <- tapply(var2, var3, mean, na.rm=t) var1[is.na(var1)] <- means[match(var3[is.na(var1)], sort(unique(var3)))] 

Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -