Replacing missing data with the mean of a subgroup in R -

- June 15, 2012

i have table in there missing data i'd replace mean of other, related data, based on conditions. have toy data show problem below:

var1    var2    var3 123.1    2.1    113 166.5    2.1    113 200.3    2.1    112  na      2.1    113  na      2.1     na  212.1    3.3    112     ...     ...    ...

what i'd able to fill in na values var1 mean of va1 in case both have same var2 , var3.

ie, first na in var1 column, matches on both var2 , var3 1st , 2nd entries, value of (123.1 + 166.5) / 2 .

the second na in var1 column missing var3 information given mean of other var1 values var2 = 2.1.

i'm relatively new r , can't seem conditional logic correct - in advance!

what i'd able to fill in na values var1 mean of var2 in case both have same var3.

hmm... don't think that's want, that:

means <- tapply(var2, var3, mean, na.rm=t) var1[is.na(var1)] <- means[match(var3[is.na(var1)], sort(unique(var3)))]

Search This Blog

Yet

Replacing missing data with the mean of a subgroup in R -

Comments

Post a Comment

Popular posts from this blog

swift - How to change text of a button with a segmented controller? -

python - How to create jsonb index using GIN on SQLAlchemy? -

PHP DOM loadHTML() method unusual warning -