c# - Using IEnumerable.Except on KeyCollection vs. exploiting Dictionary.ContainsKey for mutual subtractions and intersection in relation to performance -
i have 2 dictionaries dictionary<string, object>
. need find intersection (i mean keys intersection) , a\b , b\a subtractions , make actions objects (in fact objects entityframework entities , have mark state modified
, added
, deleted
respectively, though it's not relevant question). imagine simplest venn diagram.
i want efficient way. think have 2 choices:
1) implement set of generic extension methods internally operate ienumerable
methods on keycollection
exceptbykey
, example:
public static dictionary<tkey, tvalue> exceptbykeys<tkey, tvalue>(this dictionary<tkey, tvalue> dict1, dictionary<tkey, tvalue> dict2) { return dict1.keys.except(dict2.keys).todictionary(key => key, key => dict1[key]); }
then operate these methods separately process each of 3 groups. there know keycollection.contains
method internally uses dictionary<tkey, tvalue>.containskey
method both o(1). except
method run in o(n), is right? need use once each dictionary , somehow detect intersected part, can done implicitly though first iterating on entities in 1 dictionary , marking them belonging intersection. so, o(n) + o(n + m)?
2) iterate on dictionaries calling containskey
method on other dictionary each element , doing appropriate thing. seems me better solution because o(n + m) complexity.
so, questions are: - right in calculations? - there better way have not thought accomplish want?
update 19/06/2015
so i've chosen second case , works ok. here's implementation in wild
using (var = new hostentities()) { var dbharddrives = he.harddrive.where(_ => _.hostname == _address).todictionary(_ => _.name, _ => _); foreach (var dbhd in dbharddrives) { if (wmiharddrives.containskey(dbhd.key)) { he.entry(dbhd.value).state = entitystate.detached; he.entry(wmiharddrives[dbhd.key]).state = entitystate.modified; } else { he.entry(dbhd.value).state = entitystate.deleted; } } foreach (var wmihd in wmiharddrives) { if (!dbharddrives.containskey(wmihd.key)) { he.entry(wmihd.value).state = entitystate.added; } } he.savechanges(); }
your reasoning looks sound me. linqs except()
iterates on first collection putting hashset
before iterating on second collection, performing lookups against hashset
- o(n + m). extension method therefore o(n + m) too. mention, if want calculate 3 sets of additions, deletions , intersections, have call multiple times, making option 2 more preferable.
you trying outer join, , able evaluate left, inner , right items separately. o(n + m) solution use this
public static joinresult<tkey> joinkeys<tkey, tvalue>(this idictionary<tkey, tvalue> first, idictionary<tkey, tvalue> second) { var left = new list<tkey>(); var inner = new hashset<tkey>(); // hashset optimize lookups var right = new list<tkey>(); foreach (var l in first.keys) // o(n) { if (second.containskey(l)) inner.add(l); else left.add(l); } foreach (var r in second.keys) // o(m) (longhand clarity) { if (!inner.contains(r)) right.add(r); } return new joinresult<tkey> { left = left, inner = inner, right = right }; } public class joinresult<t> { public ienumerable<t> left { get; set; } public ienumerable<t> inner { get; set; } public ienumerable<t> right { get; set; } }
Comments
Post a Comment