Merging two dataframes, removing duplicates and aggregation in R -


i have 2 dataframes in r named house , candidates.

house        house       region                 military_strength 1 stark           north              20000 2 targaryen       slaver's bay           110000 3 lannister       westerlands        60000 4 baratheon       stormlands         40000 5 tyrell          reach              30000   candidates    house               name                  region 1 lannister           jamie lannister       westros 2 stark               robb stark            north 3 stark               arya stark            westros 4 lannister           cersi lannister       westros 5 targaryen           daenerys targaryen    mereene 6 baratheon           robert baratheon      westros 7 mormont             jorah mormont         mereene 

i want merge 2 dataframes on basis of house. have done:

merge(candidates, house, by="house", sort=false) 

the output :

       house        name         region.x        region.y   military_strength  1 lannister    jamie lannister  westros     westerlands             60000  2 lannister    cersi lannister  westros     westerlands             60000  3 stark         robb stark      north       north                   20000  4 stark         arya stark      westros     north                   20000  5 targaryen daenerys targaryen  mereene     slaver's bay                110000  6 baratheon   robert baratheon  westros     stormlands              40000 

i want remove second name candidate every house(if any), military_strength should added first candidate of same house.

for eg:

4 stark         arya stark      westros     north                   20000 

would removed but, 20000 added row3 robb stark military_strength. how in appropriate way?

starting data.frame df1 obtained after merge(), 1 proceed with:

df1$military_strength <- with(df1, ave(military_strength, house, fun=sum)) df1[!duplicated(df1$house),] #      house               name region.x        region.y military_strength #1 lannister    jamie lannister  westros westerlands            120000 #3     stark         robb stark    north       north             40000 #5 targaryen daenerys targaryen  mereene    slaver's bay            110000 #6 baratheon   robert baratheon  westros  stormlands             40000 

data used in example:

df1 <- structure(list(house = structure(c(2l, 2l, 3l, 3l, 4l, 1l),                  .label = c("baratheon", "lannister", "stark", "targaryen"),                  class = "factor"), name = structure(c(4l, 2l, 5l, 1l, 3l, 6l),                  .label = c("arya stark", "cersi lannister", "daenerys targaryen",                  "jamie lannister", "robb stark", "robert baratheon"),                  class = "factor"), region.x = structure(c(3l, 3l, 2l, 3l, 1l, 3l),                  .label = c("mereene", "north", "westros"), class = "factor"),                  region.y = structure(c(4l, 4l, 2l, 2l, 1l, 3l),                  .label = c("slaver's bay", "the north", "the stormlands",                   "the westerlands"), class = "factor"),                  military_strength = c(60000l, 60000l, 20000l, 20000l, 110000l,                  40000l)), .names = c("house", "name", "region.x", "region.y",                  "military_strength"), class = "data.frame", row.names = c("1",                  "2", "3", "4", "5", "6")) 

Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -

c++ - Migration from QScriptEngine to QJSEngine -