r - Rbind, having data frames within data frames causes errors? -


i have data.frame in turn contains data.frames, using rbind on 2 identical sets (e.g rbind(k,k)) of data throws error:

error in xpdrows.data.frame(x, rows, new.rows) : number of items replace not multiple of replacement length

below structure of object data.

> str(k) 'data.frame':   25 obs. of  18 variables:  $ location         :'data.frame':  25 obs. of  5 variables:   ..$ address   :'data.frame':  25 obs. of  1 variable:   .. ..$ streetaddress: chr  "astrakangatan 110a" "västmannagatan 85c" "doktor abelins gata 6" "standarvägen 1" ...   ..$ position  :'data.frame':  25 obs. of  2 variables:   .. ..$ latitude : num  59.4 59.3 59.3 59.3 59.3 ...   .. ..$ longitude: num  17.8 18 18.1 18 18 ...   ..$ namedareas:list of 25   .. ..$ : chr "hässelby"   .. ..$ : chr "vasastan"   .. ..$ : chr "södermalm"   .. ..$ : chr "gamla Älvsjö"   .. ..$ : chr "fruängen-hägersten"   .. ..$ : chr "södermalm"   .. ..$ : chr "kungsholmen"   .. ..$ : chr "fruängen"   .. ..$ : chr "Årsta"   .. ..$ : chr "telefonplan"   .. ..$ : chr "kista"   .. ..$ : chr "Östberga"   .. ..$ : chr "hägerstensåsen"   .. ..$ : chr "Östermalm"   .. ..$ : chr "Årsta"   .. ..$ : chr "bromma blackeberg"   .. ..$ : chr "similar listings overwritten here"   .. ..$ : chr "traneberg"   .. ..$ : chr "kungsholmen"   .. ..$ : chr "skärholmen"   .. ..$ : chr "katarina"   .. ..$ : chr "farsta stadsdelsområde"   .. ..$ : chr "kista"   .. ..$ : chr "bromma"   .. ..$ : chr "akalla"   ..$ region    :'data.frame':  25 obs. of  2 variables:   .. ..$ municipalityname: chr  "stockholm" "stockholm" "stockholm" "stockholm" ...   .. ..$ countyname      : chr  "stockholms län" "stockholms län" "stockholms län" "stockholms län" ...   ..$ distance  :'data.frame':  25 obs. of  1 variable:   .. ..$ ocean: int  na 2325 1223 6360 na 329 2630 na 2837 5537 ...  $ listprice        : int  1900000 4100000 4875000 2950000 1995000 1395000 2450000 2250000 2550000 1995000 ...  $ rent             : int  4678 1586 3092 3983 2587 520 1437 3644 2936 2707 ...  $ floor            : num  1 1 na 1 3 0.5 1 6 3 na ...  $ livingarea       : num  60 40 70 91 37 11 28 59 54 42 ...  $ source           :'data.frame':  25 obs. of  4 variables:   ..$ name: chr  "husmanhagberg" "bosthlm" "gripsholms fastighetsförmedling" "fastighetsbyrån" ...   ..$ id  : int  1610 1499 9895524 1573 58 713 2091 1566 1566 1566 ...   ..$ type: chr  "broker" "broker" "broker" "broker" ...   ..$ url : chr  "http://www.husmanhagberg.se/" "http://www.bosthlm.se/" "http://gripsholms.se/" "http://www.fastighetsbyran.se/" ...  $ rooms            : num  2 2 2.5 3.5 2 1 1 2 2 2 ...  $ published        : date, format: "2015-07-17" "2015-07-16" "2015-07-15" "2015-07-10" ...  $ constructionyear : int  2006 na 1929 1937 na 1929 1930 2014 1949 1944 ...  $ objecttype       : chr  "lägenhet" "lägenhet" "lägenhet" "lägenhet" ...  $ booliid          : int  1920703 1919949 1896584 1917520 1918145 1918049 1917638 1849399 1916805 1826479 ...  $ solddate         : date, format: "2015-07-21" "2015-07-19" "2015-07-20" "2015-07-20" ...  $ soldprice        : int  2000000 4100000 5175000 4200000 2500000 1850000 2820000 2600000 2900000 2230000 ...  $ url              : chr  "https://www.booli.se/bostad/lagenhet/hasselby/astrakangatan+110a/1920703" "https://www.booli.se/bostad/lagenhet/vasastan/vastmannagatan+85c/1919949" "https://www.booli.se/bostad/lagenhet/sodermalm/doktor+abelins+gata+6/1896584" "https://www.booli.se/bostad/lagenhet/gamla+alvsjo/standarvagen+1/1917520" ...  $ isnewconstruction: int  na na na na na na na na na na ...  $ plotarea         : int  na na na na 0 na na na na na ...  $ areasize         : factor w/ 10 levels "10","20","30",..: 6 4 7 9 3 1 2 5 5 4 ...  $ pricediff        : int  100000 0 300000 1250000 505000 455000 370000 350000 350000 235000 ... 

is using data frames within data frames ill advised? or have made mistake?

@simong, answer great. i'm stumbling upon non-unique row-names error. using nrbind()works fine individual columns or data frames, e.g. mapply(nrbind, k$location, k$location) somehow doesn't run when running whole data.frame. if change rownames row.names still throws error.

> nrbind <- function(x,y) if(is.data.frame(x)) rbind(x,y) else c(x,y) > as.data.frame( mapply(nrbind, k, k) )  show traceback   rerun debug  error in `row.names<-.data.frame`(`*tmp*`, value = value) :    duplicate 'row.names' not allowed in addition: warning message: non-unique values when setting 'row.names': ‘1’, ‘10’, ‘11’, ‘12’, ‘13’, ‘14’, ‘15’, ‘16’, ‘17’, ‘18’, ‘19’, ‘2’, ‘20’, ‘21’, ‘22’, ‘23’, ‘24’, ‘25’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’  

it took me while produce "nested" data frame yours. personally, avoid nesting data frames that. minor adjustment code , enable use standard functions in r (see option 2 below).

however, if insist on having nested data frames, can "mimic" functionality of rbind using mapply, mix of rbind , c applied depending on whether elements of data frame data frames themselves. have written small example 2 data frames (see option 1 below).

option 1: single level of nesting

a <- letters[1:5] xy <- data.frame(x=1:5, y=5:1)  k <- data.frame(a) k[["xy"]] <- xy  # 'data.frame':   5 obs. of  2 variables: #  $ : factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 #  $ xy:'data.frame':     5 obs. of  2 variables: #   ..$ x: int  1 2 3 4 5 #   ..$ y: int  5 4 3 2 1  nrbind <- function(x,y) if(is.data.frame(x)) rbind(x,y) else c(x,y) as.data.frame( mapply(nrbind, k, k) )  #    xy.x xy.y # 1  1    1    5 # 2  2    2    4 # 3  3    3    3 # 4  4    4    2 # 5  5    5    1 # 6  1    1    5 # 7  2    2    4 # 8  3    3    3 # 9  4    4    2 # 10 5    5    1 

note function nrbind above "quick , dirty". however, adjusting suit needs should straightforward.

also note result of mapply not nested data frame anymore. therefore, in order use option 1 repeatedly, have extend function nrbind instead of running mapply repeatedly.

option 2: redefine regular data.frame

k <- data.frame(a=a, xy=xy)  # 'data.frame':   5 obs. of  3 variables: #  $   : factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 #  $ xy.x: int  1 2 3 4 5 #  $ xy.y: int  5 4 3 2 1  rbind(k, k) # result above 

using regular data frame preferred way of doing this.

[edit:] option 3: higher levels of nesting

originally, didn't see data frame nested multiple times. 2 options above work single level of nesting or no nesting @ all.

multiple levels of nesting can adressed making whole thing recursive.

b <- data.frame(a) b[["z"]] <- data.frame(z1=1:5, z2=5:1) k <- data.frame(a) k[["b"]] <- b; k[["xy"]] <- xy  # 'data.frame':   5 obs. of  3 variables: #  $ : factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 #  $ b :'data.frame':     5 obs. of  2 variables: #   ..$ a: factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 #   ..$ z:'data.frame':   5 obs. of  2 variables: #   .. ..$ z1: int  1 2 3 4 5 #   .. ..$ z2: int  5 4 3 2 1 #  $ xy:'data.frame':     5 obs. of  2 variables: #   ..$ x: int  1 2 3 4 5 #   ..$ y: int  5 4 3 2 1  recursive.rbind <- function(x,y){   ll <- lapply(seq_along(x), function(i){      if(is.data.frame(x[[i]])) nrbind(x[[i]],y[[i]]) else rbind(x[i],y[i])   })   names(ll) <- names(x)   as.data.frame(ll) }  recursive.rbind(k,k)  #    b.a b.z.z1 b.z.z2 xy.x xy.y # 1         1      5    1    5 # 2  b   b      2      4    2    4 # 3  c   c      3      3    3    3 # 4  d   d      4      2    4    2 # 5  e   e      5      1    5    1 # 6         1      5    1    5 # 7  b   b      2      4    2    4 # 8  c   c      3      3    3    3 # 9  d   d      4      2    4    2 # 10 e   e      5      1    5    1 

Comments

Popular posts from this blog

python - Healpy: From Data to Healpix map -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -