r - Rbind, having data frames within data frames causes errors? -
i have data.frame in turn contains data.frames, using rbind on 2 identical sets (e.g rbind(k,k)
) of data throws error:
error in xpdrows.data.frame(x, rows, new.rows) : number of items replace not multiple of replacement length
below structure of object data.
> str(k) 'data.frame': 25 obs. of 18 variables: $ location :'data.frame': 25 obs. of 5 variables: ..$ address :'data.frame': 25 obs. of 1 variable: .. ..$ streetaddress: chr "astrakangatan 110a" "västmannagatan 85c" "doktor abelins gata 6" "standarvägen 1" ... ..$ position :'data.frame': 25 obs. of 2 variables: .. ..$ latitude : num 59.4 59.3 59.3 59.3 59.3 ... .. ..$ longitude: num 17.8 18 18.1 18 18 ... ..$ namedareas:list of 25 .. ..$ : chr "hässelby" .. ..$ : chr "vasastan" .. ..$ : chr "södermalm" .. ..$ : chr "gamla Älvsjö" .. ..$ : chr "fruängen-hägersten" .. ..$ : chr "södermalm" .. ..$ : chr "kungsholmen" .. ..$ : chr "fruängen" .. ..$ : chr "Årsta" .. ..$ : chr "telefonplan" .. ..$ : chr "kista" .. ..$ : chr "Östberga" .. ..$ : chr "hägerstensåsen" .. ..$ : chr "Östermalm" .. ..$ : chr "Årsta" .. ..$ : chr "bromma blackeberg" .. ..$ : chr "similar listings overwritten here" .. ..$ : chr "traneberg" .. ..$ : chr "kungsholmen" .. ..$ : chr "skärholmen" .. ..$ : chr "katarina" .. ..$ : chr "farsta stadsdelsområde" .. ..$ : chr "kista" .. ..$ : chr "bromma" .. ..$ : chr "akalla" ..$ region :'data.frame': 25 obs. of 2 variables: .. ..$ municipalityname: chr "stockholm" "stockholm" "stockholm" "stockholm" ... .. ..$ countyname : chr "stockholms län" "stockholms län" "stockholms län" "stockholms län" ... ..$ distance :'data.frame': 25 obs. of 1 variable: .. ..$ ocean: int na 2325 1223 6360 na 329 2630 na 2837 5537 ... $ listprice : int 1900000 4100000 4875000 2950000 1995000 1395000 2450000 2250000 2550000 1995000 ... $ rent : int 4678 1586 3092 3983 2587 520 1437 3644 2936 2707 ... $ floor : num 1 1 na 1 3 0.5 1 6 3 na ... $ livingarea : num 60 40 70 91 37 11 28 59 54 42 ... $ source :'data.frame': 25 obs. of 4 variables: ..$ name: chr "husmanhagberg" "bosthlm" "gripsholms fastighetsförmedling" "fastighetsbyrån" ... ..$ id : int 1610 1499 9895524 1573 58 713 2091 1566 1566 1566 ... ..$ type: chr "broker" "broker" "broker" "broker" ... ..$ url : chr "http://www.husmanhagberg.se/" "http://www.bosthlm.se/" "http://gripsholms.se/" "http://www.fastighetsbyran.se/" ... $ rooms : num 2 2 2.5 3.5 2 1 1 2 2 2 ... $ published : date, format: "2015-07-17" "2015-07-16" "2015-07-15" "2015-07-10" ... $ constructionyear : int 2006 na 1929 1937 na 1929 1930 2014 1949 1944 ... $ objecttype : chr "lägenhet" "lägenhet" "lägenhet" "lägenhet" ... $ booliid : int 1920703 1919949 1896584 1917520 1918145 1918049 1917638 1849399 1916805 1826479 ... $ solddate : date, format: "2015-07-21" "2015-07-19" "2015-07-20" "2015-07-20" ... $ soldprice : int 2000000 4100000 5175000 4200000 2500000 1850000 2820000 2600000 2900000 2230000 ... $ url : chr "https://www.booli.se/bostad/lagenhet/hasselby/astrakangatan+110a/1920703" "https://www.booli.se/bostad/lagenhet/vasastan/vastmannagatan+85c/1919949" "https://www.booli.se/bostad/lagenhet/sodermalm/doktor+abelins+gata+6/1896584" "https://www.booli.se/bostad/lagenhet/gamla+alvsjo/standarvagen+1/1917520" ... $ isnewconstruction: int na na na na na na na na na na ... $ plotarea : int na na na na 0 na na na na na ... $ areasize : factor w/ 10 levels "10","20","30",..: 6 4 7 9 3 1 2 5 5 4 ... $ pricediff : int 100000 0 300000 1250000 505000 455000 370000 350000 350000 235000 ...
is using data frames within data frames ill advised? or have made mistake?
@simong, answer great. i'm stumbling upon non-unique row-names error. using nrbind()
works fine individual columns or data frames, e.g. mapply(nrbind, k$location, k$location)
somehow doesn't run when running whole data.frame. if change rownames row.names
still throws error.
> nrbind <- function(x,y) if(is.data.frame(x)) rbind(x,y) else c(x,y) > as.data.frame( mapply(nrbind, k, k) ) show traceback rerun debug error in `row.names<-.data.frame`(`*tmp*`, value = value) : duplicate 'row.names' not allowed in addition: warning message: non-unique values when setting 'row.names': ‘1’, ‘10’, ‘11’, ‘12’, ‘13’, ‘14’, ‘15’, ‘16’, ‘17’, ‘18’, ‘19’, ‘2’, ‘20’, ‘21’, ‘22’, ‘23’, ‘24’, ‘25’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’
it took me while produce "nested" data frame yours. personally, avoid nesting data frames that. minor adjustment code , enable use standard functions in r (see option 2 below).
however, if insist on having nested data frames, can "mimic" functionality of rbind
using mapply
, mix of rbind
, c
applied depending on whether elements of data frame data frames themselves. have written small example 2 data frames (see option 1 below).
option 1: single level of nesting
a <- letters[1:5] xy <- data.frame(x=1:5, y=5:1) k <- data.frame(a) k[["xy"]] <- xy # 'data.frame': 5 obs. of 2 variables: # $ : factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 # $ xy:'data.frame': 5 obs. of 2 variables: # ..$ x: int 1 2 3 4 5 # ..$ y: int 5 4 3 2 1 nrbind <- function(x,y) if(is.data.frame(x)) rbind(x,y) else c(x,y) as.data.frame( mapply(nrbind, k, k) ) # xy.x xy.y # 1 1 1 5 # 2 2 2 4 # 3 3 3 3 # 4 4 4 2 # 5 5 5 1 # 6 1 1 5 # 7 2 2 4 # 8 3 3 3 # 9 4 4 2 # 10 5 5 1
note function nrbind
above "quick , dirty". however, adjusting suit needs should straightforward.
also note result of mapply
not nested data frame anymore. therefore, in order use option 1 repeatedly, have extend function nrbind
instead of running mapply
repeatedly.
option 2: redefine regular data.frame
k <- data.frame(a=a, xy=xy) # 'data.frame': 5 obs. of 3 variables: # $ : factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 # $ xy.x: int 1 2 3 4 5 # $ xy.y: int 5 4 3 2 1 rbind(k, k) # result above
using regular data frame preferred way of doing this.
[edit:] option 3: higher levels of nesting
originally, didn't see data frame nested multiple times. 2 options above work single level of nesting or no nesting @ all.
multiple levels of nesting can adressed making whole thing recursive.
b <- data.frame(a) b[["z"]] <- data.frame(z1=1:5, z2=5:1) k <- data.frame(a) k[["b"]] <- b; k[["xy"]] <- xy # 'data.frame': 5 obs. of 3 variables: # $ : factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 # $ b :'data.frame': 5 obs. of 2 variables: # ..$ a: factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 # ..$ z:'data.frame': 5 obs. of 2 variables: # .. ..$ z1: int 1 2 3 4 5 # .. ..$ z2: int 5 4 3 2 1 # $ xy:'data.frame': 5 obs. of 2 variables: # ..$ x: int 1 2 3 4 5 # ..$ y: int 5 4 3 2 1 recursive.rbind <- function(x,y){ ll <- lapply(seq_along(x), function(i){ if(is.data.frame(x[[i]])) nrbind(x[[i]],y[[i]]) else rbind(x[i],y[i]) }) names(ll) <- names(x) as.data.frame(ll) } recursive.rbind(k,k) # b.a b.z.z1 b.z.z2 xy.x xy.y # 1 1 5 1 5 # 2 b b 2 4 2 4 # 3 c c 3 3 3 3 # 4 d d 4 2 4 2 # 5 e e 5 1 5 1 # 6 1 5 1 5 # 7 b b 2 4 2 4 # 8 c c 3 3 3 3 # 9 d d 4 2 4 2 # 10 e e 5 1 5 1
Comments
Post a Comment