R data.table重塑数据

问题描述:

我使用data.table重塑数据。R data.table重塑数据

library(data.table) 
market <- data.table(
    stkcd=c(1,2), 
    type =c(1,0), 
    roa2013=c(2,3), 
    roa2014=c(4,5), 
    lev2013=c(6,7), 
    lev2016=c(8,9)) 
market 
#  stkcd type roa2013 roa2014 lev2013 lev2016 
# 1:  1 1  2  4  6  8 
# 2:  2 0  3  5  7  9 
melt(market, 
    measure.vars = patterns("^roa", "^lev"), 
    variable.name = "year", 
    value.name = c("roa","lev")) 
#  stkcd type year roa lev 
# 1:  1 1 1 2 6 
# 2:  2 0 1 3 7 
# 3:  1 1 2 4 8 
# 4:  2 0 2 5 9 

这就是最终数据的样子。

#  stkcd type year roa lev 
# 1  1 1 2013 2 6 
# 2  1 1 2014 4 NA 
# 3  1 1 2016 NA 8 
# 4  2 0 2013 3 7 
# 5  2 0 2014 5 NA 
# 6  2 0 2016 NA 9 

有没有人有任何好的方法呢? 谢谢。

+0

对于命名的“年“的值,请参阅[在融化使用模式之后将'变量'列的数值表示转换为原始字符串](https://stackoverflow.com/questions/41883573/convert-numeric-representation-of-variable-column-to-original -string-以下)。 – Henrik

+0

谢谢。我会尝试重塑{统计数据}。 – Cheng

我们可以用splitstackshape轻松做到这一点。创建感兴趣的列的数字和非数字部分之间的分隔符,然后使用merged.stack重塑成“长”和改变” .time_1`列名‘年’

library(splitstackshape) 
names(market) <- sub("(\\d+)", "_\\1", names(market)) 
res <- merged.stack(market, var.stubs = c("roa", "lev"), sep="_") 
setnames(res, ".time_1", "year") 
res 
# stkcd type year roa lev 
#1:  1 1 2013 2 6 
#2:  1 1 2014 4 NA 
#3:  1 1 2016 NA 8 
#4:  2 0 2013 3 7 
#5:  2 0 2014 5 NA 
#6:  2 0 2016 NA 9 
+0

谢谢。这是一个好方法。 – Cheng

+0

@Cheng谢谢您的评论。你也可以勾选[here](https://stackoverflow.com/help/someone-answers) – akrun

1.使用重塑{}统计,

library(data.table) 
market <- data.table(
    stkcd=c(1,2), 
    type =c(1,0), 
    roa2013=c(2,3), 
    roa2014=c(4,5), 
    lev2013=c(6,7), 
    lev2016=c(8,9)) 

market[,`:=`(roa2016=NA,lev2014=NA)] 
long <- reshape(market, 
     idvar = "stkcd", 
     varying = c("roa2013","lev2013", 
        "roa2014","lev2014", 
        "roa2016","lev2016"), 
     sep = "", 
     timevar = "year", 
     direction = "long") 
setorder(long,stkcd,year) 
long 
#  stkcd type year roa lev 
# 1:  1 1 2013 2 6 
# 2:  1 1 2014 4 NA 
# 3:  1 1 2016 NA 8 
# 4:  2 0 2013 3 7 
# 5:  2 0 2014 5 NA 
# 6:  2 0 2016 NA 9 

2.str_extract海峡

library(data.table) 
library(stringr) 
market <- data.table(
    stkcd=c(1,2), 
    type =c(1,0), 
    roa2013=c(2,3), 
    roa2014=c(4,5), 
    lev2013=c(6,7), 
    lev2016=c(8,9)) 
market 
long <- melt(market, 
      id.vars = c("stkcd","type")) 
long[,`:=`(year=str_extract(variable,pattern = "[0-9]{4}"), 
      vars=str_extract(variable,pattern = "[a-zA-Z]{1,}"))][,variable:=NULL] 
long <- dcast(long, stkcd + type + year ~ vars, value.var = "value") 
long 
#  stkcd type year lev roa 
# 1:  1 1 2013 6 2 
# 2:  1 1 2014 NA 4 
# 3:  1 1 2016 8 NA 
# 4:  2 0 2013 7 3 
# 5:  2 0 2014 NA 5 
# 6:  2 0 2016 9 NA 

...