闪亮 - 以第一行/列生成列总数和行总数的交叉表
问题描述:
我想生成行列列总数的交叉表。我试图用gmodels包生成交叉表。输出的外观比普通表格功能要好。桌子的外观很重要,因为最后必须使用Shiny来显示。但问题是我在行和列的末尾获得了列总数和行总数。我怎样才能得到总列作为表中的第一列和第一列。闪亮 - 以第一行/列生成列总数和行总数的交叉表
以下是我的数据示例。
Location <- sample(c("location A","location B","location C","location D","location E"),20,replace = T)
Brand <- sample(c("Brand A","Brand B","Brand C"),20,replace = T)
Year <- rep(c("Year 2014","Year 2015"),10)
Q1 <- sample(1:5,20,replace = T)
Q2 <- sample(1:5,20,replace = T)
mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2))
数据很庞大,因此它是data.table。我使用用于产生交叉表
代码为 -
library("gmodels")
mydata[,CrossTable(Location,Brand,prop.c = T,prop.r = F,prop.t = F,prop.chisq = F,chisq = F,format = "SPSS")]
这给出了输出,但总的列是列中的行和结束的结束。列的总数也缺少列%。我如何将总列作为第一行和第一列,并且还有%?
建议出路。
答
你有没有尝试使用sjPlot包....它有一个非常好的功能,sjt.xtab产生交叉表(列联表),类似于你在找什么。它有很多选项可供探索。我在下面使用了其中的几个。您可以查看?sjt.xtab并查看其他可用选项。下面的代码生成具有列百分比的表输出并且具有总列和行。
sjt.xtab(mydata$Location, mydata$Brand,
show.col.prc = T,
show.summary = F,
show.na = F,
wrap.labels = 50,
tdcol.col = "#f90470",
emph.total = T,
emph.color = "#3aaee5",
use.viewer = T,
CSS = list(css.table = "border: 1px solid;",
css.tdata = "border: 1px solid;"))
+0
我已经找到关于sjPlot包并在这种情况下使用它。是的,这是相当有用的,并符合要求只有东西不给总第1行和第1列。但仍然比其他表格输出更好。我错过了发布答案,并感谢您发布它。 – user1412
答
也许这样的事情可能会做?
myCT <- function(mydata) {
mydata_ct_n <- dcast.data.table(mydata, Location ~ Brand, margins = T)
mydata_ct_n[, all := rowSums(.SD), by = Location]
mydata_ct_n <- rbind(mydata_ct_n[, lapply(.SD, sum), .SDcols = 2:ncol(mydata_ct_n)], mydata_ct_n, fill = T)
mydata_ct_n$Location[1] <- "all"
foocols <- c("all", "Location")
setcolorder(mydata_ct_n, c(foocols, setdiff(colnames(mydata_ct_n), foocols)))
mydata_ct_p <- copy(mydata_ct_n)
for (j in 3:ncol(mydata_ct_p)) {
set(mydata_ct_p, j = j, value = as.numeric(mydata_ct_p[[j]]))
set(mydata_ct_p, i = 2:nrow(mydata_ct_p), j = j, value = round(100 * mydata_ct_p[2:nrow(mydata_ct_p), j, with = F]/mydata_ct_p[[j]][1], 0))
}
set(mydata_ct_p, 1L, 3L:ncol(mydata_ct_p), round(100 * mydata_ct_p[1L, 3L:ncol(mydata_ct_p), with = F]/mydata_ct_p[["all"]][1], 0))
for (j in 3:ncol(mydata_ct_p)) {
set(mydata_ct_p, j = j, value = as.character(mydata_ct_p[[j]]))
set(mydata_ct_n, j = j, value = as.character(mydata_ct_n[[j]]))
set(mydata_ct_p, j = j,
value = paste0(mydata_ct_p[[j]], "% (", mydata_ct_n[[j]], ")"))
}
return(mydata_ct_p)
}
Location <- sample(c("location A","location B","location C","location D","location E"),20,replace = T)
Brand <- sample(c("Brand A","Brand B","Brand C"),20,replace = T)
Year <- rep(c("Year 2014","Year 2015"),10)
Q1 <- sample(1:5,20,replace = T)
Q2 <- sample(1:5,20,replace = T)
mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2))
out <- myCT(mydata)
print(out)
# all Location Brand A Brand B Brand C
# 1: 20 all 30% (6) 35% (7) 35% (7)
# 2: 3 location A 0% (0) 43% (3) 0% (0)
# 3: 5 location B 33% (2) 14% (1) 29% (2)
# 4: 5 location C 50% (3) 0% (0) 29% (2)
# 5: 4 location D 17% (1) 29% (2) 14% (1)
# 6: 3 location E 0% (0) 14% (1) 29% (2)
你可能不想'cbind'在这里。看看'str(mydata)'并注意到所有的cols都被强制为字符串/字符类型。也许你想'reshape2 :: dcast(mydata,Location〜Brand,margin = TRUE)'在这里? – Frank
既然'CrossTable'返回null,那么你唯一的选择就是根据你的需要修改它的源代码。 –