R：合并相同数据表中的行，连接某些列

问题描述：

我在R中有我的数据表。我想合并具有相同customerID的行，然后连接其他合并列的元素。R：合并相同数据表中的行，连接某些列

我想从这个去：

title author customerID 
1 title1 author1   1 
2 title2 author2   2 
3 title3 author3   1

这样：

  title   author Group.1 
1 title1, title3 author1, author3  1 
2   title2   author2  2

答

的aggregate功能应该帮助你在寻找解决方案：

dat = data.frame(title = c("title1", "title2", "title3"), 
       author = c("author1", "author2", "author3"), 
       customerID = c(1, 2, 1)) 
aggregate(dat[-3], by=list(dat$customerID), c) 
# Group.1 title author 
# 1  1 1, 3 1, 3 
# 2  2  2  2

或者，只要确保在创建数据框时添加了stringsAsFactors = FALSE，那么您就可以轻松完成任务。如果您的数据已经被分解，你可以使用像dat[c(1, 2)] = apply(dat[-3], 2, as.character)他们先转换为字符，然后：

aggregate(dat[-3], by=list(dat$customerID), c) 
# Group.1   title   author 
# 1  1 title1, title3 author1, author3 
# 2  2   title2   author2

谢谢你的工作！ – 2012-07-09 09:43:41

@HarryPalmer，我不确定我了解你的后续问题。假设你已经将'aggregate'的输出赋值给另一个对象，比如'temp'，'temp $ title'就是一个列表（就像这个'list（'0' = c（“title1”，“title3”），'' 1' =“title2”）'。这个例子中的'title'和'author'列是列表，你在找什么？ – A5C1D2H2I1M1N2O1R2T1 2012-07-09 10:50:10

嗯，我想我现在明白了，谢谢，我对数据类型感到困惑。更多的问题请问：如何消除聚合后出现在列/行列表元素中的重复？我试过data1 2012-07-09 12:10:41

答

也许不是最好的解决办法，但很容易理解：

df <- data.frame(author=LETTERS[1:5], title=LETTERS[1:5], id=c(1, 2, 1, 2, 3), stringsAsFactors=FALSE) 

uniqueIds <- unique(df$id) 

mergedDf <- df[1:length(uniqueIds),] 

for (i in seq(along=uniqueIds)) { 
    mergedDf[i, "id"] <- uniqueIds[i] 
    mergedDf[i, "author"] <- paste(df[df$id == uniqueIds[i], "author"], collapse=",") 
    mergedDf[i, "title"] <- paste(df[df$id == uniqueIds[i], "title"], collapse=",") 
} 

mergedDf 
# author title id 
#1 A,C A,C 1 
#2 B,D B,D 2 
#3  E  E 3

好，但R有一些内置的功能，用于处理分组数据。对于这种情况，最好的办法是'aggregate（df [-3]，by = list（df $ id），c）'，但'by（df [-3]，df $ id，c）'也给你相同的结果，只是以完全不同的格式。 – A5C1D2H2I1M1N2O1R2T1 2012-07-06 17:20:06

@mrdwab：thx，我不经常使用数据框，也不知道'聚合“函数。 – sgibb 2012-07-06 17:22:58

R：合并相同数据表中的行，连接某些列

相关推荐