麻烦把年份变成单独的对象
嗨大家我知道我以前见过这样的帖子,但由于某种原因,我试过的建议都没有奏效。基本上我想要做的是从名为“Production.Period.End.Date”的变量中取出日期,格式为dd/mm/yyyy,并将这些日期的每个部分分成不同的对象进行分析。我这样做的原因是采取标记为“Period_kWh_Production”的年平均千瓦产量并追踪该加班的变化。如果有帮助,我粘贴下面的代码。麻烦把年份变成单独的对象
setwd( “C:\用户\ fredd \收存箱\ Grad_Life \ Spring_2017 \ AFM \ Final_Paper \”)
KWTProd.df = read.csv("Merge1//Kwht_Production_07-15.csv", header=T)
##Did this to verify "Production.Period.End.Date"
names(KWTProd.df)
##
names(KWTProd.df)
[1] "Application.Number"
[2] "Program.Administrator"
[3] "Program"
[4] "Total.Cost"
[5] "System.Owner.Sector"
[6] "Host.Customer.Sector"
[7] "Host.Customer.Physical.Address.City"
[8] "Host.Customer.Physical.Address.County"
[9] "Host.Customer.Physical.Address.Zip.Code"
[10] "PBI.Payment.."
[11] "Production.Period.End.Date"
[12] "Period_kWh_Production" <-IT EXISTS ##
##
##Did this to plot changes of Period_kWh_Production over time##
plot(Period_kWh_Production ~ Production.Period.End.Date, data = KWTProd.df)
##Tried to do this to aggregate data in average##
aggregate(Period_kWh_Production~Production.Period.End.Date,KWTProd.df,mean)
##Still too noisy and can't find the mean by year :C##
as.date(Production.Period.End.Date, data = KWTProd.df)
##Says "Production.Period.End.Date" Not found BUT IT EXISTS##
##Tried this to group and summarise by year but it says: Error in UseMethod("mutate_") :
no applicable method for 'mutate_' applied to an object of class "function" ##
summary <- df %>%
mutate(dates = dmy(Production.Period.End.Date),
year = year(Production.Period.End.Date)) %>%
group_by(year) %>%
summarise(mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE))
##Trying this but have no clue how I am supposed to use this##
regexpr("<dd>")
此代码应取决于dplyr和lubridate包。您尚未提供样本数据。所以这没有经过测试。
library(lubridate)
library(dplyr)
summary <- df %>%
mutate(end_date = dmy(Production.Period.End.Date),
production_year = year(end_date)) %>%
group_by(production_year) %>%
summarise(mean_kwH = mean(Period_kWh_Production, na.rm = TRUE),
sd_kwH = sd(Period_kWh_Production, na.rm = TRUE))
我试过,但由于某种原因,我不断收到错误:Error:'''in: “summarize(mean_kwH = mean(Period_kWh_Production,na.rm = TRUE), sd_kwH = sd(Period_kWh_Production),na.rm = TRUE))” > mutate_'应用于类“功能” –
的对象如果您将数据添加到您的问题,我们可以提供帮助。一般使用函数'dput'并粘贴结果。我建议你查看http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example。我编辑删除了一个额外的) – epi99
对不起,我做了比它更难但输入似乎使我的控制台爆炸数字,因为它是一个大型的数据集。我不知道这是否有帮助,但基于您给我发送的链接中的评论,我使用了Paste Bin来减少显示结果的数量,但是我仍然得到这个结果: –
不知道太多关于代码,但正则表达式是'\ d {2}/\ d {2}/\ d {4}' – sln