按列规则列规则列
问题描述:
这个问题是这个topic的变化,但我不知道为什么我没有得到正确的结果。按列规则列规则列
我的数据:
dput(temp)
structure(list(MB = c("4001826", "4007824", "4007948", "4010876",
"4015215"), Margin = c(900, 30733.0616, 15525, 2689.05865, 4340
), T1 = c(300, 11296.491, 38810, 1379.44, 870), T2 = c(360, 12706.491,
46404, 1466.44, 1050), T3 = c(390, 13430.491, 49781, 1574.44,
1141), T4 = c(420, 15146.491, 55274, 1720.44, 1230), T5 = c(900,
30972.2633, 109829.852, 1807.44, 2670), T6 = c(960, 41017.3059,
119443.9056, 2718.2, 2850), T7 = c(1020, 42079.3059, 128232.9056,
2907.2, 3020), T8 = c(1200, 44461.3059, 151137.9056, 3314.2,
3540), T9 = c(1500, 46936.3059, 180746.9056, 3746.2, 4400), T10 = c(1800,
48246.3059, 199116.9056, 3746.2, 5260), T11 = c(1530, 35279.3059,
144154.9056, 2748.2, 4415), T12 = c(1500, 33350.3059, 138818.9056,
2881.2, 4330), T13 = c(1500, 34719.3059, 140508.9056, 2893.2,
4330), T14 = c(1800, 58092.3059, 205687.9056, 2463.2, 5220),
T15 = c(390, 35438.0846, 68364.8492, 2987.1718, 1172), T16 = c(390,
32038.0139, 64451.0925, 2655.5102, 1162), T17 = c(390, 30219.2716,
67860.3977, 2462.239, 1162), T18 = c(608.397, 49543.5875,
113689.9478, 3643.7126, 1872), T19 = c(660, 34080.84615,
85176.3018, 2284.9598, 1923)), .Names = c("MB", "Margin",
"T1", "T2", "T3", "T4", "T5", "T6", "T7", "T8", "T9", "T10",
"T11", "T12", "T13", "T14", "T15", "T16", "T17", "T18", "T19"
), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
我想要做的,就是找到一个T-组,其具有的最接近的值(小于或等于)保证金。所以我的原则是:
- 分钟(T值< =保证金)
我做这样的尝试:
temp <- gather(temp, Closest_column, val, T1:T19) %>%
group_by(MB) %>%
slice(max(which(val <= Margin)[1]))
但有两个奇怪的事情发生。首先,将临时变量返回4行,而不是5,二,结果是不正确:
head(temp)
# A tibble: 4 x 4
# Groups: MB [4]
MB Margin Closest_column val
<chr> <dbl> <chr> <dbl>
1 4001826 900.000 T1 300.00
2 4007824 30733.062 T1 11296.49
3 4010876 2689.059 T1 1379.44
4 4015215 4340.000 T1 870.00
对于,所分配的最接近列是T1,但它应该是T5这是在值等于(保证金= 900.00,T5 = 900.00)。
对于它应该是T17,它是来自边距的最小最小值(边距= 30733.062,T17 = 30219.27)。
任何线索我错了吗?
答
我修改了一下代码。规则如下更改:
- 过滤器记录
val
小于或等于Margin
。 - 按保证金和每个
T
列之间的绝对差值排列。 - 如果有联系,请选择列号较小的
T
列。
这是代码。 temp2
是最终输出。
temp2 <- temp %>%
gather(Col, val, T1:T19) %>%
# Filter those records with val smaller than or equal to Margin
filter(val <= Margin) %>%
# Calculate the absolute difference between Margin and val
mutate(Diff = abs(val - Margin)) %>%
# Create factor for the Closest_column
mutate(Col = factor(Col, levels = paste0("T", 1:(ncol(temp) - 2)))) %>%
# Sort by MB, Diff, then Col
arrange(MB, Diff, Col) %>%
group_by(MB) %>%
slice(1) %>%
rename(Closest_column = Col)
temp2
# A tibble: 4 x 5
# Groups: MB [4]
MB Margin Closest_column val Diff
<chr> <dbl> <fctr> <dbl> <dbl>
1 4001826 900.000 T5 900.00 0.00000
2 4007824 30733.062 T17 30219.27 513.79000
3 4010876 2689.059 T16 2655.51 33.54845
4 4015215 4340.000 T12 4330.00 10.00000
这只是因为有一种情况,对于这个组来说,它对任何元素都不是真的。在那种情况下,你想要什么作为输出。另外,关于其他查询,我们只是检查“哪些”是比另一个更小的值,然后获得索引的最大值而不是值。所以可能需要'收集(temp,closest_column,val,T1:T19)%>%group_by(MB)%>%slice(which.min(abs(val - Margin)))' – akrun
第三行是一个问题: testwise %rowwise()%> mutate(lessthanistrue = any(c(T1:T19) CPak