按列规则列规则列

问题描述:

这个问题是这个topic的变化,但我不知道为什么我没有得到正确的结果。按列规则列规则列

我的数据:

dput(temp) 
structure(list(MB = c("4001826", "4007824", "4007948", "4010876", 
"4015215"), Margin = c(900, 30733.0616, 15525, 2689.05865, 4340 
), T1 = c(300, 11296.491, 38810, 1379.44, 870), T2 = c(360, 12706.491, 
46404, 1466.44, 1050), T3 = c(390, 13430.491, 49781, 1574.44, 
1141), T4 = c(420, 15146.491, 55274, 1720.44, 1230), T5 = c(900, 
30972.2633, 109829.852, 1807.44, 2670), T6 = c(960, 41017.3059, 
119443.9056, 2718.2, 2850), T7 = c(1020, 42079.3059, 128232.9056, 
2907.2, 3020), T8 = c(1200, 44461.3059, 151137.9056, 3314.2, 
3540), T9 = c(1500, 46936.3059, 180746.9056, 3746.2, 4400), T10 = c(1800, 
48246.3059, 199116.9056, 3746.2, 5260), T11 = c(1530, 35279.3059, 
144154.9056, 2748.2, 4415), T12 = c(1500, 33350.3059, 138818.9056, 
2881.2, 4330), T13 = c(1500, 34719.3059, 140508.9056, 2893.2, 
4330), T14 = c(1800, 58092.3059, 205687.9056, 2463.2, 5220), 
    T15 = c(390, 35438.0846, 68364.8492, 2987.1718, 1172), T16 = c(390, 
    32038.0139, 64451.0925, 2655.5102, 1162), T17 = c(390, 30219.2716, 
    67860.3977, 2462.239, 1162), T18 = c(608.397, 49543.5875, 
    113689.9478, 3643.7126, 1872), T19 = c(660, 34080.84615, 
    85176.3018, 2284.9598, 1923)), .Names = c("MB", "Margin", 
"T1", "T2", "T3", "T4", "T5", "T6", "T7", "T8", "T9", "T10", 
"T11", "T12", "T13", "T14", "T15", "T16", "T17", "T18", "T19" 
), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame")) 

我想要做的,就是找到一个T-组,其具有的最接近的值(小于或等于)保证金。所以我的原则是:

  • 分钟(T值< =保证金)

我做这样的尝试:

temp <- gather(temp, Closest_column, val, T1:T19) %>% 
    group_by(MB) %>% 
    slice(max(which(val <= Margin)[1])) 

但有两个奇怪的事情发生。首先,将临时变量返回4行,而不是5,二,结果是不正确:

head(temp) 
# A tibble: 4 x 4 
# Groups: MB [4] 
     MB Margin Closest_column  val 
    <chr>  <dbl>   <chr> <dbl> 
1 4001826 900.000    T1 300.00 
2 4007824 30733.062    T1 11296.49 
3 4010876 2689.059    T1 1379.44 
4 4015215 4340.000    T1 870.00 

对于,所分配的最接近列是T1,但它应该是T5这是在值等于(保证金= 900.00,T5 = 900.00)。

对于它应该是T17,它是来自边距的最小最小值(边距= 30733.062,T17 = 30219.27)。

任何线索我错了吗?

+1

这只是因为有一种情况,对于这个组来说,它对任何元素都不是真的。在那种情况下,你想要什么作为输出。另外,关于其他查询,我们只是检查“哪些”是比另一个更小的值,然后获得索引的最大值而不是值。所以可能需要'收集(temp,closest_column,val,T1:T19)%>%group_by(MB)%>%slice(which.min(abs(val - Margin)))' – akrun

+1

第三行是一个问题: testwise %rowwise()%> mutate(lessthanistrue = any(c(T1:T19) CPak

我修改了一下代码。规则如下更改:

  1. 过滤器记录val小于或等于Margin
  2. 按保证金和每个T列之间的绝对差值排列。
  3. 如果有联系,请选择列号较小的T列。

这是代码。 temp2是最终输出。

temp2 <- temp %>% 
    gather(Col, val, T1:T19) %>% 
    # Filter those records with val smaller than or equal to Margin 
    filter(val <= Margin) %>% 
    # Calculate the absolute difference between Margin and val 
    mutate(Diff = abs(val - Margin)) %>% 
    # Create factor for the Closest_column 
    mutate(Col = factor(Col, levels = paste0("T", 1:(ncol(temp) - 2)))) %>% 
    # Sort by MB, Diff, then Col 
    arrange(MB, Diff, Col) %>% 
    group_by(MB) %>% 
    slice(1) %>% 
    rename(Closest_column = Col) 
temp2 
# A tibble: 4 x 5 
# Groups: MB [4] 
     MB Margin Closest_column  val  Diff 
    <chr>  <dbl>   <fctr> <dbl>  <dbl> 
1 4001826 900.000    T5 900.00 0.00000 
2 4007824 30733.062   T17 30219.27 513.79000 
3 4010876 2689.059   T16 2655.51 33.54845 
4 4015215 4340.000   T12 4330.00 10.00000