交错SAS数据集（按照常见患者编号）

问题描述：

我需要交叉SAS数据集，但前提是患者ID都存在于两者中。在合并声明中，我会使用“in”和“if”，但是，我需要堆叠数据。数据在变量方面是等价的。交错SAS数据集（按照常见患者编号）

任何想法？

由于这个诀窍，会不会有一个内连接（ID）？ – SMW

如果obs是FIRST.ID和IN数据集2，那么数据集1没有obs。 –

答

如果您在一个或两个数据集中每个ID都有重复项，那么您有一堆其他解决方案。这是与你的MERGE想法最相似的一个。

在Double DoW循环中，您循环两次数据集，一次检查您的条件，然后一次实际输出。这可让您查看每个ID的所有行，查看您的条件是否有效，然后再次查看它们以便在该条件下采取行动。

data have_1; 
    do id = 1 to 20 by 2; 
    output; 
    output; 
    end; 
run; 

data have_2; 
    do id = 1 to 20 by 3; 
    output; 
    output; 
    end; 
run; 



data want; 
    _a=0; *initialize temporary variables; 
    _b=0; *they will be cleared once for each ID; 
    do _n_ = 1 by 1 until (last.id); 
    set have_1(in=a) have_2(in=b); 
    by id; 
    if a then _a=1; *save that value temporarily; 
    if b then _b=1; *again temporary; 
    end; 
    do _n_ = 1 by 1 until (last.id); 
    set have_1 have_2; 
    by id; 
    if _a and _b then output; *only output the rows that have both _a and _b; 
    end; 
run;

它完美的作品。谢谢！ –

答

这是一个faf工作，但如果数据集是相同的，那么你可以尝试下面。假设你在变量ID上匹配。

proc sql; 
select t1.* 
from 
    TABLE_A t1 
where ID in (select ID from TABLE_B) 
union all 
select t2.* 
from 
    TABLE_B t2 
where ID in (select ID from TABLE_A) 
;quit;

谢谢。但是，它并不是真的有效，可能是因为我的SQL技能。 TABLE_A是数据？ t1。*是?? 也许我应该提到，每个patient_ID有多个观察值。 –

如果你可以给出一个你现有的输入和输出的小例子，那么也许有人会建议更好的答案:) –

答

如果两个数据集中只有一行，这在数据步骤中很容易实现。

data have_1; 
    do id = 1 to 20 by 2; 
    output; 
    end; 
run; 

data have_2; 
    do id = 1 to 20 by 3; 
    output; 
    end; 
run; 

data want; 
    set have_1 have_2; 
    by id; 
    if not (first.id and last.id); 
run;

基本上，你只能输出一行，如果它不是第一或不是最后一排为ID - 当且仅当它是在这两个数据集，这将是真实的。如果每个ID在两个数据集中有多行，则这不起作用。

交错SAS数据集（按照常见患者编号）

相关推荐