如何从Big Query中的一列值中获取第一个非空值?
我想从基于时间戳的值列中提取第一个非空值。有人可以分享你的想法。谢谢。如何从Big Query中的一列值中获取第一个非空值?
到目前为止,我尝试了什么?
FIRST_VALUE(column) OVER (PARTITION BY id ORDER BY timestamp)
Input :-
id,column,timestamp
1,NULL,10:30 am
1,NULL,10:31 am
1,'xyz',10:32 am
1,'def',10:33 am
2,NULL,11:30 am
2,'abc',11:31 am
Output(expected) :-
1,'xyz',10:30 am
1,'xyz',10:31 am
1,'xyz',10:32 am
1,'xyz',10:33 am
2,'abc',11:30 am
2,'abc',11:31 am
据我所知,大查询像 'IGNORE NULLS' 或 'NULLS LAST' 任何选项。鉴于此,这是我能想到的最简单的解决方案。我希望看到更简单的解决方案。 假设输入数据为表“original_data”,
select w2.id, w1.column, w2.timestamp
from
(select id,column,timestamp
from
(select id,column,timestamp, row_number()
over (partition BY id ORDER BY timestamp) position
FROM original_data
where column is not null
)
where position=1
) w1
right outer join
original_data as w2
on w1.id = w2.id
快速更新:现在支持使用“IGNORE NULLS”的可能性:https://cloud.google.com/bigquery/docs/release-notes#november_2_2017 – Sourygna
尝试字符串操作这个老把戏:
Select
ID,
Column,
ttimestamp,
LTRIM(Right(CColumn,20)) as CColumn,
FROM
(SELECT
ID,
Column,
ttimestamp,
MIN(Concat(RPAD(IF(Column is null, '9999999999999999',STRING(ttimestamp)),20,'0'),LPAD(Column,20,' '))) OVER (Partition by ID) CColumn
FROM (
SELECT
*
FROM (Select 1 as ID, STRING(NULL) as Column, 0.4375 as ttimestamp),
(Select 1 as ID, STRING(NULL) as Column, 0.438194444444444 as ttimestamp),
(Select 1 as ID, 'xyz' as Column, 0.438888888888889 as ttimestamp),
(Select 1 as ID, 'def' as Column, 0.439583333333333 as ttimestamp),
(Select 2 as ID, STRING(NULL) as Column, 0.479166666666667 as ttimestamp),
(Select 2 as ID, 'abc' as Column, 0.479861111111111 as ttimestamp)
))
您可以修改你这样的SQL得到你想要的数据。
FIRST_VALUE(column)
OVER (
PARTITION BY id
ORDER BY
CASE WHEN column IS NULL then 0 ELSE 1 END DESC,
timestamp
)
MikeD确定此查询有效吗?我正在尝试这个,我得到错误信息:“在解析表达式中,ORDER BY必须引用命名列。找到CASE” – goRunToStack
SELECT标识,
(SELECT顶(1)从TEST1柱其中id = 1和列不为空,以便通过自动识别降序)作为名称 ,时间戳 FROM yourTable
输出: - 1,'xyz',10:30 am 1,'xyz',10:31 am 1,'xyz',10:32 am 1,'xyz',10:33 am 2'abc' ,11:30 am 2,'abc',11:31 am
最初的声明和您的示例输出似乎不一致。看起来你想用第一个非'NULL'值填充NULL值。 –
不需要..我需要将第一个非空值作为col级别中所有值的输出。 – Teja