1) 多行转单列
user_id | login_date |
1001 | 2024-09-01 |
1001 | 2024-09-02 |
1002 | 2024-09-03 |
1003 | 2024-09-05 |
比较常见的
select user_id,concat_ws(',',collect_list(login_date)) login_date_set
from tbl
group by user_id
延伸下,集合中保证日期的有序
启动多个mr,顺序会有问题
1) 以前做项目中有用到,这是通用的解法
a. 按指定顺序排序,用rn拼下,做前缀
b. 用sort_array
c.在regexp_replace用替换,'\\d+\:'
2) 熟悉mr的原理,用第2种
DISTRIBUTE BY
确保相同 key 值的行到同一个reducer, SORT BY 内部重新排序
select user_id,concat_ws(',',collect_list(login_date)) login_date_set
from
( select
from
(
select user_id
,login_date
from tbl
order by login_date asc
)
distribute by user_id
sort by login_date asc
)
group by user_id
2) 多列转多行
chinese | math | english |
90 | 80 | 99 |
select subject
,score
from (
select str_to_map(concat('math',:,math,'chinese',':',chinese,'english',':',english)) as res
from tbl_detail
)t later view explode(t.res) tmp as subject,score
标签:09,及列,Hive,转列,user,date,login,id,select
From: https://blog.csdn.net/2402_86289011/article/details/142306517