首页 > 其他分享 >[Clickhouse] Clickhouse 函数 : 数据类型转换

[Clickhouse] Clickhouse 函数 : 数据类型转换

时间:2024-09-18 15:15:27浏览次数:17  
标签:类型转换 toDate 01 函数 10 -- 2021 array Clickhouse

0 引言

如无特殊说明,ck版本为 21.3.4.25

1 数据类型的支持情况

查看当前受支持的数据类型

select * from system.data_type_families 
-- select name,case_insensitive,alias_to from system.data_type_families; 

output

name                           |case_insensitive|alias_to   |
-------------------------------+----------------+-----------+
Polygon                        |               0|           |
Ring                           |               0|           |
MultiPolygon                   |               0|           |
IPv6                           |               0|           |
IntervalSecond                 |               0|           |
IPv4                           |               0|           |
UInt32                         |               0|           |
IntervalYear                   |               0|           |
IntervalQuarter                |               0|           |
IntervalMonth                  |               0|           |
Int64                          |               0|           |
IntervalDay                    |               0|           |
IntervalHour                   |               0|           |
UInt256                        |               0|           |
Int16                          |               0|           |
LowCardinality                 |               0|           |
AggregateFunction              |               0|           |
Nothing                        |               0|           |
Decimal256                     |               1|           |
Tuple                          |               0|           |
Array                          |               0|           |
Enum16                         |               0|           |
IntervalMinute                 |               0|           |
FixedString                    |               0|           |
String                         |               0|           |
DateTime                       |               1|           |
Map                            |               0|           |
UUID                           |               0|           |
Decimal64                      |               1|           |
Nullable                       |               0|           |
Enum                           |               0|           |
Int32                          |               0|           |
UInt8                          |               0|           |
Date                           |               1|           |
Decimal32                      |               1|           |
Point                          |               0|           |
Float64                        |               0|           |
DateTime64                     |               1|           |
Int128                         |               0|           |
Decimal128                     |               1|           |
Int8                           |               0|           |
SimpleAggregateFunction        |               0|           |
Nested                         |               0|           |
Decimal                        |               1|           |
Int256                         |               0|           |
IntervalWeek                   |               0|           |
UInt64                         |               0|           |
Enum8                          |               0|           |
DateTime32                     |               1|           |
UInt16                         |               0|           |
Float32                        |               0|           |
INET6                          |               1|IPv6       |
INET4                          |               1|IPv4       |
BINARY                         |               1|FixedString|
NATIONAL CHAR VARYING          |               1|String     |
BINARY VARYING                 |               1|String     |
NCHAR LARGE OBJECT             |               1|String     |
NATIONAL CHARACTER VARYING     |               1|String     |
NATIONAL CHARACTER LARGE OBJECT|               1|String     |
NATIONAL CHARACTER             |               1|String     |
NATIONAL CHAR                  |               1|String     |
CHARACTER VARYING              |               1|String     |
LONGBLOB                       |               1|String     |
MEDIUMTEXT                     |               1|String     |
TEXT                           |               1|String     |
TINYBLOB                       |               1|String     |
VARCHAR2                       |               1|String     |
CHARACTER LARGE OBJECT         |               1|String     |
DOUBLE PRECISION               |               1|Float64    |
LONGTEXT                       |               1|String     |
NVARCHAR                       |               1|String     |
INT1 UNSIGNED                  |               1|UInt8      |
VARCHAR                        |               1|String     |
CHAR VARYING                   |               1|String     |
MEDIUMBLOB                     |               1|String     |
NCHAR                          |               1|String     |
CHAR                           |               1|String     |
SMALLINT UNSIGNED              |               1|UInt16     |
TIMESTAMP                      |               1|DateTime   |
FIXED                          |               1|Decimal    |
TINYTEXT                       |               1|String     |
NUMERIC                        |               1|Decimal    |
DEC                            |               1|Decimal    |
TINYINT UNSIGNED               |               1|UInt8      |
INTEGER UNSIGNED               |               1|UInt32     |
INT UNSIGNED                   |               1|UInt32     |
CLOB                           |               1|String     |
MEDIUMINT UNSIGNED             |               1|UInt32     |
BOOL                           |               1|Int8       |
SMALLINT                       |               1|Int16      |
INTEGER SIGNED                 |               1|Int32      |
NCHAR VARYING                  |               1|String     |
INT SIGNED                     |               1|Int32      |
TINYINT SIGNED                 |               1|Int8       |
BIGINT SIGNED                  |               1|Int64      |
BINARY LARGE OBJECT            |               1|String     |
SMALLINT SIGNED                |               1|Int16      |
MEDIUMINT                      |               1|Int32      |
INTEGER                        |               1|Int32      |
INT1 SIGNED                    |               1|Int8       |
BIGINT UNSIGNED                |               1|UInt64     |
BYTEA                          |               1|String     |
INT                            |               1|Int32      |
SINGLE                         |               1|Float32    |
FLOAT                          |               1|Float32    |
MEDIUMINT SIGNED               |               1|Int32      |
BOOLEAN                        |               1|Int8       |
DOUBLE                         |               1|Float64    |
INT1                           |               1|Int8       |
CHAR LARGE OBJECT              |               1|String     |
TINYINT                        |               1|Int8       |
BIGINT                         |               1|Int64      |
CHARACTER                      |               1|String     |
BYTE                           |               1|Int8       |
BLOB                           |               1|String     |
REAL                           |               1|Float32    |

查看指定字段的数据类型

select toTypeName(now())

ClickHouse 可以在数据表中存储多种数据类型。
Clickhouse 所有的数据类型存储在system.data_type_families表中,可以检查数据类型名称是否区分大小写

2 日期时间类型

Date32 类型

toDate(...) 函数

select  
    -- toDate('20240506') -- 将报错 : DB::Exception: Cannot parse date: value is too short: Cannot parse Date from String: While processing toDate('20240506') (version 21.3.4.25)
	toDate('2022-10-22') as date0 -- 2022-10-22 
	
	, toTypeName(toDate('2022-10-22')) -- 'Date'
	
	, toDate('2022-10-22 10:27:00') as date1 -- 2022-10-22 
	, toDateTime64(now(), 3, 'Asia/Shanghai') as date2 -- 2022-10-22 10:29:38.000 +0800 
	, toDate( toDateTime64(now(), 3, 'Asia/Shanghai') ) as date3 -- 2022-10-22 

	-- , toUInt16(toDate('1970-01-01')) -- 0 
	-- , toUInt16(toDate('2022-10-22')) -- 0 
	, toDate(19287) -- 2022-10-22 

DateTime 类型

数据类型 是否大小写敏感 别名 取值范围 最小单位 备注说明
DateTime [1970-01-01 00:00:00, 2106-02-07 06:28:15] 秒(s)

综合案例

查询指定月份的天数

select toDayOfMonth(subtractDays(addMonths(toStartOfMonth(toDate('2022-10-25')), 1), 1))
-- 31
  • 补充: MYSQL 实现
select DAYOFMONTH(last_day('2022-10-10'))
-- 31

统计每月每天都满勤的设备数(人员数)

-- 外层查询
SELECT
	month -- 月份
	, count(deviceId) as activeDevices -- 每个月内每天都活跃的设备
FROM (
	-- 内层查询

	select
		*
	FROM (
		SELECT
			deviceId 
			-- groupArray(deviceId) as deviceIdArray
			--, groupArray(`month`) as month
			,  replace(month, '-', '')  as month
			-- , toDate( concat(month, '-01') ) as xx
			,  toDayOfMonth(subtractDays(addMonths(toStartOfMonth( toDate( concat(month, '-01') ) ), 1), 1) )  as totalMonthDays -- 当月总天数
			-- , groupArray(month) as monthArray
			--, groupArray(totalMonthDays) as totalMonthDaysArray
			, groupArray(`date`) as dates
			, length(dates) as realMonthDays -- 当月的实际活跃天数 (也可理解为:当月的实际上班天数)
		FROM (
			SELECT
				deviceId
				, formatDateTime( create_time , '%Y-%m' , 'Asia/Shanghai') as `month` -- 'yyyyMM' / '%Y-%m-%d %H:%M:%S'
				--, toDayOfMonth(subtractDays(addMonths(toStartOfMonth( create_time ), 1), 1) ) as totalMonthDays -- 当月总天数
				, toDate(create_time, 'Asia/Shanghai' ) as `date` -- "yyyy-MM-dd"
			FROM bdp_dwd.dwd_vehicle_status_record_ri_d -- MergeTree
			where
				1 = 1
				-- and create_time < toDateTime(toUInt64({{endTime}})/1000 ,'{{timeZone}}') -- [必填参数] 13位时间戳
				-- and create_time <= toDateTime64('{{endTime}}', 3 ,'{{timeZone}}') -- [必填参数] 13位时间戳
			group by
				deviceId
				, month
				-- , monthTotalDays
				, `date`
			order by deviceId asc, `date` asc
	
		)
		group by 
			deviceId 
			, month 
			-- , monthTotalDays
		order by deviceId
	)
	where 1 = 1
		and realMonthDays = totalMonthDays -- 当月实际活跃天数 = 当月总天数

	
	union all 
	
	select 'XXGERGREY001' as deviceId ,'202408' as month , toUInt8(31) as totalMonthDays , [] as dates , 31 as realMonthDays
	
	union all
	
	select 'XXGERGREY002' as deviceId ,'202408' as month , toUInt8(31) as totalMonthDays , [] as dates , 31 as realMonthDays

)
group by 
	month
order by month desc 	
  • 内层查询结果
XXGERGREY002	202408	31	[]	31
XXGERGREY001	202408	31	[]	31
XXGERGREY003	202401	31	['2024-01-01','2024-01-02','2024-01-03','2024-01-04','2024-01-05','2024-01-06','2024-01-07','2024-01-08','2024-01-09','2024-01-10','2024-01-11','2024-01-12','2024-01-13','2024-01-14','2024-01-15','2024-01-16','2024-01-17','2024-01-18','2024-01-19','2024-01-20','2024-01-21','2024-01-22','2024-01-23','2024-01-24','2024-01-25','2024-01-26','2024-01-27','2024-01-28','2024-01-29','2024-01-30','2024-01-31']	31
XXGERGREY003	202309	30	['2023-09-01','2023-09-02','2023-09-03','2023-09-04','2023-09-05','2023-09-06','2023-09-07','2023-09-08','2023-09-09','2023-09-10','2023-09-11','2023-09-12','2023-09-13','2023-09-14','2023-09-15','2023-09-16','2023-09-17','2023-09-18','2023-09-19','2023-09-20','2023-09-21','2023-09-22','2023-09-23','2023-09-24','2023-09-25','2023-09-26','2023-09-27','2023-09-28','2023-09-29','2023-09-30']	30
  • 外层统计结果
month / activeDevices
202408	3
202402	1
202401	1
202310	1
202309	1
202308	1

3 Array 类型

Array 类型

生成 Array

方式0:创建1个数组 :array()、[]

> SELECT array(1, 2) AS x, toTypeName(x) 

┌─x─────┬─toTypeName(array(1, 2))┐ 

│        [1,2] │ Array(UInt8)            │ 

└───────┴───────────┘ 

 

> SELECT array(1, 2, NULL) AS x, toTypeName(x) 

┌─x──────────┬─toTypeName(array(1, 2, NULL))─┐ 

│ [1,2,NULL] │ Array(Nullable(UInt8))        │ 

└────────────┴──────────────┘ 

 

> SELECT [1, 2] AS x, toTypeName(x) 

┌─x─────┬─toTypeName([1, 2])─┐ 

│ [1,2] │ Array(UInt8)       │ 

└───────┴─────────┘ 

注意事项:不允许1个数组内的数据类型不同。

> SELECT array(1, 'a') 

Received exception from server (version 1.1.54388): 

Code: 386. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: There is no supertype for types UInt8, String because some of them are String/FixedString and some of them are not. 

方式1:范围内生成Array:range

select  
	range( 
		toUInt32(toDate('2021-01-01')),  toUInt32(toDate('2021-01-10')) + 1 
	)

-- out
range(toUInt32(toDate('2021-01-01')), plus(toUInt32(toDate('2021-01-10')), 1))|
------------------------------------------------------------------------------+
[18628,18629,18630,18631,18632,18633,18634,18635,18636,18637]                 |

方式3:元素映射与转换:arrayMap

select  
	arrayMap( 
		x -> toDate(x),  -- toUInt32 转 toDate
		range( toUInt32(toDate('2021-01-01')),toUInt32(toDate('2021-01-10')) + 1 ) 
	) 
arrayMap(lambda(tuple(x), toDate(x)), range(toUInt32(toDate('2021-01-01')), plus(toUInt32(toDate('2021-01-10')), 1)))              |
-----------------------------------------------------------------------------------------------------------------------------------+
['2021-01-01','2021-01-02','2021-01-03','2021-01-04','2021-01-05','2021-01-06','2021-01-07','2021-01-08','2021-01-09','2021-01-10']|

获取指定下标的数组元素:arrayElement(array, index)

  • demo1
select  
	arrayElement(['a', 'b', 'c'], 1) as a1 -- 正常下标范围 
	, arrayElement(['a', 'b', 'c'], -1) as a2 -- 负数: 反向 
	, arrayElement(['a', 'b', 'c'], -3) as a3 -- 负数: 反向 
	, arrayElement(['a', 'b', 'c'], 5) as a4 -- 下标超界: <空> 

out

a1|a2|a3|a4|
--+--+--+--+
a |c |a |  |
  • demo2
select  
	toDate('2022-11-03') as `date` 
	, [toDate('2022-10-27'), toDate('2022-10-28'),toDate('2022-10-29'),toDate('2022-10-31'),toDate('2022-11-01'),toDate('2022-11-02'),toDate('2022-11-03')] as datesA 
	, length(datesA) as `length` 
	, arrayElement(datesA, -7) as `date2` -- 负数:反向数(从1开始) 
	, `date` - INTERVAL 6 day `date3` 
	, `date2` = `date3` as resultA -- 1: true / 0: false 

	, [toDate('2022-10-28'), toDate('2022-10-29'),toDate('2022-10-30'),toDate('2022-10-31'),toDate('2022-11-01'),toDate('2022-11-02'),toDate('2022-11-03')] as datesB 
	, arrayElement(datesB, -7) as `date4` 
	, `date4` = `date3` as resultB 

out

date      |datesA                                                                                      |length|date2     |date3     |resultA|datesB                                                                                      |date4     |resultB|
----------+--------------------------------------------------------------------------------------------+------+----------+----------+-------+--------------------------------------------------------------------------------------------+----------+-------+
2022-11-03|['2022-10-27','2022-10-28','2022-10-29','2022-10-31','2022-11-01','2022-11-02','2022-11-03']|     7|2022-10-27|2022-10-28|      0|['2022-10-28','2022-10-29','2022-10-30','2022-10-31','2022-11-01','2022-11-02','2022-11-03']|2022-10-28|      1|

获取指定元素的下标:indexOf(array, element)

select  
	indexOf(['a', 'b', 'c'], 'a') -- 正常数据类型 [命中] 
	,indexOf(['a', 'b', 'c'], 'd') -- 正常数据类型 [未命中] 0 
	-- , indexOf(['a', 'b', 'c'], 1) -- 异常数据类型: 报错 
indexOf(['a', 'b', 'c'], 'a')|indexOf(['a', 'b', 'c'], 'd')|
-----------------------------+-----------------------------+
                            1|                            0|

获得数组元素个数/长度:arrayCount | length

  • 方式1 arrayCount(array)
select arrayCount([1,2,3,4,5]) -- arrayCount(Array(UInt8)) : arrayCount 函数 仅支持 元素为 UInt8 的数组 

-- 5 

 

select arrayCount( 
	lambda( 
		tuple(x), 
		x  
	)
	, [1, 2, 3, 4] 
) 

-- 4 


-- 也可自己定义列名: SELECT arrayCount(x -> x, [1, 2, 3, 4]) AS arrayLength; 
  • 方式2 length(array)
select length(['1', '2', '3', '4', '9']) 

-- 5 

数组元素值求和:arraySum(array)

select  
	arraySum([1,2,3]) -- 6 
	, arraySum([1.1, 2.4, 3.6]) -- 7.1 
	-- , arraySum(['1','2','3']) -- 不支持 非数值型的元素(将报错) 
	-- , arraySum([toDate(1664553600), toDate(1665158399)]) -- -- 不支持 非数值型的元素(将报错) 

WHERE 子句中按条件过滤 Array 字段

select  
	scores, name 
from (
	select [2, 3, 5] as scores, 'johnny' as name 
	union all 
	select [2, 7, 9] as scores, 'jane' as name 
) x 
where 1 = 1 
    and arrayExists(x -> x > 2, scores)

遍历元素并转为下标:arrayEnumerate(array)

  • 函数说明 : arrayEnumerate(array)
  • 作用:类比Oracle、Hive的开窗函数row_number()
  • 用途:Query SQL 中 生成排序序号、组内序号等
  • 关联函数:arrayEnumerateDense、arrayEnumerateUniq
  • 函数 arrayEnumerate(arr) : 类比开窗函数: ROW_NUMBER()
  • return: array(1,2,3,..., length(arr))
  • 该函数通常跟ARRAY JOIN关键字一起试用,在应用ARRAY JOIN后为每个数组进行计算一次
  • 函数 arrayEnumerateDense(arr) : 类比开窗函数: DENSE_RANK()
  • 函数 arrayEnumerateUniq(arr)
  • return: 与源数组大小相同的数组,其中每个元素表示与其下标对应的源数组元素在源数组中出现的次数。
  • 例如:arrayEnumerateUniq( [10,20,10,30 ])= [1,1,2,1 ]
  • demo1
select
	range( 
		toUInt32(toDate('2021-01-01')),toUInt32(toDate('2021-01-10')) + 1 
	) as arr1
	, arrayEnumerate( arr1 ) as arr2

out

arr1                                                         |arr2                  |
-------------------------------------------------------------+----------------------+
[18628,18629,18630,18631,18632,18633,18634,18635,18636,18637]|[1,2,3,4,5,6,7,8,9,10]|
  • demo2
SELECT  
    [ '2020-05-01','2020-05-02','2020-05-03', '2020-05-01','2020-05-01', '2020-05-02' ] as time, 
    arrayEnumerate(time) as row_number, 
    arrayEnumerateDense(time) as dense_rank, 
    arrayEnumerateUniq(time) as uniq_rank;  -- 每个数元素出现的次数 

-- 响应结果: 
  time 	/ 	row_number / 	dense_rank / 	uniq_rank 
  ['2020-05-01','2020-05-02','2020-05-03','2020-05-01','2020-05-01','2020-05-02']	[1,2,3,4,5,6]	[1,2,3,1,1,2]	[1,1,1,2,3,2] 
  • demo3 实现开窗函数的效果

实现开窗函数row_number()的效果

SELECT  
	problemLevel 
	-- ,arrayJoin_array_ecuComponent 
	-- ,arrayJoin_array_faultCount 
	
	, array_ecuComponent
	, array_faultCount
	, arrayEnumerate(array_faultCount)
	
	, arrayEnumerate_array_faultCount 
FROM ( 
    SELECT 'A' as problemLevel , ['BLE', 'VCU'] as array_ecuComponent , [2, 9] as array_faultCount 
    union all 
    SELECT 'B' as problemLevel , ['ABM','AC'] as array_ecuComponent , [6, 5] as array_faultCount 
) 
ARRAY JOIN 
    arrayEnumerate(array_faultCount) as arrayEnumerate_array_faultCount -- 类似于 开窗函数: row_number() 
;

out

problemLevel|array_ecuComponent|array_faultCount|arrayEnumerate(array_faultCount)|arrayEnumerate_array_faultCount|
------------+------------------+----------------+--------------------------------+-------------------------------+
B           |['ABM','AC']      |[6,5]           |[1,2]                           |                              1|
B           |['ABM','AC']      |[6,5]           |[1,2]                           |                              2|
A           |['BLE','VCU']     |[2,9]           |[1,2]                           |                              1|
A           |['BLE','VCU']     |[2,9]           |[1,2]                           |                              2|
  • demo4 M个数组(每个数组含N个元素)个数组转为N行M列记录+不重复的排序序号(row_number())
SELECT  
	problemLevel 
	,arrayJoin_array_ecuComponent 
	,arrayJoin_array_faultCount 
	,arrayEnumerate_array_faultCount 
FROM ( 
    SELECT 'A' as problemLevel , ['BLE', 'VCU'] as array_ecuComponent , [2, 9] as array_faultCount 
    union all 
    SELECT 'B' as problemLevel , ['ABM','AC'] as array_ecuComponent , [6, 5] as array_faultCount 
) 
ARRAY JOIN 
    array_ecuComponent as arrayJoin_array_ecuComponent -- 从 array 恢复回: ecuComponent 
    ,array_faultCount as arrayJoin_array_faultCount --  从 array 恢复回: faultCount 
    ,arrayEnumerate(array_faultCount) as arrayEnumerate_array_faultCount -- 类似于 开窗函数: row_number() 
; 

out

problemLevel|arrayJoin_array_ecuComponent|arrayJoin_array_faultCount|arrayEnumerate_array_faultCount|
------------+----------------------------+--------------------------+-------------------------------+
B           |ABM                         |                         6|                              1|
B           |AC                          |                         5|                              2|
A           |BLE                         |                         2|                              1|
A           |VCU                         |                         9|                              2|

数组元素过滤:arrayFilter(lambdaExpression, array)

  • 语法格式
arrayFilter(func, arr1, …) 

func是一个lambda表达式,通过作用于数组中每一个元素后,只留下结果为非0的部分。

  • demo
SELECT 
	arrayFilter( 
		x -> x LIKE '%World%'  
		, ['Hello', 'abc World'] 
	) AS res 


┌─res─────────——————————————──┐ 
│ ['abc World']               │ 
└─————————————————────────────┘ 


SELECT 
    arrayFilter( 
        (i, x) -> (x LIKE '%World%'), 
        arrayEnumerate(arr), 
        ['Hello', 'abc World'] AS arr
	) AS res 

┌─res─——┐ 
│ [2]   │ 
└───————┘ 
  • demo 过滤获取大于6的数值
SELECT  
    arrayFilter(x -> (x > 6), [2, 5, 7, 8, 9]) as arr1
	, arrayJoin( arr1 ) AS arrayFilter 

arr1   |arrayFilter|
-------+-----------+
[7,8,9]|          7|
[7,8,9]|          8|
[7,8,9]|          9|

数组 JOIN : array join 、 arrayJoin(...)

  • demo1
with cityRankInfo as (
	select * from (
		select 'hubei' as province , ['wuhan','xiangyang'] as cityArray, [1,2] as rankArray
		union all
		select 'guangdong' as province , ['guangzhou','shenzhen','zhuhai'] as cityArray, [1,2,3] as rankArray
		union all
		select 'beijing' as province , [] as cityArray, [10] as rankArray
		union all
		select 'shanghai' as province , [] as cityArray, [20] as rankArray
		union all
		select 'hongkong' as province , [] as cityArray, [] as rankArray
	)
)
-- select * from cityRankInfo

SELECT 
	province
	, cityArray
	, city 
FROM cityRankInfo 
ARRAY JOIN cityArray as city;

out

province |cityArray                        |city     |
---------+---------------------------------+---------+
hubei    |['wuhan','xiangyang']            |wuhan    |
hubei    |['wuhan','xiangyang']            |xiangyang|
guangdong|['guangzhou','shenzhen','zhuhai']|guangzhou|
guangdong|['guangzhou','shenzhen','zhuhai']|shenzhen |
guangdong|['guangzhou','shenzhen','zhuhai']|zhuhai   |
  • demo2 arrayJoin 函数
select 
    range( toUInt32(toDate('2021-01-02')) , toUInt32( toDate('2021-01-10') ) + 1 ) as arr1
	, arrayJoin( arr1	) as arr2


-- out:
arr1                                                   |arr2 |
-------------------------------------------------------+-----+
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18629|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18630|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18631|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18632|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18633|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18634|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18635|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18636|
[18629,18630,18631,18632,18633,18634,18635,18636,18637]|18637|
  • demo3
select 
	range( toUInt32(toDate('2021-01-01')),toUInt32(toDate('2021-01-05')) + 1 ) as arr1
	, arrayMap( x -> toDate(x), arr1 ) as arr2 
	, arrayJoin( arr2	) as day1;


-- out:
arr1                           |arr2                                                              |day1      |
-------------------------------+------------------------------------------------------------------+----------+
[18628,18629,18630,18631,18632]|['2021-01-01','2021-01-02','2021-01-03','2021-01-04','2021-01-05']|2021-01-01|
[18628,18629,18630,18631,18632]|['2021-01-01','2021-01-02','2021-01-03','2021-01-04','2021-01-05']|2021-01-02|
[18628,18629,18630,18631,18632]|['2021-01-01','2021-01-02','2021-01-03','2021-01-04','2021-01-05']|2021-01-03|
[18628,18629,18630,18631,18632]|['2021-01-01','2021-01-02','2021-01-03','2021-01-04','2021-01-05']|2021-01-04|
[18628,18629,18630,18631,18632]|['2021-01-01','2021-01-02','2021-01-03','2021-01-04','2021-01-05']|2021-01-05|

分组聚合/分组的数组 :groupArray(columnName)

  • demo1 全部元素合为一组
select  
	groupArray(user_id) as tags  
from ( 
	select 1 as user_id, 'A' as tag 
	union all 
	select 2 as user_id, 'B' as tag 
	union all 
	select 3 as user_id, 'A' as tag 
	union all 
	select 4 as user_id, 'B' as tag 
	union all 
	select 5 as user_id, 'C' as tag 
) user_tag 

out

tags       |
-----------+
[1, 2, 3, 4, 5]|
  • demo2 按照指定维度(tag)对指定字段(userId)分组
select  
	tag
	, groupArray(user_id) as tags  
from ( 
	select 1 as user_id, 'A' as tag 
	union all 
	select 2 as user_id, 'B' as tag 
	union all 
	select 3 as user_id, 'A' as tag 
	union all 
	select 4 as user_id, 'B' as tag 
	union all 
	select 5 as user_id, 'C' as tag 
) user_tag
group by tag

out

tag|tags |
---+-----+
B  |[2,4]|
C  |[5]  |
A  |[1,3]|
  • demo3 N行M列记录 转为 M个数组(每个数组含N个元素)

按照 problemLevel 对 ecuComponent 和 faultCount 分组聚合

SELECT  
	problemLevel 
	,groupArray(ecuComponent) as array_ecuComponent 
	,groupArray(faultCount) as array_faultCount 
FROM ( 
	( SELECT 'ABM' as ecuComponent, 6 AS faultCount, 'A' as problemLevel ) 
	UNION ALL 
	( SELECT 'BLE' as ecuComponent, 2 AS faultCount, 'B' as problemLevel ) 
	UNION ALL 
	( SELECT 'AC' as ecuComponent, 5 AS faultCount, 'A' as problemLevel ) 
	UNION ALL 
	( SELECT 'VCU' as ecuComponent, 9 AS faultCount, 'B' as problemLevel) 
) 
group by problemLevel; 


-- out:
problemLevel|array_ecuComponent|array_faultCount|
------------+------------------+----------------+
B           |['BLE','VCU']     |[2,9]           |
A           |['ABM','AC']      |[6,5]           |

分组去重的数组:groupUniqArray(columnName)

SELECT  
	problemLevel 
	,groupUniqArray(ecuComponent) as array_ecuComponent -- 组内元素去重的数组 
	,groupUniqArray(faultCount) as array_faultCount -- 组内元素去重的数组 
FROM ( 
	( SELECT 'ABM' as ecuComponent, 6 AS faultCount, 'A' as problemLevel ) 
	UNION ALL 
	( SELECT 'BLE' as ecuComponent, 2 AS faultCount, 'B' as problemLevel ) 
	UNION ALL 
	( SELECT 'AC' as ecuComponent, 5 AS faultCount, 'A' as problemLevel ) 
	UNION ALL 
	( SELECT 'VCU' as ecuComponent, 9 AS faultCount, 'B' as problemLevel) 
	UNION ALL 
	( SELECT 'VCU' as ecuComponent, 7 AS faultCount, 'B' as problemLevel) 
) 
group by problemLevel; 

out

problemLevel|array_ecuComponent|array_faultCount|
------------+------------------+----------------+
B           |['BLE','VCU']     |[9,2,7]         |
A           |['AC','ABM']      |[6,5]           |

数组元素去重:arrayDistinct(columnName)

截取部分元素:arraySlice

  • 语法
arraySlice(array, offset[, length]) 
  • 作用:返回一个子数组,包含从指定位置的指定长度的元素。

  • 参数

  • array – 数组。
  • offset – 数组的偏移。正值表示左侧的偏移量,负值表示右侧的缩进值。数组下标从1开始。
  • length - 子数组的长度。如果指定负值,则该函数返回[offset,array_length - length]。如果省略该值,则该函数返回[offset,the_end_of_array]。
  • demo
select 
    [9,2,3,7,5,6]            as arr
	, arraySlice(arr , 1, 3) as a   -- [9, 2, 3] 
	, arraySlice(arr , 0, 3) as b   -- [ ] (说明: 下标offset从1开始) 


-- out:
arr          |a      |b |
-------------+-------+--+
[9,2,3,7,5,6]|[9,2,3]|[]|

标签:类型转换,toDate,01,函数,10,--,2021,array,Clickhouse
From: https://www.cnblogs.com/johnnyzen/p/18418100

相关文章

  • Q:LISTAGG()函数用法笔记(oracle)
    .LISTAGG()函数作为普通函数使用时就是查询出来的结果列转为行☆LISTAGG函数既是分析函数,也是聚合函数有两种用法:1、分析函数,如:row_number()、rank()、dense_rank()等,用法相似listagg(合并字段,连接符)withingroup(orderby合并的字段的排序)over(partitionby分组字段)......
  • 鹏哥C语言42---函数调用相关练习
    #define_CRT_SECURE_NO_WARNINGS#include<stdio.h>//------------------------------------打印1000-2000年之间的闰年---------------------------------------------------//闰年的判断规则有两个//1.能被4整除,但是不能被100整除//2.能被400整除也是闰年/*intmain(......
  • Redis 字典的哈希函数和 rehash 操作详解
    Redis字典的哈希函数和rehash操作详解在Redis中,字典(HashTable)是一种重要的数据结构,用于存储键值对。下面解释Redis字典的哈希函数和rehash操作。一、哈希函数作用Redis的字典使用哈希函数将键转换为一个整数索引,这个索引用于确定键值对在哈希表中的存储位......
  • 56.【C语言】字符函数和字符串函数(strtok函数)(未完)
    目录12.strtok函数(较复杂)*简单使用总结:*优化12.strtok函数(较复杂)*简单使用strtok:stringintotokenscplusplus的介绍点我跳转翻译:函数strtokchar*strtok(char*str,constchar*delimiters);总结:delimiters参数指向一个字符串,定义了用......
  • C++信奥老师解一本通题 1164:digit函数
    ​【题目描述】在程序中定义一函数digit(n,k),它能分离出整数n从右边数第k个数字。【输入】正整数n和k。【输出】一个数字。【输入样例】318593【输出样例】8#include<iostream>usingnamespacestd;intdigit(longlongn,intk){ if(k==1) returnn%10......
  • ClickHouse-Kafka Engine 正确的使用方式
    Kafka是大数据领域非常流行的一款分布式消息中间件,是实时计算中必不可少的一环,同时一款OLAP系统能否对接Kafka也算是考量是否具备流批一体的衡量指标之一。ClickHouse的Kafka表引擎能够直接与Kafka系统对接,进而订阅Kafka中的Topic并实时接受消息数据。众所周......