文章目录
- Flink 系列文章
- 一、maven依赖
- 二、示例:时态表的join(scala版本)
- 1)、统计需求对应的SQL
- 2)、Without connnector 实现代码
- 3)、With CSVConnector 实现代码
本文给以scala的语言给出来Table API 针对时态表的join操作。
本文除了maven依赖外,没有其他依赖。
本文需要有kafka的运行环境。
本文更详细的内容可参考文章:
17、Flink 之Table API: Table API 支持的操作(1)17、Flink 之Table API: Table API 支持的操作(2)
本文maven依赖参考文章:【flink番外篇】9、Flink Table API 支持的操作示例(1)-通过Table API和SQL创建表 中的依赖,为节省篇幅不再赘述。
二、示例:时态表的join(scala版本)
该示例来源于:https://developer.aliyun.com/article/679659
假设有一张订单表Orders和一张汇率表Rates,那么订单来自于不同的地区,所以支付的币种各不一样,那么假设需要统计每个订单在下单时候Yen币种对应的金额。
1)、统计需求对应的SQL
SELECT o.currency, o.amount, r.rate
o.amount * r.rate AS yen_amount
FROM
Orders AS o,
LATERAL TABLE (Rates(o.rowtime)) AS r
WHERE r.currency = o.currency
2)、Without connnector 实现代码
object TemporalTableJoinTest {
def main(args: Array[String]): Unit = {
val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = TableEnvironment.getTableEnvironment(env)
env.setParallelism(1)
// 设置时间类型是 event-time env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
// 构造订单数据
val ordersData = new mutable.MutableList[(Long, String, Timestamp)]
ordersData.+=((2L, "Euro", new Timestamp(2L)))
ordersData.+=((1L, "US Dollar", new Timestamp(3L)))
ordersData.+=((50L, "Yen", new Timestamp(4L)))
ordersData.+=((3L, "Euro", new Timestamp(5L)))
//构造汇率数据
val ratesHistoryData = new mutable.MutableList[(String, Long, Timestamp)]
ratesHistoryData.+=(("US Dollar", 102L, new Timestamp(1L)))
ratesHistoryData.+=(("Euro", 114L, new Timestamp(1L)))
ratesHistoryData.+=(("Yen", 1L, new Timestamp(1L)))
ratesHistoryData.+=(("Euro", 116L, new Timestamp(5L)))
ratesHistoryData.+=(("Euro", 119L, new Timestamp(7L)))
// 进行订单表 event-time 的提取
val orders = env
.fromCollection(ordersData)
.assignTimestampsAndWatermarks(new OrderTimestampExtractor[Long, String]())
.toTable(tEnv, 'amount, 'currency, 'rowtime.rowtime)
// 进行汇率表 event-time 的提取
val ratesHistory = env
.fromCollection(ratesHistoryData)
.assignTimestampsAndWatermarks(new OrderTimestampExtractor[String, Long]())
.toTable(tEnv, 'currency, 'rate, 'rowtime.rowtime)
// 注册订单表和汇率表
tEnv.registerTable("Orders", orders)
tEnv.registerTable("RatesHistory", ratesHistory)
val tab = tEnv.scan("RatesHistory");
// 创建TemporalTableFunction
val temporalTableFunction = tab.createTemporalTableFunction('rowtime, 'currency)
//注册TemporalTableFunction
tEnv.registerFunction("Rates",temporalTableFunction)
val SQLQuery =
"""
|SELECT o.currency, o.amount, r.rate,
| o.amount * r.rate AS yen_amount
|FROM
| Orders AS o,
| LATERAL TABLE (Rates(o.rowtime)) AS r
|WHERE r.currency = o.currency
|""".stripMargin
tEnv.registerTable("TemporalJoinResult", tEnv.SQLQuery(SQLQuery))
val result = tEnv.scan("TemporalJoinResult").toAppendStream[Row]
// 打印查询结果
result.print()
env.execute()
}
}
- OrderTimestampExtractor 实现如下
import java.SQL.Timestamp
import org.apache.flink.streaming.api.functions.timestamps.BoundedOutOfOrdernessTimestampExtractor
import org.apache.flink.streaming.api.windowing.time.Time
class OrderTimestampExtractor[T1, T2]
extends BoundedOutOfOrdernessTimestampExtractor[(T1, T2, Timestamp)](Time.seconds(10)) {
override def extractTimestamp(element: (T1, T2, Timestamp)): Long = {
element._3.getTime
}
}
3)、With CSVConnector 实现代码
在实际的生产开发中,都需要实际的Connector的定义,下面我们以CSV格式的Connector定义来开发Temporal Table JOIN Demo。
1、genEventRatesHistorySource
def genEventRatesHistorySource: CsvTableSource = {
val csvRecords = Seq(
"ts#currency#rate",
"1#US Dollar#102",
"1#Euro#114",
"1#Yen#1",
"3#Euro#116",
"5#Euro#119",
"7#Pounds#108"
)
// 测试数据写入临时文件
val tempFilePath =
FileUtils.writeToTempFile(csvRecords.mkString(CommonUtils.line), "csv_source_rate", "tmp")
// 创建Source connector
new CsvTableSource(
tempFilePath,
Array("ts","currency","rate"),
Array(
Types.LONG,Types.STRING,Types.LONG
),
fieldDelim = "#",
rowDelim = CommonUtils.line,
ignoreFirstLine = true,
ignoreComments = "%"
)
}
2、genRatesOrderSource
def genRatesOrderSource: CsvTableSource = {
val csvRecords = Seq(
"ts#currency#amount",
"2#Euro#10",
"4#Euro#10"
)
// 测试数据写入临时文件
val tempFilePath =
FileUtils.writeToTempFile(csvRecords.mkString(CommonUtils.line), "csv_source_order", "tmp")
// 创建Source connector
new CsvTableSource(
tempFilePath,
Array("ts","currency", "amount"),
Array(
Types.LONG,Types.STRING,Types.LONG
),
fieldDelim = "#",
rowDelim = CommonUtils.line,
ignoreFirstLine = true,
ignoreComments = "%"
)
}
3、主程序
import java.io.File
import org.apache.flink.api.common.typeinfo.{TypeInformation, Types}
import org.apache.flink.book.utils.{CommonUtils, FileUtils}
import org.apache.flink.table.sinks.{CsvTableSink, TableSink}
import org.apache.flink.table.sources.CsvTableSource
import org.apache.flink.types.Row
object CsvTableSourceUtils {
def genWordCountSource: CsvTableSource = {
val csvRecords = Seq(
"words",
"Hello Flink",
"Hi, Apache Flink",
"Apache FlinkBook"
)
// 测试数据写入临时文件
val tempFilePath =
FileUtils.writeToTempFile(csvRecords.mkString("$"), "csv_source_", "tmp")
// 创建Source connector
new CsvTableSource(
tempFilePath,
Array("words"),
Array(
Types.STRING
),
fieldDelim = "#",
rowDelim = "$",
ignoreFirstLine = true,
ignoreComments = "%"
)
}
def genRatesHistorySource: CsvTableSource = {
val csvRecords = Seq(
"rowtime ,currency ,rate",
"09:00:00 ,US Dollar , 102",
"09:00:00 ,Euro , 114",
"09:00:00 ,Yen , 1",
"10:45:00 ,Euro , 116",
"11:15:00 ,Euro , 119",
"11:49:00 ,Pounds , 108"
)
// 测试数据写入临时文件
val tempFilePath =
FileUtils.writeToTempFile(csvRecords.mkString("$"), "csv_source_", "tmp")
// 创建Source connector
new CsvTableSource(
tempFilePath,
Array("rowtime","currency","rate"),
Array(
Types.STRING,Types.STRING,Types.STRING
),
fieldDelim = ",",
rowDelim = "$",
ignoreFirstLine = true,
ignoreComments = "%"
)
}
def genEventRatesHistorySource: CsvTableSource = {
val csvRecords = Seq(
"ts#currency#rate",
"1#US Dollar#102",
"1#Euro#114",
"1#Yen#1",
"3#Euro#116",
"5#Euro#119",
"7#Pounds#108"
)
// 测试数据写入临时文件
val tempFilePath =
FileUtils.writeToTempFile(csvRecords.mkString(CommonUtils.line), "csv_source_rate", "tmp")
// 创建Source connector
new CsvTableSource(
tempFilePath,
Array("ts","currency","rate"),
Array(
Types.LONG,Types.STRING,Types.LONG
),
fieldDelim = "#",
rowDelim = CommonUtils.line,
ignoreFirstLine = true,
ignoreComments = "%"
)
}
def genRatesOrderSource: CsvTableSource = {
val csvRecords = Seq(
"ts#currency#amount",
"2#Euro#10",
"4#Euro#10"
)
// 测试数据写入临时文件
val tempFilePath =
FileUtils.writeToTempFile(csvRecords.mkString(CommonUtils.line), "csv_source_order", "tmp")
// 创建Source connector
new CsvTableSource(
tempFilePath,
Array("ts","currency", "amount"),
Array(
Types.LONG,Types.STRING,Types.LONG
),
fieldDelim = "#",
rowDelim = CommonUtils.line,
ignoreFirstLine = true,
ignoreComments = "%"
)
}
/**
* Example:
* genCsvSink(
* Array[String]("word", "count"),
* Array[TypeInformation[_] ](Types.STRING, Types.LONG))
*/
def genCsvSink(fieldNames: Array[String], fieldTypes: Array[TypeInformation[_]]): TableSink[Row] = {
val tempFile = File.createTempFile("csv_sink_", "tem")
if (tempFile.exists()) {
tempFile.delete()
}
new CsvTableSink(tempFile.getAbsolutePath).configure(fieldNames, fieldTypes)
}
}
4、运行结果
以上,本文给以scala的语言给出来Table API 针对时态表的join操作。
标签:join,val,示例,flink,currency,new,Array,Types,Euro From: https://blog.51cto.com/alanchan2win/9232420