在使用streamset采集binlog过程中,发现采集的datetime格式的数据他会转换为时间戳,但给的时间戳会有时区问题。
通过百度查到前人解决方法:https://blog.csdn.net/weixin_38751513/article/details/131662819
现详细记录解决过程:
我们的streamsets是通过cdh部署的,第一步先找到streamset依赖的mysql-binlog-connector-java版本。
1.在安装streamsets的服务器上执行find / -name mysql-binlog-connector-java*
找到这个lib包的位置为:/mnt/bdaaslv/opt/cloudera/parcels/STREAMSETS_DATACOLLECTOR-3.22.0/streamsets-libs/streamsets-datacollector-mysql-binlog-lib/lib
2. cd 到对应目录查看到版本为0.23.4
3. 下载对应版本源码。仓库地址为:https://github.com/osheroff/mysql-binlog-connector-java,找到对应版本的tag。我的版本是0.23.4,所以对应的tag地址为:https://github.com/osheroff/mysql-binlog-connector-java/releases/tag/v0.23.4
4.下载源码,然后按上面教程的修改源码:
//AbstractRowsEventDataDeserializer类下 添加方法
private long convertLocalTimestamp(long millis) { TimeZone tz = TimeZone.getDefault(); Calendar c = Calendar.getInstance(tz); long localMillis = millis; int offset, time; c.set(1970, Calendar.JANUARY, 1, 0, 0, 0); // Add milliseconds while (localMillis > Integer.MAX_VALUE) { c.add(Calendar.MILLISECOND, Integer.MAX_VALUE); localMillis -= Integer.MAX_VALUE; } c.add(Calendar.MILLISECOND, (int)localMillis); // Stupidly, the Calendar will give us the wrong result if we use getTime() directly. // Instead, we calculate the offset and do the math ourselves. time = c.get(Calendar.MILLISECOND); time += c.get(Calendar.SECOND) * 1000; time += c.get(Calendar.MINUTE) * 60 * 1000; time += c.get(Calendar.HOUR_OF_DAY) * 60 * 60 * 1000; offset = tz.getOffset(c.get(Calendar.ERA), c.get(Calendar.YEAR), c.get(Calendar.MONTH), c.get(Calendar.DAY_OF_MONTH), c.get(Calendar.DAY_OF_WEEK), time); return (millis - offset); }
//修改方法asUnixTime返回值
protected Long asUnixTime(int year, int month, int day, int hour, int minute, int second, int millis) {
// https://dev.mysql.com/doc/refman/5.0/en/datetime.html
if (year == 0 || month == 0 || day == 0) {
return invalidDateAndTimeRepresentation;
}
// return UnixTime.from(year, month, day, hour, minute, second, millis);
return convertLocalTimestamp(UnixTime.from(year, month, day, hour, minute, second, millis));
}
4.打包上传替换原本的包。提示:替换前最好先备份。
5.在cm页面重启streamsets。搞定
标签:binlog,get,int,millis,采集,mysql,Streamsets,Calendar From: https://www.cnblogs.com/mytg/p/18160686