Flinksql Kafka 接收流数据并打印到控制台

芒果2年前技术文章1997

本文目的
使用Flink SQL创建一个流处理作业,将来自Kafka主题"dahua_picrecord"的数据写入到另一个表”print_table”控制台中。
使用sql-client前 需要启动yarn-session哦
首先需要在CREATE TABLE
CREATE TABLE test_source (
  objId STRING,
  data STRING,
  capTime STRING,
  dataType STRING,
  channelCode STRING
) WITH (
  'connector' = 'kafka',
  'topic' = 'test',
  'properties.bootstrap.servers' = '172.16.121.194:9092',
  'properties.group.id' = 'test-dataq-01',
  'format' = 'json',
  'scan.startup.mode' = 'earliest-offset'
);

CFC9E3F3-A2FA-43C5-A99C-C765F1A0ACAB.png
创建”print_table"
CREATE TABLE print_table (
  objId STRING,
  data STRING,
  capTime STRING,
  dataType STRING,
  channelCode STRING
) WITH (
  'connector' = 'print'
);
ACF330CB-6770-4C05-8D86-23C622FAD014.png
将数据从test_source 插入到 print_table 中
INSERT INTO print_table
SELECT objId, data, capTime, dataType, channelCode
FROM test_source;

接下来我们去查看yarn任务
2735D3C5-74DB-43DB-BB91-82DA077CACEB.png
点进去看看
开始向test写一些json数据
/opt/kafka/bin/kafka-console-producer.sh --bootstrap-server 172.16.121.194:9092 --topic test
{"objId":"12345","data":"example data 1","capTime":"2023-11-07T08:00:00","dataType":"exampleType","channelCode":"ABCDEF"}
{"objId":"54321","data":"example data 2","capTime":"2023-11-07T08:15:00","dataType":"anotherType","channelCode":"GHIJKL"}
{"objId":"99999","data":"more example data","capTime":"2023-11-07T08:30:00","dataType":"additionalType","channelCode":"ZYXWVU"}
{"objId":"11111","data":"extra data","capTime":"2023-11-07T08:45:00","dataType":"extraType","channelCode":"QRSTUV"}
{"objId":"77777","data":"additional example data","capTime":"2023-11-07T09:00:00","dataType":"moreType","channelCode":"MNBVCX"}
{"objId":"88888","data":"more and more data","capTime":"2023-11-07T09:15:00","dataType":"typeX","channelCode":"POIUYT"}
{"objId":"22222","data":"different data","capTime":"2023-11-07T09:30:00","dataType":"typeY","channelCode":"LAKSDJ"}
{"objId":"66666","data":"sample data","capTime":"2023-11-07T09:45:00","dataType":"testType","channelCode":"QWERTY"}
{"objId":"44444","data":"new data","capTime":"2023-11-07T10:00:00","dataType":"newType","channelCode":"ZXCVBN"}
{"objId":"55555","data":"fresh data","capTime":"2023-11-07T10:15:00","dataType":"freshType","channelCode":"EDCRFV"}
7FBE0A68-D3AD-433C-8672-FB49C7C81FAA.png
查看flinkweb看数据过来了
2B6440A3-1BD4-4721-B9C4-3C4233BC02FF.png
输出到了控制台
AB8DC330-4763-4CE7-84D3-F09B86863507.png
完成


标签: 大数据运维

相关文章

Sentry管理Hive目录acl -setacl不生效

Sentry管理Hive目录acl -setacl不生效

CDH在启动Sentry后/user/hive/warehouse这个目录 hdfs手动setacl会不生效首先确保hdfs参数dfs.namenode.acls.enabled=true;还有另一个...

Hdfs3.x新特性详解

Hdfs3.x新特性详解

HDFS Disk Balancer(磁盘均衡器)HDFS Disk Balancer与HDFS Balancer的区别?两者都是实现负载均衡功能HDFS Balancer是之前Hadoop2.x中本...

离线安装Kerberos

首先下载kerberos客户端所需rpm包在网站https://pkgs.org/搜索以下3个rpm包:https://pkgs.org/libkadm5krb5-libskrb5-workstati...

Gartner权威报告解读|应用可观测性列为2023年重要战略技术趋势!

Gartner于今日发布企业机构在2023年需要探索的十大战略技术趋势。Gartner杰出研究副总裁Frances Karamouzis表示:“为了在经济动荡时期增加企业机构的盈利,首席信息官和IT高...

helm安装部署trino对接hive(一)

helm安装部署trino对接hive(一)

前提:本文前提是基于hive组件已经提前安装的情况下,安装部署好trino容器之后进行对hive组件的对接。helm trino地址:https://artifacthub.io/packages/h...

Haproxy配置负载均衡

yum安装haproxy如果后面要配置高可用,和keepalived配合使用更佳。yum install haproxy修改配置文件设置impala和ldap的负载均衡(Impala Daemon分布...

发表评论    

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。