trino组件对接hudi(四)

九月1年前技术文章729


安装部署

本文是基于已经部署了trino组件的环境上,进行的trino和hudi的对接,使trino组件能够正常查询hudi表。

1、增加hudi connector配置

在trino安装部署下的etc/catalog下,创建hudi.properties文件,将集群中的core-site.xml、hdfs-site.xml放在etc下面,在hudi.properties中进行指定。将配置文件hive.properties、core-site.xml、hdfs-site.xml分发到所有trino节点。

vim hudi.properties

connector.name=hudi
hive.metastore.uri=thrift://hadoop001:9083
hive.config.resources=/opt/trino/etc/core-site.xml,/opt/trino/etc/hdfs-site.xml
hive.metastore.thrift.impersonation.enabled=true

2、trino进行重启

/opt/trino/bin/launcher stop
/opt/trino/bin/launcher start

冒烟测试

1、spark创建hudi表

export SPARK_VERSION=3.3
#进入spark部署目录                    
./bin/spark-sql --packages org.apache.hudi:hudi-spark$SPARK_VERSION-bundle_2.12:0.14.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' --conf 'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog' --conf 'spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar' --proxy-user=hive

use test;
CREATE TABLE hudi_table2 (
    ts BIGINT,
    uuid STRING,
    rider STRING,
    driver STRING,
    fare DOUBLE,
    city STRING
) USING HUDI
TBLPROPERTIES (
  primaryKey = 'uuid'
)
PARTITIONED BY (city);

INSERT INTO hudi_table2
VALUES
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70 ,'san_francisco'),
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90 ,'san_francisco'),
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'    ),
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40 ,'sao_paulo'    ),
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06 ,'chennai'      ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');

CREATE TABLE fare_adjustment (ts BIGINT, uuid STRING, rider STRING, driver STRING, fare DOUBLE, city STRING) 
USING HUDI
TBLPROPERTIES (
  primaryKey = 'uuid'
)PARTITIONED BY (city);

INSERT INTO fare_adjustment VALUES 
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',-2.70 ,'san_francisco'),
(1695530237068,'3f3d9565-7261-40e6-9b39-b8aa784f95e2','rider-K','driver-U',64.20 ,'san_francisco'),
(1695241330902,'ea4c36ff-2069-4148-9927-ef8c1a5abd24','rider-H','driver-R',66.60 ,'sao_paulo'    ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',1.85,'chennai'      );

DELETE FROM hudi_table2 WHERE uuid = '3f3d9565-7261-40e6-9b39-b8aa784f95e2';

图1.png

图2.png


2、trino 客户端进行测试

使用test用户进行测试,因为test用户没有hudi_hive库的权限,首先进行添加


/opt/trino/trino-cli --server https://172.16.121.0:8443 --keystore-path /opt/trino/etc/trino.jks --keystore-password admin@123  --catalog hudi --user test --password --debug
#输入test密码
trino> use hudi_hive;
USE
trino:hudi_hive> show tables;
      Table       
------------------
 fare_adjustment  
 host_hudi        
 hudi_table       
 hudi_table2      
 hudi_table_keyed 
 student          
 tbl_customer     
(7 rows)


trino:hudi_hive> select * from hudi_table2;
 _hoodie_commit_time | _hoodie_commit_seqno  |          _hoodie_record_key          | _hoodie_partition_path |                            _hoodie_file_n
---------------------+-----------------------+--------------------------------------+------------------------+------------------------------------------
 20231123161346626   | 20231123161346626_1_0 | e3cf430c-889d-4015-bc98-59bdce1e530c | city=sao_paulo         | 5d42cc1a-c3cb-499b-a203-98b4372c30b6-0_1-
 20231123161346626   | 20231123161346626_1_1 | 7a84095f-737f-40bc-b62f-6b69664712d2 | city=sao_paulo         | 5d42cc1a-c3cb-499b-a203-98b4372c30b6-0_1-
 20231123161346626   | 20231123161346626_2_0 | 3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 | city=chennai           | b0b19f4e-7530-4df6-ab29-638afcea17ef-0_2-
 20231123161346626   | 20231123161346626_2_1 | c8abbe79-8d89-47ea-b4ce-4d224bae5bfa | city=chennai           | b0b19f4e-7530-4df6-ab29-638afcea17ef-0_2-
 20231123161346626   | 20231123161346626_0_0 | 334e26e9-8355-45cc-97c6-c31daf0df330 | city=san_francisco     | 9d63dd6d-16dc-4a0d-bdb1-4be1aceab3a5-0_0-
 20231123161346626   | 20231123161346626_0_1 | e96c4396-3fad-413a-a942-4cb36106d721 | city=san_francisco     | 9d63dd6d-16dc-4a0d-bdb1-4be1aceab3a5-0_0-
 20231123161346626   | 20231123161346626_0_2 | 9909a8b1-2d15-4d3d-8ec9-efc48c536a00 | city=san_francisco     | 9d63dd6d-16dc-4a0d-bdb1-4be1aceab3a5-0_0-
 20231123161346626   | 20231123161346626_0_3 | 1dced545-862b-4ceb-8b43-d2a568f6616b | city=san_francisco     | 9d63dd6d-16dc-4a0d-bdb1-4be1aceab3a5-0_0-
(8 rows)

merger:trino语法不支持

图5.png

delete: trino hudi connector不支持修改表行数

图6.png

时间旅行:语法不支持

图7.png


相关文章

hive 报 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster

hive 报 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster

解决办法:关键需要配置两个配置:mapred-site.xml 和 yarn-site.xml下面配置hadoop classpath。先运行shell命令:hadoop classpath添加一个配...

hadoop集群集成Iceberg操作指导

hadoop集群集成Iceberg操作指导

hadoop集群集成Iceberg操作指导书一、    准备工作1.       大数据集群运行正常,完成hi...

Java-API对HDFS的操作(IDEA版)

Java-API对HDFS的操作(IDEA版)

前期工作首先就是安装maven在win系统下不配置hadoop环境,直接运行代码会报错,显示缺少winutils.exe 和 hadoop.dll 两个文件首先添加pom.xml文件  <dep...

ES运维(七)添加sql插件

ES运维(七)添加sql插件

一、概述ElasticSearch安装SQL插件下载地址(中国大佬开发)二、集成sql插件1、下载es-sql插件下载地址: https://github.com/NLPchina/elastics...

大数据集群监控配置操作指导(一)prometheus+grafana部署

大数据集群监控配置操作指导(一)prometheus+grafana部署

1.prometheus+grafana部署(单独部署到一台服务器。4c8g。系统盘300G。操作系统建议7.6到7.9)1.1下载prometheus和grafana的二进制包mkdir /opt/...

chengying-4.0登入接口逆向

chengying-4.0登入接口逆向

首先是登入的加密url:http://xxxxx/api/v2/user/login参数1. username:admin@dtstack.com2. password:614bb9438210c69...

发表评论    

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。