Presto开发语句简介
根据presto中的结构配置,catalog表示连接,主要看presto中catalog文件夹下的配置,一般包含hive、mysql等,其中可以根据业务的不同设置多个配置文件。schema表示连接中的数据库,table表示库中的表。
下面以hive数据源为例,使用命令通过指定catalog连接到hive数据源,查看含有的schema信息。
/opt/cloudera/parcels/presto/bin/presto --server hadoop01:8089 --catalog=hive show schemas;
数据类型
Boolean: true, false
Integer: tinyint, smallint, integer, bigint
Floating-Point: real, double
Fixed-Precision:DECIMAL
String:varchar, char, varbinary, json
Date and Time: date, time, time with time zone, timestamp, timestamp with time zone, interval year to month, interval day to second
Structural: array, map, row
Network Address: ipaddress
HyperLogLog: HyperLogLog, P4HyperLogLog
Quantile Digest: QDigest
一、建库语句
1、简单建/删库语句
create schema ods; show schemas;
drop schema ods; show schemas;
2、使用数据库
use hive.ods;
二、建表语句
1、建表格式:
CREATE TABLE [ IF NOT EXISTS ] table_name ( { column_name data_type [ COMMENT comment ] [ WITH ( property_name = expression [, ...] ) ] | LIKE existing_table_name [ { INCLUDING | EXCLUDING } PROPERTIES ] } [, ...] ) [ COMMENT table_comment ] [ WITH ( property_name = expression [, ...] ) ] - IF NOT EXISTS 比较安全,防止TABLE已存在的报错 - WITH 可以给TABLE添加属性
eg:
create table hive.ods.presto_test( "sid" int comment '学生id', "sname" varchar(50) comment '学生姓名', "cname" varchar(50) comment '班级名称', "total_score" int comment '考试总分' )comment 'presto测试表' WITH (format = 'ORC');
2、查看创建表的基本信息:
desc hive.ods.presto_test;
SHOW CREATE TABLE hive.ods.presto_test;
3、修改表信息
重命名 ALTER TABLE name RENAME TO new_name 添加字段 ALTER TABLE name ADD COLUMN column_name data_type [ COMMENT comment ] [ WITH ( property_name = expression [, ...] ) ] 删除字段 ALTER TABLE name DROP COLUMN column_name 重命名字段 ALTER TABLE name RENAME COLUMN column_name TO new_column_name
4、使用查询语句,创建表
CREATE TABLE [ IF NOT EXISTS ] table_name [ ( column_alias, ... ) ] [ COMMENT table_comment ] [ WITH ( property_name = expression [, ...] ) ] AS query [ WITH [ NO ] DATA ] eg: 创建一个新表presto_test_as,字段sid, sname分别来自于表orders的字段sid, sname CREATE TABLE presto_test_as (sid,sname ) AS SELECT sid,sname FROM presto_test;
5、删除表
DROP TABLE [ IF EXISTS ] table_name - IF EXISTS 可以防止TABLE不存在时的报错 eg: 删除名为web的TABLE DROP TABLE web 如果名为web的TABLE存在,则删除它 DROP TABLE IF EXISTS sales
三、插入语句
INSERT INTO table_name [ ( column [, ... ] ) ] query eg: 插入1条数据 insert into presto_test_as values(1,'九月'); 插入多条数据 insert into presto_test_as values(2,'二龙'),(3,'猛犸');
只能用insert into,insert overwrite报错
四、删除语句
具体的删除语句能不能执行看具体的connector是否支持。
DELETE FROM table_name [ WHERE condition ]
五、查询语句
select查询语法
[ WITH with_query [, ...] ] SELECT [ ALL | DISTINCT ] select_expr [, ...] [ FROM from_item [, ...] ] [ WHERE condition ] [ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ] [ HAVING condition] [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ] [ ORDER BY expression [ ASC | DESC ] [, ...] ] [ LIMIT [ count | ALL ] ] 以下是这些参数可能的格式 - from_item table_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ] from_item join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ] - join_type [ INNER ] JOIN LEFT [ OUTER ] JOIN RIGHT [ OUTER ] JOIN FULL [ OUTER ] JOIN CROSS JOIN - grouping_element () expression GROUPING SETS ( ( column [, ...] ) [, ...] ) CUBE ( column [, ...] ) ROLLUP ( column [, ...] )
1、with语句
WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a) SELECT a, b FROM x; 等同于 SELECT a, b FROM ( SELECT a, MAX(b) AS b FROM t GROUP BY a ) AS x; 也可以用于多条定义 WITH t1 AS (SELECT a, MAX(b) AS b FROM x GROUP BY a), t2 AS (SELECT a, AVG(d) AS d FROM y GROUP BY a) SELECT t1.*, t2.* FROM t1 JOIN t2 ON t1.a = t2.a; 也可以链式使用 WITH x AS (SELECT a FROM t), y AS (SELECT a AS b FROM x), z AS (SELECT b AS c FROM y) SELECT c FROM z;
2、group by
当在select语句中使用group by时,所有输出表达式都必须是聚合函数或group by子句中存在的列。
select sid,count(sname) as num from presto_test_as group by 1; select sid,count(sname) as num from presto_test_as group by sid; select count(sname) as num from presto_test_as group by sid;
3、grouping sets
可以指定多个列进行分组,结果列中不属于分组列的将被设置为NUll。
具有复杂分组语法(GROUPING SETS, CUBE 或 ROLLUP)的查询只从基础数据源读取一次,而使用UNION ALL的查询将读取基础数据三次。这就是当数据源不具有确定性时,使用UNION ALL的查询可能会产生不一致的结果的原因。
eg:有表presto_test,
select * from presto_test;
select sname,cname,sum(total_score) from presto_test group by grouping sets((sname),(sname,cname),(sid));
这个的查询在逻辑上等同于多个分组查询的union all:
select sname,null, sum(total_score) from presto_test group by sname union all select sname,cname,sum(total_score) from presto_test group by sname,cname union all select null,null,sum(total_score) from presto_test group by sid;
六、其他语句
声明一个名为statement_name的SQL PREPARE statement_name FROM statement 执行名为statement_name的声明 EXECUTE statement_name [ USING parameter1 [ , parameter2, ... ] ] 删除名为statement_name的声明 DEALLOCATE PREPARE statement_name PREPARE my_select2 FROM SELECT sname FROM presto_test_as WHERE sid > ? and sid <= ?; EXECUTE my_select2 USING 1, 3; DEALLOCATE PREPARE my_select2; EXECUTE my_select2 USING 1, 3;