GenericUDF使用流程记载(转载+自己整理)
概述
本文是對[1]的整理和復(fù)現(xiàn)。
環(huán)境
| 組件 | 版本 |
| Hadoop | 3.1.2 |
| Zookeeper | 3.6.0 |
| Mysql | 8.0.22-0ubuntu0.20.04.2 |
| Hive | 2.3.7 |
Hive準備工作
本實驗最大的難處是需要在hive中使用dummy table[5]
如果沒有專門學(xué)習(xí)過,還真想不到有這種技巧。
create table tb_test2 (name string,score_list array<map<string,int>>); insert into tb_test2 select "A", array(map("math",100,"english",90,"history",85)) from (select 1) x; insert into tb_test2 select "A", array(map("math",95,"english",80,"history",100)) from (select 1) x; insert into tb_test2 select "A", array(map("math",80,"english",90,"history",100)) from (select 1) x;插入后效果如下:
0: jdbc:hive2://Desktop:10000> select * from tb_test2; +----------------+-------------------------------------------+ | tb_test2.name | tb_test2.score_list | +----------------+-------------------------------------------+ | A | [{"history":85,"english":90,"math":100}] | | A | [{"history":100,"english":80,"math":95}] | | A | [{"history":100,"english":90,"math":80}] | +----------------+-------------------------------------------+?
注冊GenericUDF流程
| 注冊命令(hive/beeline中進行) | 備注 |
| add jar /home/appleyuchi/桌面/Flink_Code/FLINK讀寫各種數(shù)據(jù)源/Java/target/table_api-1.0-SNAPSHOT.jar; | 指明自定義依賴包 |
| create temporary function hellonew as 'helloGenericUDFNew'; | 依賴包中的類helloGenericUDFNew注冊為hellonew |
使用GenericUDF流程
| hive命令 | 實驗結(jié)果 | 備注 |
| use db1; select * from tb_test2; | +--------------------+-----------------------------------------------------+ | tb_test2.name ?| ? ? ? ? ? ?tb_test2.score_list? ? ? ? ? ? ? ? ? ? ? ?| +--------------------+-----------------------------------------------------+ | A? ? ? ? ? ? ? ? ? ? ? ?| [{"history":85,"english":90,"math":100}] ?| | A? ? ? ? ? ? ? ? ? ? ? ?| [{"history":100,"english":80,"math":95}] ?| | A? ? ? ? ? ? ? ? ? ? ? ?| [{"history":100,"english":90,"math":80}] ?| +------------------- -+-----------------------------------------------------+ | 檢查hive功能是否ok |
| use db1; select?hellonew(tb_test2.name,tb_test2.score_list)?from?tb_test2; | +---------------------------------------+ | ? ? ? ? ? ? ?_c0? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| +---------------------------------------+ | {"name":"A","totalscore":275} ?| | {"name":"A","totalscore":275} ?| | {"name":"A","totalscore":270} ?| +----------------------------------------+ ? | 使用寫好的udtf |
從實驗結(jié)果來看這個GenericUDF的效果也很明顯:
統(tǒng)計每個人的總成績
完整代碼:
https://gitee.com/appleyuchi/Flink_Code/blob/master/FLINK讀寫各種數(shù)據(jù)源/Java/src/main/java/helloGenericUDFNew.java
?
hive的官方文檔中指出collection指的是arrays?or?maps[4]
Reference:
[1]Hive GenericUDF2
[2]GenericUDF的示例, 根據(jù)字符串生成詞向量
[3]How can I insert a key-value pair into a hive map?
[4]Hive Tutorial
[5]How to insert array<map<string,int>> into hive table?
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎勵來咯,堅持創(chuàng)作打卡瓜分現(xiàn)金大獎總結(jié)
以上是生活随笔為你收集整理的GenericUDF使用流程记载(转载+自己整理)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 芝麻信用商家接入指南
- 下一篇: Linux MMC介绍