當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

GenericUDF使用流程记载(转载+自己整理)

發(fā)布時間：2023/12/31 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 GenericUDF使用流程记载(转载+自己整理) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

概述

本文是對[1]的整理和復(fù)現(xiàn)。

環(huán)境

組件	版本
Hadoop	3.1.2
Zookeeper	3.6.0
Mysql	8.0.22-0ubuntu0.20.04.2
Hive	2.3.7

Hive準備工作

本實驗最大的難處是需要在hive中使用dummy table[5]

如果沒有專門學(xué)習(xí)過，還真想不到有這種技巧。

create table tb_test2 (name string,score_list array<map<string,int>>); insert into tb_test2 select "A", array(map("math",100,"english",90,"history",85)) from (select 1) x; insert into tb_test2 select "A", array(map("math",95,"english",80,"history",100)) from (select 1) x; insert into tb_test2 select "A", array(map("math",80,"english",90,"history",100)) from (select 1) x;

插入后效果如下:

0: jdbc:hive2://Desktop:10000> select * from tb_test2; +----------------+-------------------------------------------+ | tb_test2.name | tb_test2.score_list | +----------------+-------------------------------------------+ | A | [{"history":85,"english":90,"math":100}] | | A | [{"history":100,"english":80,"math":95}] | | A | [{"history":100,"english":90,"math":80}] | +----------------+-------------------------------------------+

注冊GenericUDF流程

注冊命令(hive/beeline中進行)	備注
add jar /home/appleyuchi/桌面/Flink_Code/FLINK讀寫各種數(shù)據(jù)源/Java/target/table_api-1.0-SNAPSHOT.jar;	指明自定義依賴包
create temporary function hellonew as 'helloGenericUDFNew';	依賴包中的類helloGenericUDFNew注冊為hellonew

使用GenericUDF流程

hive命令

實驗結(jié)果

備注

use db1;

select * from tb_test2;

+--------------------+-----------------------------------------------------+
| tb_test2.name ?| ? ? ? ? ? ?tb_test2.score_list? ? ? ? ? ? ? ? ? ? ? ?|
+--------------------+-----------------------------------------------------+
| A? ? ? ? ? ? ? ? ? ? ? ?| [{"history":85,"english":90,"math":100}] ?|
| A? ? ? ? ? ? ? ? ? ? ? ?| [{"history":100,"english":80,"math":95}] ?|
| A? ? ? ? ? ? ? ? ? ? ? ?| [{"history":100,"english":90,"math":80}] ?|
+------------------- -+-----------------------------------------------------+

檢查hive功能是否ok

use db1;

select?hellonew(tb_test2.name,tb_test2.score_list)?from?tb_test2;

+---------------------------------------+
| ? ? ? ? ? ? ?_c0? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|
+---------------------------------------+
| {"name":"A","totalscore":275} ?|
| {"name":"A","totalscore":275} ?|
| {"name":"A","totalscore":270} ?|
+----------------------------------------+
?

使用寫好的udtf

從實驗結(jié)果來看這個GenericUDF的效果也很明顯:

統(tǒng)計每個人的總成績

完整代碼:

https://gitee.com/appleyuchi/Flink_Code/blob/master/FLINK讀寫各種數(shù)據(jù)源/Java/src/main/java/helloGenericUDFNew.java

hive的官方文檔中指出collection指的是arrays?or?maps[4]

Reference:

[1]Hive GenericUDF2

[2]GenericUDF的示例, 根據(jù)字符串生成詞向量

[3]How can I insert a key-value pair into a hive map?

[4]Hive Tutorial

[5]How to insert array<map<string,int>> into hive table?

創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎勵來咯，堅持創(chuàng)作打卡瓜分現(xiàn)金大獎

總結(jié)

以上是生活随笔為你收集整理的GenericUDF使用流程记载(转载+自己整理)的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。