怎么把自己做的網(wǎng)站登錄到網(wǎng)上北京百度網(wǎng)訊人工客服電話
當(dāng)Hive提供的內(nèi)置函數(shù)不能滿足查詢需求時,用戶可以根據(jù)自己業(yè)務(wù)編寫自定義函數(shù)(User Defined Functions, UDF), 然后在HiveQL中調(diào)用。
例如有這樣一個需求:為了保護(hù)用戶隱私,當(dāng)查詢數(shù)據(jù)的時候,需要將用戶手機(jī)號的中間四位用*號代替,比如手機(jī)號18001292688需要顯示為180****2688。這時候就可以寫一個自定義函數(shù)實現(xiàn)這個需求。
新建項目MyUDF,添加Maven依賴
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>org.example</groupId><artifactId>MyUDF</artifactId><version>1.0-SNAPSHOT</version><properties><maven.compiler.source>8</maven.compiler.source><maven.compiler.target>8</maven.compiler.target><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><hive.version>2.1.1-cdh6.1.0</hive.version></properties><dependencies><dependency><groupId>jdk.tools</groupId><artifactId>jdk.tools</artifactId><version>1.8</version><scope>system</scope><systemPath>${JAVA_HOME}/lib/tools.jar</systemPath></dependency><!--Hadoop common包--><!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common --><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>2.10.2</version></dependency><!-- https://mvnrepository.com/artifact/org.apache.hive/hive-exec --><dependency><groupId>org.apache.hive</groupId><artifactId>hive-exec</artifactId><version>${hive.version}</version></dependency></dependencies><!--添加CDH的倉庫--><repositories><repository><id>nexus-aliyun</id><url>http://maven.aliyun.com/nexus/content/groups/public</url></repository><repository><id>cloudera</id><url>https://repository.cloudera.com/artifactory/cloudera-repos</url></repository></repositories><build><plugins><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>3.6.0</version><configuration><source>1.8</source><target>1.8</target><encoding>UTF-8</encoding></configuration></plugin></plugins></build></project>
新建類hive.demo.MyUDF
package hive.demo;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;/*** Hive自定義函數(shù)類*/
public class MyUDF extends UDF{/*** @param text* 調(diào)用函數(shù)時需要傳入的參數(shù)* @return 隱藏后的手機(jī)號碼* 自定義函數(shù)類需要一個名為evaluate()的方法,Hive將調(diào)用該方法*/public String evaluate(Text text){String result = "手機(jī)號碼錯誤!";if(text != null && text.getLength() == 11){String inputStr = text.toString();StringBuffer sb = new StringBuffer();sb.append(inputStr.substring(0,3));sb.append("****");sb.append(inputStr.substring(7));result = sb.toString();}return result;}
}
?打包MyUDF.jar上傳至路徑,比如/home/hadoop/
在Hive CLI中執(zhí)行
hive>add jar /home/hadoop/MyUDF.jar;
創(chuàng)建函數(shù)名稱
CREATE TEMPORARY FUNCTION formatPhone AS 'hive.demo.MyUDF';
新建一個表測試一下這個自定義的函數(shù)
CREATE TABLE t_user(id INT, phone STRING);
INSERT INTO TABLE t_user
SELECT 1, '13123567589'
UNION ALL SELECT 2, '15898705673'
UNION ALL SELECT 3, '18001292688';