自定义函数hello，并注册到hive源码中并重新编译

1 编写自己的udf方法hello

package cn.zhangjin.hive.udf;


import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;


/**
 * @author zj
 * @create 2019-02-22 17:51
 * 一个UDF: hello
 */


@Description(name = "sayhello",
        value = "_FUNC_(input_str) - returns Hello : input_str ",
        extended = "Example:
 "
                + "  > SELECT _FUNC_('wxk') FROM src LIMIT 1;
"
                + "  'Hello : wxk'
")
public class hello extends UDF {
    public Text evaluate(Text input) {
        return new Text("Hello: " + input);
    }
}

pom配置

    <url>http://maven.apache.org</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <hadoop.version>2.6.0-cdh5.7.0</hadoop.version>
        <hive.version>1.1.0-cdh5.7.0</hive.version>
    </properties>

    <repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
        </repository>

    </repositories>

    <!-- 设定插件仓库 -->
    <pluginRepositories>

        <pluginRepository>
            <id>jeesite-repos</id>
            <name>Jeesite Repository</name>
            <url>http://maven.aliyun.com/nexus/content/groups/public</url>
        </pluginRepository>

    </pluginRepositories>

    <dependencies>

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>${hadoop.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-exec</artifactId>
            <version>${hive.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>${hive.version}</version>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.10</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

2 下载hive源码

源码下载：http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.7.0-src.tar.gz

参见：FunctionRegistry

3 自己修改代码

　（1）修改udf函数，并放入源码中　

将hello.java  放入 hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/udf 文件夹中
vi hello.java 
将 package com.****.hello; 修改为 package org.apache.hadoop.hive.ql.udf;

（2)修改FunctionRegistry.java 文件

vi hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
文件头部 一长串 import 下添加，因为我们要吧这个UDF添加进去。
import org.apache.hadoop.hive.ql.udf.hello;
 
文件头部 static 块中添加  system.registerUDF("hello", hello.class, false);
如下：
static {
    system.registerGenericUDF("concat", GenericUDFConcat.class);
    system.registerUDF("hello", hello.class, false);
    system.registerUDF("substr", UDFSubstr.class, false);

4 重新编译源码

　　maven install 这里用的idea导入工程进行编译

5 把编译好的jar上传

　　重新部署或者只将编译后的hive-exec-1.1.0-cdh5.7.0.jar 放到原来hive部署的位置即可。两种方式都可以！！

　　我这里选择的是只将编译后的hive-exec-1.1.0-cdh5.7.0.jar 放到原来hive部署的位置即可

上传到hive的lib包下面

　　/mnt/software/hive-1.1.0-cdh5.7.0/lib

6 重新启动hive

查询内置函数

hive> show functions ;

发现hello已经注册进去了

7 测试一下函数没有问题

相关阅读:
标签的讲解
 属性分类
 LeetCode 003. 无重复字符的最长子串双指针
 Leetcode 136. 只出现一次的数字异或性质
 Leetcode 231. 2的幂数学
 LeetCode 21. 合并两个有序链表
 象棋博弈资源
 acwing 343. 排序 topsort floyd 传播闭包
 Leetcode 945 使数组唯一的最小增量贪心
 Leetcode 785 判断二分图 BFS 二分染色
原文地址：https://www.cnblogs.com/QuestionsZhang/p/10420076.html

最新文章
go函数可见性
 go 继承
 1的个数
 梦断代码3
梦断代码2
浪潮之巅2
浪潮之巅3
单元测试
 场景调研
 二维数组中的最大联通子数组

热门文章
改进方案博客汇总
 站立会议9
矢量图标
 提高背景的性能优化
 排版
 浮动讲解
 列表应用
 单词
 css中背景的应用
 css