UDTF: 一进多出
UDTF(User-Defined Table-Generating Function)支持一个输入多个输出, 一般用于解析工作,比如说解析url,然后获取url中的信息 编码:继承GenericUDTF,实现方法:initializa(返回返回值的参数类型)、process具体的处理方法, 一般在这个方法中会调用父类的forward方法进行数据的写出、最终调用close方法和MR程序中的cleanUp关闭资源
简单示例,将一列数据分成两列输出,name--> name,name+email
package com.hive.udtf; import java.util.ArrayList; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; public class myudtf extends GenericUDTF{ @Override public StructObjectInspector initialize(StructObjectInspector argOIs) throws UDFArgumentException { if(argOIs.getAllStructFieldRefs().size() != 1){ throw new UDFArgumentException("Argument Only one"); } ArrayList<String> fieldname = new ArrayList<String>(); fieldname.add("name"); fieldname.add("email"); ArrayList<ObjectInspector> fieldoi = new ArrayList<ObjectInspector>(); fieldoi.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector); fieldoi.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector); return ObjectInspectorFactory.getStandardStructObjectInspector(fieldname, fieldoi); } @Override public void process(Object[] args) throws HiveException { if(args.length == 1){ String name = args[0].toString(); String email = name+"@foxmail.com"; super.forward(new String[]{name,email}); } } @Override public void close() throws HiveException { super.forward(new String[] {"complete","finish"}); } }
测试
hive (workdb)> add jar /home/liuwl/opt/datas/myudtf.jar; hive (workdb)> create temporary function myudtf as 'com.hive.udtf.myudtf'; hive (workdb)> select myudtf(ename) as (name,email) from emp; 结果: name email SMITH SMITH@foxmail.com ALLEN ALLEN@foxmail.com WARD WARD@foxmail.com JONES JONES@foxmail.com MARTIN MARTIN@foxmail.com BLAKE BLAKE@foxmail.com CLARK CLARK@foxmail.com SCOTT SCOTT@foxmail.com KING KING@foxmail.com TURNER TURNER@foxmail.com ADAMS ADAMS@foxmail.com JAMES JAMES@foxmail.com FORD FORD@foxmail.com MILLER MILLER@foxmail.com complete finish