• php通过thrift获取hadoop资源


    简介:这是php通过thrift获取hadoop资源的详细页面,介绍了和php,hadoop, thrift, php, java, hdfs php通过thrift获取hadoop资源有关的知识、技巧、经验,和一些php源码等。

    class='pingjiaF' frameborder='0' src='http://biancheng.dnbcw.info/pingjia.php?id=360990' scrolling='no'> php可以通过thrift连接hbase,同样php可以通过thrift读取hadoop资源(HDFS资源)。
    准备:
    php需要thrift的libary
    packages:hadoop-0.20.2\src\contrib\thriftfs\gen-php
    源码:
    <?php
    	$GLOBALS['THRIFT_ROOT'] = ROOTPATH . '/lib/thrift';
    	require_once($GLOBALS['THRIFT_ROOT'].'/Thrift.php');
    	require_once($GLOBALS['THRIFT_ROOT'].'/transport/TSocket.php');
    	require_once($GLOBALS['THRIFT_ROOT'].'/transport/TBufferedTransport.php');
    	require_once($GLOBALS['THRIFT_ROOT'].'/protocol/TBinaryProtocol.php');
    	require_once($GLOBALS["THRIFT_ROOT"] . "/packages/hadoopfs/ThriftHadoopFileSystem.php");
    	$hadoop_socket = new TSocket("localhost", 59256);
    	$hadoop_socket -> setSendTimeout(10000); // Ten seconds
    	$hadoop_socket -> setRecvTimeout(20000); // Twenty seconds
    	$hadoop_transport = new TBufferedTransport($hadoop_socket);
    	$hadoop_protocol = new TBinaryProtocol($hadoop_transport);
    	$hadoopClient = new ThriftHadoopFileSystemClient($hadoop_protocol);
    	$hadoop_transport -> open();
    	try {
    		// create directory
    		$dirpathname = new hadoopfs_Pathname(array("pathname" => "/user/root/hadoop"));
    		if($hadoopClient -> exists($dirpathname) == TRUE) {
    			echo $dirpathname -> pathname . " exists.\n";
    		} else {
    			$result = $hadoopClient -> mkdirs($dirpathname);
    		}
    		// put file
    		$filepathname = new hadoopfs_Pathname(array("pathname" => $dirpathname -> pathname . "/hello.txt"));
    		$localfile = fopen("hello.txt", "rb");
    		$hdfsfile = $hadoopClient -> create($filepathname);
    		while(true) {
    			$data = fread($localfile, 1024);
    			if(strlen($data) == 0)
    				break;
    			$hadoopClient -> write($hdfsfile, $data);
    		}
    		$hadoopClient -> close($hdfsfile);
    		fclose($localfile);
    		// get file
    		echo "read file:\n";
    		print_r($filepathname);
    		$data = "";
    		$hdfsfile = $hadoopClient -> open($filepathname);
    		print_r($hdfsfile);
    		while(true) {
    			$data = $hadoopClient -> read($hdfsfile, 0, 1024);
    			if(strlen($data) == 0)
    				break;
    			print $data;
    		}
    		$hadoopClient -> close($hdfsfile);
    		echo "listStatus:\n";
    		$result = $hadoopClient -> listStatus($dirpathname);
    		print_r($result);
    		foreach($result as $key => $value) {
    			if($value -> isdir == "1")
    				print "dir\t";
    			else
    				print "file\t";
    			print $value -> block_replication . "\t" . $value -> length . "\t" . $value -> modification_time . "\t" . $value -> permission . "\t" . $value -> owner . "\t" . $value -> group . "\t" . $value -> path . "\n";
    		}
    		$hadoop_transport -> close();
    	} catch(Exception $e) {
    		print_r($e);
    	}
    ?>
    

    启动hadoop的thrift
    hadoop-0.20.2\src\contrib\thriftfs\scripts\start_thrift_server.sh 59256

    problem one:
    在系统目录创建文件,而不是在hadoop目录中创建文件
    原因:
    thrift启动时加载默认的配置文件
    解决方法:
    修改start_thrift_server.sh文件
    TOP=/usr/local/hadoop-0.20.2
    CLASSPATH=$CLASSPATH:$TOP/conf

    problem two:
    java.lang.NullPointerException
        at     org.apache.hadoop.thriftfs.HadoopThriftServer$HadoopThriftHandler.write(HadoopThriftServer.java:282)
    at     org.apache.hadoop.thriftfs.api.ThriftHadoopFileSystem$Processor$write.process(Unknown Source)
    at org.apache.hadoop.thriftfs.api.ThriftHadoopFileSystem$Processor.process(Unknown Source)
    at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
    at         java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
    原因:
    java返回的map hash id为long类型,而php(32位)无法存储long类型的数据,导致转换成float数据后丢失精度。
    private long nextId = new Random().nextLong();
    java返回数据:4207488029786584864
    php获取数据:4.2074880297866E+18
    java获得php传递数据:4207488029786585088
    解决方法:
    修改hadoop-0.20.2\src\contrib\thriftfs\if\hadoopfs.thrift文件
    修改
    struct ThriftHandle {
      i64 id
    }

    struct ThriftHandle {
      string id
    }
    重新生成php packages
    thrift --gen php hadoopfs.thrift
    修改org.apache.hadoop.thriftfs.api.ThriftHandle类
    修改
    public long id;
    为:
    public String id;
    修改相应的程序
    org.apache.hadoop.thriftfs.HadoopThriftServer
    修改
    long id = insert(out);
    ThriftHandle obj = new ThriftHandle(id);

    long id = insert(out);
    String _id = String.valueOf(id);
    ThriftHandle obj = new ThriftHandle(_id);
    修改相应的程序
    重新打包,启动hadoop的thrift:
    hadoop-0.20.2\src\contrib\thriftfs\scripts\start_thrift_server.sh 59256

    这样php就可以连接并且获取hadoop中的资源了

    爱J2EE关注Java迈克尔杰克逊视频站JSON在线工具

    http://biancheng.dnbcw.info/php/360990.html pageNo:1
  • 相关阅读:
    pycharm cannot import name 'imread' from 'scipy.misc报错及解决办法
    顶会热词冲击(二)
    个人总结
    顶会热词冲击(一)
    Android学习——使用http协议访问网络
    python爬取论文
    《程序员修炼之道:从小工到专家》 阅读笔记03
    开课第十四周周总结
    Android学习——播放视频
    Android学习——播放音频
  • 原文地址:https://www.cnblogs.com/ooooo/p/2235992.html
Copyright © 2020-2023  润新知