To construct big data distributed platform based on Hadoop is a common method. Hadoop comes fron Google File System and is its open source realization. Here list the references for studying big data technology, especially on Hadoop.
基于Hadoop进行大数据分布式平台搭建是常用的方法,源于谷歌的GFS,为其开源实现。此处总结了学习大数据技术相关参考资料,尤其是Hadoop环境搭建时的参考文献,供大家参考,在后续学习过程中若发现更好的参考文献,会不断更新完善。
参考资料:
1.《大数据技术原理与应用—概念、存储、处理、分析与应用》
(林子雨 编著,人民邮电出版社,2017年2月第2版)
2. Hadoop: The Definitive Guide, Tom White, 4th Edition, 2015.4.
(http://vdisk.weibo.com/s/u5ntMYF7_5pe)
3. https://www.tutorialspoint.com/hadoop/index.htm
(Introduced the basic knowledge about the basic concept of big data,and mainly focus on the environment setup of Hadoop in detail)
"The Apache Software Foundation is a cornerstone of the modern Open Source software ecosystem â supporting some of the most widely used and important software solutions powering today's Internet economy." â Mark Driver, Research Vice President, Gartner
From Apache Project List you can find most thing about big data technology,for example Hadoop,Spark,Mahout, ZooKeeper, Sqoop, Pig, Hive, Hbase , Flume and so on. You can download the file data what you want, and then learn to install the software based on the guide. This is the basic requirement to study big data technology.