《HBase 权威指南》学习笔记一引言

《HBase 权威指南》学习笔记一引言
解决数据库多写问题，同事推荐使用hbase，并做了HBase培训，也看到老大tim参会说淘宝用hbase替代部分mysql核心应用，学习研究下看是否适用

分布式计算的谬论.：

1 The network is reliable.
2 Latency is zero.
3 Bandwidth is infinite.
4 The network is secure.
5 Topology doesn't change.
6 There is one administrator.
7 Transport cost is zero.
8 The network is homogeneous.

下载版本0.92.1 889个文件 285749 行java代码（find . -name '*.java'|wc -l）

《HBase 权威指南》目录摘要：
1. hbase演进
  
  November 2006
  Google releases paper on BigTable
  February 2007
  Initial HBase prototype created as Hadoop contrib§
  October 2007
  First “usable” HBase (Hadoop 0.15.0)
  January 2008
  Hadoop becomes an Apache top-level project, HBase becomes subproject
  October 2008
  HBase 0.18.1 released
  January 2009
  HBase 0.19.0 released
  September 2009
  HBase 0.20.0 released, the performance release
  May 2010
  HBase becomes an Apache top-level project
  June 2010
  HBase 0.89.20100621, first developer release
  January 2011
  HBase 0.90.0 released, the durability and stability release
  Mid 2011
  HBase 0.92.0 released, tagged as coprocessor and security release
2. rdbms的局限性
  举例“Hush, the HBase URL Shortener”这个应用，随访问量增大要加slave，加cache，只能做简单查询，考虑读写的不断优化和扩展，分表分库，在应用层面改程序，做sharding，买好的硬件，以及随后的不尽噩梦。
3. HBase的面向column的表
  
  the most basic unit is a column. One or more columns form a
  row that is addressed uniquely by a row key. A number of rows, in turn, form a table,
  and there can be many of them. Each column may have multiple versions, with each
  distinct value contained in a separate cell.
  (Table, RowKey, Family, Column, Timestamp) → Value 可在编程语言中表达为：
  
  SortedMap<RowKey, List<SortedMap<Column, List<Value, Timestamp>>>> （p19）
  相同rowkey会有不同时间戳的数据，对应不同的版本,数据存储在HFiles中，索引保存在内存中，默认64KB，HFiles又被保存在Hadoop Distributed File System（hdfs）中，确保在跨服务器的数据写入不会丢失。索引存储在文件块的最后面.
4. HBase的anto-sharding
  
  region去管理监控做sharding。“Each region is served by exactly one region server, and each of these servers can serve
  
  many regions at any time"
5. 数据写入流程
  
  When data is updated it is first written to a commit log, called a write-ahead log (WAL)
  in HBase, and then stored in the in-memory memstore. Once the data in memory has
  exceeded a given maximum value, it is flushed as an HFile to disk. After the flush, the
  commit logs can be discarded up to the last unflushed modification. While the system
  is flushing the memstore to disk, it can continue to serve readers and writers without
  having to block them.
  
  Since flushing memstores to disk causes more and more HFiles to be created, HBase
  has a housekeeping mechanism that merges the files into larger ones using compaction.
  There are two types of compaction: minor compactions and major compactions.（p24）
6. HBase组成部分
  the client library, one master server, and many region servers.HBase master server 使用zookeeper管理region servers，负载均衡，去掉繁忙服务器。hbase相比google bigtable，增加了" push-down predicates, that is, filters,
  
  reducing data transferred over the network"
相关阅读:
WinFrom 第三方控件 TeleRik控件
 部署项目可以将源码删掉
 XDocument简单入门
 通过HttpWebRequest在后台对WebService进行调用
 C#中HttpWebRequest的用法详解
 HttpWebRequest类
 系统之间通讯方式—SOAP（web service）
Linux strace命令
 使用 Linux 的 strace 命令跟踪/调试程序的常用选项
 Linux umount的device is busy问题
原文地址：https://www.cnblogs.com/shenguanpu/p/2521259.html

《HBase 权威指南》学习笔记一 引言

《HBase 权威指南》学习笔记一引言