Hbase基础知识
-
1.Hbase表设计
In HBase, you will find two different types of tables: the system tables and the user
tables. Systems tables are used internally by HBase to keep track of meta information
like the table’s access control lists (ACLs), metadata for the tables and regions, name‐
spaces, and so on. There should be no need for you to look at those tables. User tables
are what you will create for your use cases. They will belong to the default name‐
space unless you create and use a specific one.
一个具体例子:

https://www.tutorialspoint.com/hbase/hbase_create_data.htmOnly columns where there is a value are stored in the underlying filesystem.

tables are split into regions where each region will store a specific range
of data. The regions are assigned to RegionServers to serve each region’s content.
A column family is an HBase-specific concept that you will not find in other RDBMS applications. For the same region, different column families will store the data into different files and can be configured differently. Data with the same access pattern and the same format should be grouped into the same column family. As an example regarding the format, if you need to store a lot of textual metadata information for customer profiles in addition to image files for each customer’s profile photo, you might want to store them into two different column families: one compressed (where all the textual information will be stored), and one not compressed (where the image files will be stored). As an example regarding the access pattern, if some information is mostly read and almost never written, and some is mostly written and almost never read, you might want to separate them into two different column families. If the different columns you want to store have a similar format and access pattern, regroup them within the same column family.Stores
We will find one store per column family. A store object regroups one memstore and zero or more store files (called HFiles). This is the entity that will store all the information written into the table and will also be used when data needs to be read from the table.HFiles
HFiles are created when the memstores are full and must be flushed to disk. HFiles are eventually compacted together over time into bigger files. They are the HBase file format used to store table data. HFiles are composed of different types of blocks (e.g.,
index blocks and data blocks). HFiles are stored in HDFS, so they benefit from Hadoop persistence and replication.Blocks
HFiles are composed of blocks. Those blocks should not be confused with HDFS blocks. One HDFS block might contain multiple HFile blocks. HFile blocks are usually between 8 KB and 1 MB, but the default size is 64 KB. However, if compression is
configured for a given table, HBase will still generate 64 KB blocks but will then compress them. The size of the compressed block on the disk might vary based on the data and the compression format. Larger blocks will create a smaller number of index values and are good for sequential table access, while smaller blocks will create more index values and are better for random read accesses.
each row will be stored within a specific format. Figure 2-4 represents the format of an individual HBase cell.节点角色:

Master Server
• Region assignment
• Load balancing
• RegionServer recovery
• Region split completion monitoring
• Tracking active and dead serversUnlike HBase RegionServers, the HBase Master doesn’t have much workload and can be installed on servers with less memory and fewer cores.Building HBase Masters (and other master services like NameNodes, ZooKeeper, etc.)on robust hardware with OS on RAID drives, dual power supply, etc. is highly recommended.
RegionServer
A RegionServer (RS) is the application hosting and serving the HBase regions and therefore the HBase data.Even if it is technically doable to host more than one RegionServer on a physical host,it is recommended to run only one server per host and to give it the resources you will have shared between the two servers. -
Table (HBase table)
Region (Regions for the table)
Store (Store per ColumnFamily for each Region for the table)
MemStore (MemStore for each Store for each Region for the table)
StoreFile (StoreFiles for each Store for each Region for the table)
Block (Blocks within a StoreFile within a Store for each Region for the table) -
-
-


