一、进入Hbase Shell客户端
先在Linux Shell命令行终端执行start-dfs.sh脚本启动HDFS,再执行start-hbase.sh脚本启动HBase。如果Linux系统已配置HBase环境变量,可直接在任意目录下执行hbase shell脚本命令,就可进入HBase Shell的命令行终端环境,exit可以退出HBase Shell(我安装的是伪分布式的HBase)。
(1) help帮助命令(或者help '命令名称'查看某一具体命令的使用方法)
hbase:055:0> help
HBase Shell, version 2.5.6, r6bac842797dc26bedb7adc0759358e4c8fd5a992, Sat Oct 14 23:36:46 PDT 2023
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: processlist, status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
二、General通用操作命令
(1)status:查看HBase集群状态
hbase:051:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 5.0000 average load
Took 2.3419 seconds
(2)version:查看HBase版本信息
hbase:052:0> version
2.5.6, r6bac842797dc26bedb7adc0759358e4c8fd5a992, Sat Oct 14 23:36:46 PDT 2023
Took 0.0029 seconds
(3)whoami:查看当前登录HBase的系统用户信息
hbase:053:0> whoami
root (auth:SIMPLE)
groups: root
Took 0.1906 seconds
(4)table_help:查看HBase数据表操作的帮助信息
table_help
Help for table-reference commands.
You can either create a table via 'create' and then manipulate the table via commands like 'put', 'get', etc.
See the standard help information for how to use each of these commands.
三、Namespace操作
namespace(命名空间)是HBase对数据表的逻辑分组,用于数据表的业务划分。例如上层应用可以不同业务的数据表分别放置在不同的名字空间,以实现不同业务数据表之间的数据隔离。命名空间和数据表是一对多关系,命名空间可以包含多张数据表,一张数据表只能属于一个名字空间。HBase数据库中的NameSpace类似于MySQL数据库中的database。 命名空间是HBASE对数据表的逻辑分组,用于数据表的业务划分。
(1)list_namespace:查询所有命名空间
hbase:036:0> list_namespace
NAMESPACE
default
hbase
ns1
3 row(s)
Took 0.0654 seconds
(2)list_namespace_tables : 查询指定命名空间的表
hbase:038:0> list_namespace_tables 'ns1'
TABLE
t1
1 row(s)
Took 0.1958 seconds
=> ["t1"]
(3)create_namespace : 创建指定的命名空间
hbase:039:0> create_namespace 'ns2'
Took 0.4516 seconds
(4)describe_namespace : 查询指定命名空间的结构
hbase:041:0> describe_namespace 'ns2'
DESCRIPTION
{NAME => 'ns2'}
(5)alter_namespace :修改命名空间的结构
hbase:045:0> alter_namespace 'ns2',{METHOD=>'set','name'=>'new'}
Took 1.6665 seconds
hbase:046:0> describe_namespace 'ns2'
DESCRIPTION
{NAME => 'ns2', name => 'new'}
Quota is disabled
Took 0.2023 seconds
hbase:047:0> alter_namespace 'ns2',{METHOD=>'unset','NAME'=>'name'}
Took 0.1602 seconds
hbase:048:0> describe_namespace 'ns2'
DESCRIPTION
{NAME => 'ns2'}
Quota is disabled
Took 0.0878 seconds
(6)drop_namespace:删除命名空间
hbase:049:0> drop_namespace 'ns2'
Took 0.4194 seconds
hbase:050:0> list_namespace
NAMESPACE
default
hbase
ns1
3 row(s)
Took 0.0610 seconds
四、DDL操作命令
DDL分组中包含的操作命令很多,主要用于对HBase数据库表的相关管理操作,主要包括创建表、修改表、删除表、列出表、启用表、禁用表等操作。
(1)create:建表(建表的时候,必须至少指定一个列族名称)
在默认的命名空间中,创建表students,并包含一个名为info的列族,列族属性默认。
#如果保留默认的列族设置,建表时直接写列族的名字就可以了
hbase:070:0> create 'students','info'
2024-03-21 00:25:59,140 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: CREATE, Table Name: default:students, procId: 131 completed
Created table students
Took 3.4314 seconds
=> Hbase::Table - students
在指定的ns1命名空间中,创建表t1,并包含一个名为f1的列族,列族属性自定义。
#建表的同时指定列族属性
hbase:071:0> create 'ns1:t1',{NAME=>'f1',VERSIONS=>5}
2024-03-21 00:28:43,872 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: CREATE, Table Name: ns1:t1, procId: 134 completed
Created table ns1:t1
Took 2.3465 seconds
=> Hbase::Table - ns1:t1
(2)list : 查询所有的表
hbase:072:0> list
TABLE
students
ns1:t1
2 row(s)
Took 0.2459 seconds
=> ["students", "ns1:t1"]
(3)describe/desc : 查询表结构
hbase:074:0> describe 'students'
Table students is ENABLED
students, {TABLE_ATTRIBUTES => {METADATA => {'hbase.store.file-tracker.impl' => DEFAULT'}}}
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE'
, DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0
', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLO
CKSIZE => '65536 B (64KB)'}
1 row(s)
Quota is disabled
Took 1.1379 seconds
(4)exists : 判断指定表明是否存在
hbase:075:0> exists 'ns1:t1'
Table ns1:t1 does exist
Took 0.0714 seconds
=> true
(5)alter:修改表,添加、修改、删除列簇信息
为students表格,增加一个列族score。
hbase:076:0> alter 'students','score'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 3.8994 seconds
hbase:077:0> desc 'students'
Table students is ENABLED
students, {TABLE_ATTRIBUTES => {METADATA => {'hbase.store.file-tracker.impl' => 'DEFAULT'}}}
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE'
, DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0
', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLO
CKSIZE => '65536 B (64KB)'}
{NAME => 'score', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE
', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '
0', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BL
OCKSIZE => '65536 B (64KB)'}
2 row(s)
Quota is disabled
Took 0.0753 seconds
修改表格students的列族score的属性VERSIONS的值为5。
hbase:078:0> alter 'students',NAME=>'score',VERSIONS=>5
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9776 seconds
删除表格表格students的列族score。
hbase:079:0> alter 'students',NAME=>'score',METHOD=>'delete'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.5517 seconds
hbase:080:0> desc 'students'
Table students is ENABLED
students, {TABLE_ATTRIBUTES => {METADATA => {'hbase.store.file-tracker.impl' =>'DEFAULT'}}}
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE'
, DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0
', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLO
CKSIZE => '65536 B (64KB)'}
1 row(s)
Quota is disabled
Took 0.2625 seconds
(6)disable:禁用表格
hbase:083:0> disable 'students'
2024-03-21 00:52:29,298 INFO [main] client.HBaseAdmin (HBaseAdmin.java:rpcCall(926)) - Started disable of students
2024-03-21 00:52:29,810 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: DISABLE, Table Name: default:students, procId: 158 completed
Took 0.5354 seconds
(7)enable:启用表格
hbase:084:0> enable 'students'
2024-03-21 00:52:51,239 INFO [main] client.HBaseAdmin (HBaseAdmin.java:rpcCall(866)) - Started enable of students
2024-03-21 00:52:51,882 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: ENABLE, Table Name: default:students, procId: 161 completed
Took 0.6636 seconds
(8)drop:删除表格(先要disable表,再删除表)
hbase:085:0> disable 'ns1:t1'
2024-03-21 00:53:34,360 INFO [main] client.HBaseAdmin (HBaseAdmin.java:rpcCall(926)) - Started disable of ns1:t1
2024-03-21 00:53:34,727 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: DISABLE, Table Name: ns1:t1, procId: 164 completed
Took 0.3910 seconds
hbase:086:0> drop 'ns1:t1'
2024-03-21 00:53:43,300 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: DELETE, Table Name: ns1:t1, procId: 167 completed
Took 0.7044 seconds
hbase:087:0> list
TABLE
students
1 row(s)
Took 0.0301 seconds
=> ["students"]
五、DML操作命令
DML包含的操作命令很多,主要用于对数据表中的数据进行操作,主要包括全表扫描、读取单行数据、写入数据和删除数据等操作。
(1)put:插入数据(put命令,不能一次性插入多条)
往students表格的info列族,增加2条数据:
行键为s001,列限定符name,单元格值Jack;行键为s001,列限定符age,单元格值20;
往students表格的score列族,增加2条数据:
行键为s001,列限定符Chinese,单元格值90;行键为s001,列限定符Math,单元格值85;
hbase:088:0> put 'students','s001','info:name','Jack'
Took 2.2121 seconds
hbase:092:0> put 'students','s001','info:age','20'
Took 0.0543 seconds
hbase:096:0> put 'students','s001','score:Chinese','90'
Took 0.0380 seconds
hbase:097:0> put 'students','s001','score:Math','85'
Took 0.0148 seconds
修改students表格的单元格值:行键为s001,列限定符name,单元格值'Jack'修改为'Mike'。
hbase:089:0> put 'students','s001','info:name','Mike'
Took 0.2027 seconds
(2)get:查询数据
查询students表格,行键为s001的所有数据。
hbase:098:0> get 'students','s001'
COLUMN CELL
info:age timestamp=2024-03-21T01:11:05.881, value=20
info:name timestamp=2024-03-21T01:04:05.318, value=Mike
score:Chinese timestamp=2024-03-21T01:16:34.006, value=90
score:Math timestamp=2024-03-21T01:16:48.854, value=85
1 row(s)
Took 0.1583 seconds
查询行键为s001,数据列为socre:Chinese的单元格值。
hbase:103:0> get 'students','s001','score:Chinese'
COLUMN CELL
score:Chinese timestamp=2024-03-21T01:16:34.006, value=90
1 row(s)
Took 0.0142 seconds
查询学生Mike的多次数学成绩。(确保score列族数学的VERSION版本不止为1)
hbase:108:0> alter 'students',NAME=>'score',VERSIONS=>5
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 3.6026 seconds
1)写入学生Mike的多次数学成绩
hbase:100:0> put 'students','s001','score:Math','92'
Took 0.0871 seconds
hbase:101:0> put 'students','s001','score:Math','100'
Took 0.0158 seconds
2)读取学生Mike最新的一次数学成绩(get命令默认只读出最新写入的单元格值,时间戳版本最大的单元格值内容)。
hbase:102:0> get 'students','s001','score:Math'
COLUMN CELL
score:Math timestamp=2024-03-21T01:23:13.668, value=100
1 row(s)
Took 0.0771 seconds
3)查询学生Mike的3次数学成绩。
hbase:111:0> get 'students','s001',{COLUMN=>'score:Math',VERSIONS=>3}
COLUMN CELL
score:Math timestamp=2024-03-21T01:32:55.629, value=85
score:Math timestamp=2024-03-21T01:32:35.808, value=92
score:Math timestamp=2024-03-21T01:23:13.668, value=100
1 row(s)
Took 0.2027 seconds
(3)scan:扫描数据(读取所有行的每个列族的所有数据列的最新时间戳版本的单元格值)
1)为了获得更好效果,现在students表中写入多行数据。
hbase:112:0> put 'students','s002','info:name','Tom'
hbase:113:0> put 'students','s002','info:age','19'
hbase:114:0> put 'students','s002','score:Chinese','87'
hbase:115:0> put 'students','s002','score:Math','70'
hbase:116:0> put 'students','s003','info:name','Lucy'
hbase:117:0> put 'students','s003','info:age',18'
hbase:122:0> put 'students','s003','score:Chinese','80'
hbase:123:0> put 'students','s003','score:Math','90'
2)对数据表students进行全表扫描。
hbase:124:0> scan 'students'
ROW COLUMN+CELL
s001 column=info:age, timestamp=2024-03-21T01:11:05.881, value=20
s001 column=info:name, timestamp=2024-03-21T01:04:05.318, value=Mike
s001 column=score:Chines, timestamp=2024-03-21T01:16:34.006, value=90
s001 column=score:Math, timestamp=2024-03-21T01:32:55.629, value=85
s002 column=info:Chinese, timestamp=2024-03-21T01:43:05.067, value=89
s002 column=info:Math, timestamp=2024-03-21T01:43:22.209, value=95
s002 column=info:age, timestamp=2024-03-21T01:42:45.619, value=19
s002 column=info:name, timestamp=2024-03-21T01:42:33.917, value=Tom
s003 column=info:age, timestamp=2024-03-21T01:44:16.318, value=18
s003 column=info:name, timestamp=2024-03-21T01:43:45.397, value=Lucy
s003 column=score:Chinese, timestamp=2024-03-21T01:45:40.078, value=80
s003 column=score:Math, timestamp=2024-03-21T01:45:53.743, value=90
3 row(s)
Took 0.2514 seconds
3)对数据表students的指定行键范围的数据进行扫描。
#若只指定开始的行键STARTROW,会一直扫描到最后,包含STARTROW和最后一行数据
hbase:135:0> scan 'students',{STARTROW=>'s002'}
ROW COLUMN+CELL
s002 column=info:age, timestamp=2024-03-21T01:42:45.619, value=19
s002 column=info:name, timestamp=2024-03-21T01:42:33.917, value=Tom
s002 column=score:Chinese, timestamp=2024-03-21T01:48:44.123, value=87
s002 column=score:Math, timestamp=2024-03-21T01:48:59.578, value=70
s003 column=info:age, timestamp=2024-03-21T01:44:16.318, value=18
s003 column=info:name, timestamp=2024-03-21T01:43:45.397, value=Lucy
s003 column=score:Chinese, timestamp=2024-03-21T01:45:40.078, value=80
s003 column=score:Math, timestamp=2024-03-21T01:45:53.743, value=90
2 row(s)
Took 0.6646 seconds
#同时指定STARTROW和STOPROW,行键范围包括STARTROW,但不包含STOPROW的数据
hbase:136:0> scan 'students',{STARTROW=>'s002',STOPROW=>'s003'}
ROW COLUMN+CELL
s002 column=info:age, timestamp=2024-03-21T01:42:45.619, value=19
s002 column=info:name, timestamp=2024-03-21T01:42:33.917, value=Tom
s002 column=score:Chinese, timestamp=2024-03-21T01:48:44.123, value=87
s002 column=score:Math, timestamp=2024-03-21T01:48:59.578, value=70
1 row(s)
Took 0.0566 seconds
4)对数据表students的指定列族info进行扫描。
hbase:137:0> scan 'students',{COLUMNS=>['info']}
ROW COLUMN+CELL
s001 column=info:age, timestamp=2024-03-21T01:11:05.881, value=20
s001 column=info:name, timestamp=2024-03-21T01:04:05.318, value=Mike
s002 column=info:age, timestamp=2024-03-21T01:42:45.619, value=19
s002 column=info:name, timestamp=2024-03-21T01:42:33.917, value=Tom
s003 column=info:age, timestamp=2024-03-21T01:44:16.318, value=18
s003 column=info:name, timestamp=2024-03-21T01:43:45.397, value=Lucy
3 row(s)
Took 0.2538 seconds
5)对数据表students的指定列族info下的name数据列进行扫描。
hbase:138:0> scan 'students',{COLUMNS=>['info:name']}
ROW COLUMN+CELL
s001 column=info:name, timestamp=2024-03-21T01:04:05.318, value=Mike
s002 column=info:name, timestamp=2024-03-21T01:42:33.917, value=Tom
s003 column=info:name, timestamp=2024-03-21T01:43:45.397, value=Lucy
3 row(s)
Took 0.0447 seconds
6)对数据表students的前2行数据进行扫描。
hbase:139:0> scan 'students',{LIMIT=>2}
ROW COLUMN+CELL
s001 column=info:age, timestamp=2024-03-21T01:11:05.881, value=20
s001 column=info:name, timestamp=2024-03-21T01:04:05.318, value=Mike
s001 column=score:Chinese, timestamp=2024-03-21T01:16:34.006, value=90
s001 column=score:Math, timestamp=2024-03-21T01:32:55.629, value=85
s002 column=info:age, timestamp=2024-03-21T01:42:45.619, value=19
s002 column=info:name, timestamp=2024-03-21T01:42:33.917, value=Tom
s002 column=score:Chinese, timestamp=2024-03-21T01:48:44.123, value=87
s002 column=score:Math, timestamp=2024-03-21T01:48:59.578, value=70
2 row(s)
Took 0.0328 seconds
(4)count:统计表行数
hbase:141:0> count 'students'
3 row(s)
Took 1.1445 seconds
=> 3
(5)delete:删除指定数据列单元格
删除数据表students的指定数据列单元格(删除行键003的列族score下的列限定符Math单元格)。
hbase:143:0> delete 'students','s003','score:Math'
Took 0.0677 seconds
hbase:144:0> get 'students','s003'
COLUMN CELL
info:age timestamp=2024-03-21T01:44:16.318, value=18
info:name timestamp=2024-03-21T01:43:45.397, value=Lucy
score:Chines timestamp=2024-03-21T01:45:40.078, value=80
1 row(s)
Took 0.1472 seconds
(6)deleteall:删除指定数据行
1)删除数据表students的行键为s003的数据。
hbase:145:0> deleteall 'students','s003'
Took 0.1017 seconds
hbase:146:0> get 'students','s003'
COLUMN CELL
0 row(s)
Took 0.0270 seconds
2)删除数据表students的行键为s002,列名为score:chinese的所有数据。
hbase:148:0> deleteall 'students','s002','score:Chinese'
Took 0.0182 seconds
hbase:149:0> get 'students','s002'
COLUMN CELL
info:age timestamp=2024-03-21T01:42:45.619, value=19
info:name timestamp=2024-03-21T01:42:33.917, value=Tom
score:Math timestamp=2024-03-21T01:48:59.578, value=70
1 row(s)
Took 0.0348 seconds
(7)append:给指定列的单元格追加内容
给行键s002,列名info: name的数据列单元格值Tom后面追加son后变成Tomson。
hbase:007:0> append 'students','s002','info:name','son'
CURRENT VALUE = Tomson
Took 0.9325 seconds
hbase:008:0> get 'students','s002'
COLUMN CELL
info:age timestamp=2024-03-21T01:42:45.619, value=19
info:name timestamp=2024-03-21T06:16:33.838, value=Tomson
score:Math timestamp=2024-03-21T01:48:59.578, value=70
1 row(s)
Took 0.4914 seconds
(8)truncate:清空指定数据表的全部数据内容
清空数据表students的内容。
hbase:009:0> truncate 'students'
Truncating 'students' table (it may take a while):
Disabling table...
2024-03-21 06:17:52,730 INFO [main] client.HBaseAdmin (HBaseAdmin.java:rpcCall(926)) - Started disable of students
2024-03-21 06:17:57,843 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: DISABLE, Table Name: default:students, procId: 194 completed
Truncating table...
2024-03-21 06:17:57,898 INFO [main] client.HBaseAdmin (HBaseAdmin.java:rpcCall(806)) - Started truncating students
2024-03-21 06:18:02,439 INFO [main] client.HBaseAdmin (HBaseAdmin.java:postOperationResult(3591)) - Operation: TRUNCATE, Table Name: default:students, procId: 197 completed
Took 10.4274 seconds
hbase:010:0> scan 'students'
ROW COLUMN+CELL
0 row(s)
Took 6.5341 seconds