使用happybase访问HBase出现Broken pipe问题---两个“惊天”大bug
发表于:2025-12-02 作者:千家信息网编辑
千家信息网最后更新 2025年12月02日,来源使用happybase通过thrift接口向HBase读取、写入数据时,出现Broken pipe的错误。排查步骤:1、查看hbase的日志:Java HotSpot(TM) 64-Bit Ser
千家信息网最后更新 2025年12月02日使用happybase访问HBase出现Broken pipe问题---两个“惊天”大bug
来源
使用happybase通过thrift接口向HBase读取、写入数据时,出现Broken pipe的错误。排查步骤:
1、查看hbase的日志:
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release17/05/12 18:08:41 INFO util.VersionInfo: HBase 1.2.0-cdh6.10.117/05/12 18:08:41 INFO util.VersionInfo: Source code repository file:///data/jenkins/workspace/generic-package-centos64-7-0/topdir/BUILD/hbase-1.2.0-cdh6.10.1 revision=Unknown17/05/12 18:08:41 INFO util.VersionInfo: Compiled by jenkins on Mon Mar 20 02:46:09 PDT 201717/05/12 18:08:41 INFO util.VersionInfo: From source with checksum c6d9864e1358df7e7f39d39a40338b4e17/05/12 18:08:41 INFO thrift.ThriftServerRunner: Using default thrift server type17/05/12 18:08:41 INFO thrift.ThriftServerRunner: Using thrift server type threadpool17/05/12 18:08:42 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties17/05/12 18:08:42 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).17/05/12 18:08:42 INFO impl.MetricsSystemImpl: HBase metrics system started17/05/12 18:08:42 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog17/05/12 18:08:42 INFO http.HttpRequestLog: Http request log for http.requests.thrift is not defined17/05/12 18:08:42 INFO http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)17/05/12 18:08:42 INFO http.HttpServer: Added global filter 'clickjackingprevention' (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter)17/05/12 18:08:42 INFO http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context thrift17/05/12 18:08:42 INFO http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static17/05/12 18:08:42 INFO http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs17/05/12 18:08:42 INFO http.HttpServer: Jetty bound to port 909517/05/12 18:08:42 INFO mortbay.log: jetty-6.1.26.cloudera.417/05/12 18:08:42 WARN mortbay.log: Can't reuse /tmp/Jetty_0_0_0_0_9095_thrift____.vqpz9l, using /tmp/Jetty_0_0_0_0_9095_thrift____.vqpz9l_512017503248018505817/05/12 18:08:43 INFO mortbay.log: Started SelectChannelConnector@0.0.0.0:909517/05/12 18:08:43 INFO thrift.ThriftServerRunner: starting TBoundedThreadPoolServer on /0.0.0.0:9090 with readTimeout 300000ms; min worker threads=128, max worker threads=1000, max queued requests=1000.../05/08 15:05:51 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x645132bf connecting to ZooKeeper ensemble=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:218117/05/08 15:05:51 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 sessionTimeout=60000 watcher=hconnection-0x64513-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181, baseZNode=/hbase17/05/08 15:05:51 INFO zookeeper.ClientCnxn: Opening socket connection to server cdh-slave3/192.168.10.219:2181. Will not attempt to authenticate using SASL (unknown error)17/05/08 15:05:51 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.10.23:43170, server: cdh-slave3/192.168.10.219:218117/05/08 15:05:51 INFO zookeeper.ClientCnxn: Session establishment complete on server cdh-slave3/192.168.10.219:2181, sessionid = 0x35bd74a77802148, negotiated timeout = 60000[caitinggui@cdh-master-slave1 example]$ 17/05/08 15:32:50 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x35bd74a7780214817/05/08 15:32:51 INFO zookeeper.ZooKeeper: Session: 0x35bd74a77802148 closed17/05/08 15:32:51 INFO zookeeper.ClientCnxn: EventThread shut down17/05/08 15:38:53 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0xb876351 connecting to ZooKeeper ensemble=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:218117/05/08 15:38:53 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 sessionTimeout=60000 watcher=hconnection-0xb8763510x0, quorum=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181, baseZNode=/hbase17/05/08 15:38:53 INFO zookeeper.ClientCnxn: Opening socket connection to server cdh-master-slave1/192.168.10.23:2181. Will not attempt to authenticate using SASL (unknown error)17/05/08 15:38:53 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.10.23:35526, server: cdh-master-slave1/192.168.10.23:218117/05/08 15:38:53 INFO zookeeper.ClientCnxn: Session establishment complete on server cdh-master-slave1/192.168.10.23:2181, sessionid = 0x15ba3ddc6cc90d4, negotiated timeout = 60000初步推断是hbase设置了某个超时时间,导致连接断开
2、查看官方文档,但是没有发现很有意义的timeout参数
3、Google相似问题
查看相似的内容:
Uploaded image for project: 'HBase' HBaseHBASE-14926Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets stuck readingAgile Board ExportDetailsType: BugStatus:RESOLVEDPriority: MajorResolution: FixedAffects Version/s:2.0.0, 1.2.0, 1.1.2, 1.3.0, 1.0.3, 0.98.16Fix Version/s:2.0.0, 1.2.0, 1.3.0, 0.98.17Component/s:ThriftLabels:NoneHadoop Flags:ReviewedRelease Note: Adds a timeout to server read from clients. Adds new configs hbase.thrift.server.socket.read.timeout for setting read timeout on server socket in milliseconds. Default is 60000;DescriptionThrift server is hung. All worker threads are doing this:"thrift-worker-0" daemon prio=10 tid=0x00007f0bb95c2800 nid=0xf6a7 runnable [0x00007f0b956e0000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0x000000066d859490> (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:601) at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:470) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289) at org.apache.hadoop.hbase.thrift.CallQueue$Call.run(CallQueue.java:64) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)They never recover.I don't have client side logs.We've been here before: HBASE-4967 "connected client thrift sockets should have a server side read timeout" but this patch only got applied to fb branch (and thrift has changed since then)ps:来源https://issues.apache.org/jira/browse/HBASE-149264、Google "hbase.thrift.server.socket.read.timeout"
可以看到一个网页内容:
问题背景测试环境是三台服务器搭建的Hadoop分布式环境。Hadoop版本是:hadoop-2.7.3;Hbase-1.2.4; zookeeper-3.4.9。 使用thrift c++接口向hbase中写入数据,每次都是刚开始写入正常,过一段时间就开始报错。 但之前使用的hbase-0.94.27版本就没遇到过该问题,配置也相同,一直用的好好地。thrift接口报错解决办法通过抓包可以看出,hbase server响应了RST包,导致连接中断。 通过 bin/hbase thrift start -threadpool命令可以readTimeout的设置为60s。thriftpool经过验证却是和这个设置有关,配置中没有配置过该项,通过查看代码发现60s是默认值,如果没有配置即按照以该值为准。因此在conf/hbase-site.xml中添加上配置即可: hbase.thrift.server.socket.read.timeout 6000000 eg:milisecond ps:来源http://blog.csdn.net/wwlhz/article/details/56012053所以添加参数后,重启hbase thrift,发现问题解决
5、查看源码,可以看到
#https://github.com/apache/hbase/blob/master/hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java... public static final String THRIFT_SERVER_SOCKET_READ_TIMEOUT_KEY = "hbase.thrift.server.socket.read.timeout"; public static final int THRIFT_SERVER_SOCKET_READ_TIMEOUT_DEFAULT = 60000;... int readTimeout = conf.getInt(THRIFT_SERVER_SOCKET_READ_TIMEOUT_KEY, THRIFT_SERVER_SOCKET_READ_TIMEOUT_DEFAULT); TServerTransport serverTransport = new TServerSocket( new TServerSocket.ServerSocketTransportArgs(). bindAddr(new InetSocketAddress(listenAddress, listenPort)). backlog(backlog). clientTimeout(readTimeout));问题解决~~~
6、然而问题解决了吗?
实际上还是有问题,一段时间发现连续scan大概20多分钟后,连接又被断开了,又是一次艰难的搜索,发现是hbase该版本自带的问题,它将所有连接(不管有没有在使用)都默认为idle的状态,然后有个hbase.thrift.connection.max-idletime的配置,所以我将此项配置为31104000(一年),如果是在CDH中,应该在管理页面配置,如图:

遇到问题一般步骤:
技术进步型:
1、查看日志,查看报错的地方,初步定位问题
2、查看官方文档
3、Google相似的问题,或者查看源码去定位问题
快速解决问题型:
1、查看日志,查看报错的地方,初步定位问题
2、Google相似问题
3、查看官方文档,或者查看源码
参考:
- [1]HBase thrift/thrift2 使用指南
问题
配置
相似
官方
接口
文档
日志
时间
来源
源码
版本
定位
内容
参数
地方
数据
步骤
环境
相同
艰难
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
服务器进入计算机管理
计算机网络安全技术b卷
写业务和服务器代码哪个语言快
泸州软件开发要多少钱
谁对网络安全威胁更大
打开服务器管理工具的命令
斑马打印机数据库字段长度
中国电子学会网络安全标准
软件开发完成后怎么上线呢
网络安全的结束语怎么写
光伏电站网络安全演练记录
数据库服务器主要功能
浙江宁波浪潮服务器云主机
建筑行业企业软件开发
计算机人网络安全专业
南京百信服务器订购
apex什么服务器最好打排位
网络安全保卫大队民警十九大报告
刺客信条3显示无法连接服务器
郑州股票软件开发哪个好
网络安全例子 英语
网络安全与执法专业录取
服务器硬盘接口不足怎么解决
高性能韩国服务器多少钱
服务器优惠
数据库ps协议
excel数据库怎么设计
辽宁中文版服务器租用云空间
服务器副管理员权限
计算机网络技术可以先自学什么