hdfs - Only one datanode can run in a multinode Hadoop setup -
i trying setup multinode hadoop cluster. right now, trying 2 nodes. 1 namenode/datanode (host a), , other second datanode (host b). strange thing that, can have 1 datanode running, either host or host b. if remove host b conf/slaves file , keep host in set up, system use host datanode. if put both host , b in conf/slaves file, host b show datanode in system.
the following log host when not work:
************************************************************/ 2013-07-31 10:18:16,074 info org.apache.hadoop.hdfs.server.datanode.datanode: startup_msg: /************************************************************ startup_msg: starting datanode startup_msg: host = a.mydomain.com/192.168.1.129 startup_msg: args = [] startup_msg: version = 1.0.4 startup_msg: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled 'hortonfo' on wed oct 3 05:13:58 utc 2012 ************************************************************/ 2013-07-31 10:18:16,317 info org.apache.hadoop.metrics2.impl.metricsconfig: loaded properties hadoop-metrics2.properties 2013-07-31 10:18:16,334 info org.apache.hadoop.metrics2.impl.metricssourceadapter: mbean source metricssystem,sub=stats registered. 2013-07-31 10:18:16,335 info org.apache.hadoop.metrics2.impl.metricssystemimpl: scheduled snapshot period @ 10 second(s). 2013-07-31 10:18:16,335 info org.apache.hadoop.metrics2.impl.metricssystemimpl: datanode metrics system started 2013-07-31 10:18:16,470 info org.apache.hadoop.metrics2.impl.metricssourceadapter: mbean source ugi registered. 2013-07-31 10:18:16,842 info org.apache.hadoop.hdfs.server.datanode.datanode: registered fsdatasetstatusmbean 2013-07-31 10:18:16,855 info org.apache.hadoop.hdfs.server.datanode.datanode: opened info server @ 50010 2013-07-31 10:18:16,858 info org.apache.hadoop.hdfs.server.datanode.datanode: balancing bandwith 1048576 bytes/s 2013-07-31 10:18:16,932 info org.mortbay.log: logging org.slf4j.impl.log4jloggeradapter(org.mortbay.log) via org.mortbay.log.slf4jlog 2013-07-31 10:18:17,038 info org.apache.hadoop.http.httpserver: added global filtersafety (class=org.apache.hadoop.http.httpserver$quotinginputfilter) 2013-07-31 10:18:17,053 info org.apache.hadoop.hdfs.server.datanode.datanode: dfs.webhdfs.enabled = false 2013-07-31 10:18:17,054 info org.apache.hadoop.http.httpserver: port returned webserver.getconnectors()[0].getlocalport() before open() -1. opening listener on 50075 2013-07-31 10:18:17,054 info org.apache.hadoop.http.httpserver: listener.getlocalport() returned 50075 webserver.getconnectors()[0].getlocalport() returned 50075 2013-07-31 10:18:17,054 info org.apache.hadoop.http.httpserver: jetty bound port 50075 2013-07-31 10:18:17,054 info org.mortbay.log: jetty-6.1.26 2013-07-31 10:18:17,437 info org.mortbay.log: started selectchannelconnector@0.0.0.0:50075 2013-07-31 10:18:17,444 info org.apache.hadoop.metrics2.impl.metricssourceadapter: mbean source jvm registered. 2013-07-31 10:18:17,446 info org.apache.hadoop.metrics2.impl.metricssourceadapter: mbean source datanode registered. 2013-07-31 10:18:17,786 info org.apache.hadoop.ipc.server: starting socketreader 2013-07-31 10:18:17,790 info org.apache.hadoop.metrics2.impl.metricssourceadapter: mbean source rpcdetailedactivityforport50020 registered. 2013-07-31 10:18:17,791 info org.apache.hadoop.metrics2.impl.metricssourceadapter: mbean source rpcactivityforport50020 registered. 2013-07-31 10:18:17,794 info org.apache.hadoop.hdfs.server.datanode.datanode: dnregistration = datanoderegistration(a.mydomain.com:50010, storageid=ds-1991287861-192.168.1.129-50010-1373314691613, infoport=50075, ipcport=50020) 2013-07-31 10:18:17,817 info org.apache.hadoop.hdfs.server.datanode.datanode: starting asynchronous block report scan 2013-07-31 10:18:17,820 info org.apache.hadoop.hdfs.server.datanode.datanode: datanoderegistration(192.168.1.129:50010, storageid=ds-1991287861-192.168.1.129-50010-1373314691613, infoport=50075, ipcport=50020)in datanode.run, data = fsdataset{dirpath='/disk2/clustering/support/hdfs/data/current'} 2013-07-31 10:18:17,824 info org.apache.hadoop.ipc.server: ipc server responder: starting 2013-07-31 10:18:17,825 info org.apache.hadoop.ipc.server: ipc server listener on 50020: starting 2013-07-31 10:18:17,827 info org.apache.hadoop.ipc.server: ipc server handler 0 on 50020: starting 2013-07-31 10:18:17,827 info org.apache.hadoop.hdfs.server.datanode.datanode: using blockreport_interval of 3600000msec initial delay: 0msec 2013-07-31 10:18:17,829 info org.apache.hadoop.ipc.server: ipc server handler 1 on 50020: starting 2013-07-31 10:18:17,830 info org.apache.hadoop.ipc.server: ipc server handler 2 on 50020: starting 2013-07-31 10:18:17,831 info org.apache.hadoop.hdfs.server.datanode.datanode: starting periodic block scanner. 2013-07-31 10:18:17,831 info org.apache.hadoop.hdfs.server.datanode.datanode: finished asynchronous block report scan in 14ms 2013-07-31 10:18:17,845 info org.apache.hadoop.hdfs.server.datanode.datanode: generated rough (lockless) block report in 12 ms 2013-07-31 10:18:17,848 info org.apache.hadoop.hdfs.server.datanode.datanode: reconciled asynchronous block report against current state in 2 ms 2013-07-31 10:18:20,828 info org.apache.hadoop.hdfs.server.datanode.datanode: reconciled asynchronous block report against current state in 0 ms 2013-07-31 10:18:20,838 warn org.apache.hadoop.hdfs.server.datanode.datanode: datanode shutting down: org.apache.hadoop.ipc.remoteexception: org.apache.hadoop.hdfs.protocol.unregistereddatanodeexception: data node 192.168.1.129:50010 attempting report storage id ds-1991287861-192.168.1.129-50010-1373314691613. node 192.168.1.128:50010 expected serve storage. @ org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getdatanode(fsnamesystem.java:4608) @ org.apache.hadoop.hdfs.server.namenode.fsnamesystem.processreport(fsnamesystem.java:3460) @ org.apache.hadoop.hdfs.server.namenode.namenode.blockreport(namenode.java:1001) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:39) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:25) @ java.lang.reflect.method.invoke(method.java:597) @ org.apache.hadoop.ipc.rpc$server.call(rpc.java:563) @ org.apache.hadoop.ipc.server$handler$1.run(server.java:1388) @ org.apache.hadoop.ipc.server$handler$1.run(server.java:1384) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:396) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1121) @ org.apache.hadoop.ipc.server$handler.run(server.java:1382) @ org.apache.hadoop.ipc.client.call(client.java:1070) @ org.apache.hadoop.ipc.rpc$invoker.invoke(rpc.java:225) @ $proxy5.blockreport(unknown source) @ org.apache.hadoop.hdfs.server.datanode.datanode.offerservice(datanode.java:958) @ org.apache.hadoop.hdfs.server.datanode.datanode.run(datanode.java:1458) @ java.lang.thread.run(thread.java:662) 2013-07-31 10:18:20,839 info org.mortbay.log: stopped selectchannelconnector@0.0.0.0:50075 2013-07-31 10:18:20,942 info org.apache.hadoop.ipc.server: stopping server on 50020 2013-07-31 10:18:20,943 info org.apache.hadoop.ipc.server: ipc server handler 0 on 50020: exiting 2013-07-31 10:18:20,944 info org.apache.hadoop.ipc.server: ipc server handler 2 on 50020: exiting 2013-07-31 10:18:20,944 info org.apache.hadoop.ipc.server: stopping ipc server listener on 50020 2013-07-31 10:18:20,943 info org.apache.hadoop.ipc.server: ipc server handler 1 on 50020: exiting 2013-07-31 10:18:20,945 info org.apache.hadoop.ipc.metrics.rpcinstrumentation: shut down 2013-07-31 10:18:20,945 info org.apache.hadoop.ipc.server: stopping ipc server responder 2013-07-31 10:18:20,945 info org.apache.hadoop.hdfs.server.datanode.datanode: waiting threadgroup exit, active threads 1 2013-07-31 10:18:20,945 warn org.apache.hadoop.hdfs.server.datanode.datanode: datanoderegistration(192.168.1.129:50010, storageid=ds-1991287861-192.168.1.129-50010-1373314691613, infoport=50075, ipcport=50020):dataxceiveserver:java.nio.channels.asynchronouscloseexception @ java.nio.channels.spi.abstractinterruptiblechannel.end(abstractinterruptiblechannel.java:185) @ sun.nio.ch.serversocketchannelimpl.accept(serversocketchannelimpl.java:157) @ sun.nio.ch.serversocketadaptor.accept(serversocketadaptor.java:84) @ org.apache.hadoop.hdfs.server.datanode.dataxceiverserver.run(dataxceiverserver.java:131) @ java.lang.thread.run(thread.java:662) 2013-07-31 10:18:20,945 info org.apache.hadoop.hdfs.server.datanode.datanode: exiting dataxceiveserver 2013-07-31 10:18:21,905 info org.apache.hadoop.hdfs.server.datanode.datablockscanner: exiting datablockscanner thread. 2013-07-31 10:18:21,945 info org.apache.hadoop.hdfs.server.datanode.datanode: waiting threadgroup exit, active threads 0 2013-07-31 10:18:22,047 info org.apache.hadoop.hdfs.server.datanode.fsdatasetasyncdiskservice: shutting down async disk service threads... 2013-07-31 10:18:22,048 info org.apache.hadoop.hdfs.server.datanode.fsdatasetasyncdiskservice: async disk service threads have been shut down. 2013-07-31 10:18:22,048 info org.apache.hadoop.hdfs.server.datanode.datanode: datanoderegistration(192.168.1.129:50010, storageid=ds-1991287861-192.168.1.129-50010-1373314691613, infoport=50075, ipcport=50020):finishing datanode in: fsdataset{dirpath='/disk2/clustering/support/hdfs/data/current'} 2013-07-31 10:18:22,050 warn org.apache.hadoop.metrics2.util.mbeans: hadoop:service=datanode,name=datanodeinfo javax.management.instancenotfoundexception: hadoop:service=datanode,name=datanodeinfo @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.getmbean(defaultmbeanserverinterceptor.java:1094) @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.exclusiveunregistermbean(defaultmbeanserverinterceptor.java:415) @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.unregistermbean(defaultmbeanserverinterceptor.java:403) @ com.sun.jmx.mbeanserver.jmxmbeanserver.unregistermbean(jmxmbeanserver.java:506) @ org.apache.hadoop.metrics2.util.mbeans.unregister(mbeans.java:71) @ org.apache.hadoop.hdfs.server.datanode.datanode.unregistermxbean(datanode.java:522) @ org.apache.hadoop.hdfs.server.datanode.datanode.shutdown(datanode.java:737) @ org.apache.hadoop.hdfs.server.datanode.datanode.run(datanode.java:1471) @ java.lang.thread.run(thread.java:662) 2013-07-31 10:18:22,051 info org.apache.hadoop.ipc.server: stopping server on 50020 2013-07-31 10:18:22,051 info org.apache.hadoop.ipc.metrics.rpcinstrumentation: shut down 2013-07-31 10:18:22,051 info org.apache.hadoop.hdfs.server.datanode.datanode: waiting threadgroup exit, active threads 0 2013-07-31 10:18:22,051 warn org.apache.hadoop.metrics2.util.mbeans: hadoop:service=datanode,name=fsdatasetstate-ds-1991287861-192.168.1.129-50010-1373314691613 javax.management.instancenotfoundexception: hadoop:service=datanode,name=fsdatasetstate-ds-1991287861-192.168.1.129-50010-1373314691613 @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.getmbean(defaultmbeanserverinterceptor.java:1094) @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.exclusiveunregistermbean(defaultmbeanserverinterceptor.java:415) @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.unregistermbean(defaultmbeanserverinterceptor.java:403) @ com.sun.jmx.mbeanserver.jmxmbeanserver.unregistermbean(jmxmbeanserver.java:506) @ org.apache.hadoop.metrics2.util.mbeans.unregister(mbeans.java:71) @ org.apache.hadoop.hdfs.server.datanode.fsdataset.shutdown(fsdataset.java:2067) @ org.apache.hadoop.hdfs.server.datanode.datanode.shutdown(datanode.java:799) @ org.apache.hadoop.hdfs.server.datanode.datanode.run(datanode.java:1471) @ java.lang.thread.run(thread.java:662) 2013-07-31 10:18:22,052 warn org.apache.hadoop.hdfs.server.datanode.fsdatasetasyncdiskservice: asyncdiskservice has shut down. 2013-07-31 10:18:22,052 info org.apache.hadoop.hdfs.server.datanode.datanode: exiting datanode 2013-07-31 10:18:22,055 info org.apache.hadoop.hdfs.server.datanode.datanode: shutdown_msg: /************************************************************ shutdown_msg: shutting down datanode @ a.mydomain.com/192.168.1.129
i appreciate insights. thanks.
edit: configuration files following:
core-site.xml <configuration> <property> <name>fs.default.name</name> <value>hdfs://a.mydomain.com:9000</value> </property> </configuration> hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.data.dir</name> <value>/disk2/clustering/support/hdfs/data</value> </property> <property> <name>dfs.name.dir</name> <value>/disk2/clustering/support/hdfs/name</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/disk2/clustering/support/hdfs/tmp</value> </property> </configuration> slaves (on host only, host b default localhost) a.mydomain.com b.mydomain.com masters (on host only, host b default localhost) a.mydomain.com
update: added host c cluster , make host serve namenode (not namenode/datanode). , problem same. 1 host can run datanode. ideas? many thanks.
solved problem. read though error information
node 192.168.1.128:50010 expected serve storage
and found when copy set 1 server another, copied local data dir hdfs (dfs.data.dir). , created conflicts. once clean data inside local dfs.data.dir, datanode started without issue.
Comments
Post a Comment