java - Import data from MapReduce to HBase (TableOutputFormat error) -
a trying save data mapreduce job hbase. made script work great on older versions of hadoop (cdh3u4). upgraded newest version (cdh 5.0.2) , script not working.
when run program on newest version, following error:
exception in thread "main" java.lang.runtimeexception: java.io.ioexception: java.lang.reflect.invocationtargetexception @ org.apache.hadoop.hbase.mapreduce.tableoutputformat.setconf(tableoutputformat.java:211) @ org.apache.hadoop.util.reflectionutils.setconf(reflectionutils.java:73) @ org.apache.hadoop.util.reflectionutils.newinstance(reflectionutils.java:133) @ org.apache.hadoop.mapreduce.jobsubmitter.checkspecs(jobsubmitter.java:455) @ org.apache.hadoop.mapreduce.jobsubmitter.submitjobinternal(jobsubmitter.java:343) @ org.apache.hadoop.mapreduce.job$10.run(job.java:1295) @ org.apache.hadoop.mapreduce.job$10.run(job.java:1292) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1548) @ org.apache.hadoop.mapreduce.job.submit(job.java:1292) @ org.apache.hadoop.mapreduce.job.waitforcompletion(job.java:1313) @ com.nrholding.t0_mr.main.eloghbaseimport.main(eloghbaseimport.java:89) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ org.apache.hadoop.util.runjar.main(runjar.java:212) caused by: java.io.ioexception: java.lang.reflect.invocationtargetexception @ org.apache.hadoop.hbase.client.hconnectionmanager.createconnection(hconnectionmanager.java:389) @ org.apache.hadoop.hbase.client.hconnectionmanager.createconnection(hconnectionmanager.java:366) @ org.apache.hadoop.hbase.client.hconnectionmanager.getconnection(hconnectionmanager.java:247) @ org.apache.hadoop.hbase.client.htable.<init>(htable.java:188) @ org.apache.hadoop.hbase.client.htable.<init>(htable.java:150) @ org.apache.hadoop.hbase.mapreduce.tableoutputformat.setconf(tableoutputformat.java:206) ... 17 more caused by: java.lang.reflect.invocationtargetexception @ sun.reflect.nativeconstructoraccessorimpl.newinstance0(native method) @ sun.reflect.nativeconstructoraccessorimpl.newinstance(nativeconstructoraccessorimpl.java:57) @ sun.reflect.delegatingconstructoraccessorimpl.newinstance(delegatingconstructoraccessorimpl.java:45) @ java.lang.reflect.constructor.newinstance(constructor.java:526) @ org.apache.hadoop.hbase.client.hconnectionmanager.createconnection(hconnectionmanager.java:387) ... 22 more caused by: java.lang.noclassdeffounderror: org/cloudera/htrace/trace @ org.apache.hadoop.hbase.zookeeper.recoverablezookeeper.exists(recoverablezookeeper.java:195) @ org.apache.hadoop.hbase.zookeeper.zkutil.checkexists(zkutil.java:479) @ org.apache.hadoop.hbase.zookeeper.zkclusterid.readclusteridznode(zkclusterid.java:65) @ org.apache.hadoop.hbase.client.zookeeperregistry.getclusterid(zookeeperregistry.java:83) @ org.apache.hadoop.hbase.client.hconnectionmanager$hconnectionimplementation.retrieveclusterid(hconnectionmanager.java:801) @ org.apache.hadoop.hbase.client.hconnectionmanager$hconnectionimplementation.<init>(hconnectionmanager.java:633) ... 27 more caused by: java.lang.classnotfoundexception: org.cloudera.htrace.trace @ java.net.urlclassloader$1.run(urlclassloader.java:366) @ java.net.urlclassloader$1.run(urlclassloader.java:355) @ java.security.accesscontroller.doprivileged(native method) @ java.net.urlclassloader.findclass(urlclassloader.java:354) @ java.lang.classloader.loadclass(classloader.java:424) @ java.lang.classloader.loadclass(classloader.java:357) ... 33 more
it seams problem in tableoutputformat. checked that:
- proper library in libpath.
- zookeeper quorum , zookeeper port set in hbase-site.xml
- table wp_json exists in hbase
here code makes problems:
public static void main(string args[]) throws exception { configuration conf = new configuration(); conf.set("hbase.zookeeper.quorum", "zookeeper_server1,zookeeper_server2,zookeeper_server3"); conf.set(tableoutputformat.output_table, "wp_json"); string[] otherargs = new genericoptionsparser(conf, args).getremainingargs(); string input = otherargs[0]; job job = job.getinstance(conf, "eloghbaseimport"); // input text files in hdfs fileinputformat.addinputpath(job, new path(input)); job.setjarbyclass(eloghbaseimport.class); job.setmapperclass(map.class); job.setnumreducetasks(0); job.setoutputformatclass(tableoutputformat.class); job.waitforcompletion(true); }
when use nulloutputformat, works great nothing written hbase.
the part of tableoutputformat responsible error here:
163 /** 164 * returns output committer. 165 * 166 * @param context current context. 167 * @return committer. 168 * @throws ioexception when creating committer fails. 169 * @throws interruptedexception when job aborted. 170 * @see org.apache.hadoop.mapreduce.outputformat#getoutputcommitter(org.apache.hadoop.mapreduce.taskattemptcontext) 171 */ 172 @override 173 public outputcommitter getoutputcommitter(taskattemptcontext context) 174 throws ioexception, interruptedexception { 175 return new tableoutputcommitter(); 176 } 177 178 public configuration getconf() { 179 return conf; 180 } 181 182 @override 183 public void setconf(configuration otherconf) { 184 this.conf = hbaseconfiguration.create(otherconf); 185 186 string tablename = this.conf.get(output_table); 187 if(tablename == null || tablename.length() <= 0) { 188 throw new illegalargumentexception("must specify table name"); 189 } 190 191 string address = this.conf.get(quorum_address); 192 int zkclientport = this.conf.getint(quorum_port, 0); 193 string serverclass = this.conf.get(region_server_class); 194 string serverimpl = this.conf.get(region_server_impl); 195 196 try { 197 if (address != null) { 198 zkutil.applyclusterkeytoconf(this.conf, address); 199 } 200 if (serverclass != null) { 201 this.conf.set(hconstants.region_server_impl, serverimpl); 202 } 203 if (zkclientport != 0) { 204 this.conf.setint(hconstants.zookeeper_client_port, zkclientport); 205 } 206 this.table = new htable(this.conf, tablename); 207 this.table.setautoflush(false, true); 208 log.info("created table instance " + tablename); 209 } catch(ioexception e) { 210 log.error(e); 211 throw new runtimeexception(e); 212 } 213 }
error caused message:
caused by: java.lang.classnotfoundexception: org.cloudera.htrace.trace*
probably, missing jar in classpath. class mentioned above may indirectly referred code. try put jar containing class in classpath.
hope helps!!!
Comments
Post a Comment