jar - Data in Hbase are not structured as it should be - Twitter Flume -
users, greetings !
i have installed flume on cloudera 4.6, , trying tweets twitter.
so created hdfs sink , hbase sink, , gathering tweets... data in hbase not structured.
as data not structured, can't make queries on impala.
i created table tweets {name => 'tweet'}, {name => 'retweet'}, {name => 'entities'}, {name => 'user'}
and flume configuration : http://pastebin.com/4b5d3r8q
i following tutorial, don't know serializer.
https://github.com/aronmacdonald/twitter_hbase_impala have make jar ?
i have in hbase: http://pastebin.com/angbsvb7 in column tweets...
i recompiled , used flume-sources-1.0-snapshot.jar git:https://github.com/cloudera/cdh-twitter-example , there no promblem when using 'twitteragent.sources.twitter.type = com.cloudera.flume.source.twittersource'
install maven, download repository of cdh-twitter-example.
unzip, execute inside (as mentionned) :
$ cd flume-sources
$ mvn package
$ cd ..
this problem happened when twitter4j version updated 2.2.6 3.x, removed method setincludeentities, , jar not date.
ps: not download prebuilt version, still old.
Comments
Post a Comment