jar - Data in Hbase are not structured as it should be - Twitter Flume -


users, greetings !

i have installed flume on cloudera 4.6, , trying tweets twitter.

so created hdfs sink , hbase sink, , gathering tweets... data in hbase not structured.

as data not structured, can't make queries on impala.

i created table tweets {name => 'tweet'}, {name => 'retweet'}, {name => 'entities'}, {name => 'user'}

and flume configuration : http://pastebin.com/4b5d3r8q

i following tutorial, don't know serializer.

https://github.com/aronmacdonald/twitter_hbase_impala have make jar ?

i have in hbase: http://pastebin.com/angbsvb7 in column tweets...

i recompiled , used flume-sources-1.0-snapshot.jar git:https://github.com/cloudera/cdh-twitter-example , there no promblem when using 'twitteragent.sources.twitter.type = com.cloudera.flume.source.twittersource'

install maven, download repository of cdh-twitter-example.

unzip, execute inside (as mentionned) :

$ cd flume-sources

$ mvn package

$ cd ..

this problem happened when twitter4j version updated 2.2.6 3.x, removed method setincludeentities, , jar not date.

ps: not download prebuilt version, still old.


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Python ctypes access violation with const pointer arguments -