mapreduce - Aggregate pipes together in Hadoop -
so have giant pipe in cascade looks this:
k1 - v1 k1 - v2 k1 -v3 k2 - v4 k2- v5 k2-v6
is there anyway aggregate them using every pipe such output looks along these lines:
k1 - {v1, v2, v3} k2 - {v4, v5, v6}
thank you!
edit:
my code far:
i calling every pipe
outputpipe = new every(outputpipe, fields.all, somebuffer());
and overriding operate method in buffer:
@override public void operate( flowprocess flowprocess, buffercall buffercall ) { tupleentry group = buffercall.getgroup(); // current argument values grouping iterator<tupleentry> arguments = buffercall.getargumentsiterator(); // create tuple hold our result values string result = ""; string key = ""; if (arguments.hasnext()) { tupleentry argument = arguments.next(); key = argument.getstring("key") + "\t"; } while (arguments.hasnext()) { tupleentry argument = arguments.next(); result += argument.getstring("value") + "\t"; } buffercall.getoutputcollector().add(new tuple(key, result)); }
the output kind of strange though. keep getting strange results reading in file, i'm guessing logic in every pipe wrong.
Comments
Post a Comment