mapreduce - Aggregate pipes together in Hadoop -


so have giant pipe in cascade looks this:

k1 - v1 k1 - v2 k1 -v3 k2 - v4 k2- v5 k2-v6 

is there anyway aggregate them using every pipe such output looks along these lines:

k1 - {v1, v2, v3} k2 - {v4, v5, v6} 

thank you!

edit:

my code far:

i calling every pipe

    outputpipe = new every(outputpipe, fields.all, somebuffer()); 

and overriding operate method in buffer:

   @override     public void operate( flowprocess flowprocess, buffercall buffercall )     {         tupleentry group = buffercall.getgroup();          // current argument values grouping         iterator<tupleentry> arguments = buffercall.getargumentsiterator();         // create tuple hold our result values         string result = "";         string key = "";          if (arguments.hasnext()) {             tupleentry argument = arguments.next();             key = argument.getstring("key") + "\t";         }           while (arguments.hasnext()) {             tupleentry argument = arguments.next();             result += argument.getstring("value") + "\t";         }         buffercall.getoutputcollector().add(new tuple(key, result));     } 

the output kind of strange though. keep getting strange results reading in file, i'm guessing logic in every pipe wrong.


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -

c++ - Migration from QScriptEngine to QJSEngine -