java - Reducer loop strange behavior -


i'm new in mapreduce i'm trying join of 2 different type of lines 2 different csv files.

the map ok, load 2 files , b, match lines want same key.

in reducer having strange behavior can not understand. lines start accident# , lines b start meteo#. want identify if line or b , rest of line, when testing code

        for(text val : values){             stringtokenizer line = new stringtokenizer(val.tostring(), "#");             string comparable = line.nexttoken();             context.write(key,new text(comparable));         }  

i receive following output, ok

2015-12-31;x    meteo 2015-12-31;x    accident 2015-12-31;x    accident 2015-12-31;x    accident 2015-12-31;x    accident 

then this

        for(text val : values){             stringtokenizer line = new stringtokenizer(val.tostring(), "#");             string comparable = line.nexttoken();             if (comparable.equals("meteo"))                 comparable = line.nexttoken();             context.write(key,new text(comparable));         } 2015-12-31;x    ;17.8;14:00;9.1;04:40;25;12:20;19;19:00;0;0;0 2015-12-31;x    accident 2015-12-31;x    accident 2015-12-31;x    accident 2015-12-31;x    accident 

which ok. following thing store meteo

        string meteo;         for(text val : values){             meteo = "hi";             stringtokenizer line = new stringtokenizer(val.tostring(), "#");             string comparable = line.nexttoken();             if (comparable.equals("meteo"))                 meteo = line.nexttoken();             context.write(key,new text(meteo));         }  2015-12-31;x    hi 2015-12-31;x    hi 2015-12-31;x    hi 2015-12-31;x    hi 2015-12-31;x    hi 

when expected result was

2015-12-31;x    ;17.8;14:00;9.1;04:40;25;12:20;19;19:00;0;0;0 2015-12-31;x    hi 2015-12-31;x    hi 2015-12-31;x    hi 2015-12-31;x    hi 

this simplification of problem shows strange behavior. want append meteo line every accident line same key, final objective, if not work... not know how can (my idea meteo line, store , append every accident line)

edit

next, i'm going add code of mapper , exact input, clarify problem

 public void map(object key, text value, context context                  ) throws ioexception, interruptedexception {      stringtokenizer lines = new stringtokenizer(value.tostring(), "\n");    while (lines.hasmoretokens()){         stringtokenizer line = new stringtokenizer(lines.nexttoken(),";");         string csvline = new string(); //this output value         string atr = line.next.token(); //with first atribute diferenciate between meteo , accidents         boolean ismeteo = false;         if(atr.equals("0201x")) ismeteo=true;          if(!ismeteo){  //if accident line, search atributs put date in key (i==6,7,8)                   int i=1;                   csvline=atr;                   while(line.hasmoretokens()){                       string aux= line.nexttoken();                       csvline+=";"+aux;                       if(i==6) id =aux;                       else if(i==7 || i==8){                           int x = integer.parseint(aux);                           if(x<10)aux = "0"+aux;                           id+="-"+aux;                       }                       else if(i==13){ //this x in key, identify meteo station (this not important in problem)                           aux = aux.substring(0,aux.length()-1);                           id+=";"+aux;                           csvline= csvline.substring(0,csvline.length()-1);                       }                       ++i;                   }         }         else if(ismeteo){             id = line.nexttoken(); //in second column have complete date string             id+=";x";  //this file has data of meteo station x             csvline+=";"+tocsvline(line);         }         text outkey = new text(id);         text ouykey = new text(csvline);         context.write(outkey,outvalue);  }   public string tocsvline(stringtokenizer st){      string x = new string();      x = st.nexttoken();      while(st.hasmoretokens()){          x+=";"+st.nexttoken();      }      return x;  }       

in accidents file, take columns make day id (year-month-day), , in meteo file take column date id. in csvline have csv line want. write key(id) , value(csvline).

and here have input data (only 2 days, representative example):

meteox.csv :

 0201x;2015-12-30;18.6;14:50;12.2;07:00;;26;13:20;17;13:10;;0;;;  0201x;2015-12-31;17.8;14:00;9.1;04:40;;25;12:20;19;19:00;;;0;0;0 

accidents.csv :

 2015s009983;ciutat vella;la barceloneta;mar;dc;laboral;2015;12;30;22;altres;4581220,92;432258,31;x   2015s009984;sant mart�;sant mart� de proven�als;cant�bria;dc;laboral;2015;12;30;20;col.lisi� fronto-lateral;4585862,62;433330,95;x   2015s009985;eixample;la nova esquerra de l'eixample;cal�bria;dj;laboral;2015;12;31;00;caiguda (dues rodes);4582094,15;428800,57;x    2015s009987;eixample;la dreta de l'eixample;gr�cia;dj;laboral;2015;12;31;02;col.lisi� lateral;4582944,96;430133,41;x     2015s009988;eixample;la nova esquerra de l'eixample;arag�;dj;laboral;2015;12;31;07;abast;4581873,45;429312,63;x      2015s009989;ciutat vella;la barceloneta;mar�tim de la barceloneta;dj;laboral;2015;12;31;08;abast;4581518,06;432606,87;x     


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -

c++ - Migration from QScriptEngine to QJSEngine -