hadoop - Fail to write SequenceFile with Pig -
I want to store some pig variables in Hadoop SequenceFile so that we can remove jobs to run an external map.
Assume that my data has a charrray (int) schema:
(Hello, 1) (test, 2) (example, 3)
I wrote this storage work:
import java.io.IOException; Import java.util.logging.Level; Import java.util.logging.Logger; Import org.apache.hadoop.fs.Path; Import org.apache.hadoop.io.IntWritable; Import org.apache.hadoop.io.Text; Import org.apache.hadoop.mapreduce.Job; Import org.apache.hadoop.mapreduce.OutputFormat; Import org.apache.hadoop.mapreduce.RecordWriter; Import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; Import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat; Import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; Import org.apache.pig.StoreFunc; Import org.apache.pig.data.Tuple; Public Square StoreTest StoreFunc {extends private string store location; Private recordware author; Private job job; Public StoreTest () {} @ Override public output format getOutputFormat () throws IOException {// back to new TextOutputFormat (); Return new sequence file auto-format (); } @ Override throws public zero set store location (string location, job job) IOException {this.storeLocation = location; This.job = job; System.out.println ("Load Location" + Store Location); FileOutputFormat.setOutputPath (job, new path (location)); System.out.println ("Out Path" + FileOutputFormat.getOutputPath (Job)); } @ Override Public Recording Ready: Right Writer (Recordware Writer) throws IOException {this.writer = writer; } @ Override throws IOException to public view planlive (Tupal Tupal) {try {text k = new text (String) tuple.get (0))); IntWritable v = New IntWritable (integer) tuple.get (1)); Author. Write (k, v); } Hold (Prerupted Preset) {LoggerGetLugger (StoreTest.class.getName ()). Logs (level. SESEE, ANNEL, X); }}}
and this pig code:
register MyudFs.jar; X = Load '/ User / Pinoli / Input' (A: Chararey, B: IT); Use '/ user / pinoli / output' / 'Store Test' in Store X; However, storage fails and I get this error: error org.apache.pig.tools.pigstats.PigStats - Error 0: Java Io.IOException: Incorrect key class: org.apache.hadoop.io.Text class is not org.apache.hadoop.io.LongWritable
Is there any way to fix it? ?
The problem is that you have not set the output key / value class. You can do this in the setStoreLocation ()
method:
@Override throws public zero set store location (string location, job job) IOException {this.storeLocation = location; This.job = job; This.job.setOutputKeyClass (Text.class); // !!! This.job.setOutputValueClass (IntWritable.class); // !!! ...}
I think you want to use your store with different key / price types. In that case you can do the constructor with your type. Example:
Private class
Then, set the correct type in pig script:
register MyudFs.jar; Set MyStore Storetest ('org.apache.hadoop.io.Text', 'org.apache.hadoop.io.IntWritable'); X = Load '/ User / Pinoli / Input' (A: Chararey, B: IT); Store 'X' using '/ user / pinoli / output /' mystorer ();
Comments
Post a Comment