hadoop MultipleInputs fails with ClassCastException (get fileName)
?
來自:http://stackoverflow.com/questions/11130145/hadoop-multipleinputs-fails-with-classcastexception
?
Following up on my comment, the Javadocs for?TaggedInputSplit?confirms that you are probably wrongly casting the input split to a FileSplit:
/*** An {@link InputSplit} that tags another InputSplit with extra data for use* by {@link DelegatingInputFormat}s and {@link DelegatingMapper}s.*/My guess is your setup method looks something like this:
@Override protected void setup(Context context) throws IOException,InterruptedException {FileSplit split = (FileSplit) context.getInputSplit(); }Unfortunately?TaggedInputSplit?is not public visible, so you can't easily do an?instanceof?style check, followed by a cast and then call to?TaggedInputSplit.getInputSplit()?to get the actual underlying FileSplit. So either you'll need to update the source yourself and re-compile&deploy, post a JIRA ticket to ask this to be fixed in future version (if it already hasn't been actioned in 2+) or perform some nasty?nasty?reflection hackery to get to the underlying InputSplit
This is completely untested:
@Override protected void setup(Context context) throws IOException,InterruptedException {InputSplit split = context.getInputSplit();Class<? extends InputSplit> splitClass = split.getClass();FileSplit fileSplit = null;if (splitClass.equals(FileSplit.class)) {fileSplit = (FileSplit) split;} else if (splitClass.getName().equals("org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit")) {// begin reflection hackery...try {Method getInputSplitMethod = splitClass.getDeclaredMethod("getInputSplit");getInputSplitMethod.setAccessible(true);fileSplit = (FileSplit) getInputSplitMethod.invoke(split);} catch (Exception e) {// wrap and re-throw errorthrow new IOException(e);}// end reflection hackery} }Reflection Hackery Explained:
With TaggedInputSplit being declared protected scope, it's not visible to classes outside the?org.apache.hadoop.mapreduce.lib.input?package, and therefore you cannot reference that class in your setup method. To get around this, we perform a number of reflection based operations:
Inspecting the class name, we can test for the type TaggedInputSplit using it's fully qualified name
splitClass.getName().equals("org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit")
We know we want to call the?TaggedInputSplit.getInputSplit()?method to recover the wrapped input split, so we utilize the?Class.getMethod(..)?reflection method to acquire a reference to the method:
Method getInputSplitMethod = splitClass.getDeclaredMethod("getInputSplit");
The class still isn't public visible so we use the setAccessible(..) method to override this, stopping the security manager from throwing an exception
getInputSplitMethod.setAccessible(true);
Finally we invoke the method on the reference to the input split and cast the result to a FileSplit (optimistically hoping its a instance of this type!):
fileSplit = (FileSplit) getInputSplitMethod.invoke(split);
轉載于:https://www.cnblogs.com/sunxucool/p/3727200.html
總結
以上是生活随笔為你收集整理的hadoop MultipleInputs fails with ClassCastException (get fileName)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: android 接口定义常量,Andro
- 下一篇: 字符串长度(PHP学习)