阅读背景:

解析大于hdfs块大小的XmlInputFormat元素

来源:互联网 

I'm new to Hadoop MapReduce (4 days to be precise) and I've been asked to perform distributed XML parsing on a cluster. As per my (re)search on the Internet, it should be fairly easy using Mahout's XmlInputFormat, but my task is to make sure that the system works for huge (~5TB) XML files.I'm new to Hadoop MapReduce (4 days to be preci




你的当前访问异常,请进行认证后继续阅读剩余内容。

分享到: