前段时间一直在研究alluxio,不过alluxio似乎不太适合我们的使用场景。具体可以参考文章:采用alluxio提升MR job和Spark job性能的注意点来了解alluxio的应用场景。

这里并不是说alluxio不好。alluxio在符合其使用的场景下也有非常显著的性能提升。在国内也有百度、去哪儿、阿里这样的公司来使用。不过作为内存缓存层加速本地的spark或者mapreduce job方面可能并不是做的很完善。

ignite的开发者和alluxio的开发者也有过交集。和我标题一样的问题ignite的作者也同样问过。具体可以参考:

  1. 敢说 Apache Ignite 比 Tachyon 好?删帖!: 这篇是中文的翻译。不过标题写的有点引导读者情绪的嫌疑。大家请客观看待ignite和alluxio。
  2. Apache Ignite (incubating) vs Tachyon: 这篇是ignite作者被删掉的英文原文,推荐阅读。

按照ignite作者所言,ignite相比alluxio好的地方是:

And here's the catch BTW: file system caching in Ignite is a part of its 'data fabric' paradigm like the services, advanced clustering, distributed messaging, ACID real-time transactions, etc. Adding HDFS and MR acceleration layer was pretty straight-forward as it was build on the advanced Ignite core, which has been in the real-world production for 5+ years. However. it is very hard to achieve the same level of enterprise computing when you start from an in-memory file system like Tachyon. Not bashing anything - just saying.

重点留一下这句话:

Adding HDFS and MR acceleration layer was pretty straight-forward as it was build on the advanced Ignite core, which has been in the real-world production for 5+ years. 

上面的话简单总结来说就是: ignite作为一个HDFS和MR的加速层会更加有效。主要原因是这个加速层是基于ignite core来实现的。

我想alluxio之所以作为acceleration layer来说不够称职,其原因可能就是并没有去为MapReduce和Spark去实现自己的alluxio native acceleration layer.

关于这个问题,我也同样在ignite的开源社区提问了,大家可以关注下该问题的动态:
Apache Ignite vs alluxio