在数据分析层面,处理的算法[[大数据算法]]以及统计分析的工具。
= 大数据的存储/管理/处理现状 =
#Spark项目是目前最热门的大数据平台
#Real-time Analytics Platform 还没有统治性的软件框架
= Hadoop =
* [http://hadoop.apache.org Hadoop]
= Spark =
# Zaharia, Matei, et al. "Spark: cluster computing with working sets.“ Proceedings of the 2nd USENIX conference on Hot topics in cloud computing. Vol. 10. 2010.
= Druid =
#Yang, Fangjin, et al. "Druid: a real-time analytical data store." SIGMOD 2014.
#Yang, Fangjin, et al. "The RADStack: Open source lambda architecture for interactive analytics." Proceedings of the 50th Hawaii International Conference on System Sciences. 2017.
=参考材料=