“大数据算法”版本间的差异

2018年8月22日 (三) 15:22的最后版本

参见数据分析。

@@ 第1行： / 第1行： @@
-=大数据算法=
-===数据解析===
+参见[[数据分析]]。
-数据解析（Data Analytic），是指对数据集的属性值进行SUM，TopN，Rank操作。一般要求实时响应。
-* [https://lucene.apache.org/core/4_5_0/core/org/apache/lucene/util/BroadWord.html Broadword Implementation of Rank]
-大数据解析平台，是实现数据解析的分布式软件系统。
-* [http://kylin.io Apache Kylin]
-* [http://druid.io/ Druid]
-# Navarro, Gonzalo, and Eliana Providel. "Fast, small, simple rank/select on bitmaps." In International Symposium on Experimental Algorithms, pp. 295-306. Springer Berlin Heidelberg, 2012.
-# Vigna, Sebastiano. "Broadword implementation of rank/select queries." In International Workshop on Experimental and Efficient Algorithms, pp. 154-168. Springer Berlin Heidelberg, 2008.
-===基数估计===
-基数估计（Cardinality Estimation），评估一下一个集合中不同数目的个数。比如，访问一个网站的独立IP个数。
-* [https://github.com/Microsoft/CardinalityEstimation Cardinality Estimation Algorithm]
-# Flajolet, Philippe, Éric Fusy, Olivier Gandouet, and Frédéric Meunier. "Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm." DMTCS Proceedings 1 (2008).
-# Heule, Stefan, Marc Nunkesser, and Alexander Hall. "HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm." In Proceedings of the 16th International Conference on Extending Database Technology, pp. 683-692. ACM, 2013.

“大数据算法”版本间的差异

2018年8月22日 (三) 15:22的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具