上QQ阅读APP看书，第一时间看更新

The central limit theorem

As shown in by the LLN applied to this business case, the k-means clustering project must provide a reasonable set of centroids and clusters (regions of locations for long-duration phone calls).

This approach can now be extended to the CLT, which states, in machine learning parlance, that when training a large dataset, a subset of mini-batch samples is sufficient. The following two conditions define the main properties of the central limit theorem:

The variance between the data points of the subset (mini-batch) remains reasonable. In this case, filtering only long-duration calls solves the problem.
The normal distribution pattern with mini-batch variances close to the variance of the whole dataset.

本周热推：

常用算法深入学习实录电脑故障排除与维护终极技巧金典算法设计与分析 Excel VBA语法与应用手册 Mastercam X4中文版完全自学一本通