本文目录:
本文主要对常用的排序算法进行测试,分析总结。
1. 总结对比
Wiki 上的总结对比非常详细:
https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms
2. 源码
项目源码:https://github.com/loverszhaokai/ALG
sort算法源码:https://github.com/loverszhaokai/ALG/blob/master/src/sort.cc
sort算法测试源码:https://github.com/loverszhaokai/ALG/blob/master/test/sort_test.cc
3. 测试
3.1 测试数据
有四组测试数据,如下所示:
const struct ArrSize arr_sizes [] = { { 1000000, 20 }, // 1 million * 20 = 20 million { 10000000, 20 }, // 10 million * 20 = 200 million { 100000, 200 }, // 100 thousand * 200 = 20 million { 100000, 2000 }, // 100 thousand * 2000 = 200 million };
例如,对于 { 1000000, 20 } 就是100万个大小为20的数组。即 int arr[1000000][20];
该数组中的数据都是随机生成的,代码如下:
static int generate_arrays(int **a, int size_1d, int size_2d) { srand(time(NULL)); // mod = mod + mod / 2; static const int mods[] = { 1, 2, 3, 4, 6, 9, 13, 19, 28, 42, 63, 94, 141, 211, 316, 474, 711, 1066, 1599, 2398, 3597, 5395, 8092, 12138, 18207, 27310, 40965, 61447, 92170, 138255, 207382, 311073, 466609, 699913, 1049869, 1574803, 2362204, 3543306, 5314959, 7972438, 11958657, 17937985, 26906977, 40360465, 60540697, 90811045, 136216567, 204324850, 306487275, 459730912, 689596368, 1034394552 }; int mi; for (int iii = 0; iii < size_1d; iii++) { for (int jjj = 0; jjj < size_2d; jjj++) { mi = rand() % (sizeof(mods) / sizeof(mods[0])); a[iii][jjj] = rand() % mods[mi]; } } return 0; }
3.2 测试方法
1. 先随机生成一个数组 array_orig[size_1d][size_2d]; 如上例,size_1d 就是100万,size_2d就是20。
2. 将该数组拷贝一份 array_expected[size_1d][size_2d]; 然后使用STL sort对该数组进行排序。该数组就是排好序的。
3. 测试sort算法,比如bubble_sort()
3.1 将array_orig[size_1d][size_2d] 拷贝一份 array[size_1d][size_2d];
3.2 使用bubble_sort() 对array[size_1d][size_2d]进行排序
3.3 比较array[size_1d][size_2d] 和 array_expected[size_1d][size_2d] 是否相等。
代码如下:
typedef void (* SortFunction)(int a[], const int size); static int test_sort(const char *function_name, SortFunction sort_f) { TimeUtil tu; int **array; if (copy_arrays(&array, array_orig, size_1d, size_2d) != 0) goto err; // Run tu.restart(); for (int iii = 0; iii < size_1d; iii++) sort_f(array[iii], size_2d); tu.stop(); if (compare_array(array, array_expected, size_1d, size_2d) != 0) goto err; cout << std::setw(20) << function_name << std::setw(10) << " total run time = " << std::setw(10) << (int)tu.get_total_run_time() << " ms" << std::setw(18) << " when arrays is: " << std::setw(12) << size_1d << " * " << size_2d << endl; free_arrays(array, size_1d); return 0; err: free_arrays(array, size_1d); return -1; }
3.3 测试数据
/* Test data: It takes 664.014 ms to generate arrays: 1000000 * 20 TEST_insert_sort total run time = 183 ms when arrays is: 1000000 * 20 TEST_stl_sort total run time = 245 ms when arrays is: 1000000 * 20 TEST_quick_sort total run time = 310 ms when arrays is: 1000000 * 20 TEST_select_sort total run time = 411 ms when arrays is: 1000000 * 20 TEST_merge_sort total run time = 483 ms when arrays is: 1000000 * 20 TEST_bubble_sort total run time = 530 ms when arrays is: 1000000 * 20
It takes 6642.19 ms to generate arrays: 10000000 * 20 TEST_insert_sort total run time = 1862 ms when arrays is: 10000000 * 20 TEST_stl_sort total run time = 2446 ms when arrays is: 10000000 * 20 TEST_quick_sort total run time = 3071 ms when arrays is: 10000000 * 20 TEST_select_sort total run time = 4106 ms when arrays is: 10000000 * 20 TEST_merge_sort total run time = 4791 ms when arrays is: 10000000 * 20 TEST_bubble_sort total run time = 5324 ms when arrays is: 10000000 * 20
It takes 801.532 ms to generate arrays: 100000 * 200 TEST_insert_sort total run time = 665 ms when arrays is: 100000 * 200 TEST_stl_sort total run time = 431 ms when arrays is: 100000 * 200 TEST_quick_sort total run time = 513 ms when arrays is: 100000 * 200 TEST_select_sort total run time = 1578 ms when arrays is: 100000 * 200 TEST_merge_sort total run time = 756 ms when arrays is: 100000 * 200 TEST_bubble_sort total run time = 3317 ms when arrays is: 100000 * 200
It takes 9591.41 ms to generate arrays: 100000 * 2000 TEST_insert_sort total run time = 52265 ms when arrays is: 100000 * 2000 TEST_stl_sort total run time = 5953 ms when arrays is: 100000 * 2000 TEST_quick_sort total run time = 7375 ms when arrays is: 100000 * 2000 TEST_select_sort total run time = 112452 ms when arrays is: 100000 * 2000 TEST_merge_sort total run time = 10539 ms when arrays is: 100000 * 2000 TEST_bubble_sort total run time = 236275 ms when arrays is: 100000 * 2000
sort_test.cc total run time=482889 ms */
3.4 结果分析
1. 当数组大小较小时,例如20,插入排序是最优的
2. 当数组大小较大时,STL sort排序是最优的。(STL sort还有待研究。。)
4. 参考资料
1. Sorting algorithm. https://en.wikipedia.org/wiki/Sorting_algorithm
2. Benchmarks: 14 Sorting Algorithms and PHP Arrays. http://kukuruku.co/hub/php/benchmarks-14-sorting-algorithms-and-php-arrays
3. Compare sorting algorithms' performance. http://rosettacode.org/wiki/Compare_sorting_algorithms'_performance