Java--常见排序算法

前言堆排序 (Heap Sort)

最近遇到一个求解Top N的场景，从1亿条数据中，找出最大或者最小的10个数。怎么办？不可能对数据进行全排序吧，哪里有那么大的内存空间！谷歌搜索了相关的解决方案，最终定位在使用堆排序解决这个问题。

摘要

1、什么是二叉树？

2、什么是堆？

3、堆排序原理？

4、堆排序的Java实现。

5、堆排序的Scala实现。

主要内容

一、什么是二叉树

参考：https://zh.wikipedia.org/wiki/%E4%BA%8C%E5%8F%89%E6%A0%91

要了解堆首先需要了解下二叉树（英语：Binary tree），在计算机科学中，二叉树是每个节点最多有两个子树的树结构。通常子树被称作“左子树”（left subtree）和“右子树”（right subtree）。二叉树常被用于实现二叉查找树和二叉堆。

二叉树的每个结点至多只有二棵子树（不存在度大于 2 的结点），二叉树的子树有左右之分，次序不能颠倒。二叉树的第 i 层至多有 2i - 1 个结点；深度为 k 的二叉树至多有 2k - 1 个结点；对任何一棵二叉树 T，如果其终端结点数为 n0，度为 2 的结点数为 n2，则n0 = n2 + 1。

树和二叉树的三个主要差别：

树的结点个数至少为 1，而二叉树的结点个数可以为 0
树中结点的最大度数没有限制，而二叉树结点的最大度数为 2
树的结点无左、右之分，而二叉树的结点有左、右之分

二叉树又分为完全二叉树（complete binary tree）和满二叉树（full binary tree）

(1)满二叉树：一棵深度为 k，且有 2k - 1 个节点称之为满二叉树

Java--常见排序算法

深度为 3 的满二叉树 full binary tree。

(2)完全二叉树：深度为 k，有 n 个节点的二叉树，当且仅当其每一个节点都与深度为 k 的满二叉树中序号为 1 至 n 的节点对应时，称之为完全二叉树

Java--常见排序算法

深度为 3 的完全二叉树 complete binary tree

二、什么是堆？

堆（二叉堆）可以视为一棵完全的二叉树，完全二叉树的一个“优秀”的性质是，除了最底层之外，每一层都是满的，这使得堆可以利用数组来表示（普通的一般的二叉树通常用链表作为基本容器表示），每一个结点对应数组中的一个元素。

如下图，是一个堆和数组的相互关系：

Java--常见排序算法

《算法导论》中谈到：对于给定的某个结点的下标 i，可以很容易的计算出这个结点的父结点、左孩子结点和右孩子节点的下标(基于下标以1开始)：

Parent(i) = floor(i/2)，i 的父节点下标(向下取整)
Left(i) = 2i，i 的左子节点下标
Right(i) = 2i + 1，i 的右子节点下标

二叉堆一般分为两种：最大堆和最小堆。

最大堆：

最大堆的最大元素在根结点（堆顶）
堆中每个父节点的元素值都大于等于其孩子结点

最小堆：

最小堆的最小元素值在根结点（堆顶）
堆中每个父节点的元素值都小于等于其孩子结点

Java--常见排序算法

三、堆排序原理

堆排序就是把最大堆堆顶的最大数取出，将剩余的堆继续调整为最大堆，再次将堆顶的最大数取出，这个过程持续到剩余数只有一个时结束。

在堆中定义以下几种操作：

最大堆调整（Max-Heapify）：将堆的末端子节点作调整，使得子节点永远小于父节点，保证最大堆性质
创建最大堆（Build-Max-Heap）：将堆所有数据重新排序，使其成为最大堆
堆排序（Heap-Sort）：移除位在第一个数据的根节点，并做最大堆调整的递归运算

这里我们需要注意：数组都是 Zero-Based，这就意味着我们的堆数据结构模型要发生改变。

Java--常见排序算法

相应的，几个计算公式也要作出相应调整：

Parent(i) = floor((i-1)/2)，i 的父节点下标
Left(i) = 2i + 1，i 的左子节点下标
Right(i) = 2(i + 1)，i 的右子节点下标

下面我们一个一个地看关于堆排序的3个操作：

(1)操作一：最大堆调整（Max-Heapify），保证最大堆的性质

Java--常见排序算法

Java代码实现如下：

[java]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort;  

/** 

 * Created by yangjf on 20171030. 

 * Update date: 

 * Time: 22:03 

 * Project: ngaa-cdn-java-sdk 

 * Package: com.ngaa.utils 

 * Describe : 最大堆和最小堆的排序 

 * <p> 

 * Result of Test: test ok 

 * Command: 

 * </p><p> 

 * Email:  [email protected] 

 * Status：Using online 

 * </p><p> 

 * Please note: 

 * Must be checked once every time you submit a configuration file is correct! 

 * Data is priceless! Accidentally deleted the consequences! 

 */  

public class HeapSortUtil {  

    // i节点的父亲节点下标  

    private int parent(int i) {  

        return (int) (Math.floor(i / 2) - 1);  

    }  

    // i节点的左节点下标  

    private int left(int i) {  

        return 2 * i + 1;  

    }  

    // i节点的右节点下标  

    private int right(int i) {  

        return 2 * (i + 1);  

    }  

    // 交换下标为i的元素和下标为i的数组元素的值  

    private void swap(int[] a, int i, int j) {  

        int temp = a[i];  

        a[i] = a[j];  

        a[j] = temp;  

    }  

    // 使以i为根的子树成为最大堆，并保持最大堆的性质  

    private void maxHeapify(int[] a, int index, int heapSize) {  

        int l = left(index);        // 左儿子的下标  

        int r = right(index);       // 右儿子的下标  

        int largestIndex;        // 最大值的下标  

        //如果左儿子节点小于等于堆大小，左节点大于当前值;  

        if (l < heapSize && a[l] > a[index]) {  

            largestIndex = l;  

        } else {  

            largestIndex = index;  

        }  

        // 如果右儿子节点小于等于堆大小，右节点大于最大节点值;  

        if (r < heapSize && a[r] > a[largestIndex]) {  

            largestIndex = r;  

        }  

        // 如果最大值的index不等于当前根i，则交换根节点位置  

        if (largestIndex != index) {  

            swap(a, index, largestIndex);  

            // 递归调用避免违反最大堆的性质  

            maxHeapify(a, largestIndex, heapSize);  

        }  

    }  

}  

</p>

(2)操作二：创建最大堆（Build-Max-Heap）

创建最大堆（Build-Max-Heap）的作用是将一个数组改造成一个最大堆，接受数组和堆大小两个参数，Build-Max-Heap 将自下而上的调用 Max-Heapify 来改造数组，建立最大堆。因为 Max-Heapify 能够保证下标 i 的结点之后结点都满足最大堆的性质，所以自下而上的调用 Max-Heapify 能够在改造过程中保持这一性质。如果最大堆的数量元素是 n，那么 Build-Max-Heap 从 Parent(n) 开始，往上依次调用 Max-Heapify。流程如下：

Java--常见排序算法

Java实现的代码如下：

[java]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort;  

/** 

 * Created by yangjf on 20171030. 

 * Update date: 

 * Time: 22:03 

 * Project: ngaa-cdn-java-sdk 

 * Package: com.ngaa.utils 

 * Describe : 最大堆和最小堆的排序 

 * <p> 

 * Result of Test: test ok 

 * Command: 

 * </p><p> 

 * Email:  [email protected] 

 * Status：Using online 

 * </p><p> 

 * Please note: 

 * Must be checked once every time you submit a configuration file is correct! 

 * Data is priceless! Accidentally deleted the consequences! 

 */  

public class BuildMaxheap {  

    // i节点的父亲节点下标  

    private int parent(int i) {  

        return (int) (Math.floor(i / 2) - 1);  

    }  

    // i节点的左节点下标  

    private int left(int i) {  

        return 2 * i + 1;  

    }  

    // i节点的右节点下标  

    private int right(int i) {  

        return 2 * (i + 1);  

    }  

    // 交换下标为i的元素和下标为i的数组元素的值  

    private void swap(int[] a, int i, int j) {  

        int temp = a[i];  

        a[i] = a[j];  

        a[j] = temp;  

    }  

    // 使以i为根的子树成为最大堆，并保持最大堆的性质  

    private void maxHeapify(int[] a, int index, int heapSize) {  

        int l = left(index);        // 左儿子的下标  

        int r = right(index);       // 右儿子的下标  

        int largestIndex;     // 最大值的下标  

        //如果左儿子节点小于等于堆大小，左节点大于当前值;  

        if (l < heapSize && a[l] > a[index]) {  

            largestIndex = l;  

        } else {  

            largestIndex = index;  

        }  

        // 如果右儿子节点小于等于堆大小，右节点大于最大节点值;  

        if (r < heapSize && a[r] > a[largestIndex]) {  

            largestIndex = r;  

        }  

        // 如果最大值的index不等于当前根i，则交换根节点位置  

        if (largestIndex != index) {  

            swap(a, index, largestIndex);  

            // 递归调用避免违反最大堆的性质  

            maxHeapify(a, largestIndex, heapSize);  

        }  

    }  

    // 创建最大堆  

    private void buildMaxHeapify(int[] a, int heapSize) {  

        int parentIndex = parent(a.length);  

        for (int i = parentIndex; i >= 0; i--) {  

            maxHeapify(a, i, heapSize);  

        }  

    }  

}  

</p>

(3)操作三：堆排序（Heap-Sort）

堆排序（Heap-Sort）是堆排序的接口算法，Heap-Sort先调用Build-Max-Heap将数组改造为最大堆，然后将堆顶和堆底元素交换，之后将底部上升，最后重新调用Max-Heapify保持最大堆性质。由于堆顶元素必然是堆中最大的元素，所以一次操作之后，堆中存在的最大元素被分离出堆，重复n-1次之后，数组排列完毕。整个流程如下图：

Java--常见排序算法

Java实现如下：

[java]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort;  

/** 

 * Created by yangjf on 20171030. 

 * Update date: 

 * Time: 22:03 

 * Project: ngaa-cdn-java-sdk 

 * Package: com.ngaa.utils 

 * Describe : 最大堆和最小堆的排序 

 * <p> 

 * Result of Test: test ok 

 * Command: 

 * </p><p> 

 * Email:  [email protected] 

 * Status：Using online 

 * </p><p> 

 * Please note: 

 * Must be checked once every time you submit a configuration file is correct! 

 * Data is priceless! Accidentally deleted the consequences! 

 */  

public class HeapSort {  

    // i节点的父亲节点下标  

    private int parent(int i) {  

        return (int) (Math.floor(i / 2) - 1);  

    }  

    // i节点的左节点下标  

    private int left(int i) {  

        return 2 * i + 1;  

    }  

    // i节点的右节点下标  

    private int right(int i) {  

        return 2 * (i + 1);  

    }  

    // 交换下标为i的元素和下标为i的数组元素的值  

    private void swap(int[] a, int i, int j) {  

        int temp = a[i];  

        a[i] = a[j];  

        a[j] = temp;  

    }  

    // 使以i为根的子树成为最大堆，并保持最大堆的性质  

    private void maxHeapify(int[] a, int index, int heapSize) {  

        int l = left(index);        // 左儿子的下标  

        int r = right(index);       // 右儿子的下标  

        int largestIndex;     // 最大值的下标  

        //如果左儿子节点小于等于堆大小，左节点大于当前值;  

        if (l < heapSize && a[l] > a[index]) {  

            largestIndex = l;  

        } else {  

            largestIndex = index;  

        }  

        // 如果右儿子节点小于等于堆大小，右节点大于最大节点值;  

        if (r < heapSize && a[r] > a[largestIndex]) {  

            largestIndex = r;  

        }  

        // 如果最大值的index不等于当前根i，则交换根节点位置  

        if (largestIndex != index) {  

            swap(a, index, largestIndex);  

            // 递归调用避免违反最大堆的性质  

            maxHeapify(a, largestIndex, heapSize);  

        }  

    }  

    // 创建最大堆  

    private void buildMaxHeapify(int[] a, int heapSize) {  

        int parentIndex = parent(a.length);  

        for (int i = parentIndex; i >= 0; i--) {  

            maxHeapify(a, i, heapSize);  

        }  

    }  

    // 对a数组升序排序：使用最大堆  

    public void heapAscSort(int[] a, int headSize) {  

        buildMaxHeapify(a, headSize);  

        for (int i = a.length - 1; i > 0; i--) {  

            swap(a, 0, i);  

            headSize = headSize - 1;     // 通过减小headSize，去掉节点i  

            maxHeapify(a, 0, headSize);  // 还原位置，避免违反最大堆性质  

        }  

    }  

}  

</p>

四、堆排序的Java实现

(1)堆排序算法实现

[java]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort;  

/** 

 * Created by yangjf on 20171030. 

 * Update date: 

 * Time: 22:03 

 * Project: ngaa-cdn-java-sdk 

 * Package: com.ngaa.utils 

 * Describe : 最大堆和最小堆的排序 

 * <p> 

 * Result of Test: test ok 

 * Command: 

 * </p><p> 

 * Email:  [email protected] 

 * Status：Using online 

 * </p><p> 

 * Please note: 

 * Must be checked once every time you submit a configuration file is correct! 

 * Data is priceless! Accidentally deleted the consequences! 

 */  

public class HeapSortUtil {  

    // i节点的父亲节点下标  

    private int parent(int i) {  

        return (int) (Math.floor(i / 2) - 1);  

    }  

    // i节点的左节点下标  

    private int left(int i) {  

        return 2 * i + 1;  

    }  

    // i节点的右节点下标  

    private int right(int i) {  

        return 2 * (i + 1);  

    }  

    // 交换下标为i的元素和下标为i的数组元素的值  

    private void swap(int[] a, int i, int j) {  

        int temp = a[i];  

        a[i] = a[j];  

        a[j] = temp;  

    }  

    // 使以i为根的子树成为最大堆，并保持最大堆的性质  

    private void maxHeapify(int[] a, int index, int heapSize) {  

        int l = left(index);        // 左儿子的下标  

        int r = right(index);       // 右儿子的下标  

        int largestIndex;     // 最大值的下标  

        //如果左儿子节点小于等于堆大小，左节点大于当前值;  

        if (l < heapSize && a[l] > a[index]) {  

            largestIndex = l;  

        } else {  

            largestIndex = index;  

        }  

        // 如果右儿子节点小于等于堆大小，右节点大于最大节点值;  

        if (r < heapSize && a[r] > a[largestIndex]) {  

            largestIndex = r;  

        }  

        // 如果最大值的index不等于当前根i，则交换根节点位置  

        if (largestIndex != index) {  

            swap(a, index, largestIndex);  

            // 递归调用避免违反最大堆的性质  

            maxHeapify(a, largestIndex, heapSize);  

        }  

    }  

    // 使以i为根的子树成为最小堆，并保持最小堆的性质  

    private void minHeapify(int[] a, int index, int heapSize) {  

        int l = left(index);        // 左儿子的下标  

        int r = right(index);       // 右儿子的下标  

        int largestIndex;     // 最大值的下标  

        //如果左儿子节点小于等于堆大小，左节点小于当前值;  

        if (l < heapSize && a[l] < a[index]) {  

            largestIndex = l;  

        } else {  

            largestIndex = index;  

        }  

        // 如果右儿子节点小于等于堆大小，右节点小于最大节点值;  

        if (r < heapSize && a[r] < a[largestIndex]) {  

            largestIndex = r;  

        }  

        // 如果最大值的index不等于当前根i，则交换根节点位置  

        if (largestIndex != index) {  

            swap(a, index, largestIndex);  

            // 递归调用避免违反最小堆的性质  

            minHeapify(a, largestIndex, heapSize);  

        }  

    }  

    // 创建最大堆  

    private void buildMaxHeapify(int[] a, int heapSize) {  

        int parentIndex = parent(a.length);  

        for (int i = parentIndex; i >= 0; i--) {  

            maxHeapify(a, i, heapSize);  

        }  

    }  

    // 创建最小堆  

    private void buildMinHeapify(int[] a, int heapSize) {  

        int parentIndex = parent(a.length);  

        for (int i = parentIndex; i >= 0; i--) {  

            minHeapify(a, i, heapSize);  

        }  

    }  

    // 对a数组降序排序：使用最小堆  

    public void heapDescSort(int[] a, int headSize) {  

        buildMinHeapify(a, headSize);  

        for (int i = a.length - 1; i > 0; i--) {  

            swap(a, 0, i);  

            headSize = headSize - 1;     // 通过减小headSize，去掉节点i  

            minHeapify(a, 0, headSize);  // 还原位置，避免违反最小堆性质  

        }  

    }  

    // 对a数组升序排序：使用最大堆  

    public void heapAscSort(int[] a, int headSize) {  

        buildMaxHeapify(a, headSize);  

        for (int i = a.length - 1; i > 0; i--) {  

            swap(a, 0, i);  

            headSize = headSize - 1;     // 通过减小headSize，去掉节点i  

            maxHeapify(a, 0, headSize);  // 还原位置，避免违反最大堆性质  

        }  

    }  

}  

</p>

(2)使得数组始终保持升序或者降序

[java]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort;  

/** 

 * Created by yangjf on 20171030. 

 * Update date: 

 * Time: 8:46 

 * Project: ngaa-cdn-java-sdk 

 * Package: com.ngaa.utils 

 * Describe : 找到最大堆和最小堆的排序 

 * <p> 

 * Result of Test: test ok 

 * Command: 

 * </p><p> 

 * Email:  [email protected] 

 * Status：Using online 

 * </p><p> 

 * Please note: 

 * Must be checked once every time you submit a configuration file is correct! 

 * Data is priceless! Accidentally deleted the consequences! 

 */  

public class FindTopNUtils {  

    private static HeapSortUtil heapSortUtil = new HeapSortUtil();  

    /** 

     * 方法的目的：使得数组a始终保持降序排序 

     * 

     * @param a     堆数组:例如 a={10,9,8,7,6,5,4,3,2,1} 

     * @param value 输入的值 

     * @throws Exception 异常 

     */  

    public synchronized void findMaxTopN(int[] a, int value) throws Exception {  

        try {  

            int arraySize = a.length; // 数组长度  

            /** 

             * tmp的值可能性是 

             *  (1)大于最大的元素：   tmp>heap[0] 

             *  (2)处于最小和最大之间：heap[arraySize-1]<tmp<heap[0] *="" (3)舍弃值：value="heap[0]" 、="" value="heap[arraySize-1]" 和="" <="" heap[arraysize-1](小于最小值)="" if="" (value=""> a[0] || (a[arraySize - 1] < value && value < a[0])) { 

                // 阶梯交换值：即将最小的值用value替换 

                a[arraySize - 1] = value; 

                // 保证最小堆的性质 

                heapSortUtil.heapDescSort(a, arraySize); 

            } 

        } catch (Exception minE) { 

            throw new RuntimeException(minE); 

        } 

    } 

    /** 

     * 方法的目的：使得数组a始终保持升序排序 

     * 

     * @param a     堆数组:例如 a={1,2,3,4,5,6,7,8} 

     * @param value 输入的值 

     * @throws Exception 异常 

     */  

    public synchronized void findMinTopN(int[] a, int value) throws Exception {  

        try {  

            int arraySize = a.length; // 数组长度  

            /** 

             * tmp的值可能性是 

             *  (1)小于最小值：       tmp<heap[0] *="" (2)处于最小和最大之间：heap[0]<tmp<heap[arraysize-1]="" (3)舍弃值：tmp="heap[0]" 、="" tmp="heap[arraySize-1]" (4)tmp="">heap[arraySize-1](大于数组最大值) 

             * 

             * 

             */  

            if (value < a[0] || (a[0] < value && value < a[arraySize - 1])) {  

                // 阶梯交换值：即将最大的值用value替换  

                a[arraySize - 1] = value;  

                // 保证最大堆的性质  

                heapSortUtil.heapAscSort(a, arraySize);  

            }  

            // 为了避免数组初始时没有元素加入，需要添加：value>a[0]  

            if (value > a[0] && a[0] == 0) {  

                // 阶梯交换值：即将第一个元素用value替换  

                a[0] = value;  

                // 保证最大堆的性质  

                heapSortUtil.heapAscSort(a, arraySize);  

            }  

        } catch (Exception maxE) {  

            throw new RuntimeException(maxE);  

        }  

    }  

}  

</heap[0]></tmp<heap[0]></p>

(3)测试排序是否正常

准备一个文件：number.txt

包含的内容是1千万条随机数：

Java--常见排序算法

测试的Java代码如下:

[java]view
plain copy
 print?

package com.ngaa.bigdata.scala.test;  

import com.ngaa.bigdata.common.utils.sort.FindTopNUtils;  

import java.io.BufferedReader;  

import java.io.File;  

import java.io.FileReader;  

import java.io.IOException;  

/** 

 * Created by yangjf on 20171030. 

 * Update date: 

 * Time: 13:53 

 * Project: sparkmvn 

 * Package: com.ngaa.bigdata.scala.core 

 * Describe : 

 * <p> 

 * Result of Test: test ok,test error 

 * Command: 

 * </p><p> 

 * Email:  [email protected] 

 * Status：Using online 

 * </p><p> 

 * Please note: 

 * Must be checked once every time you submit a configuration file is correct! 

 * Data is priceless! Accidentally deleted the consequences! 

 */  

public class TestTopNForJava {  

    public static void main(String[] args) throws IOException {  

        int topN = 10;  

        String file = "G:/tmp/number.txt";  

        // 最小的前n个数  

        int[] minTopN = findTopNMin(topN, file);  

        for (int minArray : minTopN) {  

            System.out.println("-------->" + minArray);  

        }  

        System.out.println("-----------------------------------------------------------------------------");  

        // 最大的前n个数  

        int[] maxTopN = findTopNMax(topN, file);  

        for (int maxArray : maxTopN) {  

            System.out.println("-------->" + maxArray);  

        }  

    }  

    //求最大的前topN个数  

    static int[] findTopNMax(int topN, String filePath) throws NumberFormatException, IOException {  

        File file = new File(filePath);  

        int[] heap = new int[topN];  //创建长度为topN的数组  

        FileReader fr = new FileReader(file);  

        BufferedReader br = new BufferedReader(fr);  

        String line = null;  

        FindTopNUtils heapSort = new FindTopNUtils();  

        int i = 0;  //初始下标为0  

        while ((line = br.readLine()) != null) {  

            //如果元素有值  

            if (line.trim().length() > 0) {  

                int tmp = Integer.  

                        parseInt(line);  

                try {  

                    heapSort.findMaxTopN(heap, tmp);  // 获取最大的前N个数  

                } catch (Exception e) {  

                    e.printStackTrace();  

                }  

            }  

        }  

        br.close();  

        fr.close();  

        return heap;  

    }  

    //求最小的前topN个数  

    static int[] findTopNMin(int topN, String filePath) throws NumberFormatException, IOException {  

        File file = new File(filePath);  

        int[] heap = new int[topN];  //创建长度为topN的数组  

        FileReader fr = new FileReader(file);  

        BufferedReader br = new BufferedReader(fr);  

        String line = null;  

        FindTopNUtils heapSort = new FindTopNUtils();  

        int i = 0;  //初始下标为0  

        while ((line = br.readLine()) != null) {  

            //如果元素有值  

            if (line.trim().length() > 0) {  

                int tmp = Integer.  

                        parseInt(line);  

                try {  

                    heapSort.findMinTopN(heap, tmp);    // 获取最小的前N个数  

                } catch (Exception e) {  

                    e.printStackTrace();  

                }  

            }  

        }  

        br.close();  

        fr.close();  

        return heap;  

    }  

}  

</p>

五、堆排序的Scala实现

(1)堆排序算法

[plain]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort  

/**  

  * Created by yangjf on 20171030.  

  * Update date:  

  * Time: 10:09  

  * Project: sparkmvn  

  * Package: com.ngaa.bigdata.common.utils.sort  

  * Describe :  

  *          This class is the largest stack and the smallest heap sort for the second element of the ancestor.  

  *  

  * Result of Test: test ok  

  * Command:  

  *  

  * Email:  [email protected]  

  * Status：Using online  

  *  

  * Please note:  

  * Must be checked once every time you submit a configuration file is correct!  

  * Data is priceless! Accidentally deleted the consequences!  

  *  

  */  

class SortByHeapUtils extends Serializable{  

  def parent(i: Int): Int = {  

     (Math.floor(i / 2) - 1).asInstanceOf[Int]  

  }  

  def left(i: Int): Int = {  

    2 * i + 1  

  }  

  def right(i: Int): Int = {  

    2 * (i + 1)  

  }  

  def swap(array: Array[(String, Long)], i: Int, j: Int): Unit = {  

    val tmp = array(i)  

    array(i) = array(j)  

    array(j) = tmp  

  }  

  def minHeapify(a: Array[(String, Long)], index: Int, heapSize: Int): Any = {  

    val l = left(index)  

    val r = right(index)  

    var largestIndex: Int = 0  

    if (l < heapSize && (a(l)._2 < a(index)._2)) {  

      largestIndex = l  

    } else {  

      largestIndex = index  

    }  

    if (r < heapSize && a(r)._2 < a(largestIndex)._2) {  

      largestIndex = r  

    }  

    if (largestIndex != index) {  

      swap(a, index, largestIndex)  

      minHeapify(a, largestIndex, heapSize)  

    }  

  }  

  def maxHeapify(a: Array[(String, Long)], index: Int, heapSize: Int): Any = {  

    val l = left(index)  

    val r = right(index)  

    var largestIndex: Int = 0  

    if (l < heapSize && (a(l)._2 > a(index)._2)) {  

      largestIndex = l  

    } else {  

      largestIndex = index  

    }  

    if (r < heapSize && a(r)._2 > a(largestIndex)._2) {  

      largestIndex = r  

    }  

    if (largestIndex != index) {  

      swap(a, index, largestIndex)  

      maxHeapify(a, largestIndex, heapSize)  

    }  

  }  

  def buildMinHeapify(a: Array[(String, Long)], heapSize: Int): Unit = {  

    val parentIndex: Int = parent(a.length)  

    for (i <- parentIndex to 0 by -1) {  

      minHeapify(a, i, heapSize)  

    }  

  }  

  def buildMaxHeapify(a: Array[(String, Long)], heapSize: Int): Unit = {  

    val parentIndex: Int = parent(a.length)  

    for (i <- parentIndex to 0 by -1) {  

      maxHeapify(a, i, heapSize)  

    }  

  }  

  def heapDescSort(a: Array[(String, Long)], headSize: Int) {  

    buildMinHeapify(a, headSize)  

    var headSizeTmp = headSize  

    for (i <- a.length - 1 to 0 by -1) {  

      swap(a, 0, i)  

      headSizeTmp -= 1  

      minHeapify(a, 0, headSizeTmp)  

    }  

  }  

  def heapAscSort(a: Array[(String, Long)], headSize: Int) {  

    buildMaxHeapify(a, headSize)  

    var headSizeTmp = headSize  

    for (i <- a.length - 1 to 0 by -1) {  

      swap(a, 0, i)  

      headSizeTmp -= 1  

      maxHeapify(a, 0, headSizeTmp)  

    }  

  }  

}

(2)保持数组降序或者升序

[plain]view
plain copy
 print?

package com.ngaa.bigdata.common.utils.sort  

import com.ngaa.bigdata.common.model.global.NgaaException  

import com.ngaa.bigdata.common.traits.HeapSort  

/**  

  * Created by yangjf on 20171030.  

  * Update date:  

  * Time: 11:54  

  * Project: sparkmvn  

  * Package: com.ngaa.bigdata.common.utils.sort  

  * Describe :  

  *        The Scala version looks for the largest number of N and the smallest number of N numbers in the tuple.  

  *  

  * Result of Test: test ok  

  * Command:  

  *  

  * Email:  [email protected]  

  * Status：Using online  

  *  

  * Please note:  

  * Must be checked once every time you submit a configuration file is correct!  

  * Data is priceless! Accidentally deleted the consequences!  

  *  

  */  

class FindSortTopN extends HeapSort with Serializable{  

  private val sortByHeapUtils = new SortByHeapUtils  

  @throws(classOf[NgaaException])  

  override def findMaxTopN(a: Array[(String, Long)], value: (String, Long)): Unit = {  

    try {  

      val arraySize: Int = a.length // 数组长度  

      /**  

        * tmp的值可能性是  

        * (1)大于最大的元素：   tmp>heap[0]  

        * (2)处于最小和最大之间：heap[arraySize-1]< tmp < heap[0]  

        * (3)舍弃值：value=heap[0] 、 value=heap[arraySize-1] 和 value < heap[arraySize-1](小于最小值)  

        */  

      if (value._2 >= a(0)._2 || (a(arraySize - 1)._2 < value._2 && value._2 < a(0)._2)) {  

        // 阶梯交换值：即将最小的值用value替换  

        a(arraySize - 1) = value  

        // 保证最小堆的性质  

        sortByHeapUtils.heapDescSort(a, arraySize)  

      }  

    }  

    catch {  

      case minE: Exception => throw new RuntimeException(minE)  

    }  

  }  

  @throws(classOf[NgaaException])  

  override def findMinTopN(a: Array[(String, Long)], value: (String, Long)): Unit = {  

    try {  

      val arraySize = a.length; // 数组长度  

      /**  

        * tmp的值可能性是  

        * (1)小于最小值：       tmp<heap[0] *="" (2)处于最小和最大之间：heap[0]<tmp<heap[arraysize-1]="" (3)舍弃值：tmp="heap[0]" 、="" tmp="heap[arraySize-1]" (4)tmp="">heap[arraySize-1](大于数组最大值)  

        *  

        *  

        */  

      if (value._2 < a(0)._2 || (a(0)._2 < value._2 && value._2 < a(arraySize - 1)._2)) {  

        // 阶梯交换值：即将最大的值用value替换  

        a(arraySize - 1) = value  

        // 保证最大堆的性质  

        sortByHeapUtils.heapAscSort(a, arraySize)  

      }  

      // 为了避免数组初始时没有元素加入，需要添加：value>a[0]  

      if (value._2 > a(0)._2 && a(0)._2 == 0) {  

        // 阶梯交换值：即将第一个元素用value替换  

        a(0) = value  

        // 保证最大堆的性质  

        sortByHeapUtils.heapAscSort(a, arraySize)  

      }  

    } catch {  

      case maxE: Exception => throw new RuntimeException(maxE)  

    }  

  }  

  @throws(classOf[NgaaException])  

  override def initArray(array: Array[(String, Long)],initValue:(String,Long)=("init",0l)): Unit = {  

   for(i <- array.indices ){  

      array(i)=initValue  

   }  

  }  

}  

</heap[0]>

注：

代码中涉及的文件内容如下

[plain]view
plain copy
 print?

package com.ngaa.bigdata.common.traits  

import com.ngaa.bigdata.common.model.global.NgaaException  

/**  

  * Created by yangjf on 20171030.  

  * Update date:  

  * Time: 11:19  

  * Project: sparkmvn  

  * Package: com.ngaa.bigdata.common.traits  

  * Describe : Heap sort interface  

  *  

  * Result of Test: test ok  

  * Command:  

  *  

  * Email:  [email protected]  

  * Status：Using online  

  *  

  * Please note:  

  * Must be checked once every time you submit a configuration file is correct!  

  * Data is priceless! Accidentally deleted the consequences!  

  *  

  */  

 trait HeapSort  extends Serializable{  

  /**  

    *  Initialize the array  

    * @param array       Input array  

    * @param initValue   Init value  

    * @throws com.ngaa.bigdata.common.model.global.NgaaException exception  

    */  

    @throws(classOf[NgaaException])  

    def initArray(array:Array[(String, Long)],initValue:(String,Long)=("init",0l))  

  /**  

    * Discover the largest number of N numbers in the Tuple.  

    * @param array Input array.  

    * @param tuple Tuple  

    * @throws com.ngaa.bigdata.common.model.global.NgaaException exception  

    */  

    @throws(classOf[NgaaException])  

    def findMaxTopN(array:Array[(String, Long)],tuple:(String,Long))  

  /**  

    * Discover the smallest number of N numbers in the Tuple.  

    * @param array Input array.  

    * @param tuple Tuple  

    * @throws com.ngaa.bigdata.common.model.global.NgaaException exception  

    */  

    @throws(classOf[NgaaException])  

    def findMinTopN(array:Array[(String, Long)],tuple:(String,Long))  

}

参考文章：

1、堆排序：http://student.zjzk.cn/course_ware/data_structure/web/paixu/paixu8.4.2.1.htm

2、算法-堆排序：http://ind.ntou.edu.tw/~litsnow/al98/pdf/Algorithm-Ch6-Heapsort.pdf

3、堆排序：http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/heapSort.htm

4、排序算法：http://www.sorting-algorithms.com/

5、计算机算法：http://www.nowamagic.net/algorithm/algorithm_HeapSortStudy.php

6、堆排序wiki：https://zh.wikipedia.org/wiki/%E5%A0%86%E6%8E%92%E5%BA%8F

Java--常见排序算法

相关推荐