• ArrayList 源码分析


    世上的事,只要肯用心去学,没有一件是太晚的。请你一定不要停下来,成为你想成为的人。

    前言

    learn from collection framework design中提到,collection framework分为两部分,分别为CollectionMap,其中Collection又分为三类分别为ListSetQueue,本篇文章先来分析ArrayList的实现。

    ArrayList继承关系


    如上图所示,它实现了RandomAccess(可随机访问),Cloneable(可克隆),Serializable(支持序列化和反序列化)接口以及List接口,并且它还继承了List的抽象模板类AbstractList
    其中,前三个接口都是marker interface,没有可以让实现类实现的方法。

    下面直接来看ArrayList内部的一些实现机制。

    内部实现

    数据结构

    其内部维护了一个Object类型的数组,即elementData成员变量,成员变量size记录list的大小。。

    初始化

    ArrayList的构造方法有如下三种重载,分别是:
    第一种方式:根据初始容量初始化ArrayList。

    /**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
    	if (initialCapacity > 0) { // 根据传入的初始的容量大小初始化List,其内部维护的是
    		this.elementData = new Object[initialCapacity];
    	} else if (initialCapacity == 0) {
    		this.elementData = EMPTY_ELEMENTDATA; // 是一个长度为0的空数组,即{}
    	} else { // 因数组长度不能小于0,故抛出异常
    		throw new IllegalArgumentException("Illegal Capacity: "+
    										   initialCapacity);
    	}
    }
    

    第二种:使用默认大小,默认内部数组长度为0。

    /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA; // DEFAULTCAPACITY_EMPTY_ELEMENTDATA默认为长度为0的空数组
    }
    

    第三种:根据传入的集合构建ArrayList

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @param c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public ArrayList(Collection<? extends E> c) {
    	elementData = c.toArray(); // 注意,先构造一个新的数组,然后使用数组拷贝,将旧数据拷贝到新数组,这样效率并不高,并且还浪费内存
    	if ((size = elementData.length) != 0) { // collection包含元素
    		// c.toArray might (incorrectly) not return Object[] (see 6260652)
    		if (elementData.getClass() != Object[].class)
    			elementData = Arrays.copyOf(elementData, size, Object[].class);
    	} else { // collection不包含元素,使用内部预定义的长度为0的数组。
    		// replace with empty array.
    		this.elementData = EMPTY_ELEMENTDATA;
    	}
    }
    

    内部数组扩容机制

    java.util.ArrayList#ensureCapacityInternal是专门用于扩容的私有方法,具体如下:

    private void ensureCapacityInternal(int minCapacity) {
        ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
    }
    

    一共有两个步骤,分别为计算所需容量以及扩容两个步。

    计算所需容量

    calculateCapacity源码如下:

    private static int calculateCapacity(Object[] elementData, int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            return Math.max(DEFAULT_CAPACITY, minCapacity); // 如果刚开始是空数组,则第一次扩容,数组长度需扩容到 max(10,需要的最小容量)
        }
        return minCapacity;
    }
    

    扩容

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++; // 记录内部数组扩容次数
    
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }
    

    这里为什么要用减法而不直接比较?
    因为minCapacity这个是由原始的大小 + 需要插入的元素的个数得到的,在加法运算后可能会出现溢出,变为负数,变为负数了就不能继续扩容了。
    grow具体如下:

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0) // 这里之所以用减法还是考虑到新的数组长度可能会溢出
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
    

    huge源码如下:

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
    

    扩容倍数是1.5,最大数组长度为 MAX_ARRAY_SIZE,即Integer.MAX_VALUE - 8,之所以要取这个值是因为,有的JVM在实现数组的时候,刚开始会保留一些header的信息,这些信息会占8个字节。在扩展数组时,长度一旦超过这个大小,会抛出OutOfMemoryError异常。
    也就是说,如果当前数组不足以容纳新的元素,则需要1.5倍扩容,最终容量最大为Integer.MAX_VALUE - 8

    缩容

    /**
     * Trims the capacity of this <tt>ArrayList</tt> instance to be the
     * list's current size.  An application can use this operation to minimize
     * the storage of an <tt>ArrayList</tt> instance.
     */
    public void trimToSize() {
        modCount++;
        if (size < elementData.length) {
            elementData = (size == 0)
              ? EMPTY_ELEMENTDATA
              : Arrays.copyOf(elementData, size);
        }
    }
    

    缩容效率其实不高,因为又重新创建了一个长度等于size的数组,然后再逐一拷贝旧数组上元素到新的数组上。

    单个元素插入

    有两种方式,分别如下:
    方式一,默认在结尾插入,如下:

    /**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }
    

    方式二,在指定位置插入元素,如下:

    /**
     * Inserts the specified element at the specified position in this
     * list. Shifts the element currently at that position (if any) and
     * any subsequent elements to the right (adds one to their indices).
     *
     * @param index index at which the specified element is to be inserted
     * @param element element to be inserted
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public void add(int index, E element) {
      rangeCheckForAdd(index); // 注意,检查下标的合法性,这个下标是跟ArrayList的长度比较的,不是跟内部数据的capacity比较的!
    
      ensureCapacityInternal(size + 1);  // Increments modCount!!
      // 把指定下标后(包括该下标)的数据整体后移一位
      System.arraycopy(elementData, index, elementData, index + 1,
                       size - index);
      elementData[index] = element;
      size++;
    }
    

    多个元素插入

    也有两种方式。
    方式一,在结尾插入,如下:

    /**
     * Appends all of the elements in the specified collection to the end of
     * this list, in the order that they are returned by the
     * specified collection's Iterator.  The behavior of this operation is
     * undefined if the specified collection is modified while the operation
     * is in progress.  (This implies that the behavior of this call is
     * undefined if the specified collection is this list, and this
     * list is nonempty.)
     *
     * @param c collection containing elements to be added to this list
     * @return <tt>true</tt> if this list changed as a result of the call
     * @throws NullPointerException if the specified collection is null
     */
    public boolean addAll(Collection<? extends E> c) {
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
        System.arraycopy(a, 0, elementData, size, numNew);
        size += numNew;
        return numNew != 0;
    }
    

    方式二,在指定位置插入,如下:

    /**
     * Inserts all of the elements in the specified collection into this
     * list, starting at the specified position.  Shifts the element
     * currently at that position (if any) and any subsequent elements to
     * the right (increases their indices).  The new elements will appear
     * in the list in the order that they are returned by the
     * specified collection's iterator.
     *
     * @param index index at which to insert the first element from the
     *              specified collection
     * @param c collection containing elements to be added to this list
     * @return <tt>true</tt> if this list changed as a result of the call
     * @throws IndexOutOfBoundsException {@inheritDoc}
     * @throws NullPointerException if the specified collection is null
     */
    public boolean addAll(int index, Collection<? extends E> c) {
        rangeCheckForAdd(index);
    
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
    
        int numMoved = size - index; // 计算需要index后(包括index)空出的元素的个数
        if (numMoved > 0)
            System.arraycopy(elementData, index, elementData, index + numNew,
                             numMoved);
    
        System.arraycopy(a, 0, elementData, index, numNew);
        size += numNew;
        return numNew != 0;
    }
    

    移除单个元素

    主要有两种方式,分别为:
    方式一,移出指定下标对应位置的元素,如下:

    /**
     * Removes the element at the specified position in this list.
     * Shifts any subsequent elements to the left (subtracts one from their
     * indices).
     *
     * @param index the index of the element to be removed
     * @return the element that was removed from the list
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public E remove(int index) {
        rangeCheck(index); // index 有效性校验,跟 内部元素个数 size 比较
    
        modCount++;
        E oldValue = elementData(index); // 获取指定下标下的元素
    
        int numMoved = size - index - 1; // 计算需要移动的元素的个数
        if (numMoved > 0) // 指定index后的所有元素统一向前一个索引距离
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work 设置为null,允许gc回收不用的对象,并更新list的大小
    
        return oldValue;
    }
    

    方式二,移出左边第一个出现的指定元素

    /**
     * Removes the first occurrence of the specified element from this list,
     * if it is present.  If the list does not contain the element, it is
     * unchanged.  More formally, removes the element with the lowest index
     * <tt>i</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>
     * (if such an element exists).  Returns <tt>true</tt> if this list
     * contained the specified element (or equivalently, if this list
     * changed as a result of the call).
     *
     * @param o element to be removed from this list, if present
     * @return <tt>true</tt> if this list contained the specified element
     */
    public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }
    

    注意,其一,判断相等使用的是equals方法,自定义的对象,需要根据自己的需求重新实现其equals方法;其二,从左向右遍历,只移出第一个跟指定对象相等(equals)的对象。

    其中,fastRemove方法如下:

    /*
     * Private remove method that skips bounds checking and does not
     * return the value removed.
     */
    private void fastRemove(int index) {
        modCount++; // 修改次数+1
        int numMoved = size - index - 1; // 计算需要向前移动的元素的个数
        if (numMoved > 0) // 如果需要移动,则将index后的元素统一向前移动一个元素大小位置,并把最后的元素的引用设为null,便于gc回收不再使用的对象,并更新list的大小。
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    }
    

    移除多个元素

    方式一,移除所有元素

    /**
     * Removes all of the elements from this list.  The list will
     * be empty after this call returns.
     */
    public void clear() {
        modCount++; // 修改次数 + 1
    
        // clear to let GC do its work
        for (int i = 0; i < size; i++) // 所有索引下标下的元素引用设置为null
            elementData[i] = null;
    
        size = 0; // 重置list的大小为0
    }
    

    方式二,移出指定范围内的元素,包括开始索引不包括结束索引

    /**
     * Removes from this list all of the elements whose index is between
     * {@code fromIndex}, inclusive, and {@code toIndex}, exclusive.
     * Shifts any succeeding elements to the left (reduces their index).
     * This call shortens the list by {@code (toIndex - fromIndex)} elements.
     * (If {@code toIndex==fromIndex}, this operation has no effect.)
     *
     * @throws IndexOutOfBoundsException if {@code fromIndex} or
     *         {@code toIndex} is out of range
     *         ({@code fromIndex < 0 ||
     *          fromIndex >= size() ||
     *          toIndex > size() ||
     *          toIndex < fromIndex})
     */
    protected void removeRange(int fromIndex, int toIndex) {
        modCount++; // 修改次数 + 1
        int numMoved = size - toIndex; // 计算需要移动的元素的个数
        System.arraycopy(elementData, toIndex, elementData, fromIndex,
                         numMoved);
    
        // clear to let GC do its work
        int newSize = size - (toIndex-fromIndex); // 计算list新的大小
        for (int i = newSize; i < size; i++) { // 从后往前依次清除指定位置上的元素
            elementData[i] = null;
        }
        size = newSize; // 更新list的大小
    }
    

    注意,这种方式是一个protected类型的,即只允许ArrayList子类或其本身调用的方法。

    方式三,批量移出给定集合内的元素或不在给定集合内的元素

    private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData;
        int r = 0, w = 0;
        boolean modified = false;
        try {
            for (; r < size; r++) // 从前向后遍历
                if (c.contains(elementData[r]) == complement)
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
            if (r != size) { // 剩余的整体前移
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) { // 有元素被移除
                // clear to let GC do its work
                for (int i = w; i < size; i++) // 移除之后的设置为null
                    elementData[i] = null;
                modCount += size - w; // 修改次数 + 移除的元素的个数
                size = w; // 修改list的大小
                modified = true; // 设置修改标志位为true
            }
        }
        return modified;
    }
    

    数据移除采用的是双指针,指针w维护的是新的list,指针r用于遍历旧的list,一次外层循环遍历即可得到新的list,其中w是新的list的大小,算法复杂度是O(n)

    方式四,移除指定集合内的所有元素

    public boolean removeAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, false);
    }
    

    其内部调用的是方式三的方法,不做过多说明。
    方式五,移除指定集合外的所有元素

    public boolean retainAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, true);
    }
    

    方式六,移除符合条件的所有数据

    @Override
    public boolean removeIf(Predicate<? super E> filter) {
        Objects.requireNonNull(filter);
        // figure out which elements are to be removed
        // any exception thrown from the filter predicate at this stage
        // will leave the collection unmodified
        int removeCount = 0;
        final BitSet removeSet = new BitSet(size);
        final int expectedModCount = modCount;
        final int size = this.size;
        for (int i=0; modCount == expectedModCount && i < size; i++) {
            @SuppressWarnings("unchecked")
            final E element = (E) elementData[i];
            if (filter.test(element)) {
                removeSet.set(i);
                removeCount++;
            }
        }
        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
    
        // shift surviving elements left over the spaces left by removed elements
        final boolean anyToRemove = removeCount > 0;
        if (anyToRemove) {
            final int newSize = size - removeCount;
            for (int i=0, j=0; (i < size) && (j < newSize); i++, j++) {
                i = removeSet.nextClearBit(i);
                elementData[j] = elementData[i];
            }
            for (int k=newSize; k < size; k++) {
                elementData[k] = null;  // Let gc do its work
            }
            this.size = newSize;
            if (modCount != expectedModCount) {
                throw new ConcurrentModificationException();
            }
            modCount++;
        }
    
        return anyToRemove;
    }
    

    对序列化的支持

    /**
     * Save the state of the <tt>ArrayList</tt> instance to a stream (that
     * is, serialize it).
     *
     * @serialData The length of the array backing the <tt>ArrayList</tt>
     *             instance is emitted (int), followed by all of its elements
     *             (each an <tt>Object</tt>) in the proper order.
     */
    private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException{
        // Write out element count, and any hidden stuff
        int expectedModCount = modCount;
        s.defaultWriteObject();
    
        // Write out size as capacity for behavioural compatibility with clone()
        s.writeInt(size);
    
        // Write out all elements in the proper order.
        for (int i=0; i<size; i++) {
            s.writeObject(elementData[i]);
        }
    
        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
    }
    

    注意,在序列化的时候,list大小不能修改,序列化的时候把list的大小size也保存下来了。

    /**
     * Reconstitute the <tt>ArrayList</tt> instance from a stream (that is,
     * deserialize it).
     */
    private void readObject(java.io.ObjectInputStream s)
        throws java.io.IOException, ClassNotFoundException {
        elementData = EMPTY_ELEMENTDATA;
    
        // Read in size, and any hidden stuff
        s.defaultReadObject();
    
        // Read in capacity
        s.readInt(); // ignored
    
        if (size > 0) {
            // be like clone(), allocate array based upon size not capacity
            int capacity = calculateCapacity(elementData, size);
            SharedSecrets.getJavaOISAccess().checkArray(s, Object[].class, capacity);
            ensureCapacityInternal(size);
    
            Object[] a = elementData;
            // Read in all elements in the proper order.
            for (int i=0; i<size; i++) {
                a[i] = s.readObject();
            }
        }
    }
    

    反序列化后,list的capacity和size是一样的。
    测试代码如下:

    package com.company;
    
    import java.io.ByteArrayInputStream;
    import java.io.ByteArrayOutputStream;
    import java.io.ObjectInputStream;
    import java.io.ObjectOutputStream;
    import java.lang.reflect.Field;
    import java.util.ArrayList;
    import java.util.List;
    
    public class Main {
        public static void main(String[] args) throws Exception {
    	// write your code here
            List<Integer> list = new ArrayList<>();
            for (int i = 0; i < 12; i++) {
                list.add(i);
            }
            System.out.println(list.size());
            System.out.println(list);
            ByteArrayOutputStream os = new ByteArrayOutputStream();
            ObjectOutputStream oos = new ObjectOutputStream(os);
            oos.writeObject(list);
            oos.flush();
            byte[] bytes = os.toByteArray();
            ObjectInputStream inputStream = new ObjectInputStream(new ByteArrayInputStream(bytes));
            List<Integer> o = (List<Integer>)inputStream.readObject();
            System.out.println(o.size());
            System.out.println(o);
            Field elementData1 = o.getClass().getDeclaredField("elementData");
            elementData1.setAccessible(true);
            Object[] elementData = (Object[]) elementData1.get(list);
            System.out.println(elementData.length);
            elementData = (Object[]) elementData1.get(o);
            System.out.println(elementData.length);
        }
    }
    

    克隆

    /**
     * Returns a shallow copy of this <tt>ArrayList</tt> instance.  (The
     * elements themselves are not copied.)
     *
     * @return a clone of this <tt>ArrayList</tt> instance
     */
    public Object clone() {
        try {
            ArrayList<?> v = (ArrayList<?>) super.clone(); // Object.clone是浅克隆,只克隆引用,内部数据需要重新拷贝一份
            v.elementData = Arrays.copyOf(elementData, size); // 重新建一个新的数组存放数据
            v.modCount = 0; // 修改内部数组长度变更次数
            return v;
        } catch (CloneNotSupportedException e) {
            // this shouldn't happen, since we are Cloneable
            throw new InternalError(e);
        }
    }
    

    注意ArrayList实现的是深克隆。

    测试代码

    ArrayList<Integer> list = new ArrayList<>();
    list.add(0);
    List<Integer> clone = (List<Integer>) list.clone();
    clone.set(0,1000);
    System.out.println(list.get(0)); // 0
    System.out.println(clone.get(0)); // 1000
    

    替换/变换

    替换,本质上就是一个变换,只不过这个是在原数组上修改旧值,原来的数据丢失了,如果还想要原来的数据,需要提前拷贝一份。

    @Override
    @SuppressWarnings("unchecked")
    public void replaceAll(UnaryOperator<E> operator) {
        Objects.requireNonNull(operator);
        final int expectedModCount = modCount;
        final int size = this.size;
        for (int i=0; modCount == expectedModCount && i < size; i++) {
            elementData[i] = operator.apply((E) elementData[i]);
        }
        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
        modCount++;
    }
    

    排序

    排序,其实现了通用的排序算法(调用Array.sort方法),排序比较规则交给用户来指定。

    @Override
    @SuppressWarnings("unchecked")
    public void sort(Comparator<? super E> c) {
        final int expectedModCount = modCount;
        Arrays.sort((E[]) elementData, 0, size, c);
        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
        modCount++;
    }
    

    遍历

    • Itr实现了可以向后遍历remove操作的迭代器,由iterator方法返回。
    • ListItr实现了可以向前遍历向后遍历元素的添加删除修改的迭代器,由listIterator方法返回。

    关于遍历,不得不说一个非常有名的异常 - ConcurrentModificationException, 多数情况下是由于list内部数组长度发生变化导致,modCount != expectedModCount或者是IndexOutOfBoundsException等等原因抛出的这个异常,遵循一个原则,在使用迭代器的时候,不能直接调用list的方法来修改list而要通过迭代器提供的相应方法来修改list。

    ArrayList的优势和缺点

    优势

    • 顺序存储,随机存取,数据元素与位置相关联,因此查找效率高,索引遍历快,时间复杂度O(1)
    • 尾部插入与删除的速度速度快

    缺点

    • 线程不安全
    • 非尾节点的插入和删除需要移除后续的元素,效率较低
    • 虽然支持扩缩容,但是,原数据需逐一拷贝,效率较低

    总结

    本篇文章,相对来说比较简单,归根结底,对ArrayList的各种操作都是对底层数组的操作,深刻理解数组这种非常简单的数据结构对理解ArrayList的各个操作有很大帮助。

  • 相关阅读:
    [编程题] 回文数问题
    [编程题] 以字符串的形式读入两个数字,再以字符串的形式输出两个数字的和
    [编程题] 合并有序链表
    [编程题] 借用栈实现链表反转
    [编程题] 二叉树求深度
    pat 甲级 1045 ( Favorite Color Stripe ) (动态规划 )
    pat 甲级 1034 ( Head of a Gang )
    PAT public bike management (30)
    PAT 甲练习 1003 Emergency
    vim 自动代码格式调整
  • 原文地址:https://www.cnblogs.com/johnny666888/p/12868310.html
Copyright © 2020-2023  润新知