【java容器的刻意练习】【八】ArrayList与LinkedList的遍历

我们使用容器经常会用到遍历，而之前几篇文章都没有提到这一点。所以，今天把这块内容补一下。

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, 
        RandomAccess, 
        Cloneable, 
        java.io.Serializable

ArrayList 集成 AbstractList 抽象类。AbstractList 中提供了两个迭代器的实现类，默认实现了迭代器接口，实现了对元素的遍历，它们就是Itr 和其子类 ListItr。
ArrayList 实现 List 接口，能对它进行队列操作。
ArrayList 实现 RandomAccess 接口，
ArrayList 实现了 Cloneable 接口，即覆盖了函数clone()，能克隆。
ArrayList 实现 java.io.Serializable 接口，这意味着 LinkedList 支持序列化，能通过序列化去传输。

先看看 ArrayList 的三种遍历

        // 先创建一个100w元素的ArrayList
        for (int i=0; i < 1000000; i++){
            arrayList.add(i);
        }

        long start, end;

        // foreach循环
        start = System.currentTimeMillis();
        for (Integer item : arrayList) {
            //System.out.println(item);
        }
        end = System.currentTimeMillis();
        System.out.println("foreach cost time = " + (end-start) + "ms");

        // 普通for循环，通过get函数获取元素
        start = System.currentTimeMillis();
        for (int i=0; i < arrayList.size(); i++) {
            arrayList.get(i);
        }
        end = System.currentTimeMillis();
        System.out.println("for cost time = " + (end-start) + "ms");

        // 迭代器循环
        start = System.currentTimeMillis();
        Iterator<Integer> it = arrayList.iterator();
        while (it.hasNext()) {
            it.next();
        }
        end = System.currentTimeMillis();
        System.out.println("Iterator cost time = " + (end-start) + "ms");

foreach cost time = 15ms
for cost time = 7ms
Iterator cost time = 29ms

对同一个100w元素的ArrayList进行三种不同的循环，我们发现，最快的居然是通过get函数读取元素。

那么，foreach循环是什么东西，怎么会比迭代器快的呢？

使用foreach结构的类对象必须实现了Iterable接口，Java的Collection继承自此接口，List实现了Collection，Collection 这个接口仅包含一个函数，源码如下：

public interface Iterable<T> {
 
    /**
     * Returns an iterator over a set of elements of type T.
     *
     * @return an Iterator.
     */
    Iterator<T> iterator();

}

iterator()用于返回一个Iterator，原来，JAVA中的增强for循环底层是通过迭代器模式来实现的。

那为什么遍历ArrayList使用迭代器就这么慢呢？

我们看下public interface Iterator<E>接口的实现：

   public Iterator<E> iterator() {
        return new Itr();
    }

    /**
     * An optimized version of AbstractList.Itr // 这还是优化过的版本呢
     */
    private class Itr implements Iterator<E> {
        // 数组的游标
        int cursor;       // index of next element to return 
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;

        // prevent creating a synthetic constructor
        Itr() {}

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            
            // 临时引用指向数组
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            // 通过数组索引访问元素
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

原来，ArrayList的迭代器的next()函数也是通过索引快速访问到元素的。如果数组比较小，那么不管用哪种遍历性能都差不多。如果数组比较大，那么，我们现在知道，用普通循环+get读取元素来遍历ArrayList，会比迭代器快一倍。这是因为get做了内联优化处理，节省大量的函数调用进出时间。

所以，如果有面试官问你，ArrayLsit哪种遍历最快？为什么？你会回答了吗？

接下来，我们来看看LinedList的定义。

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, 
    Deque<E>, 
    Cloneable, 
    java.io.Serializable

LinkedList 集成AbstractSequentialList抽象类，内部使用listIterator迭代器来实现重要的方法
LinkedList 实现 List 接口，能对它进行队列操作。
LinkedList 实现 Deque 接口，即能将LinkedList当作双端队列使用。
LinkedList 实现了Cloneable接口，即覆盖了函数clone()，能克隆。
LinkedList 实现java.io.Serializable接口，这意味着LinkedList支持序列化，能通过序列化去传输。

先看 LinkedList 的遍历。先插入100w个元素，然后用foreach、迭代器、removeFirst 这三种方法遍历（循环get不需要测试了，简直慢到难以忍受）。

        LinkedList linkedList = new LinkedList();
        for (int i=0; i<1000000; i++) {
            linkedList.add(i);
        }

        long start, end;

        start = System.currentTimeMillis();
        for (Object item : linkedList) {
            //System.out.println(item);
        }
        end = System.currentTimeMillis();

        System.out.println("foreach cost time = " + (end-start) + "ms");

        start = System.currentTimeMillis();
        Iterator<Integer> it = linkedList.iterator();
        while (it.hasNext()) {
            it.next();
        }
        end = System.currentTimeMillis();
        System.out.println("Iterator cost time = " + (end-start) + "ms");

        start = System.currentTimeMillis();
        while (linkedList.size() != 0) {
            linkedList.removeFirst();
        }
        end = System.currentTimeMillis();
        System.out.println("removeFirst cost time = " + (end-start) + "ms");

运行代码得到结果如下：

foreach cost time = 12ms
Iterator cost time = 11ms
removeFirst cost time = 24ms

印证了之前我们分析的结论，foreach就是Iterator，所以它们的效率都差不多。不过从代码精简方面来看，当然是foreach更简单。

而某些文章提到用removeFirst会遍历更快，但是在笔者的环境下不成立。

为什么呢？我们先看看next()的实现：

    private class ListItr implements ListIterator<E> {

        private Node<E> lastReturned;
        private Node<E> next;
        private int nextIndex;
        private int expectedModCount = modCount;

        public E next() {
            checkForComodification();
            if (!hasNext())
                throw new NoSuchElementException();

             
            lastReturned = next;
            next = next.next;
            nextIndex++;
            return lastReturned.item;
        }

真相大白！原来LinkedList为了用迭代器顺序遍历，用了lastReturned保存最近一次返回的节点，next下一次返回的节点，nextIndex保存下一个节点的索引。

再将元素增大一倍，就是200万个元素。

foreach cost time = 20ms
Iterator cost time = 21ms
removeFirst cost time = 37ms

而removeFirst还是比迭代器多10多毫秒。证明这个多10ms应该是多了节点删除操作的开销。

ok，那么今天我们的结论就是

ArrayList推荐用for循环get元素遍历最快，不仅是连续内存读取，而且还有内联优化节省函数调用开销。

LinkedList推荐用foreach遍历元素最快，因为内部实现是迭代器。

相同的数据量，ArrayList最快的遍历比LinkedList最快遍历快一倍。

【java容器的刻意练习】【八】ArrayList与LinkedList的遍历

推荐阅读更多精彩内容