Greetings!
After studying the collections, namely such List
implementations such as ArrayList
and LinkedList
, an idea emerged, why not combine these data structures into one and see what happens.
ArrayList
problem - it has the default initial size of DEFAULT_CAPACITY
or the specified initialCapacity
size, when this size is exceeded, a new larger array is created, while the data from the old array is copied there, which is very time consuming and this gives in the worst case algorithmic complexity O(n)
LinkedList
problem is the opposite here, adding a new item is just adding a new link (creating another Node
and adding a link to it), but the operation of getting an item by index is very expensive, since you will need to go through the entire list from the beginning, which is very expensive and gives O(n)
What if you create a data structure in which the insertion and receipt of any element will be in constant time. I will use the ArrayList
technology without re-creating the array
In order to link them together, I will use a doubly linked list:
Let's go directly to the source code:
public class Node<T> { Node prev; Node next; T[] value; public Node(Node prev, Node next, T[] value) { this.prev = prev; this.next = next; this.value = value; } }
To begin, create a standard doubly-linked list structure, where value
is an array of values.
Next, we turn to a specific implementation of the class, declare the necessary variables:
public static int INIT_CAPACITY = 100; private Object[] arr = new Object[INIT_CAPACITY]; private int index = 0; private int level = 0; private Node<T> first = new Node<>(null, null, (T[]) arr); private Node last = first; private Node current = null; private int size = 0;
Here INIT_CAPACITY
is the initial size of the array, it can be overridden in the corresponding constructor, arr
is the array itself, the index
variable is needed to calculate the index, level
is to calculate the level, then we will explain in detail what it is for, first
- link to the first element of the list ,last
is a link to the last element of the list, current
is a link to the current element of the list (the last selection), so you can speed up the selection of consecutive elements or close to them, size
is the size (or amount of data).
Set up 2 constructors - by default and to change the initial size:
public MyLinkedList() { first.next = last.next; } public MyLinkedList(int size) { INIT_CAPACITY = size; arr = new Object[INIT_CAPACITY]; first.next = last.next; }
Add item:
public void add(T i) { if (index == INIT_CAPACITY) { arr = new Object[INIT_CAPACITY]; last.next = new Node<>(last, null, (T[]) arr); index = 0; last = last.next; } arr[index] = i; index++; size++; }
Here we check the condition; if the array is full, then we create a new one and remember the link to it.
Item receipt:
public T get(int i) { T value; int level = i / INIT_CAPACITY; int index = i % INIT_CAPACITY; if (this.current == null) { this.level = 0; this.current = first; } if(this.level > level) for (int j = this.level; j < level; j++) { this.level = level; current = current.prev; } else for (int j = this.level; j < level; j++) { this.level = level; current = current.next; } value = (T) current.value[index]; return value; }
Levels are the number of arrays in the list, ie, at the 0th level, 1 array, at the 1st level 2 arrays, etc., index
is the index of the current level 0..INIT_CAPACITY
, we also have a link to the current element list of current
, which was obtained from the previous sample, i.e. if the new level is greater than the previous one, then go forward from the current element and if vice versa, then go back. Also added 2 quick operations - getting the first and last element:
public T getFirst(){ return first.value[0]; } public T getLast() { return (T) last.value[(size-1) % INIT_CAPACITY]; }
The first and last elements get as easy and fast as in an array.
The operation of deleting the last element - the fastest thing is to overwrite the value with null, if the entire array becomes filled with nulls, then we lose the reference to it and the garbage collector will clean everything:
public void removeLast(){ if (last.value[0] == null) { last = last.prev; index=INIT_CAPACITY-1; } last.value[(size-1) % INIT_CAPACITY]=null; size--; index--; }
Getting the size is obvious. An iterator has also been added, i.e. this class implements iterable and implements the iterator method
private Node<T> first; private int index = -1; public MyLinkedListIterator(Node<T> first) { this.first = first; } @Override public boolean hasNext() { index++; return first != null; } @Override public T next() { T value; int index = this.index % INIT_CAPACITY; value= first.value[index]; if(index==INIT_CAPACITY-1||this.first.value[index+1]==null) first=first.next; return value; }
Measured using the JMH benchmark, an example test:
public class TestGetMyDeque { private static class SetDeque { MyDeque<Integer> deque = new MyDeque<>(); SetDeque() { for (int i = 0; i <Constant.COUNT_ELEMENT; i++) { deque.add(i); } } } @State(Scope.Benchmark) public static class MyState{ SetDeque deque; @Setup(Level.Invocation) public void setUp() { deque = new SetDeque(); } } @Benchmark @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.SECONDS) public void getFirst(MyState state, Blackhole blackhole) { for (int i = 0; i < Constant.COUNT_ELEMENT; i++) { blackhole.consume(state.deque.deque.getFirst()); } } }
Documentation can be viewed on the Internet.
I took a different number of elements, test results in seconds:
N = 100,000 | Insert at the end | Getting first | Getting average | Getting last |
---|---|---|---|---|
MyDeque | 0,002 | 0 | 0.001 | 0 |
ArrayDeque | 0,002 | 0 | - | 0.001 |
ArrayList | 0,002 | 0.001 | 0 | 0.001 |
Linkedlist | 0,003 | 0 | 28,465 | 0.001 |
Take a million items:
N = 1,000,000 | Insert at the end | Getting first | Getting average | Getting last |
---|---|---|---|---|
MyDeque | 0,019 | 0,005 | 0,010 | 0,007 |
ArrayDeque | 0.025 | 0,005 | - | 0,005 |
ArrayList | 0.036 | 0,005 | 0,005 | 0,005 |
Linkedlist | 0.051 | 0,005 | OutOfMemoryError: Java heap space | 0,006 |
Finally, take 10 million items:
N = 10,000,000 | Insert at the end | Getting first | Getting average | Getting last |
---|---|---|---|---|
MyDeque | 1.832 | 0.047 | 0.090 | 0.071 |
ArrayDeque | 2,065 | 0.046 | - | 0.058 |
ArrayList | 4,413 | 0.048 | 0.047 | 0.048 |
Linkedlist | OutOfMemoryError: Java heap space |
Measurement of memory done on the principle https://github.com/jheusser/core-java-performance-examples/
(x axis is a million elements, y axis is memory mb)
As can be seen from the graph, devouring memory with this structure is not much different from similar data structures.
In the end, we received a LIFO queue, which works faster than the usual Deque, and in terms of inserting elements, much faster than ArrayList, there are also no problems that can be encountered when using LinkedList. In the future, it is planned to implement such operations as insertion and deletion from any place without loss of performance, there are already some developments in this regard, at this stage I would like to receive feedback and see the interest of the community for further work on the new data structure.
Source: https://habr.com/ru/post/352930/
All Articles