**The basics.**PersistentVector stores its elements in arrays, each array having at most size 32. One can think of the arrays as forming a wide balanced tree with each node having at most 32 children. Here is an example of what a size 32

^{2}= 1024 PersistentVector tree might look like. The root of the tree is a pointer to a size-32 array ("blue") which has 32 children ("red"). The 32 red arrays each have 32 references to actual 1024 objects stored in the PersistentVector.

**Fetching elements (get).**The key to understanding PersistentVector is to look at the binary representation of the numbers that index the vector. Java uses 32 bits for ints with signed two's-complement representation. Examples: One can think of this binary representation as divided into chunks of 5 bits, represented in colors here. These "chunks" are mapped to levels in the tree: for example, when fetching the element at 16, one looks at the the "blue" bits (00000) -- this determines the index at the root child array (0); following that pointer, the "red" bits (10000) determines the index at the first level (16). When fetching index 49 the blue bits index to 1 and the red bits index to 17. In this case the depth of the tree is 2: the cost of lookup is following two pointers and some bit computation. In general the cost of lookup is proportional to the height of the tree, i.e., O(log

_{32}n) where n is the number of elements in the vector. In practice the height is no more than 7. Fetching the n'th element in PersistentVector is done with the

`nth`method.

Ignoring the 'tail'-part, the code uses an instance variable:public Object nth(int i){ if(i >= 0 && i < cnt) { if(i >= tailoff()) return tail[i & 0x01f]; Object[] arr = root; for(int level = shift; level > 0; level -= 5) arr = (Object[]) arr[(i >>> level) & 0x01f]; return arr[i & 0x01f]; } throw new IndexOutOfBoundsException(); }

`shift`which is determined by the height of the tree:

`shift = 5*(h+1)`where

`h`is the height of the tree (defined here as 0 for an empty tree). Another way to think of the

`shift`variable is that it determines how many bits to right-shift the bit representation of an integer to move for a certain block (color) of interest to be the rightmost block. For example shifting

`0000110001 (49)`with 5 gives

`00000000001`which is exactly the "blue" index. In general

`shift`determines the right-shift needed to access the root index, i.e., for the block of bits that index into the root array to become the right-most block. The tree is searched from the most significant blocks (root, and it's children) towards the least significant blocks (the "high levels" or leaves). Notice that for any integer

`i`, if we mask

`i`'s bit representation with

`00...011111`we get the five right-most bits of

`i`. The expression

`i & 0x1f`does this as

`0x1f`is

`011111`.

**Insertion (put).**Insertion is more tricky. Here we consider only the 'cons' method which appends an element to the end of the vector. The method which replaces the element at an index is easy once 'cons' and 'nth' is understood. Of course, since PersistentVector is immutable 'cons' really returns a new PersistentVector object which is equal to the old one except that it has the new element at the end. With cons, the tree grows only in a balanced way: elements are added at the deepest level at the rightmost array. If the PersistentVector contains a number of elements which is a power of 32 (e.g., 1024 = 32

^{2}), the height of the tree is increased by one and there is now "room" for 32 times more elements. In our example from before, suppose we add an element at the end of the vector. We obtain this tree: Notice that the entire old PersistentVector is shared with the new PersistentVector! No copying is involved in this particular case.

**The tail...**Actually (to be precise) this happens not exactly at powers of 32 but slightly delayed: at the insertion of the 33rd element after a power of 32. E.g., 1024 + 33 = 1057. As you can see from the figure above, an entire size-32 array is inserted at the lower right-most position (instead of a size 1 array with the new element). This is an optimization. There is a special instance variable

`tail`of size at most 32. The

`tail`array is a sort of buffer where elements are appended until a full size-32 array can be placed in the tree. This makes "batch" insertion faster: suppose you creating a new PersistentVector from another collection type, e.g., a LinkedList. Since PersistentVector is immutable this is done by starting with the empty vector and then repeatedly 'cons'ing elements from the LinkedList to obtain a new vector. E.g.,

This also explains the 'tail' code in thestatic public PersistentVector create(List items){ PersistentVector ret = EMPTY; for(Object item : items) ret = ret.cons(item); return ret; }

`nth`method above: If we a getting an element which happens to be located in the

`tail`buffer at present (there can be from 0 to 32 elements with this property), we simply access it directly. This is done by looking at the 5 least significant bits of the index. While the code for 'cons' is intricate, the idea is reasonably simple as illustrated by the above figure. I will end this post with the actual code. If anyone is interested, I can go into the details of the implementation.

public PersistentVector cons(Object val){ if(tail.length < 32) { Object[] newTail = new Object[tail.length + 1]; System.arraycopy(tail, 0, newTail, 0, tail.length); newTail[tail.length] = val; return new PersistentVector(meta(), cnt + 1, shift, root, newTail); } Box expansion = new Box(null); Object[] newroot = pushTail(shift - 5, root, tail, expansion); int newshift = shift; if(expansion.val != null) { newroot = new Object[]{newroot, expansion.val}; newshift += 5; } return new PersistentVector(meta(), cnt + 1, newshift, newroot, new Object[]{val}); }

private Object[] pushTail(int level, Object[] arr, Object[] tailNode, Box expansion){ Object newchild; if(level == 0) { newchild = tailNode; } else { newchild = pushTail(level - 5, (Object[]) arr[arr.length - 1], tailNode, expansion); if(expansion.val == null) { Object[] ret = arr.clone(); ret[arr.length - 1] = newchild; return ret; } else newchild = expansion.val; } //expansion if(arr.length == 32) { expansion.val = new Object[]{newchild}; return arr; } Object[] ret = new Object[arr.length + 1]; System.arraycopy(arr, 0, ret, 0, arr.length); ret[arr.length] = newchild; expansion.val = null; return ret; }