<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Higher-Order &#187; Clojure</title>
	<atom:link href="http://blog.higher-order.net/category/clojure/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.higher-order.net</link>
	<description>topics: functional programming, concurrency, web-development, REST, dynamic languages</description>
	<lastBuildDate>Fri, 29 Apr 2011 11:00:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
		<item>
		<title>vectormap and pvectormap</title>
		<link>http://blog.higher-order.net/2010/10/14/vectormap-and-pvectormap/</link>
		<comments>http://blog.higher-order.net/2010/10/14/vectormap-and-pvectormap/#comments</comments>
		<pubDate>Thu, 14 Oct 2010 18:59:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[clj-ds]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=690</guid>
		<description><![CDATA[So after attending Brian Goetz&#8217; talk and Rich Hickey&#8217;s talk at JAOO Aarhus (eer, I mean Goto Aarhus), I was thinking about how to construct Clojure data structures in parallel. To start with something that wasn&#8217;t too complex, I decided &#8230; <a href="http://blog.higher-order.net/2010/10/14/vectormap-and-pvectormap/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>So after attending Brian Goetz&#8217; talk and Rich Hickey&#8217;s talk at JAOO Aarhus (eer, I mean <a href="http://gotocon.com/aarhus-2010/tracks/">Goto Aarhus</a>), I was thinking about how to construct Clojure data structures in parallel. </p>
<p>To start with something that wasn&#8217;t too complex, I decided to try and create a parallel version of mapping a function for vectors, i.e., an eager function that would take a vector and a function as input and produce a mapped vector as output (instead of a seq). This would replace a pattern I&#8217;ve often used:</p>
<pre><tt><code>(into [] (map f vs))
</code></tt></pre>
<p>with <tt><code>(vectormap f vs)</code></tt>, which avoids the overhead of constructing a seq and the reconstructing a vector by conj&#8217;ing from the seq. </p>
<p>My initial goal was to produce a parallel version e.g., <tt><code>(pvectormap f vs)</code></tt>. As Rich has pointed out: Clojure&#8217;s persistent data structures are excellent candidates for parallel processing using divide and conquer since they are trees which are already &#8220;sitting there divided!&#8221; Further immutability means no synchronization is needed. For example, <a href="http://blog.higher-order.net/2009/02/01/understanding-clojures-persistentvector-implementation/">remember that</a> PersistentVector is a balanced 32-way tree consisting of size 32 arrays of Objects (Nodes in the tree or the actual values stored in the vector). </p>
<p>I decided to warm up by implementing vectormap, i.e., the serial version, first. </p>
<h2>Remember this?</h2>
<p><a href="http://olabini.com/blog/2010/07/the-jvm-language-summit-2010/">Apparently Rich Hickey presented this piece of code</a> at JVM Lang summit:</p>
<pre><tt><code>static public Object ret1(Object ret, Object nil) {
    return ret;
}

public static int count(Object o){
    if(o instanceof Counted)
        return ((Counted) o).count();
    return countFrom(Util.ret1(o, o = null));
}
</code></tt></pre>
<p>Has Rich gone mad? A two argument static method which simply returns the first argument?? When I first read Ola&#8217;s blog post I simply couldn&#8217;t figure out why he would use that code&#8230; However, when I was writing the vectormap code I was thinking: suppose the input vector is really large &#8212; in fact, so large that we don&#8217;t have memory enough to hold both the input and the output vector. Then vectormap would produce an OutOfMemoryError. But suppose the calling code didn&#8217;t actually need the input vector what if we would release references to the elements of the input vector as we construct the corresponding mapped elements in the output vector (but without destroying the input vector)? We would need to only keep references to Node vectors we hadn&#8217;t already processed, and then null out our local variables to those we had. </p>
<p>This was when I realized that this is exactly what Rich&#8217;s function can help with: when calling the <tt>countFrom</tt> function, he provides as argument the seq referenced by the local variable o via the ret1 function. The side effect of using ret1 is that since Java is strict,  both argument expressions to ret1 are evaluated (left-to-right), and consequently o is null&#8217;ed out. No more holding on the the head <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I could use this to only hold on to the vector arrays I hadn&#8217;t processed. </p>
<p>For example:</p>
<pre><tt><code>private static Node mapNode(IFn f, Node node, int level) {
	if (node == null) {return null;}
	if (level == 0) {
		return new Node(null,mapArray(f, Util.ret1(node.array, node=null)));
	}
	Object[] newArr = new Object[node.array.length];
	System.arraycopy(node.array, 0, newArr, 0, node.array.length);
	node=null;
	level -= 5;
	for (int i=0;i&lt;newArr.length;i++) {
		newArr[i] = mapNode(f,Util.ret1((Node) newArr[i], newArr[i]=null),level);
	}
	return new Node(null,newArr);
}
</code></tt></pre>
<p>This code is available in <a href="http://blog.higher-order.net/2010/06/11/clj-ds-clojures-persistent-data-structures-for-java/">my clj-ds project</a>.</p>
<h2>Parallelize with Fork/Join</h2>
<p>The pvectormap function uses <a href="http://gee.cs.oswego.edu/dl/concurrency-interest/">Fork/Join</a> to parallelize the mapping. I&#8217;m not sure about the granularity of the tasks, but I decided that processing a size 32 array was too small a task, and went with processing 32 size 32 arrays instead.</p>
<p>Starting at the root array of nodes, the code simply forks 32 tasks which recursively process each child of the root. This forking continues recursively until we hit the second lowest level in the tree &#8212; this is processed sequentially using the mapNode function from above. This is implemented as a RecursiveTask in the Fork/Join framework. </p>
<pre><tt><code>static final class PMapTask extends RecursiveTask&lt;Node&gt; {

	private IFn f;
	private int shift;
	private Node node;

	public PMapTask(IFn f, int shift, Node node) {
		this.f = f;
		this.shift = shift;
		this.node = node;
	}

	public Node compute() {
		if (node == null) {
			return null;
		}
	   if (this.shift &lt;= 5) {
		   return mapNode(f,node,shift);
	   }

	   PMapTask[] tasks = new PMapTask[node.array.length];
	   shift -= 5;
	   for (int i=0;i&lt;tasks.length;i++) {
		   tasks[i] = new PMapTask(f,shift,(Node) node.array[i]);
	   }
	   invokeAll(tasks);
	   Node[] nodes = new Node[node.array.length];
	   try {
		   for (int i=0;i&lt;tasks.length;i++) {
				nodes[i] = tasks[i].get();
		   }
		   return new Node(null,nodes);
	   } catch (InterruptedException e) {
			Thread.currentThread().interrupt();
			throw new RuntimeException(e);
		} catch (ExecutionException e) {
			throw new RuntimeException(e);
		}
   }
}
</code></tt></pre>
<p>On my dual core system with a non trivial function f that actually does some work, using pvectormap is about twice as fast as vectormap: <a href="http://github.com/krukow/clj-ds/blob/master/test/com/trifork/clj_ds/test/PersistentVectorTest.java#L90">see PersistentVectorTest of clj-ds</a>.</p>
<p>Next stop: optimize and add vectormap and pvectormap to Clojure core <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2010/10/14/vectormap-and-pvectormap/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Assoc and Clojure&#8217;s PersistentHashMap: part ii</title>
		<link>http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii/</link>
		<comments>http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 08:15:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[persistent data structures]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=609</guid>
		<description><![CDATA[Some time ago I wrote introductory posts that gave high-level overviews of how Clojure&#8217;s PersistentVector and PersistentHashMap work. In the PersistentHashMap post I promised that &#8220;In part 2 we look at how assoc works…&#8221; &#8211; it seems I never got &#8230; <a href="http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Some time ago I wrote introductory posts that gave high-level overviews of how Clojure&#8217;s PersistentVector and PersistentHashMap work. In the <a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/">PersistentHashMap post</a> I promised that &#8220;In part 2 we look at how assoc works…&#8221; &#8211; it seems I never got around to that!</p>
<p>A lot of interesting things have happened with data structures in JVM-land since then: <a href="http://github.com/krestenkrab/erjang">Erjang</a> uses Clojure&#8217;s data structures, it looks like Scala <a href="https://lampsvn.epfl.ch/trac/scala/ticket/3724">is porting PersistentVector</a>, <a href="http://olabini.com/blog/2010/07/preannouncing-seph/">upcoming Seph</a> is using them too. My <a href="http://blog.higher-order.net/2010/06/11/clj-ds-clojures-persistent-data-structures-for-java/">clj-ds project</a> should help in this regard: I&#8217;ve extracted the data structures of Clojure from its compiler for use with JVM-based languages (providing some extra stuff like reverse and &#8220;positioned&#8221; iterators). There are already people interested in using this in Java land.</p>
<p>Some people have asked for the &#8220;part ii&#8221; post, and my son just fell asleep, so &#8230; <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Most of what is described in the <a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/">PersistentHashMap post</a> is still true, however there have been optimizations and simplifications which I explain here. Note: before reading this post, you should read the previous post on PersistentHashMap.</p>
<p><strong>First, some changes</strong>. Previously there were five implementations of the <tt>INode</tt> interface; this has changed and there are now only three implementations: <tt>ArrayNode</tt>, <tt>BitmapIndexedNode</tt> and <tt>HashCollisionNode</tt>. This means that: <tt>EmptyNode, LeafNode, FullNode</tt> are out, with <tt>ArrayNode</tt> replacing <tt>FullNode</tt>. An array node is an array where the entries are null or instances of INode, i.e., it stores other nodes but not any key-value pairs. An empty persistent hash map is simply a persistent hash map where the root node is null &#8212; this removes the need for EmptyNode. Finally, leaf nodes used to store the actual entries stored in the map. Leaf nodes are out, and instead BitmapIndexedNodes directly embed the map entries in their arrays. </p>
<p>The idea is the following: if, in the old implementation, a BitmapIndexedNode would store a leaf node at an index, then instead, it now embeds the key and value directly in its array. This array used to be of type <tt>INode[]</tt>, storing only nodes, but is now a mixed object array for which the value can be one of: a map key, a map entry, null or an INode object. </p>
<p><img src="http://blog.higher-order.net/files/clj/persistenthashmap1.png" alt="Old persistent hashmap structure" /></p>
<p><em>The above drawing corresponds to the old structure.</em> In the new structure all the white circles (the leaf nodes) are embedded directly in their parents.</p>
<p><strong><em>There is an invariant: </em> </strong></p>
<p>for all the indices that are in the bitmap, the even indices store keys or null; the odd indices store values or INode objects. Key-value pairs are layed out in sequence, so that if index <tt>2*i</tt> is a key then index <tt>2*i + 1</tt> is a value. If an index <tt>2*i</tt> is null then <tt>2*i+1</tt> must be a non-null INode object.</p>
<p>Note: if you don&#8217;t remember how bitpos, bitmap and index work see <a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/">the part i post</a>.</p>
<p>The strategy used in BitmapIndexedNode is that it can store up to 16 entries in the bit map: if the size grows above 16, it is converted to a FullNode. </p>
<p><strong>Assoc</strong>. The assoc method creates a persistent hash map which is like the current one, except that it additionally stores another map entry. As before assoc works using path copying: all that is changed in the new map is the path from the root node to the newly added map entry.</p>
<p><em>Beware</em>: the following drawing looks confusing: take the time and read the explanation. It is showing two persistent hash maps, where one is obtained by assoc&#8217;ing to the first. Again this is a modified version of one of Rich&#8217;s slides. </p>
<p><img style="position: relative; left:-20px;" src="http://blog.higher-order.net/files/clj/persistenthashmap-pathcopy.png" alt="Path copying in the old structure" /></p>
<p>The first map is rooted on the left (where the left-most box points to). The second is rooted where the right-most box points to. The nodes are grouped in three, each indicated by a colored circle that surrounds it. The new node is in the right-most, lower corner.</p>
<p>Purple is the path in the old tree to where the new map entry would be;<br />
green is the new nodes that are created in the new hash map.<br />
The dashed lines indicate that the nodes in the new tree share children with nodes in the old tree. The red circles show all the nodes that are shared in the tree: note that this is most of the nodes. </p>
<p>So how much work needs to be done to create the new tree? Suppose we are storing key k with value v. Using recursion, descend down the original tree as if looking for the key k. The key k isn&#8217;t found. In the worst case this takes time: <tt>O(log<sub>32</sub> n)</tt>, in practice it we could stop at any level in the tree so 2-3 steps would be common (remember from <a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice">the old post</a> that the work done at each level is constant and fast). At the bottom we create a new BitmapIndexedNode and store the new map entry in it. If the bottom node in the old tree was a BitmapIndexedNode with less than 16 elements the new BitmapIndexedNode is a copy of the old one, except that the new map entry is added. This step takes constant time since the array to copy at the bitmap indexed node always has less than 32 elements (because we store at most 16 map entries as: key, value, key, value, &#8230;). If the bottom node was an <tt>ArrayNode</tt> we simply copy the array node, and make the new BitmapIndexedNode a size-one child of this array node (still constant time). </p>
<p>On the drawing above, we have now recursively descended the purple path in the left/old tree, and we have created the new node which is a bitmap indexed node. What remains is to establish the path to this new node, in the new tree: this path is almost identical to the &#8220;purple&#8221; path in the old tree: the only difference is that we have created a new bitmap indexed node. The parent of the new node must of course reference it, so that is copied and modified to get a reference to the new node. This means that we must also copy the grandparent of the new node, modifying it to reference the parent. And so on&#8230; This copying takes place on our way &#8220;up&#8221; through the recursion, i.e., after the recursive calls complete at each node level.</p>
<p>Let&#8217;s decompose the code. We only look at assoc for BitmapIndexedNode as it is the most interesting.</p>
<pre><tt><span class="keyword">public</span><span class="normal"> INode </span><span class="function">assoc</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> Object key</span><span class="symbol">,</span><span class="normal"> Object val</span><span class="symbol">,</span><span class="normal"> Box addedLeaf</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">		</span><span class="type">int</span><span class="normal"> bit </span><span class="symbol">=</span><span class="normal"> </span><span class="function">bitpos</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="normal">		</span><span class="type">int</span><span class="normal"> idx </span><span class="symbol">=</span><span class="normal"> </span><span class="function">index</span><span class="symbol">(</span><span class="normal">bit</span><span class="symbol">);</span>
<span class="normal">		</span><span class="keyword">if</span><span class="symbol">((</span><span class="normal">bitmap </span><span class="symbol">&amp;</span><span class="normal"> bit</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">!=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="comment">//..</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">else</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="comment">//</span>
<span class="normal">		</span><span class="cbracket">}</span>
<span class="cbracket">}</span></tt></pre>
<p>This should be familiar from the previous post: we check if the index of the hash (at the current level) is in the bitmap. The else branch is the case where the index is not in the bitmap, corresponding to reaching the bottom of the path: here we must create a new node and path-copy as described above. In the &#8220;if&#8221;-branch we simply either call recursively if the index references an INode: if it references a key-value pair this is a &#8220;replace&#8221; and we do path copying here too. Again we look only at the else-branch as it is the most interesting.</p>
<pre><tt><span class="keyword">public</span><span class="normal"> INode </span><span class="function">assoc</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> Object key</span><span class="symbol">,</span><span class="normal"> Object val</span><span class="symbol">,</span><span class="normal"> Box addedLeaf</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">		</span><span class="type">int</span><span class="normal"> bit </span><span class="symbol">=</span><span class="normal"> </span><span class="function">bitpos</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="normal">		</span><span class="type">int</span><span class="normal"> idx </span><span class="symbol">=</span><span class="normal"> </span><span class="function">index</span><span class="symbol">(</span><span class="normal">bit</span><span class="symbol">);</span>
<span class="normal">		</span><span class="keyword">if</span><span class="symbol">((</span><span class="normal">bitmap </span><span class="symbol">&amp;</span><span class="normal"> bit</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">!=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="comment">//..</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">else</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="type">int</span><span class="normal"> n </span><span class="symbol">=</span><span class="normal"> Integer</span><span class="symbol">.</span><span class="function">bitCount</span><span class="symbol">(</span><span class="normal">bitmap</span><span class="symbol">);</span>
<span class="normal">			</span><span class="keyword">if</span><span class="symbol">(</span><span class="normal">n </span><span class="symbol">&gt;=</span><span class="normal"> </span><span class="number">16</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">				</span><span class="comment">//convert to ArrayNode</span>
<span class="normal">			</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">else</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">				Object</span><span class="symbol">[]</span><span class="normal"> newArray </span><span class="symbol">=</span><span class="normal"> </span><span class="keyword">new</span><span class="normal"> Object</span><span class="symbol">[</span><span class="number">2</span><span class="symbol">*(</span><span class="normal">n</span><span class="symbol">+</span><span class="number">1</span><span class="symbol">)];</span>
<span class="normal">				System</span><span class="symbol">.</span><span class="function">arraycopy</span><span class="symbol">(</span><span class="normal">array</span><span class="symbol">,</span><span class="normal"> </span><span class="number">0</span><span class="symbol">,</span><span class="normal"> newArray</span><span class="symbol">,</span><span class="normal"> </span><span class="number">0</span><span class="symbol">,</span><span class="normal"> </span><span class="number">2</span><span class="symbol">*</span><span class="normal">idx</span><span class="symbol">);</span>
<span class="normal">				newArray</span><span class="symbol">[</span><span class="number">2</span><span class="symbol">*</span><span class="normal">idx</span><span class="symbol">]</span><span class="normal"> </span><span class="symbol">=</span><span class="normal"> key</span><span class="symbol">;</span>
<span class="normal">				addedLeaf</span><span class="symbol">.</span><span class="normal">val </span><span class="symbol">=</span><span class="normal"> addedLeaf</span><span class="symbol">;</span><span class="normal"> </span>
<span class="normal">				newArray</span><span class="symbol">[</span><span class="number">2</span><span class="symbol">*</span><span class="normal">idx</span><span class="symbol">+</span><span class="number">1</span><span class="symbol">]</span><span class="normal"> </span><span class="symbol">=</span><span class="normal"> val</span><span class="symbol">;</span>
<span class="normal">				System</span><span class="symbol">.</span><span class="function">arraycopy</span><span class="symbol">(</span><span class="normal">array</span><span class="symbol">,</span><span class="normal"> </span><span class="number">2</span><span class="symbol">*</span><span class="normal">idx</span><span class="symbol">,</span><span class="normal"> newArray</span><span class="symbol">,</span><span class="normal"> </span><span class="number">2</span><span class="symbol">*(</span><span class="normal">idx</span><span class="symbol">+</span><span class="number">1</span><span class="symbol">),</span><span class="normal"> </span><span class="number">2</span><span class="symbol">*(</span><span class="normal">n</span><span class="symbol">-</span><span class="normal">idx</span><span class="symbol">));</span>
<span class="normal">				</span><span class="keyword">return</span><span class="normal"> </span><span class="keyword">new</span><span class="normal"> </span><span class="function">BitmapIndexedNode</span><span class="symbol">(</span><span class="keyword">null</span><span class="symbol">,</span><span class="normal"> bitmap </span><span class="symbol">|</span><span class="normal"> bit</span><span class="symbol">,</span><span class="normal"> newArray</span><span class="symbol">);</span>
<span class="normal">			</span><span class="cbracket">}</span>
<span class="normal">		</span><span class="cbracket">}</span>
<span class="cbracket">}</span></tt></pre>
<p>Remember (from the previous post) that <tt>Integer.bitCount(bitmap)</tt> counts the number of children of this node. If we are storing less than 16 elements, we have room for one more: To create the new bitmap indexed node simply copy the object array, and modify it to store the new key-value pair (using the invariant mentioned above). (Ignore the &#8220;box&#8221; part, it is used to communicate to higher-levels in the recursion what happened.) Finally, if we have 16 elements stored already, convert to an ArrayNode:</p>
<pre><tt><span class="keyword">public</span><span class="normal"> INode </span><span class="function">assoc</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> Object key</span><span class="symbol">,</span><span class="normal"> Object val</span><span class="symbol">,</span><span class="normal"> Box addedLeaf</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">		</span><span class="type">int</span><span class="normal"> bit </span><span class="symbol">=</span><span class="normal"> </span><span class="function">bitpos</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="normal">		</span><span class="type">int</span><span class="normal"> idx </span><span class="symbol">=</span><span class="normal"> </span><span class="function">index</span><span class="symbol">(</span><span class="normal">bit</span><span class="symbol">);</span>
<span class="normal">		</span><span class="keyword">if</span><span class="symbol">((</span><span class="normal">bitmap </span><span class="symbol">&amp;</span><span class="normal"> bit</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">!=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="comment">//..</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">else</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="type">int</span><span class="normal"> n </span><span class="symbol">=</span><span class="normal"> Integer</span><span class="symbol">.</span><span class="function">bitCount</span><span class="symbol">(</span><span class="normal">bitmap</span><span class="symbol">);</span>
<span class="normal">			</span><span class="keyword">if</span><span class="symbol">(</span><span class="normal">n </span><span class="symbol">&gt;=</span><span class="normal"> </span><span class="number">16</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">				INode</span><span class="symbol">[]</span><span class="normal"> nodes </span><span class="symbol">=</span><span class="normal"> </span><span class="keyword">new</span><span class="normal"> INode</span><span class="symbol">[</span><span class="number">32</span><span class="symbol">];</span>
<span class="normal">				</span><span class="type">int</span><span class="normal"> jdx </span><span class="symbol">=</span><span class="normal"> </span><span class="function">mask</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="normal">				nodes</span><span class="symbol">[</span><span class="normal">jdx</span><span class="symbol">]</span><span class="normal"> </span><span class="symbol">=</span><span class="normal"> EMPTY</span><span class="symbol">.</span><span class="function">assoc</span><span class="symbol">(</span><span class="normal">shift </span><span class="symbol">+</span><span class="normal"> </span><span class="number">5</span><span class="symbol">,</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> key</span><span class="symbol">,</span><span class="normal"> val</span><span class="symbol">,</span><span class="normal"> addedLeaf</span><span class="symbol">);</span><span class="normal">  </span>
<span class="normal">				</span><span class="type">int</span><span class="normal"> j </span><span class="symbol">=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">;</span>
<span class="normal">				</span><span class="keyword">for</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> i </span><span class="symbol">=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">;</span><span class="normal"> i </span><span class="symbol">&lt;</span><span class="normal"> </span><span class="number">32</span><span class="symbol">;</span><span class="normal"> i</span><span class="symbol">++)</span>
<span class="normal">					</span><span class="keyword">if</span><span class="symbol">(((</span><span class="normal">bitmap </span><span class="symbol">&gt;&gt;&gt;</span><span class="normal"> i</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">&amp;</span><span class="normal"> </span><span class="number">1</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">!=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">						</span><span class="keyword">if</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">array</span><span class="symbol">[</span><span class="normal">j</span><span class="symbol">]</span><span class="normal"> </span><span class="symbol">==</span><span class="normal"> </span><span class="keyword">null</span><span class="symbol">)</span>
<span class="normal">							nodes</span><span class="symbol">[</span><span class="normal">i</span><span class="symbol">]</span><span class="normal"> </span><span class="symbol">=</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">INode</span><span class="symbol">)</span><span class="normal"> array</span><span class="symbol">[</span><span class="normal">j</span><span class="symbol">+</span><span class="number">1</span><span class="symbol">];</span>
<span class="normal">						</span><span class="keyword">else</span>
<span class="normal">							nodes</span><span class="symbol">[</span><span class="normal">i</span><span class="symbol">]</span><span class="normal"> </span><span class="symbol">=</span><span class="normal"> EMPTY</span><span class="symbol">.</span><span class="function">assoc</span><span class="symbol">(</span><span class="normal">shift </span><span class="symbol">+</span><span class="normal"> </span><span class="number">5</span><span class="symbol">,</span><span class="normal">  Util</span><span class="symbol">.</span><span class="function">hash</span><span class="symbol">(</span><span class="normal">array</span><span class="symbol">[</span><span class="normal">j</span><span class="symbol">]),</span><span class="normal"> array</span><span class="symbol">[</span><span class="normal">j</span><span class="symbol">],</span><span class="normal"> array</span><span class="symbol">[</span><span class="normal">j</span><span class="symbol">+</span><span class="number">1</span><span class="symbol">],</span><span class="normal"> addedLeaf</span><span class="symbol">);</span>
<span class="normal">						j </span><span class="symbol">+=</span><span class="normal"> </span><span class="number">2</span><span class="symbol">;</span>
<span class="normal">					</span><span class="cbracket">}</span>
<span class="normal">				</span><span class="keyword">return</span><span class="normal"> </span><span class="keyword">new</span><span class="normal"> </span><span class="function">ArrayNode</span><span class="symbol">(</span><span class="keyword">null</span><span class="symbol">,</span><span class="normal"> n </span><span class="symbol">+</span><span class="normal"> </span><span class="number">1</span><span class="symbol">,</span><span class="normal"> nodes</span><span class="symbol">);</span>
<span class="normal">			</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">else</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">				</span><span class="comment">//we covered this...</span>
<span class="normal">			</span><span class="cbracket">}</span>
<span class="normal">		</span><span class="cbracket">}</span>
<span class="cbracket">}</span></tt></pre>
<p>The first block up to the &#8216;for&#8217; loop prepares the array of INodes that will be the children array of the new ArrayNode. We create a new BitmapIndexedNode easily by calling assoc on an empty persistent hash map (which is an easy way of creating an one-entry BitmapIndexedNode). Note that the indexing strategy for ArrayNodes is different from BitmapIndexedNode: we don&#8217;t need the bitmap <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  We simply use the 5-bit block corresponding to the level (again see the older post). </p>
<p>Finally, the for-loop copies all the other nodes that are stored in this bitmap indexed node into the new array node. This entails mapping between the bitmap index scheme and the &#8220;5-bit hash-block&#8221; scheme. For each possible index in the bitmap node, only do work if the index is present in the bit map: <tt>if(((bitmap >>> i) &#038; 1) != 0)</tt>. The variable <tt>j</tt> runs through the even indexes, and according to the invariant: null means that <tt>j+1</tt> is an INode, and non-null means it is a value in a map entry.</p>
<p>I am wondering if there is an optimization possible here? We are looping through all possible indices of the new ArrayNode, but we know that we only have to do something on the indices that correspond to an non-zero index in the bit map; and there can be only 15 of those&#8230; Would it be possible to iterate only those using some strategy? If I figure something out, I&#8217;ll let you know <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>clj-ds: Clojure&#8217;s persistent data structures for Java</title>
		<link>http://blog.higher-order.net/2010/06/11/clj-ds-clojures-persistent-data-structures-for-java/</link>
		<comments>http://blog.higher-order.net/2010/06/11/clj-ds-clojures-persistent-data-structures-for-java/#comments</comments>
		<pubDate>Fri, 11 Jun 2010 13:45:47 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[persistent data structures]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=575</guid>
		<description><![CDATA[One of the appealing features of Clojure is the pervasive use of (efficient!) persistent data structures. (In previous posts I&#8217;ve shed some light on how PersistentHashMap and PersistentVector are implemented, although some of that information is slightly dated now). There &#8230; <a href="http://blog.higher-order.net/2010/06/11/clj-ds-clojures-persistent-data-structures-for-java/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>One of the appealing features of Clojure is the pervasive use of (efficient!) persistent data structures. (In previous posts I&#8217;ve shed some light on how <a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/">PersistentHashMap</a> and <a href="http://blog.higher-order.net/2009/02/01/understanding-clojures-persistentvector-implementation/">PersistentVector</a> are implemented, although some of that information is slightly dated now).</p>
<p>There are many advantages to programming with persistent data structures (which implies immutability) but that isn&#8217;t the topic of this post&#8230; Currently the Clojure data structures are implemented in Java, so in principle they should be usable also outside of Clojure, say from Java.  However, in practice it is inconvenient (see below). </p>
<p>I&#8217;ve created the project clj-ds to make Clojure&#8217;s data structures available in a more practical form to other JVM languages than Clojure. The <a href="http://github.com/krukow/clj-ds/raw/master/README">README</a> file from the <a href="http://github.com/krukow/clj-ds">clj-ds GitHub project</a> explains the motivation:</p>
<p><strong>Advantages of clj-ds when constrained to working with Java</strong> (as opposed to just including clojure.jar)</p>
<p>* Currently the Clojure data structures are implemented in Java. In the future,<br />
all of Clojure will be implemented in Clojure itself (known as &#8220;Clojure-in-Clojure&#8221;).<br />
This has many advantages for Clojure, but when it happens the data structures will<br />
probably be even more intertwined with the rest of the language,<br />
and may be even more inconvenient to use in a Java context.</p>
<p>The clj-ds project will maintain Java versions of the code, and where possible attempt<br />
to &#8220;port&#8221; improvements made in the Clojure versions back into clj-ds. Thus keeping maintained<br />
versions of the Java data structures. </p>
<p>* In the current Clojure version, calling certain methods on PersistentHashMap requires<br />
loading the entire Clojure runtime, including the bootstrap process. This takes about one second.<br />
This means that the first time one of these methods is called, a Java user will experience a<br />
slight delay (and a memory-usage increase). Further, many of the Clojure runtime<br />
Java classes are not needed when only support for persistent data structures<br />
is wanted (e.g., the compiler).</p>
<p>* The clj-ds library is not dependent on the Clojure runtime nor does it run any<br />
Clojure bootstrap process, e.g., the classes that deal with compilation have been removed.<br />
This results in a smaller library, and the mentioned delay does not occur.</p>
<p>* Clojure is a dynamically typed language. Java is statically typed, and supports<br />
&#8216;generics&#8217; from version 5. A Java user would expect generics support from a Java<br />
data structure library, and the Clojure version doesn&#8217;t have this.<br />
clj-ds will support generics.</p>
<p>* Finally, a slight improvement.  Certain of the Clojure data structure methods use Clojure&#8217;s &#8216;seq&#8217; abstraction. In the implementation of the Java &#8216;iterator&#8217; pattern. It is possible, to make<br />
slightly more efficient iterators using a tailor made iterator. clj-ds does this.</p>
<p>Code: <a href="http://github.com/krukow/clj-ds">http://github.com/krukow/clj-ds</a> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2010/06/11/clj-ds-clojures-persistent-data-structures-for-java/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Circuit Breaker: a small but real-life example of Clojure protocols and datatype</title>
		<link>http://blog.higher-order.net/2010/05/05/circuitbreaker-clojure-1-2/</link>
		<comments>http://blog.higher-order.net/2010/05/05/circuitbreaker-clojure-1-2/#comments</comments>
		<pubDate>Wed, 05 May 2010 08:59:50 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[circuit breaker]]></category>
		<category><![CDATA[Michael Nygard]]></category>
		<category><![CDATA[stability pattern]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=470</guid>
		<description><![CDATA[(Update July 2nd 2010: I&#8217;ve cleaned up the code and git repo. Inlined the protocol function definitions in the state records for native-platform speed. The policy can now specify which exceptions should be considered errors: this is really useful in &#8230; <a href="http://blog.higher-order.net/2010/05/05/circuitbreaker-clojure-1-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>(Update July 2nd 2010:<br />
I&#8217;ve cleaned up the code and git repo.<br />
Inlined the protocol function definitions in the state records for native-platform speed.<br />
The policy can now specify which exceptions should be considered errors: this is really useful in real life when you don&#8217;t want to trip the circuit breaker say on security exceptions.<br />
Now builds with lein 1.1.0.<br />
&#8230;]$ lein jar<br />
&#8230;<br />
&#8230;]$ javac -cp lib/clojure-1.2.0-master-20100623.220259-87.jar:circuit-breaker.jar src/C.java<br />
&#8230;]$ java -cp lib/clojure-1.2.0-master-20100623.220259-87.jar:circuit-breaker.jar:src C<br />
)</p>
<p>Michael Nygards <a href="http://www.pragprog.com/titles/mnee/release-it">stability pattern &#8220;Circuit Breaker&#8221; </a>is useful for failing fast when calling integration points that are unstable (which is every integration point I&#8217;ve ever dealt with!). This is done by detecting when integration points fail, and subsequently cutting off access for a time-period. Use the circuit breaker to</p>
<blockquote><p>&#8230; preserve request handling threads in the calling system. Very often, when you make a call to an external integration point that&#8217;s broken, it will tie up a thread in a blocking synchronous call for an indefinite period of time. [<a href="http://www.infoq.com/interviews/Building-Resilient-Systems-Michael-Nygard">Michael Nygard, QCon interview</a>]</p></blockquote>
<p>I&#8217;ve written a <a href="http://github.com/krukow/clojure-circuit-breaker">fast, non-blocking functional implementation of the Circuit Breaker</a> in the 1.2 branch of Clojure which will be released <a href="http://groups.google.com/group/clojure/browse_thread/thread/bdf0e8500ec11aa">shortly</a>. It uses the new Clojure constructs <a href="http://www.clojure.org/protocols"><strong>protocols</strong></a> and <a href="http://clojure.org/datatypes"><strong>datatypes</strong></a> for modeling states, for interop and to obtain platform-speed polymorphic calls. </p>
<p>The implementation exposes a simple Java interface which makes it usable from Java, Scala, JRuby et al. </p>
<p><strong>Why?</strong></p>
<blockquote><p>Design Patterns are a disease, and Clojure is the cure. <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  (the smiley is mine!) <br/>[<a href="http://www.nofluffjuststuff.com/blog/stuart_halloway/2009/10/the_case_for_clojure">http://www.nofluffjuststuff.com/blog/stuart_halloway/2009/10/the_case_for_clojure</a>]</p></blockquote>
<p>I believe this implementation has a couple of advantages compared to these <a href="http://www.jroller.com/kenwdelong/entry/circuit_breaker_in_java">Java</a> and <a href="http://github.com/FaKod/Circuit-Breaker-for-Scala">Scala</a> implementations that use the GoF &#8220;State Machine&#8221; pattern. </p>
<p>First, I find the functional version simpler (see examples below). Second, this version guarantees that only a single call is made to the integration point when the circuit breaker decides to re-test if it is working again (the other versions seem to allow an unbounded number of calls if more threads concurrently try to access the integration point calling &#8220;invoke&#8221;). Finally, this version encapsulates the (immutable) state in a single Clojure atom (corresponding to a Java AtomicReference), whereas the GoF implementations use at least three atomics for various counters. Why does that matter? Well, it gives you the ability to obtain a consistent (immutable) snapshot of the state of circuit breaker at any given time which can be used to e.g. logging and analysis &#8211; this isn&#8217;t possible when you have several atomics in play.</p>
<p>These benefits come naturally from following Clojure&#8217;s programming model and concurrency constructs. Let me illustrate the Clojure features that I&#8217;ve found useful for this problem.</p>
<p><strong>Protocols and records</strong>.<br />
I&#8217;m using pure polymorphic functions <tt>on-success</tt>, <tt>on-error</tt>, <tt>on-before-call</tt> as transition functions mapping a state to the next state for an event (successfull call, error call and before a call is initiated). A pure function <tt>proceed</tt> is a predicate on states that decide whether or not the state allows calls to go though to the integration point. </p>
<p>Together these functions form <a href="http://clojure.org/protocols">a Clojure protocol</a> (which is similar to a Java interface, but has additional benefits).</p>
<p><script src="http://gist.github.com/390485.js?file=states-part-1.clj"></script> (Show with JavaScript: for non-JS User agents, see <a href="http://gist.github.com/390485">http://gist.github.com/390485</a>)</p>
<p>Apart from the definition of the protocol, we define a default implementation of the protocol functions that our states can use. The default transition functions are simply the identity function and proceed defaults to false.</p>
<p>We now define datatypes corresponding to each type of state: closed (calls go through, count failures), open (calls fail-fast, stores a time-stamp when IP failed), initial-half-open (a single call goes through), pending-half-open (waiting for a probing call to return).  The datatypes are parameterized by a &#8220;transition policy&#8221; defining how many failures are &#8220;needed&#8221; to transition to the open state, and how long to wait in the open state.</p>
<p><script src="http://gist.github.com/390492.js?file=states-datatypes.clj"></script>(For non-JS User agents, see <a href="http://gist.github.com/390492">http://gist.github.com/390492</a> )</p>
<p>This simply defines the states as datatypes. Note that we use <tt>defrecord</tt> not <tt>deftype</tt>. This makes our  datatypes work like persistent clojure maps which is extremely useful &#8211; for example our states can be destructured (see, e.g. <tt>defrecord</tt> at <a href="http://clojure.org/datatypes">http://clojure.org/datatypes</a>).</p>
<p>We can make our new types participate in our <tt>CircuitBreakerTransitions</tt> protocol</p>
<p><script src="http://gist.github.com/390494.js?file=states-extend-closed.clj"></script> (<a href="http://gist.github.com/390494">http://gist.github.com/390494</a>)</p>
<p>A couple of notes: </p>
<p>- We use <tt>merge</tt> to take the default implementations and &#8220;override&#8221; with the implementations given (this would correspond to an abstract super-class in Java but is more flexible).<br />
- We use destructing in the function definitions for easy access to the &#8220;ClosedState&#8221; data, e.g., in the body of <tt>(fn [{f :fail-count p :policy, :as s}] ... </tt> the states fail-count is available as <tt>f</tt> and similarly for the policy. The <tt>:as s</tt> clause makes the state itself available as <tt>s</tt>.<br />
- Finally, we can construct new instances of the states since they are simple dynamically compiled classes, e.g., <tt>(ClosedState. p 0)</tt> creates the initial state with a policy <tt>p</tt>.</p>
<p><strong>Pure functions</strong>. A clear advantage of the functional approach is the ease of testing. Consider this simple test of some states and transition functions.</p>
<p><script src="http://gist.github.com/390505.js?file=states-test.clj"></script> (<a href="http://gist.github.com/390505">http://gist.github.com/390505</a>)</p>
<p>This is the core of the circuit breaker itself:</p>
<p><script src="http://gist.github.com/390529.js?file=circuit-breaker-wrap.clj"></script> (<a href="http://gist.github.com/390529">http://gist.github.com/390529</a>)</p>
<p>A circuit breaker is simply an atomic reference to a state. The function <tt>wrap-with</tt> takes a function, <tt>f</tt> to wrap &#8211; this represents a function that will call an integration point, and a circuit breaker, named <tt>state</tt>. It then returns a &#8220;wrapped&#8221; function which is guarded by the circuit breaker. It uses the &#8220;<tt>transition-by!</tt>&#8221; function which makes a state-transition from the current state. </p>
<p>An example usage:</p>
<p><script src="http://gist.github.com/390535.js?file=circuit-breaker-test.clj"></script> (<a href="http://gist.github.com/390535">http://gist.github.com/390535</a>)</p>
<p>Notice that the snapshot of the state is available simply with a <tt>deref</tt>, e.g., as <tt>@cb</tt>.</p>
<h3>A Java-interface</h3>
<p>To expose the functionality to Java I have used Clojures <tt>gen-class</tt> to create a class that exposes two methods given by this protocol which lets you wrap a function and look at the state:</p>
<p><code><tt><br />
(defprotocol CircuitBreaker<br />
  (#^clojure.lang.IFn wrap [this #^clojure.lang.IFn fn])<br />
  (#^net.higher_order.integration.circuit_breaker.states.CircuitBreakerTransitions current-state [this]))<br />
 </tt></code></p>
<p>Then using <tt>gen-class</tt> I generate a class that implements the interface corresponding to that protocol. This gives the possibility of using the circuit breaker from Java:</p>
<pre><tt><span class="preproc">import</span><span class="normal"> clojure</span><span class="symbol">.</span><span class="normal">lang</span><span class="symbol">.</span><span class="normal">IFn</span><span class="symbol">;</span>
<span class="preproc">import</span><span class="normal"> clojure</span><span class="symbol">.</span><span class="normal">lang</span><span class="symbol">.</span><span class="normal">RT</span><span class="symbol">;</span>
<span class="preproc">import</span><span class="normal"> net</span><span class="symbol">.</span><span class="normal">higher_order</span><span class="symbol">.</span><span class="normal">integration</span><span class="symbol">.</span><span class="normal">circuit_breaker</span><span class="symbol">.</span><span class="normal">AtomicCircuitBreaker</span><span class="symbol">;</span>
<span class="preproc">import</span><span class="normal"> net</span><span class="symbol">.</span><span class="normal">higher_order</span><span class="symbol">.</span><span class="normal">integration</span><span class="symbol">.</span><span class="normal">circuit_breaker</span><span class="symbol">.</span><span class="normal">CircuitBreaker</span><span class="symbol">;</span>

<span class="keyword">public</span><span class="normal"> </span><span class="keyword">class</span><span class="normal"> </span><span class="classname">C</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">	</span><span class="keyword">public</span><span class="normal"> </span><span class="keyword">static</span><span class="normal"> </span><span class="type">void</span><span class="normal"> </span><span class="function">main</span><span class="symbol">(</span><span class="normal">String</span><span class="symbol">[]</span><span class="normal"> args</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">		CircuitBreaker atomicCircuitBreaker </span><span class="symbol">=</span><span class="normal"> </span><span class="keyword">new</span><span class="normal"> </span><span class="function">AtomicCircuitBreaker</span><span class="symbol">();</span>
<span class="normal">		IFn wrap </span><span class="symbol">=</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">IFn</span><span class="symbol">)</span><span class="normal"> atomicCircuitBreaker</span><span class="symbol">.</span><span class="function">wrap</span><span class="symbol">(</span><span class="keyword">new</span><span class="normal"> clojure</span><span class="symbol">.</span><span class="normal">lang</span><span class="symbol">.</span><span class="function">AFn</span><span class="symbol">()</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="keyword">public</span><span class="normal"> Object </span><span class="function">invoke</span><span class="symbol">(</span><span class="normal">Object arg0</span><span class="symbol">)</span><span class="normal"> </span><span class="keyword">throws</span><span class="normal"> Exception </span><span class="cbracket">{</span>
<span class="normal">				</span><span class="keyword">if</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">arg0 </span><span class="symbol">==</span><span class="normal"> </span><span class="keyword">null</span><span class="symbol">)</span><span class="normal"> </span><span class="keyword">throw</span><span class="normal"> </span><span class="keyword">new</span><span class="normal"> </span><span class="function">IllegalArgumentException</span><span class="symbol">(</span><span class="string">"null arg"</span><span class="symbol">);</span>
<span class="normal">				System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="string">"Invoked with: "</span><span class="symbol">+</span><span class="normal">arg0</span><span class="symbol">);</span>
<span class="normal">				</span><span class="keyword">return</span><span class="normal"> arg0</span><span class="symbol">;</span>
<span class="normal">			</span><span class="cbracket">}</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">succeed</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">sleep</span><span class="symbol">(</span><span class="number">1000</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">status</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">fail</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">sleep</span><span class="symbol">(</span><span class="number">5000</span><span class="symbol">);</span>
<span class="normal">		</span><span class="function">succeed</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> wrap</span><span class="symbol">);</span>
<span class="normal">	</span><span class="cbracket">}</span>

<span class="normal">	</span>
<span class="normal">	</span><span class="keyword">private</span><span class="normal"> </span><span class="keyword">static</span><span class="normal"> </span><span class="type">void</span><span class="normal"> </span><span class="function">sleep</span><span class="symbol">(</span><span class="type">long</span><span class="normal"> howlong</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">		</span><span class="keyword">try</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			Thread</span><span class="symbol">.</span><span class="function">sleep</span><span class="symbol">(</span><span class="normal">howlong</span><span class="symbol">);</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">catch</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">InterruptedException e</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="comment">// TODO Auto-generated catch block</span>
<span class="normal">			e</span><span class="symbol">.</span><span class="function">printStackTrace</span><span class="symbol">();</span>
<span class="normal">		</span><span class="cbracket">}</span>
<span class="normal">	</span><span class="cbracket">}</span>

<span class="normal">	</span><span class="keyword">private</span><span class="normal"> </span><span class="keyword">static</span><span class="normal"> </span><span class="type">void</span><span class="normal"> </span><span class="function">succeed</span><span class="symbol">(</span><span class="normal">CircuitBreaker atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> IFn wrap</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">		</span><span class="keyword">try</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">wrap</span><span class="symbol">.</span><span class="function">invoke</span><span class="symbol">(</span><span class="string">"KARL"</span><span class="symbol">));</span>
<span class="normal">			System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">wrap</span><span class="symbol">.</span><span class="function">invoke</span><span class="symbol">(</span><span class="number">42</span><span class="symbol">));</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">catch</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">Exception e</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">e</span><span class="symbol">.</span><span class="function">getMessage</span><span class="symbol">());</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">finally</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="function">status</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">);</span>
<span class="normal">		</span><span class="cbracket">}</span>
<span class="normal">	</span><span class="cbracket">}</span>

<span class="normal">	</span><span class="keyword">private</span><span class="normal"> </span><span class="keyword">static</span><span class="normal"> </span><span class="type">void</span><span class="normal"> </span><span class="function">status</span><span class="symbol">(</span><span class="normal">CircuitBreaker atomicCircuitBreaker</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">		System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">RT</span><span class="symbol">.</span><span class="function">printString</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">.</span><span class="function">current_state</span><span class="symbol">()));</span>
<span class="normal">	</span><span class="cbracket">}</span>

<span class="normal">	</span><span class="keyword">private</span><span class="normal"> </span><span class="keyword">static</span><span class="normal"> </span><span class="type">void</span><span class="normal"> </span><span class="function">fail</span><span class="symbol">(</span><span class="normal">CircuitBreaker atomicCircuitBreaker</span><span class="symbol">,</span><span class="normal"> IFn wrap</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">		</span><span class="keyword">try</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">wrap</span><span class="symbol">.</span><span class="function">invoke</span><span class="symbol">(</span><span class="keyword">null</span><span class="symbol">));</span>
<span class="normal">			System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">wrap</span><span class="symbol">.</span><span class="function">invoke</span><span class="symbol">(</span><span class="number">42</span><span class="symbol">));</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">catch</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">Exception e</span><span class="symbol">)</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			System</span><span class="symbol">.</span><span class="normal">out</span><span class="symbol">.</span><span class="function">println</span><span class="symbol">(</span><span class="normal">e</span><span class="symbol">.</span><span class="function">getMessage</span><span class="symbol">());</span>
<span class="normal">		</span><span class="cbracket">}</span><span class="normal"> </span><span class="keyword">finally</span><span class="normal"> </span><span class="cbracket">{</span>
<span class="normal">			</span><span class="function">status</span><span class="symbol">(</span><span class="normal">atomicCircuitBreaker</span><span class="symbol">);</span>
<span class="normal">		</span><span class="cbracket">}</span>
<span class="normal">	</span><span class="cbracket">}</span>
<span class="cbracket">}</span>
</tt></pre>
<p>This is getting long. I&#8217;ll save the comparison for the next post <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Github: <a href="http://github.com/krukow/clojure-circuit-breaker">http://github.com/krukow/clojure-circuit-breaker</a> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2010/05/05/circuitbreaker-clojure-1-2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Joy of Clojure</title>
		<link>http://blog.higher-order.net/2010/01/14/the-joy-of-clojure/</link>
		<comments>http://blog.higher-order.net/2010/01/14/the-joy-of-clojure/#comments</comments>
		<pubDate>Thu, 14 Jan 2010 10:54:08 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=444</guid>
		<description><![CDATA[In case you haven&#8217;t noticed there is a very interesting Clojure book coming out, titled &#8220;The Joy of Clojure,&#8221; written by two very interesting authors that anyone hanging out in the Clojure community should know: Chris Houser and Michael Fogus. &#8230; <a href="http://blog.higher-order.net/2010/01/14/the-joy-of-clojure/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In case you haven&#8217;t noticed there is a very interesting Clojure book coming out, titled &#8220;<a href="http://www.manning.com/fogus/">The Joy of Clojure</a>,&#8221;  written by two very interesting authors that anyone hanging out in the Clojure community should know: <a href="http://twitter.com/chrishouser">Chris Houser</a> and <a href="http://twitter.com/fogus">Michael Fogus</a>. </p>
<p>As an appetizer, the first chapter is available for free:</p>
<p><a href="http://www.manning.com/fogus/Fogus_MEAP_Ch1.pdf">Clojure—A Lisp for the Java Virtual Machine</a></p>
<p>I&#8217;ve read the first chapter and the book looks very promising! To quote the last paragraph of chapter one:</p>
<blockquote><p>We&#8217;ve talked a little about how this book will go beyond what Clojure is to why it&#8217;s designed the way it is and how that design can be exploited through idioms that will help you think in Clojure. So lets stop talking about what this book will do and get on with the doing.<br />
Fasten your seat belts.</p></blockquote>
<p>I&#8217;ve fastened my seat belt and ordered my copy <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2010/01/14/the-joy-of-clojure/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Understanding Clojure&#8217;s PersistentHashMap (deftwice&#8230;)</title>
		<link>http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/</link>
		<comments>http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 15:29:55 +0000</pubDate>
		<dc:creator>krukow</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[persistent data structures]]></category>
		<category><![CDATA[PersistentHashMap]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=386</guid>
		<description><![CDATA[[sept. 8th, 21:22: fixed a +/- 1 error] In a previous post, I gave a high-level description of how Clojure&#8217;s PersistentVector is implemented. While the code has changed, the description was high-level enough that the explanations still hold (although some &#8230; <a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>[sept. 8th, 21:22: fixed a +/- 1 error]</p>
<p>In a previous post, I gave a high-level description of <a href="http://blog.higher-order.net/?p=233">how Clojure&#8217;s PersistentVector is implemented.</a> While the code has changed, the description was high-level enough that the explanations still hold (although some code snipplets don&#8217;t correspond to what&#8217;s in <a href="http://github.com/richhickey/clojure/tree/master">Git master</a>.) </p>
<p>In this post, I&#8217;ll try to explain (also at a high level) how <code>clojure.lang.PersistentHashMap</code> works internally. Reading the mentioned post on PersistentVector is helpful as some of the concepts are the same (e.g., bit-partitioning). </p>
<p><strong>Persistent</strong><br />
PersistentHashMap is a persistent version of the classical hash table data structure. Persistent means that the data structure is immutable, yet has efficient non-destructive operations that correspond to the operations on the classical hash table. E.g., put(K,V) in hash table corresponds to a side-effect free function assoc(P, K, V) which computes from P a new PersistentHashMap P&#8217; which is like P except that it maps key K to value V. The word &#8220;efficient&#8221; means &#8220;on par&#8221; with their mutating counterparts. For Clojure data structures, Rich tries to make them within 1-4 of the Java data structure operations; and read-only operations can even be faster than Java&#8217;s. Later I will cover &#8216;transients&#8217; which are a new optimization that make &#8220;batch&#8221; operations faster.</p>
<p><strong>Array-mapped hash trie</strong><br />
In his paper <a href="http://lampwww.epfl.ch/papers/idealhashtrees.pdf">Ideal Hash Tries</a> Phil Bagwell describes a data structure &#8220;Hash Array Mapped Trie&#8221; which is an efficient implementation of a Hash Tree, based on a combination of hashing and the <a href="http://en.wikipedia.org/wiki/Trie">trie data structure.</a> Hash Array Mapped Tries, are not persistent or immutable. What Rich did was create a persistent version of Bagwell&#8217;s data structure; <tt>clojure.lang.PersistentHashMap</tt>.</p>
<p><strong>PersistentHashMap basic idea</strong><br />
PersistentHashMap (PHM) maintains a very-wide tree, each node having up to 32 children. Each node is a concrete implementation of a static inner interface, <tt>INode</tt>, and there are five implementations of this interface: EmptyNode, LeafNode, FullNode, HashCollisionNode, BitmapIndexedNode. I&#8217;ll only cover EmptyNode, LeafNode and BitmapIndexedNode; the latter being where most of the interesting stuff happens. </p>
<p>The <tt>INode</tt> interface look like this:</p>
<pre><tt><span class="keyword">static</span><span class="normal"> </span><span class="keyword">interface</span><span class="normal"> </span><span class="classname">INode</span><span class="cbracket">{</span>
<span class="normal">    INode </span><span class="function">assoc</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> Object key</span><span class="symbol">,</span><span class="normal"> Object val</span><span class="symbol">,</span><span class="normal"> Box addedLeaf</span><span class="symbol">);</span>
<span class="normal">    LeafNode </span><span class="function">find</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> Object key</span><span class="symbol">);</span>
<span class="normal">    </span><span class="comment">//I've left out a few methods</span>
<span class="cbracket">}</span>
</tt></pre>
<p>The <tt>assoc</tt> method &#8220;adds&#8221; a new key-value pair to the map. The <tt>find</tt> method searches for the Leaf-node holding a key.</p>
<p>An EmptyNode simply represents the empty hash map. LeafNodes are also pretty simple; they hold the actual entries stored in map. </p>
<p>The root node of the tree is initially an EmptyNode. When assoc is called on EmptyNode, it returns a new LeafNode, which the key-value pair. So EmptyNode &#8220;becomes&#8221; a LeafNode with assoc. In turn, a LeafNode typically &#8220;becomes&#8221; a BitmapIndexedNode with assoc.  We will go into details with BitmapIndexedNode, but first we need to understand&#8230;</p>
<p><strong>Bit-partitioning of hash-codes</strong><br />
When PHM assocs a key object K with value object V, it first computes the hashCode of K, just as a hash-table would. The hash code of K yields an int, which has a 32-bit representation in Java (as I explained in <a href="http://blog.higher-order.net/?p=233">the post on PersistentVector</a>). Here are some example bit representations of numbers:</p>
<p><img src="http://blog.higher-order.net/files/clj/bitpartitioning1.png" alt="PersistentHashMap ilustration 1" /></p>
<p>The trick that PHM uses is to partition this bit representation in to blocks of 5-bits, represented with colors in the above example. Each block corresponds to a &#8220;level&#8221; in the tree structure; for example, the right-most green block corresponds to root-level, and the orange block corresponds to the children of the root. Exactly what &#8220;corresponds&#8221; means is described below.  Levels are multiples of 5. I.e., the root level is level 0, the children of the root are level 5, the grand-children of the root are level 10, etc. Note that a block of five bits corresponds to a number in the range 0-31.  </p>
<p>The reason that levels are multiples of five is the following: You have a bit-representation of a hash-code and you are interested in a particular block corresponding to a level <tt>n</tt>. You obtain this number in two steps: first move the block of bits to the right, until it is the right-most block. Then null-out all other bits except this right-most block. The is done with two bit-operations: you simply right-shift the bits with the level <tt>n</tt> and then do a bit-wise &#8216;and&#8217; (<tt>&amp;</tt>) with the pattern <tt>00..11111</tt>. For example, suppose you want the block corresponding to level 5 (the orange block) of the number 1258 (binary: <tt>[00001][00111][01010]</tt>). You right shift with the level, 5, which is <tt>[00000][00001][00111]</tt>; then do the null&#8217;ing, yielding <tt>[00111]</tt>, which was exactly the orange block of 1258.</p>
<p>The following function does this.</p>
<pre><tt><span class="keyword">static</span><span class="normal"> </span><span class="type">int</span><span class="normal"> </span><span class="function">mask</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">	</span><span class="keyword">return</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">hash </span><span class="symbol">&gt;&gt;&gt;</span><span class="normal"> shift</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">&amp;</span><span class="normal"> </span><span class="number">0x01f</span><span class="symbol">;</span>
<span class="cbracket">}</span></tt></pre>
<p><strong>Illustrating the tree structure</strong><br />
I&#8217;ll use the following picture (adapted from one of Rich&#8217;s slides).<br />
<center><br />
<img src="http://blog.higher-order.net/files/clj/persistenthashmap1.png" alt="PersistentHashMap ilustration 1" /><br />
</center></p>
<p>The colored nodes are <tt>BitmapIndexedNode</tt>s and have between 2 and 31 children (should they get a 32nd child, they become <tt>FullNode</tt>s). A naive implementation of <tt>BitmapIndexedNode</tt> might be the following: use an int variable, <tt>level</tt>, to denote the level that this node lives in, and allocate a full 32 element array of INode references for the children. To add a new child: lookup the index via the bit-block corresponding to the level, i.e. given a hashCode <tt>hash</tt> for the child, and given the level, call <tt>mask(hash, level)</tt> to get the index in range [0, 31]. But this strategy wastes a lot of memory: each node has a full 32 element array where most entries are simply <tt>null</tt>, i.e., if there are 4 children there are 28 null references which are just wasting space.</p>
<p>The hard part is to only use as much space as is needed for each <tt>BitmapIndexedNode</tt>, i.e., if a <tt>BitmapIndexedNode</tt> has <tt>N</tt> children it maintains an array of size <tt>N</tt>. But then we can&#8217;t use <tt>mask(hash,shift)</tt> as the index into the array since it returns a number in the range [0,31] and we need a number only in range [0, <tt>N</tt>). </p>
<p><strong>bitpos</strong><br />
So we need a function to map numbers in range [0, 31] to indexes in range [0, <tt>N</tt>). The function has to be fast constant time, since we are using it to find the child of a node from a hash code, which we will do at each level in the tree. The function is a composition of two functions: <tt>bitpos</tt> and <tt>index</tt>. Function <tt>bitpos</tt> maps numbers [0, 31] to powers of two, i.e., numbers that have a binary representation of the form:<br />
<center><tt>{10<sup>n</sup> | n &gt;= 0}</tt>.</center><br />
For example, <tt>bitpos(7)</tt> in binary is 10000000. We always look at <tt>bitpos(x)</tt> in binary form. Function <tt>index</tt> we return to shortly.</p>
<pre><tt><span class="keyword">static</span><span class="normal"> </span><span class="type">int</span><span class="normal"> </span><span class="function">bitpos</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">    </span><span class="keyword">return</span><span class="normal"> </span><span class="number">1</span><span class="normal"> </span><span class="symbol">&lt;&lt;</span><span class="normal"> </span><span class="function">mask</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="cbracket">}</span></tt></pre>
<p><strong>bitmap</strong><br />
Each <tt>BitmapIndexedNode</tt> also maintains an int variable <tt>bitmap</tt> which we also look at in binary form. The <tt>bitmap</tt> tells us how many children this node has, and also what their indexes are in the child array. All this is encoded into one <tt>int</tt> variable! How? The bit-map has a binary representation, e.g.,<br />
<center><tt>00000000000000010000000010000101</tt></center><br />
The number of children is the number of <tt>1</tt>&#8216;s in the binary representation. If the <tt>n</tt>th bit in <tt>bitmap</tt> is <tt>1</tt> (counting right-to-left, starting with position 0) then there is a child with index <tt>n</tt>. So to check if a child exists for a certain hash-code: first compute <tt>mask(hash,shift)</tt> to get the bit-block and number in range [0, 31]. Then compute <tt>bitpos</tt> of this. You then have a number of form <tt>10<sup>n</sup></tt>. Now match that with the <tt>bitmap</tt> to check if there is a <tt>1</tt> in the <tt>n</tt>&#8216;th position; this match is simply a bit-wise and, &#8216;&amp;&#8217;, with <tt>bitpos</tt>. We&#8217;d better take an example.  </p>
<p>Suppose we are at level 5, and looking up an element with hash-code 1258. Suppose also the bitmap-indexed node has four children, with <tt>bitmap</tt><br />
<tt>bitmap =<sub>binary rep</sub> 00000000000000010000000010000101</tt></p>
<p>Now check if there is a child for hashCode 1258 at this level:</p>
<p><tt>mask(1258,5) = 7</tt> (i.e., binary <tt>00111</tt> as we saw before).</p>
<p><tt>bitpos(7) =<sub>binary rep</sub> 10000000</tt></p>
<p><tt>00000000000000010000000010000101 &amp;</tt><br />
<tt>00000000000000000000000010000000 = 1</tt></p>
<p>Which means that the child exists. Now what is its index? This is  where the <tt>index</tt> function comes into play&#8230;</p>
<p><strong>index</strong><br />
This is the final piece of bit-trickery, I promise <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  The index of a child is the number of <tt>1</tt>&#8216;s to the right of the child&#8217;s <tt>bitpos</tt> in the bit map. In our example above, the index corresponding to hash code 1258 would be 2, since there are two <tt>1</tt>&#8216;s to the right of <tt>10000000</tt> in the <tt>bitmap</tt>.  Now the trick is that on many processors there is an efficient instruction called CTPOP which counts the number of ones in the bit representation of an integer (CTPOP is &#8220;count (bit) population&#8221;). Note that if we subtract 1 from the bitpos, <tt>10<sup>n</sup></tt>, we get <tt>01<sup>n</sup></tt>, then binary &#8216;and&#8217; with the bit map gives us the same bit map, but where only the <tt>1</tt>&#8216;s to the right of <tt>bitpos</tt> are present. If we do a CTPOP on this, we get the index. Hence, </p>
<pre><tt><span class="keyword">final</span><span class="normal"> </span><span class="type">int</span><span class="normal"> </span><span class="function">index</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> bit</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">    </span><span class="keyword">return</span><span class="normal"> Integer</span><span class="symbol">.</span><span class="function">bitCount</span><span class="symbol">(</span><span class="normal">bitmap </span><span class="symbol">&amp;</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">bit </span><span class="symbol">-</span><span class="normal"> </span><span class="number">1</span><span class="symbol">));</span>
<span class="cbracket">}</span></tt></pre>
<p><strong>To be continued&#8230;</strong><br />
From this you should be able to understand how find works. In combines all this:</p>
<pre><tt><span class="keyword">final</span><span class="normal"> </span><span class="keyword">static</span><span class="normal"> </span><span class="keyword">class</span><span class="normal"> </span><span class="classname">BitmapIndexedNode</span><span class="normal"> </span><span class="keyword">implements</span><span class="normal"> INode</span><span class="cbracket">{</span>
<span class="normal">    </span><span class="keyword">final</span><span class="normal"> </span><span class="type">int</span><span class="normal"> bitmap</span><span class="symbol">;</span>
<span class="normal">    </span><span class="keyword">final</span><span class="normal"> INode</span><span class="symbol">[]</span><span class="normal"> nodes</span><span class="symbol">;</span>
<span class="normal">    </span><span class="keyword">final</span><span class="normal"> </span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">;</span>
<span class="normal">    </span><span class="comment">//some stuff left out</span>
<span class="normal">    </span><span class="keyword">static</span><span class="normal"> </span><span class="type">int</span><span class="normal"> </span><span class="function">bitpos</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> </span><span class="type">int</span><span class="normal"> shift</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">	</span><span class="keyword">return</span><span class="normal"> </span><span class="number">1</span><span class="normal"> </span><span class="symbol">&lt;&lt;</span><span class="normal"> </span><span class="function">mask</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="normal">    </span><span class="cbracket">}</span>
<span class="normal">    </span>
<span class="normal">    </span><span class="keyword">final</span><span class="normal"> </span><span class="type">int</span><span class="normal"> </span><span class="function">index</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> bit</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">	</span><span class="keyword">return</span><span class="normal"> Integer</span><span class="symbol">.</span><span class="function">bitCount</span><span class="symbol">(</span><span class="normal">bitmap </span><span class="symbol">&amp;</span><span class="normal"> </span><span class="symbol">(</span><span class="normal">bit </span><span class="symbol">-</span><span class="normal"> </span><span class="number">1</span><span class="symbol">));</span>
<span class="normal">    </span><span class="cbracket">}</span>
<span class="normal">    </span><span class="comment">//...some methods left out</span>
<span class="normal">    </span>
<span class="normal">   </span><span class="keyword">public</span><span class="normal"> LeafNode </span><span class="function">find</span><span class="symbol">(</span><span class="type">int</span><span class="normal"> hash</span><span class="symbol">,</span><span class="normal"> Object key</span><span class="symbol">)</span><span class="cbracket">{</span>
<span class="normal">       </span><span class="type">int</span><span class="normal"> bit </span><span class="symbol">=</span><span class="normal"> </span><span class="function">bitpos</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> shift</span><span class="symbol">);</span>
<span class="normal">       </span><span class="keyword">if</span><span class="symbol">((</span><span class="normal">bitmap </span><span class="symbol">&amp;</span><span class="normal"> bit</span><span class="symbol">)</span><span class="normal"> </span><span class="symbol">!=</span><span class="normal"> </span><span class="number">0</span><span class="symbol">)</span>
<span class="normal">	   </span><span class="cbracket">{</span>
<span class="normal">	       </span><span class="keyword">return</span><span class="normal"> nodes</span><span class="symbol">[</span><span class="function">index</span><span class="symbol">(</span><span class="normal">bit</span><span class="symbol">)].</span><span class="function">find</span><span class="symbol">(</span><span class="normal">hash</span><span class="symbol">,</span><span class="normal"> key</span><span class="symbol">);</span>
<span class="normal">	   </span><span class="cbracket">}</span>
<span class="normal">       </span><span class="keyword">else</span>
<span class="normal">	   </span><span class="keyword">return</span><span class="normal"> </span><span class="keyword">null</span><span class="symbol">;</span>
<span class="normal">   </span><span class="cbracket">}</span>
<span class="cbracket">}</span>
</tt></pre>
<p>In part 2 we look at how assoc works&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Clojure talks in Copenhagen and Aarhus &#8211; now with Azul Systems Clojure demo</title>
		<link>http://blog.higher-order.net/2009/08/17/clojure-talks-in-copenhagen-and-aarhus-now-with-azul-systems-clojure-demo/</link>
		<comments>http://blog.higher-order.net/2009/08/17/clojure-talks-in-copenhagen-and-aarhus-now-with-azul-systems-clojure-demo/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 10:45:30 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[free events]]></category>
		<category><![CDATA[JAOO]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=343</guid>
		<description><![CDATA[Upcoming events: I am giving a talk on Clojure in Copenhagen and Aarhus &#8211; this is the first chance for a dcug meetup (though also non-dcug members are invited). The events are after work and free &#8211; there will even &#8230; <a href="http://blog.higher-order.net/2009/08/17/clojure-talks-in-copenhagen-and-aarhus-now-with-azul-systems-clojure-demo/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><strong>Upcoming events</strong>: I am giving a talk on Clojure in Copenhagen and Aarhus &#8211; this is the first chance for a <a href="http://www.clojure.dk">dcug</a> meetup (though also non-dcug members are invited). The events are after work and free &#8211; there will even be free sandwiches, compliments of Trifork <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<ul>
<li>Monday, Sept. 7th in Aarhus, location: Trifork. Registration: <a href="https://secure.trifork.com/aarhus-2009/freeevent/register.m?eventOID=2129">https://secure.trifork.com/aarhus-2009/freeevent/register.m?eventOID=2129</a></li>
<li>Wednesday, Sept. 9th in Copenhagen, location: Trifork Cph. Registration: <a href="https://secure.trifork.com/aarhus-2009/freeevent/register.m?eventOID=2130">https://secure.trifork.com/aarhus-2009/freeevent/register.m?eventOID=2130</a>.</li>
<li>Free Clojure workshop at JAOO &#8211; featuring Rich Hickey. October 6, 2009, 17.30 &#8211; 19.30. Registration: <a href="https://secure.trifork.com/aarhus-2009/freeevent/register.m?eventOID=2093">https://secure.trifork.com/aarhus-2009/freeevent/register.m?eventOID=2093</a></li>
</ul>
<p><strong>Abstract for Aarhus/Cph talks.</strong><br />
Clojure is..</p>
<p>&#8230; a new functional, dynamic programming language for Java Virtual Machines. The primary novelty of Clojure is its strong focus on and support for in-process concurrency: a unique concurrency model, combining a notion of persistent (i.e., immutable, fast) data structures, with a lock-free concurrency model. This simplifies concurrent programming greatly and has good scalability properties.<br />
Influenced by LISP and Haskell, Clojure supports pure, lazy functional programming and has a powerful macro system which makes extending the language to support DSLs easy and powerful.</p>
<p>This talk..</p>
<p>&#8230; is split in three parts. In the first part, Clojure is introduced for those who don&#8217;t know the language. There is so much to cover that this will be a fast tour with pointers to more information, but we will emphasize the unique aspects of the language.</p>
<p>In the second part we go into more depth regarding the implementation of persistent (and transient data structures) &#8211; &#8220;the secret sauce of Clojure&#8221; <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>In the third part we get to see Clojure in action running on some very cool technology &#8211; a unique opportunity! Azul Systems (www.azulsystems.com) has promised to make available one of their large Vega 3 compute appliances (864 core, 368 GB memory, let&#8217;s go concurrent). We will explore how the Clojure concurrency model fares in practice, scaling a demo of a parallel Traveling Sales Problem algorithm. We will also push the implementation to its limits in a high-contention demo. Great fun!</p>
<hr />
Remember: Active until August 31st:<br />
DCUG members can now get a 15% discount on JAOO tickets.<br />
Simply click the banner below, choose &#8220;register here&#8221; and use the promotion code: dcug</p>
<p><a href="http://trifork-affiliate-program.com/scripts/click.php?a_aid=dcug&amp;a_bid=93ef8336&amp;desturl=https%3A%2F%2Fsecure.trifork.com%2Faarhus-2009%2Fregistration%2F"><img title="JAOO Aarhus 2009 - The Conference for the 360 Degree software developer" src="http://trifork-affiliate-program.com/accounts/default1/banners/Jaoo_webbanner234X60_49okt.jpg" alt="JAOO Aarhus 2009 - The Conference for the 360 Degree software developer" width="234" height="60" /></a><img style="border:0" src="http://trifork-affiliate-program.com/scripts/imp.php?a_aid=dcug&amp;a_bid=93ef8336" alt="" width="1" height="1" /></ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2009/08/17/clojure-talks-in-copenhagen-and-aarhus-now-with-azul-systems-clojure-demo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JAOO 2009 discount (for Clojure users ;-)</title>
		<link>http://blog.higher-order.net/2009/07/13/jaoo-discount/</link>
		<comments>http://blog.higher-order.net/2009/07/13/jaoo-discount/#comments</comments>
		<pubDate>Mon, 13 Jul 2009 05:25:56 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=331</guid>
		<description><![CDATA[I suggested to the JAOO program committee that the JAOO Aarhus 2009 conference should have a concurrency track. Their reply was &#8220;that&#8217;s a good idea &#8211; you are hosting it!&#8221; &#8211; this is how Trifork works The good thing is &#8230; <a href="http://blog.higher-order.net/2009/07/13/jaoo-discount/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I suggested to the JAOO program committee that the <a href="http://trifork-affiliate-program.com/scripts/click.php?a_aid=dcug&amp;a_bid=7441e314&amp;desturl=http%3A%2F%2Fjaoo.dk%2Faarhus-2009%2F">JAOO Aarhus 2009 conference<img style="border:0" src="http://trifork-affiliate-program.com/scripts/imp.php?a_aid=dcug&amp;a_bid=7441e314" alt="" width="1" height="1" /></a> should have a concurrency track. Their reply was &#8220;that&#8217;s a good idea &#8211; you are hosting it!&#8221; &#8211; this is how Trifork works <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>The good thing is that the track host gets to pick (or at least propose) speakers for the track (the bad thing is that it entails some work!). Given my recent interest in Clojure and since he is such a great speaker, I immediately suggested we invite Rich Hickey. I&#8217;ll write a bit more on the program in an upcoming post, but right now I just want to mention that via the Danish Clojure Users&#8217; Group I am now a JAOO affiliate, and anyone signing up via dcug gets a 15% discount on JAOO tickets.</p>
<p>To sign up, follow this procedure:</p>
<ol>
<li>To join dcug, simply register as a dcug user at </a><a href="http://www.clojure.dk">http://www.clojure.dk.</a></li>
<li>Read about the discount here: <a href="http://clojure.higher-order.net/?p=32">http://clojure.higher-order.net/?p=32</a></li>
</ol>
<p>As an appetizer, check out <a href="http://clojure.higher-order.net/?p=28">this post</a> at dcug. </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2009/07/13/jaoo-discount/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>One aspect of lazy computation</title>
		<link>http://blog.higher-order.net/2009/04/23/one-aspect-of-lazy-computation/</link>
		<comments>http://blog.higher-order.net/2009/04/23/one-aspect-of-lazy-computation/#comments</comments>
		<pubDate>Thu, 23 Apr 2009 20:38:07 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[Laziness]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=298</guid>
		<description><![CDATA[I&#8217;ve often needed to do a combination of filtering and mapping on arrays. E.g. in a Ruby on Rails app, I might have a list of &#8220;RecurringActivation&#8221; model objects which have a product_id and an integer period. Now I would &#8230; <a href="http://blog.higher-order.net/2009/04/23/one-aspect-of-lazy-computation/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve often needed to do a combination of filtering and mapping on arrays. E.g. in a Ruby on Rails app, I might have a list of &#8220;RecurringActivation&#8221; model objects which have a product_id and an integer period. Now I would like a list of product_ids where the current time &#8220;matches&#8221; the period; in code this might be:</p>
<pre><tt><span class="comment">#version 1 - 'functional'</span>
<span class="normal">cur_period </span><span class="symbol">=</span><span class="normal"> Time</span><span class="symbol">.</span><span class="normal">now</span><span class="symbol">.</span><span class="normal">month </span><span class="symbol">-</span><span class="normal"> </span><span class="number">1</span>
<span class="normal">RecurringActivation</span><span class="symbol">.</span><span class="normal">find</span><span class="symbol">(:</span><span class="normal">all</span><span class="symbol">,</span><span class="normal"> </span><span class="symbol">:</span><span class="normal">select </span><span class="symbol">=&gt;</span><span class="normal"> </span><span class="string">"product_id, period"</span><span class="symbol">).</span><span class="normal">select </span><span class="cbracket">{</span><span class="normal"> </span><span class="symbol">|</span><span class="normal">r</span><span class="symbol">|</span>
<span class="normal">  cur_period </span><span class="symbol">%</span><span class="normal"> r</span><span class="symbol">.</span><span class="normal">period </span><span class="symbol">==</span><span class="normal"> </span><span class="number">0</span>
<span class="cbracket">}</span><span class="symbol">.</span><span class="normal">map </span><span class="symbol">&amp;:</span><span class="normal">product_id</span>

<span class="comment">#version 2 - 'imperative'</span>
<span class="normal">cur_period </span><span class="symbol">=</span><span class="normal"> Time</span><span class="symbol">.</span><span class="normal">now</span><span class="symbol">.</span><span class="normal">month </span><span class="symbol">-</span><span class="normal"> </span><span class="number">1</span>
<span class="normal">result </span><span class="symbol">=</span><span class="normal"> </span><span class="symbol">[]</span>
<span class="normal">RecurringActivation</span><span class="symbol">.</span><span class="normal">find</span><span class="symbol">(:</span><span class="normal">all</span><span class="symbol">,</span><span class="normal"> </span><span class="symbol">:</span><span class="normal">select </span><span class="symbol">=&gt;</span><span class="normal"> </span><span class="string">"product_id, period"</span><span class="symbol">).</span><span class="normal">each </span><span class="cbracket">{</span><span class="normal"> </span><span class="symbol">|</span><span class="normal">r</span><span class="symbol">|</span>
<span class="normal">  result </span><span class="symbol">&lt;&lt;</span><span class="normal"> r</span><span class="symbol">.</span><span class="normal">product_id  </span><span class="keyword">if</span><span class="normal"> cur_period </span><span class="symbol">%</span><span class="normal"> r</span><span class="symbol">.</span><span class="normal">period </span><span class="symbol">==</span><span class="normal"> </span><span class="number">0</span>
<span class="cbracket">}</span></tt></pre>
<p>In the first version I compose functions (methods) &#8216;select&#8217; and &#8216;map&#8217;: the blocks have no side-effects, and this is why I call it &#8216;functional.&#8217; This is shorter and more clear. Of course, this is implemented by the library using iteration and imperative assignment, but at least my code feels functional.</p>
<p>In the second version I use &#8216;each&#8217; with a block that does both filtering and &#8216;mapping&#8217; in one step: I add the product_id explicitly to the products array for each object which satisfies my criterion. </p>
<p>Notice that the first version first produces a complete filtered list which is then passed on to the map method which produces a complete mapped and filtered list: the list is iterated twice, and an intermediate array containing only the filtered objects exists in memory and is later collected as garbage. In the second version the list is only iterated once, and such an intermediate array of filtered objects is never allocated. Hence one might choose the latter for performance reasons. </p>
<p>Now consider the same in Clojure or any lazy functional language. We might have (no active record here):</p>
<pre><tt>
krukow:~/Projects/private/okooko-prod/tmp$ cl
Clojure
user=> (<span class="keyword">def</span> recurring_activations '({:product_id 1 :period 2}
                                    {:product_id 2 :period 3}
				    {:product_id 3 :period 2}))
#'user/recurring_activations
user=> (<span class="keyword">def</span> res
         (map <span class="symbol">#</span>(<span class="keyword">do</span> (println <span class="string">"map"</span>) (<span class="symbol">:product_id</span> <span class="symbol">%</span>))
             (filter <span class="symbol">#</span>(<span class="keyword">do</span> (println <span class="string">"filter"</span>) (= 0 (mod (:period <span class="symbol">%</span>) <span class="number">2</span>)))
                 recurring_activations)))
#'user/res
user=>
</tt></pre>
<p>Nothing has been mapped or filtered yet since Clojure is fully lazy. So res is now a lazy seq which will compute exactly what is needed on demand. Now think about what will happen if I just peek at the first element of res… In a non-lazy language, already the entire recurring_activations list would have been filtered (printing &#8220;filter&#8221; three times), then that result would have been mapped (printing &#8220;map&#8221; twice) and finally the first element would be looked up. So the output would be<br />
&#8220;filter&#8221;<br />
&#8220;filter&#8221;<br />
&#8220;filter&#8221;<br />
&#8220;map&#8221;<br />
&#8220;map&#8221;<br />
(return first element of list).</p>
<p>What happens in Clojure? It prints &#8220;filter&#8221; &#8220;map&#8221; and gives the first element:</p>
<pre><tt>
user=> (first res)
filter
map
1
user=>
</tt></pre>
<p>In effect we don&#8217;t need an intermediate list containing the filtered elements which is then garbage collected. Notice also that if we just look at the entire list (recomputing res first) we get:</p>
<pre><tt>
user=> res
(filter
map
filter
filter
map
1 3)
user=>
</tt></pre>
<p>So the side effects are evaluated in a different order than had we realized the entire filtered list first. This can be counter intuitive when one is used to eager languages (which all mainstream languages are); however, once understood laziness can be extremely powerful and elegant.</p>
<p>In effect, <em>in lazy languages we can compose functions that work on entire sequences, e.g., map and filter, to obtain a lazy sequence which evaluates the all of the composed functions on each element in sequence.</em> </p>
<p>If that sounded abstract and poorly phrased, I can try to say it more concretely: In Ruby (or Java) I would need to write a new function: filter_and_map taking two &#8220;blocks&#8221;/&#8221;procs&#8221; and a list, then using e.g. &#8220;each&#8221; on the list and apply first the filter &#8220;proc&#8221; then the map &#8220;proc&#8221; &#8212; the existing functions &#8220;map&#8221; and &#8220;select&#8221; can&#8217;t help me. In Clojure (or Haskell) I can simply compose the existing library functions filter and map to obtain the same thing:</p>
<pre><tt>
(def filter_map
	(comp (partial map :product_id)
	          (partial filter #(= 0 (mod (:period %) 2)))))
</pre>
<p></tt><br />
or just use it inline</p>
<pre><tt>
user=> (def res
	(map :product_id
	 (filter #(= 0 (mod (:period %) 2)) recurring_activations)))
#'user/res
user=> res
(1 3)
user=>
</pre>
<p></tt> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2009/04/23/one-aspect-of-lazy-computation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Clojure event sourcing</title>
		<link>http://blog.higher-order.net/2009/02/22/clojure-event-sourcing/</link>
		<comments>http://blog.higher-order.net/2009/02/22/clojure-event-sourcing/#comments</comments>
		<pubDate>Sun, 22 Feb 2009 19:56:46 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[instanceof]]></category>

		<guid isPermaLink="false">http://blog.higher-order.net/?p=270</guid>
		<description><![CDATA[Event sourcing: Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation &#8230; <a href="http://blog.higher-order.net/2009/02/22/clojure-event-sourcing/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://martinfowler.com/eaaDev/EventSourcing.html">Event sourcing</a>:</p>
<blockquote><p>Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes.</p></blockquote>
<p>I recently watched <a href="http://www.infoq.com/news/2009/01/greg-young-ddd">an InfoQ interview with Greg Young</a> where he discusses an interesting architecture that uses event stream processing. I thought their architecture sounded very interesting, but I did not go into &#8216;research mode&#8217; to dig into it. Later I came by Jonas Bonér&#8217;s blog on &#8220;Real world scala&#8221; and saw an <a href="http://jonasboner.com/2009/02/12/event-sourcing-using-actors.html">example of event sourcing using actors in Scala</a>. I thought it would be a good exercise to implement a similar example in Clojure, and perhaps it would also be interesting to compare the solutions and see the effect of the languages on the implementation.</p>
<p>As Clojure focuses on concurrency it was natural to start thinking about what &#8216;concurrent&#8217; event sourcing would mean and how to implement it. </p>
<p><strong>Sequential Clojure implementation.</strong> For context, read Fowler&#8217;s article on <a href="http://martinfowler.com/eaaDev/EventSourcing.html">event sourcing</a>, and for comparison Jonas Bonér&#8217;s <a href="http://jonasboner.com/2009/02/12/event-sourcing-using-actors.html">Scala implementation</a>.</p>
<p><strong>Modeling.</strong> Scala uses classes and actors for modeling ships and their state. A ship is an actor that receives events (messages) and reacts to those by updating its internal state, i.e., its location (e.g., departure event, arrival event). </p>
<p>Clojure has <em>agents</em> which are similar to actors. Agents are simpler and there are more functions that act on agents than on Actors; for example, in Clojure it is possible to directly read the state of an agent (using <tt>(<span class="keyword">deref</span> x)</tt> or <tt>@x</tt>) — this is not possible with actors since they are distributed (at least programmed as if they were distributed).</p>
<p>I&#8217;ve used two additional Clojure features in implementing event sourcing: watchers and meta-data. <a href="http://clojure.org/api#toc50">Watchers</a> are a form of callback attached to agents. The callback is called synchronously with the agent actions and &#8220;derefs of the agent in the callback will see the value set during that action.&#8221; I am using watchers to record the events that are sent to each ship. This decouples the code dealing with event storage from the state-changing functions sent to agents. The state of an agent is the location of the ship it represents. So we can get the location of a ship at any time simply by deref&#8217;ing the agent representing the ship (with actors this is more complex as one has to send a message to the actor and receive a message with the answer). I decided to use meta data attached to the location to represent the event that caused a move to that location. For example</p>
<pre><tt>
user> @(first agents)
{:country "At sea", :city "At sea"}
user> ^@(first agents)
{:agent 0, :time #&lt;Date Sun Feb 22 19:49:59 CET 2009&gt;, :type :depart_for,
:loc {:country "Sweden", :city "Malmö"}}
user></tt></pre>
<p>In general, I think it is possible to implement sequential event sourcing reasonably elegantly in Clojure using agents, watchers and metadata. The agents store the state that can change, the watchers record the events that cause the changes by looking at state meta data. The invariant should be that if an agent moves from state <tt>s1</tt> to state <tt>s2</tt> then <tt>(meta s2)</tt> should store the event that caused a transition from the <tt>s1</tt> to state <tt>s2</tt>. Then if one knows the initial state, and all the events it is possible to reconstruct the entire sequence of states.</p>
<p>Our watchers are quite simple: (notice the cool syntax <tt>#(meta @%)</tt>.  A shame I couldn&#8217;t write <tt>#(^@%)</tt>  it looks like I&#8217;m swearing! <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  that would be cool)</p>
<pre><tt>
;;Record event history
(def events (atom (map #(meta @%) agents))) 

(defn event_logger
  [idx a changed]
  (if changed
    (swap! events conj (meta @a))))

(defn add-watchers []
  "associate a watcher with each agent
   the watcher logs the events for that agent"
  (dotimes [i NUM_SHIPS]
    (add-watch (agents i) i event_logger)))</tt></pre>
<p><strong>Concurrent event sourcing</strong> A natural question to ask for Clojure is &#8220;what if there are multiple concurrent event sources?&#8221; Is it still possible to do event sourcing? I don&#8217;t think so; at least not is a manner as powerful as with sequential events. Since Fowler defines event sourcing as ensuring &#8220;(…) that all changes to application state are stored as a sequence of events.&#8221; We already have a problem: when events can occur concurrently one must serialize them in order to store them as a sequence. One could try and timestamp all events and store them as a set, but this is sensitive to timing an scheduler issues. Even if stored as a set of timestamped events, how would one replay these events in a way that ensures that the global program states are the same in the replayed program? This would be dependent on thread scheduler timings. </p>
<p>It is of course possible to do the less advanced use cases of event sourcing, e.g. event querying and analysis (I guess that is concurrent event stream processing).</p>
<p>Anyway, if anyone is still reading <img src='http://blog.higher-order.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  the <a href="http://blog.higher-order.net/files/clj/es.clj">Source file link</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.higher-order.net/2009/02/22/clojure-event-sourcing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

