<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Harder, Better, Faster, Stronger</title>
	<atom:link href="http://hbfs.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://hbfs.wordpress.com</link>
	<description>Explorations in better, faster, stronger code.</description>
	<lastBuildDate>Tue, 14 May 2013 18:52:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='hbfs.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/c22d23f7ac31436d57af00e0027dc364?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Harder, Better, Faster, Stronger</title>
		<link>http://hbfs.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://hbfs.wordpress.com/osd.xml" title="Harder, Better, Faster, Stronger" />
	<atom:link rel='hub' href='http://hbfs.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Average node depth in a Full Tree</title>
		<link>http://hbfs.wordpress.com/2013/05/14/average-node-depth-in-a-full-tree/</link>
		<comments>http://hbfs.wordpress.com/2013/05/14/average-node-depth-in-a-full-tree/#comments</comments>
		<pubDate>Tue, 14 May 2013 18:50:36 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[data structures]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[average depth]]></category>
		<category><![CDATA[Binary Tree]]></category>
		<category><![CDATA[full tree]]></category>
		<category><![CDATA[path]]></category>
		<category><![CDATA[path depth]]></category>
		<category><![CDATA[Tree]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4557</guid>
		<description><![CDATA[While doing something else I stumbled upon the interesting problem of computing the average depth of nodes in a tree. The depth of a node is the distance that separates that node from the root. You can either decide that the root is at depth 1, or you can decide that it is at depth [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4557&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>While doing something else I stumbled upon the interesting problem of computing the average depth of nodes in a tree. The depth of a node is the distance that separates that node from the root. You can either decide that the root is at depth 1, or you can decide that it is at depth zero, but let&#8217;s decide on depth 1. So an immediate child of the root is at depth two, and its children at depth 3, and so on until you reach leaves, nodes with no children.</p>
<p><a href="http://hbfs.files.wordpress.com/2011/12/tree-diagram7.png"><img src="http://hbfs.files.wordpress.com/2011/12/tree-diagram7.png?w=150&#038;h=109" alt="tree-diagram7" width="150" height="109" class="aligncenter size-thumbnail wp-image-3862" /></a></p>
<p>So the calculation of the average node depth (including leaves) in a tree comes interesting when we want to know how far a constructed tree is from the ideal <a href="http://en.wikipedia.org/wiki/Binary_tree#Types_of_binary_trees" target="_blank">full tree</a>, as a measure of (application-specific) performance. After searching a bit on the web, I found only incomplete or incorrect formulas, or stated with proof. This week, let us see how we can derive the result without (too much) pain.</p>
<p><span id="more-4557"></span></p>
<p>Let us start with a tree with <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> nodes (in what follows, the distinction between internal nodes and leaves noted where irrelevant, so we&#8217;ll just say &#8216;nodes&#8217; when it does not matter). First, we find integers <img src='http://s0.wp.com/latex.php?latex=k%5Cleqslant%7B0%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k&#92;leqslant{0}' title='k&#92;leqslant{0}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n%5Cleqslant%7B0%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;leqslant{0}' title='n&#92;leqslant{0}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=m+%5Csim+2%5En%2Bk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m &#92;sim 2^n+k' title='m &#92;sim 2^n+k' class='latex' />.</p>
<p>Then we reflect on what <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> imposes on the shape of the tree. If <img src='http://s0.wp.com/latex.php?latex=m%5Csim2%5En-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m&#92;sim2^n-1' title='m&#92;sim2^n-1' class='latex' /> (or, from the decomposition above, <img src='http://s0.wp.com/latex.php?latex=m%5Csim+2%5E%7Bn-1%7D%2B%282%5E%7Bn-1%7D-1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m&#92;sim 2^{n-1}+(2^{n-1}-1)' title='m&#92;sim 2^{n-1}+(2^{n-1}-1)' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=k%3D2%5E%7Bn-1%7D-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=2^{n-1}-1' title='k=2^{n-1}-1' class='latex' />) the tree can be balanced in such a way that it has all leaves on the same level. The tree is perfectly triangular:</p>
<p><a href="http://hbfs.files.wordpress.com/2013/03/triangular-tree.png"><img src="http://hbfs.files.wordpress.com/2013/03/triangular-tree.png?w=150&#038;h=143" alt="triangular-tree" width="150" height="143" class="aligncenter size-thumbnail wp-image-4565" /></a></p>
<p>In that case, the average depth is given by</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbar%7Bd%7D%3D%5Cfrac%7B1%7D%7B2%5En-1%7D%5Csum_%7Bd%3D1%7D%5En+d+2%5E%7Bd-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;bar{d}=&#92;frac{1}{2^n-1}&#92;sum_{d=1}^n d 2^{d-1}' title='&#92;displaystyle &#92;bar{d}=&#92;frac{1}{2^n-1}&#92;sum_{d=1}^n d 2^{d-1}' class='latex' /></p>
<p>because there are <img src='http://s0.wp.com/latex.php?latex=2%5E%7Bd-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^{d-1}' title='2^{d-1}' class='latex' /> nodes at depth <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> (indeed: at depth <img src='http://s0.wp.com/latex.php?latex=d%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d=1' title='d=1' class='latex' />, the root, there is <img src='http://s0.wp.com/latex.php?latex=2%5E%7Bd-1%7D%3D2%5E0%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^{d-1}=2^0=1' title='2^{d-1}=2^0=1' class='latex' /> node. At depth <img src='http://s0.wp.com/latex.php?latex=d%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d=2' title='d=2' class='latex' />, we have <img src='http://s0.wp.com/latex.php?latex=2%5E%7Bd-1%7D%3D2%5E1%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^{d-1}=2^1=2' title='2^{d-1}=2^1=2' class='latex' /> nodes, etc.) At depth <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />, all <img src='http://s0.wp.com/latex.php?latex=2%5E%7Bd-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^{d-1}' title='2^{d-1}' class='latex' /> nodes have depth <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />, contributing <img src='http://s0.wp.com/latex.php?latex=d+2%5E%7Bd-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d 2^{d-1}' title='d 2^{d-1}' class='latex' /> to the total. Since there are <img src='http://s0.wp.com/latex.php?latex=m%3D2%5En-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m=2^n-1' title='m=2^n-1' class='latex' /> nodes, the average is given by the formula above.</p>
<p>But <img src='http://s0.wp.com/latex.php?latex=%5Csum+d+2%5E%7Bd%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum d 2^{d}' title='&#92;sum d 2^{d}' class='latex' /> should look <a href="http://hbfs.wordpress.com/2009/10/27/of-staircases-and-textbooks/" target="_blank">familiar</a> to you. But we have <img src='http://s0.wp.com/latex.php?latex=%5Csum+d+2%5E%7Bd-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum d 2^{d-1}' title='&#92;sum d 2^{d-1}' class='latex' />, so we need to transform it a bit so that we fall back again on the form <img src='http://s0.wp.com/latex.php?latex=%5Csum+d+2%5Ed&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum d 2^d' title='&#92;sum d 2^d' class='latex' />. First:
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Csum_%7Bd%3D1%7D%5E2+d+2%5E%7Bd-1%7D%3D%5Csum_%7Bd%3D1%7D%5En+d+%5Cleft%28%5Cfrac%7B1%7D%7B2%7D2%5Ed%5Cright%29%3D%5Cfrac%7B1%7D%7B2%7D%5Csum_%7Bd%3D1%7D%5En+d+2%5Ed&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;sum_{d=1}^2 d 2^{d-1}=&#92;sum_{d=1}^n d &#92;left(&#92;frac{1}{2}2^d&#92;right)=&#92;frac{1}{2}&#92;sum_{d=1}^n d 2^d' title='&#92;displaystyle &#92;sum_{d=1}^2 d 2^{d-1}=&#92;sum_{d=1}^n d &#92;left(&#92;frac{1}{2}2^d&#92;right)=&#92;frac{1}{2}&#92;sum_{d=1}^n d 2^d' class='latex' /></p>
<p>and we know that
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Csum_%7Bd%3D1%7D%5En+d+2%5Ed+%3D+2%5E%7Bn%2B1%7D%28n-1%29%2B2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;sum_{d=1}^n d 2^d = 2^{n+1}(n-1)+2' title='&#92;displaystyle &#92;sum_{d=1}^n d 2^d = 2^{n+1}(n-1)+2' class='latex' /></p>
<p>Finally,</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cfrac%7B1%7D%7B2%5En-1%7D%5Csum_%7Bd%3D1%7D%5En+d+2%5E%7Bd-1%7D%3D%5Cleft%28%5Cfrac%7B1%7D%7B2%5En-1%7D%5Cright%29%5Cleft%28%5Cfrac%7B1%7D%7B2%7D%5Csum_%7Bd%3D1%7D%5En+d+2%5Ed%5Cright%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;frac{1}{2^n-1}&#92;sum_{d=1}^n d 2^{d-1}=&#92;left(&#92;frac{1}{2^n-1}&#92;right)&#92;left(&#92;frac{1}{2}&#92;sum_{d=1}^n d 2^d&#92;right)' title='&#92;displaystyle &#92;frac{1}{2^n-1}&#92;sum_{d=1}^n d 2^{d-1}=&#92;left(&#92;frac{1}{2^n-1}&#92;right)&#92;left(&#92;frac{1}{2}&#92;sum_{d=1}^n d 2^d&#92;right)' class='latex' />
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%3D%5Cleft%28%5Cfrac%7B1%7D%7B2%5En-1%7D%5Cright%29%5Cleft%28%5Cfrac%7B1%7D%7B2%7D%5Cleft%282%5E%7Bn%2B1%7D%28n-1%29%2B2%5Cright%29%5Cright%29%3D%5Cfrac%7B2%5En%28n-1%29%2B1%7D%7B2%5En-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle =&#92;left(&#92;frac{1}{2^n-1}&#92;right)&#92;left(&#92;frac{1}{2}&#92;left(2^{n+1}(n-1)+2&#92;right)&#92;right)=&#92;frac{2^n(n-1)+1}{2^n-1}' title='&#92;displaystyle =&#92;left(&#92;frac{1}{2^n-1}&#92;right)&#92;left(&#92;frac{1}{2}&#92;left(2^{n+1}(n-1)+2&#92;right)&#92;right)=&#92;frac{2^n(n-1)+1}{2^n-1}' class='latex' /></p>
<p>and <img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cfrac%7B2%5En%28n-1%29%2B1%7D%7B2%5En-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;frac{2^n(n-1)+1}{2^n-1}' title='&#92;displaystyle &#92;frac{2^n(n-1)+1}{2^n-1}' class='latex' /> is approximately <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p align="center">*<br />*&emsp;*</p>
<p>However, if <img src='http://s0.wp.com/latex.php?latex=m%5Cnot%5Csim+2%5En-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m&#92;not&#92;sim 2^n-1' title='m&#92;not&#92;sim 2^n-1' class='latex' />, then we have a tree with <img src='http://s0.wp.com/latex.php?latex=k%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k+1' title='k+1' class='latex' /> leaves on one depth more than the main tree, and the tree looks something like:</p>
<p><a href="http://hbfs.files.wordpress.com/2013/03/regular-tree.png"><img src="http://hbfs.files.wordpress.com/2013/03/regular-tree.png?w=150&#038;h=147" alt="regular-tree" width="150" height="147" class="aligncenter size-thumbnail wp-image-4566" /></a></p>
<p>The depth equation becomes now</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cfrac%7B1%7D%7B2%5En%2Bk%7D%5Cleft%28%5Csum_%7Bd%3D1%7D%5En+d+2%5E%7Bd-1%7D%2B%28k%2B1%29%28n%2B1%29%5Cright%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;frac{1}{2^n+k}&#92;left(&#92;sum_{d=1}^n d 2^{d-1}+(k+1)(n+1)&#92;right)' title='&#92;frac{1}{2^n+k}&#92;left(&#92;sum_{d=1}^n d 2^{d-1}+(k+1)(n+1)&#92;right)' class='latex' /></p>
<p>which simplifies to</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbar%7Bd%7D%3D%5Cfrac%7B2%5En%28n-1%29%2B1%2B%28k%2B1%29%28n%2B1%29%7D%7B2%5En%2Bk%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;bar{d}=&#92;frac{2^n(n-1)+1+(k+1)(n+1)}{2^n+k}' title='&#92;displaystyle &#92;bar{d}=&#92;frac{2^n(n-1)+1+(k+1)(n+1)}{2^n+k}' class='latex' /></p>
<p align="center">*<br />*&emsp;*</p>
<p>The hard&mdash;or rather inconvenient&mdash;part is to find the decomposition <img src='http://s0.wp.com/latex.php?latex=m%3D2%5En%2Bk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m=2^n+k' title='m=2^n+k' class='latex' /> which needs either a call to log-base-2 (gasp!) or <img src='http://s0.wp.com/latex.php?latex=O%28%5Clg+m%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O(&#92;lg m)' title='O(&#92;lg m)' class='latex' /> steps using an integer method. Other than that, now that you know how the formula is derived (with the root starting at depth <i>one</i>, not zero, because you still need 1 operation to look at it), there shouldn&#8217;t be any more problems.</p>
<p>Also,the formula applies to <i>full trees</i>, not any kind of trees. If you have a Fibonacci tree, or some other kind of tree, then you will have to find another formula for the average depth, and it may not be trivial.</p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/algorithms/'>algorithms</a>, <a href='http://hbfs.wordpress.com/category/data-structures/'>data structures</a>, <a href='http://hbfs.wordpress.com/category/mathematics/'>Mathematics</a> Tagged: <a href='http://hbfs.wordpress.com/tag/average-depth/'>average depth</a>, <a href='http://hbfs.wordpress.com/tag/binary-tree/'>Binary Tree</a>, <a href='http://hbfs.wordpress.com/tag/full-tree/'>full tree</a>, <a href='http://hbfs.wordpress.com/tag/path/'>path</a>, <a href='http://hbfs.wordpress.com/tag/path-depth/'>path depth</a>, <a href='http://hbfs.wordpress.com/tag/tree/'>Tree</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4557/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4557&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/05/14/average-node-depth-in-a-full-tree/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2011/12/tree-diagram7.png?w=150" medium="image">
			<media:title type="html">tree-diagram7</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/03/triangular-tree.png?w=150" medium="image">
			<media:title type="html">triangular-tree</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/03/regular-tree.png?w=150" medium="image">
			<media:title type="html">regular-tree</media:title>
		</media:content>
	</item>
		<item>
		<title>Parsing GPS data with Bash</title>
		<link>http://hbfs.wordpress.com/2013/05/07/parsing-gps-data-with-bash/</link>
		<comments>http://hbfs.wordpress.com/2013/05/07/parsing-gps-data-with-bash/#comments</comments>
		<pubDate>Tue, 07 May 2013 17:19:41 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[Bash (Shell)]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[cut]]></category>
		<category><![CDATA[GPS]]></category>
		<category><![CDATA[grep]]></category>
		<category><![CDATA[NMEA]]></category>
		<category><![CDATA[sed]]></category>
		<category><![CDATA[tr]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4578</guid>
		<description><![CDATA[Last time we looked at how to get the data to the GPS and now we will have a look at how to parse the data. Turns out that except for the check-sum, everything is pretty straight forward, even in Bash. So, why bash in the first place? Well, there&#8217;s not real reason except that [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4578&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://hbfs.wordpress.com/2013/04/30/reading-gps-data-with-bash/" target="_blank">Last time</a> we looked at how to get the data to the GPS and now we will have a look at how to parse the data. Turns out that except for the check-sum, everything is pretty straight forward, even in Bash.</p>
<p><a href="http://hbfs.files.wordpress.com/2011/03/map-detail.png"><img src="http://hbfs.files.wordpress.com/2011/03/map-detail.png?w=150&#038;h=150" alt="map-detail" width="150" height="150" class="aligncenter size-thumbnail wp-image-3216" /></a></p>
<p>So, why bash in the first place? Well, there&#8217;s not real reason except that for the something else I&#8217;m working on, it&#8217;s the ideal glue-code language, allowing me to invoke simply other programs that I do not want to re-code (or take parts of) to do what I want. I must say that I even have a C# version of the GPS data grabber, but while fancier, it does not bring much more than the Bash version.</p>
<p><span id="more-4578"></span></p>
<p>A typical GPS message looks like</p>
<pre class="brush: plain; title: ; notranslate">
$GPGGA,041050.000,xxxx.xxxx,N,yyyy.yyyy,W,1,06,1.1,3.9,M,,,,0000*1D
</pre>
<p>Here, we have a series of comma-separated fields, ended by <tt>*</tt> and a check-sum&mdash;let us ignore those for now. The message type, <tt>$GPGGA</tt> gives the <a href="http://www.gpsinformation.org/dale/nmea.htm#GGA" target="_blank">3d location and accuracy data</a>, containing things like the UTC time of capture, the number of satellites being tracked, the position (here <tt>xxxx.xxxx</tt> and <tt>yyyy.yyyy</tt>; let&#8217;s keep some privacy).</p>
<p>Splitting the message into fields is trivial in Bash (as it would be in C# where it would suffice to use <tt>string.Split(...)</tt> to get essentially the same result):</p>
<pre class="brush: bash; title: ; notranslate">
old_IFS=$IFS
IFS=, # set the Internal Field Separator

# casts the string $message as a
# list, each item separated by $IFS
#
message_fields=( &quot;$message&quot; )

IFS=$old_IFS
</pre>
<p>It is now possible to use the variable <tt>message_fields</tt> as a list. For example, <tt>${message_fields[0]}$</tt> yields the message type <tt>"$GPGGA"</tt>, and <tt>${message_fields[2]}$</tt> contains the latitude. (To remove the check-sum, one could do something like <tt>$(echo $message | cut -d* -f 1)</tt> to recover the part of the message before <tt>*</tt>.)</p>
<p align="center">*<br />*&emsp;*</p>
<p>Using <tt>$IFS</tt> is not the only way of processing the data. Good ol&#8217; friends <tt>cut</tt>, <tt>tr</tt>, and <tt>sed</tt> help just as much if you&#8217;re not planning to do extensive (pre)processing. Here, I just grab the data from the file/device #4:</p>
<pre class="brush: bash; title: ; notranslate">
while [ 1 ]
do
    read this_line 

    # if cr/lf bothers you, make it lf only
    # (os-specfic concern)
    #
    this_line=$( echo $this_line | sed s/$'\r'//g )

    # get a precise time stamp
    # %N = nanoseconds
    #
    ts=$(date +&quot;%Y/%m/%d %H:%M:%S.%N&quot;)

    echo $ts $this_line &gt;&gt; full-log.txt

    # let us filter the current position
    #
    if [[ &quot;$this_line&quot; =~ &quot;GPRMC&quot; ]]
    then
        # ok, it looks like a GPS reading (may be void)
        # if field 3 is V, the reading is void (or maybe
        # only untrusted?), if it is A, then the position
        # is Active (and therefore given with confidence?)
        #
        if [[ $(echo $this_line | cut -d, -f 4-6) != &quot;,,,&quot; ]]
        then
            # get latitude and longitude
            gps_pos=($(echo $this_line | \
                cut -d, -f 3-7 | \
                tr , ' ' | \
                sed 's/\(^0*\)\|\(\b0*\)//g'))
            # show
            echo $ts ${gps_pos[@]}
        fi
    fi
done &lt;&amp;4
</pre>
<p>And this yields something like</p>
<pre class="brush: plain; title: ; notranslate">
2013/03/23 00:10:55.381921982 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.416345523 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.451376341 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.486083583 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.521025127 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:56.334357097 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:57.323532397 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:58.325286590 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:59.321872511 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:00.328458852 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:01.328374601 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:02.328710428 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:03.327115325 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:04.328974192 A xxxx.1229 N yyyy.2298 W
</pre>
<p align="center">*<br />*&emsp;*</p>
<p>I am not saying that you should do all your data processing using Bash, merely that for some things, it should not be frowned upon, it may still do the job quite nicely.</p>
<p>Further post-processing of the data is application specific, and could be done in just about anything. As long as you capture as much as the available data (the above snippet time-stamps the data and stores it in another file, <tt>full-log.txt</tt>) you should be just fine. And since text isn&#8217;t that heavy (and high compressible in this case), you should really feel at ease to grab <i>just everything</i>.</p>
<p>Tools like Gnumeric (or excel) can load the log file by treating it as a space-or-comma separated file, and you can use that to look at the data. Let&#8217;s do just that.</p>
<p>But in a next entry. <i>To be continued&#8230;</i></p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/bash-shell/'>Bash (Shell)</a>, <a href='http://hbfs.wordpress.com/category/hacks/'>hacks</a>, <a href='http://hbfs.wordpress.com/category/programming/'>programming</a> Tagged: <a href='http://hbfs.wordpress.com/tag/cut/'>cut</a>, <a href='http://hbfs.wordpress.com/tag/gps/'>GPS</a>, <a href='http://hbfs.wordpress.com/tag/grep/'>grep</a>, <a href='http://hbfs.wordpress.com/tag/nmea/'>NMEA</a>, <a href='http://hbfs.wordpress.com/tag/sed/'>sed</a>, <a href='http://hbfs.wordpress.com/tag/tr/'>tr</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4578/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4578/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4578&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/05/07/parsing-gps-data-with-bash/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2011/03/map-detail.png?w=150" medium="image">
			<media:title type="html">map-detail</media:title>
		</media:content>
	</item>
		<item>
		<title>Reading GPS data with Bash</title>
		<link>http://hbfs.wordpress.com/2013/04/30/reading-gps-data-with-bash/</link>
		<comments>http://hbfs.wordpress.com/2013/04/30/reading-gps-data-with-bash/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 15:22:44 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[Bash (Shell)]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[GPS]]></category>
		<category><![CDATA[serial port]]></category>
		<category><![CDATA[usb gps]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4572</guid>
		<description><![CDATA[I am presently working on something that requires geolocation. Not knowing much about GPSes and related topics, I decided to get a USB GPS. This week, let&#8217;s have a look at how we can extract information from the USB GPS using Bash. The first step is to locate your USB GPS as a device. If [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4572&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I am presently working on something that requires geolocation. Not knowing much about GPSes and related topics, I decided to get a USB GPS. This week, let&#8217;s have a look at how we can extract information from the USB GPS using Bash.</p>
<p><a href="http://hbfs.files.wordpress.com/2011/03/map-detail.png"><img src="http://hbfs.files.wordpress.com/2011/03/map-detail.png?w=150&#038;h=150" alt="map-detail" width="150" height="150" class="aligncenter size-thumbnail wp-image-3216" /></a></p>
<p>The first step is to locate your USB GPS as a device. If it&#8217;s <a href="http://en.wikipedia.org/wiki/NMEA_0183" target="_blank">NMEA</a> compliant, it should mount automagically as a USB serial port. You would think that <tt>lsusb -v</tt> would show you the device, but it does not always. Sometimes it shows as &#8220;Brand X GPS&#8221;, sometimes it only shows as a generic device, say &#8220;MediaTek Inc.&#8221;, or even as a <i>modem</i>. It will typically show up as <tt>/dev/ttyUSB0</tt> or <tt>/dev/ttyACM0</tt>.</p>
<p><span id="more-4572"></span></p>
<p>Once you&#8217;ve ascertained the device name, you must add your user to the <tt>dialout</tt> group (on Linux, anyway) to access the devices. You do so with</p>
<pre class="brush: bash; title: ; notranslate">
sudo adduser $USER dialout
</pre>
<p>and by login-out and in again (to make the group addition propagate correctly to your environment).</p>
<p>The next step (now that you know the device name and that you&#8217;ve added yourself to group dialup), is to configure the baud-rate and other port parameters. To do that, we&#8217;ll use <tt>stty</tt></p>
<pre class="brush: bash; title: ; notranslate">
stty \
    -F $gps_device \
    raw \
    38400 \
    cs8 \
    clocal \
    cs8 \
    -parenb \
    crtscts \
    -cstopb
</pre>
<p>This commands sets the device pointed by the variable <tt>$gps_device</tt> to 38400 bauds, no-handshake and 8n1 data (8 bits, no parity, one stop bit).</p>
<p>We can now open the device as files for reading and writing:</p>
<pre class="brush: bash; title: ; notranslate">
exec 4&lt;$gps_device # gps read-stream
exec 5&gt;$gps_device # gps write-stream
</pre>
<p>&#8230;creating stream #4 for reading and stream #5 for writing. While one may not think of a GPS as a read/write device, it does accept configuration commands such as the frequency at which you desire the updates and what NMEA information you want. The simplest command is probably the update frequency that lets you specify the interval, in ms, at which the GPS spews out its data. It would look very much like:</p>
<pre class="brush: bash; title: ; notranslate">
# configure GPS options
#
# with the help of : http://www.hhhh.org/wiml/proj/nmeaxor.html
#
# 2000 ms = $PMTK220,2000*1C
# 1500 ms = $PMTK220,1500*1A
# 1000 ms = $PMTK220,1000*1F
#  750 ms = $PMTK220,750*2C
#  500 ms = $PMTK220,500*2B
#  250 ms = $PMTK220,250*29
#  200 ms = $PMTK220,200*2C
#  100 ms = $PMTK220,100*2F

echo \$PMTK220,1000\*1F$'\r' &gt;&amp;5
</pre>
<p>The PMTK commands are reminiscent of the old AT modem commands, but with the addition of a (rather crude) check-sum at the end of each command, separated by <tt>*</tt>. The GPS will answer with a possibly negative acknowledge message to let you know if the command succeeded. The above would result in something like:</p>
<pre class="brush: plain; title: ; notranslate">
$PMTK001,220,3*30
</pre>
<p>Were <tt>001</tt> is the type of the message, here an acknowledge message, <tt>220</tt> is the type of the message it answers to, and <tt>2</tt> or <tt>3</tt> before the check sum tells whether it failed or succeeded.</p>
<p>Pumping the data out of the GPS is rather straightforward:</p>
<pre class="brush: bash; title: ; notranslate">
while [ 1 ]
do
    read this_line

...do more stuff...

done &lt;&amp;4
</pre>
<p>The output should look something like</p>
<pre class="brush: plain; title: ; notranslate">
$GPGSV,3,1,12,15,69,197,26,26,63,068,25,21,46,300,23,05,38,085,23*71
$GPGSV,3,2,12,09,34,119,23,29,26,224,18,51,23,226,,18,21,275,20*74
$GPGSV,3,3,12,08,20,043,16,06,08,329,17,24,06,168,,07,02,021,*79
$GPRMC,132822.000,A,4932.1147,N,06723.2320,W,0.00,181.2,230313,,*26
$GPVTG,181.2,T,,M,0.00,N,0.0,K*5A
</pre>
<p>where the <tt>$GPGSV</tt> data tells you what satellites you are receiving signals from, and the <tt>$GPRMC</tt> is the minimal positioning data the GPS is supposed to give you to be NMEA-compliant.</p>
<p align="center">*<br />*&emsp;*</p>
<p>In the next entry, we&#8217;ll have a look at how to parse this data with Bash</p>
<p><i><a href="http://hbfs.wordpress.com/2013/05/07/parsing-gps-data-with-bash/" target="_blank">To be continued&#8230;</a></p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/bash-shell/'>Bash (Shell)</a>, <a href='http://hbfs.wordpress.com/category/hacks/'>hacks</a>, <a href='http://hbfs.wordpress.com/category/programming/'>programming</a> Tagged: <a href='http://hbfs.wordpress.com/tag/gps/'>GPS</a>, <a href='http://hbfs.wordpress.com/tag/serial-port/'>serial port</a>, <a href='http://hbfs.wordpress.com/tag/usb-gps/'>usb gps</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4572/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4572&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/04/30/reading-gps-data-with-bash/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2011/03/map-detail.png?w=150" medium="image">
			<media:title type="html">map-detail</media:title>
		</media:content>
	</item>
		<item>
		<title>Breaking Caesar’s Cipher (part III)</title>
		<link>http://hbfs.wordpress.com/2013/04/23/breaking-caesars-cipher-part-iii/</link>
		<comments>http://hbfs.wordpress.com/2013/04/23/breaking-caesars-cipher-part-iii/#comments</comments>
		<pubDate>Tue, 23 Apr 2013 14:39:17 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Bash (Shell)]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Caesar Cipher]]></category>
		<category><![CDATA[Markov chains]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4540</guid>
		<description><![CDATA[In the last installment of this series, we looked at Markov chains as a mean of estimating the likelihood of a given piece of text of actually being a message, written in English, rather than mere gibberish. This week, we finally piece everything together to obtain a program to crack Caesar&#8217;s cipher without (much) human [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4540&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In the <a href="http://hbfs.wordpress.com/2013/04/16/breaking-caesars-cipher-caesars-cipher-part-ii/" target="_blank">last installment</a> of this series, we looked at <a href="" target="_blank">Markov chains</a> as a mean of estimating the likelihood of a given piece of text of actually being a message, written in English, rather than mere gibberish.</p>
<p><!--lock.png--></p>
<p>This week, we finally piece everything together to obtain a program to crack Caesar&#8217;s cipher without (much) human intervention.</p>
<p><span id="more-4540"></span></p>
<p>So the Markov chain model gives us a way of assigning a score to a piece of text as a probability:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+P%28sentence%7Es%7Eis%7EEnglish%7Etext%29+%5Cpropto+%5Cprod_%7Bi%3D1%7D%5En+P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle P(sentence~s~is~English~text) &#92;propto &#92;prod_{i=1}^n P(s_i|s_{i-1})' title='&#92;displaystyle P(sentence~s~is~English~text) &#92;propto &#92;prod_{i=1}^n P(s_i|s_{i-1})' class='latex' />,</p>
<p>according to a model that estimates the likeliness of a symbol following an other. We saw how we estimate this probability <a href="http://hbfs.wordpress.com/2013/04/16/breaking-caesars-cipher-caesars-cipher-part-ii/" target="_blank">last week</a>. It basically reduces to scan a large body of text (the larger the better) and estimating frequencies, then converting frequencies into probabilities. Once the score engine is ready, we can finally assemble everything together and solve for the Caesar code automagically.</p>
<p>Cycling through keys and getting a score:</p>
<pre class="brush: bash; title: ; notranslate">
#!/usr/bin/env bash

key=9
message=&quot;attack the gauls at dawn&quot;

for ((k=0;k&lt;25;k++))
do
    cesar $k $(cesar $key &quot;$message&quot;) 
done | compute-score matrix.txt
</pre>
<p>Producing:</p>
<pre class="brush: plain; title: ; notranslate">
product=7.7064e-69  log=-156.836 jccjltcqnpjdubjcmjfw
product=1.10597e-54 log=-124.239 kddkmudroqkevckdnkgx
product=1.08628e-38 log=-87.4155 leelnvesprlfwdleolhy
product=2.74988e-54 log=-123.328 mffmowftqsmgxemfpmiz
product=5.62482e-54 log=-122.612 nggnpxgurtnhyfngqnja
product=1.46081e-50 log=-114.75  ohhoqyhvsuoizgohrokb
product=4.27383e-51 log=-115.979 piiprziwtvpjahpisplc
product=6.22185e-75 log=-170.866 qjjqsajxuwqkbiqjtqmd
product=2.27773e-56 log=-128.122 rkkrtbkyvxrlcjrkurne
product=8.98141e-43 log=-96.816  sllsuclzwysmdkslvsof
product=7.42025e-57 log=-129.243 tmmtvdmaxztneltmwtpg
product=6.0041e-45  log=-101.824 unnuwenbyauofmunxuqh
product=7.90349e-53 log=-119.97  voovxfoczbvpgnvoyvri
product=5.10408e-62 log=-141.13  wppwygpdacwqhowpzwsj
product=3.1834e-69  log=-157.72  xqqxzhqebdxripxqaxtk
product=5.26779e-49 log=-111.165 yrryairfceysjqyrbyul
product=3.82049e-61 log=-139.117 zsszbjsgdfztkrzsczvm
product=4.59541e-32 log=-72.1577 attackthegaulsatdawn
product=3.90715e-53 log=-120.674 buubdluifhbvmtbuebxo
product=6.96003e-65 log=-147.728 cvvcemvjgicwnucvfcyp
product=2.09647e-66 log=-151.23  dwwdfnwkhjdxovdwgdzq
product=1.66271e-34 log=-77.7794 exxegoxlikeypwexhear
product=7.40958e-65 log=-147.665 fyyfhpymjlfzqxfyifbs
product=2.26381e-56 log=-128.128 gzzgiqznkmgarygzjgct
product=1.10932e-50 log=-115.026 haahjraolnhbszhakhdu
</pre>
<p>On the first column, we have scores yielded by the product formula, the second by the log-sum, and finally, the corresponding tentative decryption. In this form, we must scan for the maximum probability (or maximum log-sum) to discover the best candidate. It shows the correctly decrypted text at position <img src='http://s0.wp.com/latex.php?latex=k%3D17&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=17' title='k=17' class='latex' /> (and indeed <img src='http://s0.wp.com/latex.php?latex=17%2B9+%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='17+9 &#92;equiv 0' title='17+9 &#92;equiv 0' class='latex' /> (mod 26), it&#8217;s the correct inverse&mdash;and it&#8217;s not like the key in itself has any value). It seems to be working correctly.</p>
<p>It&#8217;s easier if we sort it out:</p>
<pre class="brush: bash; title: ; notranslate">
#!/usr/bin/env bash

key=9
message=&quot;attack the gauls at dawn&quot;

for ((k=0;k&lt;25;k++))
do
    cesar $k $(cesar $key &quot;$message&quot;) 
done | compute-score matrix.txt \
     | sort --numeric-sort \
            --reverse \
            --field-separator '=' \
            --key 3
</pre>
<p>and now:</p>
<pre class="brush: plain; title: ; notranslate">
product=4.59541e-32 log=-72.1577 attackthegaulsatdawn
product=1.66271e-34 log=-77.7794 exxegoxlikeypwexhear
product=1.08628e-38 log=-87.4155 leelnvesprlfwdleolhy
product=8.98141e-43 log=-96.816  sllsuclzwysmdkslvsof
product=6.0041e-45  log=-101.824 unnuwenbyauofmunxuqh
product=5.26779e-49 log=-111.165 yrryairfceysjqyrbyul
product=1.46081e-50 log=-114.75  ohhoqyhvsuoizgohrokb
product=1.10932e-50 log=-115.026 haahjraolnhbszhakhdu
product=4.27383e-51 log=-115.979 piiprziwtvpjahpisplc
product=7.90349e-53 log=-119.97  voovxfoczbvpgnvoyvri
product=3.90715e-53 log=-120.674 buubdluifhbvmtbuebxo
product=5.62482e-54 log=-122.612 nggnpxgurtnhyfngqnja
product=2.74988e-54 log=-123.328 mffmowftqsmgxemfpmiz
product=1.10597e-54 log=-124.239 kddkmudroqkevckdnkgx
product=2.27773e-56 log=-128.122 rkkrtbkyvxrlcjrkurne
product=2.26381e-56 log=-128.128 gzzgiqznkmgarygzjgct
product=7.42025e-57 log=-129.243 tmmtvdmaxztneltmwtpg
product=3.82049e-61 log=-139.117 zsszbjsgdfztkrzsczvm
product=5.10408e-62 log=-141.13  wppwygpdacwqhowpzwsj
product=7.40958e-65 log=-147.665 fyyfhpymjlfzqxfyifbs
product=6.96003e-65 log=-147.728 cvvcemvjgicwnucvfcyp
product=2.09647e-66 log=-151.23  dwwdfnwkhjdxovdwgdzq
product=7.7064e-69  log=-156.836 jccjltcqnpjdubjcmjfw
product=3.1834e-69  log=-157.72  xqqxzhqebdxripxqaxtk
product=6.22185e-75 log=-170.866 qjjqsajxuwqkbiqjtqmd
</pre>
<p>As we go down the sorted list, we see that solutions become progressively exponentially less likely; at least as estimated by the language model. This means that we only have to have a look at the very first few to ascertain that the decoding has succeeded.</p>
<p>And this means we can use the language model (despite its needing refinement, no doubt) for other ciphers/algorithms. And maybe we should.</p>
<p><i>To Be continued&#8230;</i></a></p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/algorithms/'>algorithms</a>, <a href='http://hbfs.wordpress.com/category/bash-shell/'>Bash (Shell)</a>, <a href='http://hbfs.wordpress.com/category/cryptography/'>Cryptography</a> Tagged: <a href='http://hbfs.wordpress.com/tag/caesar-cipher/'>Caesar Cipher</a>, <a href='http://hbfs.wordpress.com/tag/markov-chains/'>Markov chains</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4540/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4540&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/04/23/breaking-caesars-cipher-part-iii/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>
	</item>
		<item>
		<title>Suggested Reading: Lost Cat</title>
		<link>http://hbfs.wordpress.com/2013/04/21/suggested-reading-lost-cat/</link>
		<comments>http://hbfs.wordpress.com/2013/04/21/suggested-reading-lost-cat/#comments</comments>
		<pubDate>Sun, 21 Apr 2013 05:28:55 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[Suggested Reading]]></category>
		<category><![CDATA[catcam]]></category>
		<category><![CDATA[Cats]]></category>
		<category><![CDATA[GPS]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4676</guid>
		<description><![CDATA[Caroline Paul, Wendy MacNaughton — Lost Cat: A True Story of Love, Desperation, and GPS Technology — Bloomsburt, 2013, 176 pp. ISBN 1608199770 This is a story of Tibia the cat that suddenly vanishes for five weeks before coming back. In full health, shiny coat, happy. Then vanishes again. Quickly, Caroline Paul catlover, couchridden following [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4676&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Caroline Paul, Wendy MacNaughton — <a href="http://www.amazon.com/gp/product/048620247X?ie=UTF8&amp;camp=1789&amp;creativeASIN=048620247X&amp;linkCode=xm2&amp;tag=hardbettfasts-20" target="_blank"><i>Lost Cat: A True Story of Love, Desperation, and GPS Technology</i></a> — Bloomsburt, 2013, 176 pp. ISBN 1608199770</p>
<div id="attachment_4678" class="wp-caption aligncenter" style="width: 150px"><a href="http://www.amazon.com/gp/product/1608199770?ie=UTF8&amp;camp=1789&amp;creativeASIN=1608199770&amp;linkCode=xm2&amp;tag=hardbettfasts-20"><img class="size-full wp-image-4678" alt="(Buy at Amazon.com)" src="http://hbfs.files.wordpress.com/2013/04/lost-cat.jpg?w=450"   /></a><p class="wp-caption-text">(Buy at Amazon.com)</p></div>
<p>This is a story of Tibia the cat that suddenly vanishes for five weeks before coming back. In full health, shiny coat, happy. Then vanishes again. Quickly, Caroline Paul catlover, couchridden following a plane crash, worries about Tibby, wondering where he was and why he didn&#8217;t come back sooner. Then begins a detective story: will a GPS and a camera attached to the cat reveal its whereabouts?</p>
<p><span id="more-4676"></span></p>
<p>The book isn&#8217;t about a somewhat banal event—a cat that does just what it wants—but about the anguish, treason, and desperation felt by a true cat lover. The story (that involves mystery, death, and a few crackpots) is told in a lively and quirky style, Caroline being always on the verge of breaking down, at times being hilarious (in a sad way, I suppose), all making it very hard to put the book down (I went through it in a little more than one hour!).</p>
<p>(I must say that originally, it was the GPS part that drew my attention (as I have forthcoming projects involving GPSes); not especially the cat part, but it turns out that, despite the title, the book as little to do with technology, and everything to do with the love of pets.)</p>
<p>Read more on the <a href="http://lostcatbook.com/" target="_blank">book&#8217;s website</a>.</p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/suggested-reading/'>Suggested Reading</a> Tagged: <a href='http://hbfs.wordpress.com/tag/catcam/'>catcam</a>, <a href='http://hbfs.wordpress.com/tag/cats/'>Cats</a>, <a href='http://hbfs.wordpress.com/tag/gps/'>GPS</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4676/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4676/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4676&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/04/21/suggested-reading-lost-cat/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/04/lost-cat.jpg" medium="image">
			<media:title type="html">(Buy at Amazon.com)</media:title>
		</media:content>
	</item>
		<item>
		<title>Breaking Caesar&#8217;s Cipher (Caesar&#8217;s Cipher, part II)</title>
		<link>http://hbfs.wordpress.com/2013/04/16/breaking-caesars-cipher-caesars-cipher-part-ii/</link>
		<comments>http://hbfs.wordpress.com/2013/04/16/breaking-caesars-cipher-caesars-cipher-part-ii/#comments</comments>
		<pubDate>Tue, 16 Apr 2013 20:04:58 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[data structures]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Caesar Cipher]]></category>
		<category><![CDATA[Markov chains]]></category>
		<category><![CDATA[Probability]]></category>
		<category><![CDATA[Transition Matrix]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4524</guid>
		<description><![CDATA[In the last installment of this series, we had a look at Caesar&#8217;s cipher, an absurdly simple encryption technique where the symmetric encryption only consists in shifting symbols places. While it&#8217;s ridiculously easy to break the cipher, even with pen-and-paper techniques, we ended up, last time, surmising that we should be able to crack the [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4524&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In the <a href="http://hbfs.wordpress.com/2013/04/02/caesars-cipher/" target="_blank">last installment</a> of this series, we had a look at <a href="http://en.wikipedia.org/wiki/Caesar_cipher" target="_blank">Caesar&#8217;s cipher</a>, an absurdly simple encryption technique where the symmetric encryption only consists in shifting symbols <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> places.</p>
<p><a href="http://hbfs.files.wordpress.com/2011/03/markov-chains.jpg"><img src="http://hbfs.files.wordpress.com/2011/03/markov-chains.jpg?w=220&#038;h=300" alt="markov-chains" width="220" height="300" class="aligncenter size-thumbnail wp-image-3291" /></a></p>
<p>While it&#8217;s ridiculously easy to break the cipher, even with pen-and-paper techniques, we ended up, last time, surmising that we should be able to crack the cipher automatically, without human intervention, if only we had a reasonable language model. This week, let us have a look at how we could build a very simple language model that does just that.</p>
<p><span id="more-4524"></span></p>
<p>There are a lot of ways of building language models. In our case, we&#8217;re only interested in having something that gives a higher score to text that looks like reasonable text and very low scores to text that looks like gibberish. One way to do so, is to use a somewhat classical framework where we estimate the probability that we observe a given text fragment given it&#8217;s supposed to be, say, English.</p>
<p>That is, we want an estimation of</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=P%28sentence%7CEnglish%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(sentence|English)' title='P(sentence|English)' class='latex' />.</p>
<p>While we cannot really estimate the above directly, we can use, as a proxy, the following:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+P%28sentence%7CEnglish%29+%5Cpropto+P%28+%5C%7Bs_i%5C%7D_%7Bi%3D1%7D%5En+%7C++English%29+%3D+%5Cprod_%7Bi%3D1%7D%5En+P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle P(sentence|English) &#92;propto P( &#92;{s_i&#92;}_{i=1}^n |  English) = &#92;prod_{i=1}^n P(s_i|s_{i-1})' title='&#92;displaystyle P(sentence|English) &#92;propto P( &#92;{s_i&#92;}_{i=1}^n |  English) = &#92;prod_{i=1}^n P(s_i|s_{i-1})' class='latex' /></p>
<p>with <img src='http://s0.wp.com/latex.php?latex=s_0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s_0' title='s_0' class='latex' /> having some default value, maybe &#8220;space.&#8221; That is, the model we will use is a <i>first order <a href="http://en.wikipedia.org/wiki/Markov_chain" target="_blank">Markov chain</a></i>, where the probability of the entire sentence is estimated as the product of the conditional probabilities <img src='http://s0.wp.com/latex.php?latex=P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(s_i|s_{i-1})' title='P(s_i|s_{i-1})' class='latex' />, the probability that <img src='http://s0.wp.com/latex.php?latex=s_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s_i' title='s_i' class='latex' /> is observed after <img src='http://s0.wp.com/latex.php?latex=s_%7Bi-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s_{i-1}' title='s_{i-1}' class='latex' />; the conditional probability that a given letter is observed after another. This leaves us with the much simpler task of estimating only <img src='http://s0.wp.com/latex.php?latex=P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(s_i|s_{i-1})' title='P(s_i|s_{i-1})' class='latex' />.</p>
<p align="center">*<br />*&emsp;*</p>
<p>To estimate <img src='http://s0.wp.com/latex.php?latex=P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(s_i|s_{i-1})' title='P(s_i|s_{i-1})' class='latex' />, we will simply construct a <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes%7Bn%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times{n}' title='n&#92;times{n}' class='latex' /> matrix <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is the number of distinct symbols in the language&mdash;let us say <img src='http://s0.wp.com/latex.php?latex=n%3D256&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=256' title='n=256' class='latex' />, and treat text as composed of bytes. Initially, all entries of <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> are set to 1 (not zero, in order to deal with rare, but not impossible, observations, as we will make clearer in a moment). Then we scan a <em>large</em> amount of text, reading characters one by one. The pseudo-code looks something like</p>
<p>
<img src='http://s0.wp.com/latex.php?latex=l+%5Cleftarrow+&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='l &#92;leftarrow ' title='l &#92;leftarrow ' class='latex' />&#8220;&nbsp;&#8221;<br />
while read s:<br />
    <img src='http://s0.wp.com/latex.php?latex=M_%7Bl%2Cs%7D+%5Cleftarrow+M_%7Bl%2Cs%7D%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M_{l,s} &#92;leftarrow M_{l,s}+1' title='M_{l,s} &#92;leftarrow M_{l,s}+1' class='latex' />
</p>
<p>We then normalize row-wise the matrix <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' />. Why row-wise? Because we transform each row <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i' title='i' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> to a probability distribution <img src='http://s0.wp.com/latex.php?latex=P%28successor%7Csymbol_i%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(successor|symbol_i)' title='P(successor|symbol_i)' class='latex' />. That is, we want:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Csum_%7Bs%5Cin%7B%7Dsuccessors%7DP%28s%7Csymbol_i%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;sum_{s&#92;in{}successors}P(s|symbol_i)=1' title='&#92;displaystyle &#92;sum_{s&#92;in{}successors}P(s|symbol_i)=1' class='latex' />,</p>
<p>where we account for all possible successors, and that all probabilities sum to exactly one. Which brings us back to the initial value of <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> for the entries of the un-normalized <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' />. We could have set <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> to zero initially, and after we fill it (by scanning the text), the entries that are still zero correspond not to impossible combinations, but only to <em>combinations that have not been observed</em> while filling the matrix. So it may be quite possible later on, when estimating the score of a new piece of text to encounter combinations not seen while &#8220;learning&#8221; the matrix. This would yield a very, very bad score for such a text. Well, a score of zero. We just want it to be unlikely, so we must settle for a small, but non-zero probability to prevent the product from being zero whenever we observe a combination not seen during the learning phase.</p>
<p align="center">*<br />*&emsp;*</p>
<p>Let us now have a look at the implementation of the estimation of <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' />.</p>
<pre class="brush: cpp; title: ; notranslate">
// compile with --std=c++11

#include &lt;iostream&gt;
#include &lt;vector&gt;
#include &lt;algorithm&gt;

#include &lt;configuration&gt;

int main()
 {
  std::vector&lt;std::vector&lt;int&gt;&gt; 
   matrix(charset_size, std::vector&lt;int&gt;(charset_size, 1));

  size_t nb_read;
  do
   {
    unsigned char page[page_size];
    std::cin.read((char*)page,page_size);
    nb_read=std::cin.gcount();

    int last=0;
    for (int i=0;i&lt;nb_read;i++)
     {
      unsigned char next=page[i];
      if ( (next&gt;=charset_min) &amp;&amp;
           (next&lt;=charset_max) )
       {
        next-=charset_min;
        matrix[last][next]++;
        last=next;
       }
      else
       ; // skip, out of charset we're
      // interested in
     }
   } while (nb_read==page_size);

  for (int i=0;i&lt;charset_size;i++)
   {
    size_t norm=0;
    for (int j=0;j&lt;charset_size;j++) norm+=matrix[i][j];

    for (int j=0;j&lt;charset_size-1;j++)
     std::cout &lt;&lt; matrix[i][j]/(double)norm &lt;&lt; ' ';
    std::cout &lt;&lt; matrix[i][charset_size-1]/(double)norm &lt;&lt; std::endl;
   }

  return 0;
 }
</pre>
<p>where the <tt>configuration</tt> file contains</p>
<pre class="brush: cpp; title: ; notranslate">
#ifndef __MODULE_CONFIGURATION__
#define __MODULE_CONFIGURATION__

const size_t page_size=1000000; // sure, why not?
const size_t charset_min=0;
const size_t charset_max=255;
const size_t charset_size=charset_max-charset_min+1;

#endif
 // __MODULE_CONFIGURATION__
</pre>
<p>&#8230;various details on how you want to deal with files and charsets. You can then invoke</p>
<pre class="brush: bash; title: ; notranslate">
bzcat blob.txt.bz2 | make-matrix &gt; matrix.txt
</pre>
<p>where <tt>blob.txt.bz2</tt> is a large text file assembled from files from the <a href="http://www.gutenberg.org/" target="_blank">Project Gutenberg</a>.</p>
<p>It&#8217;s interesting to look at the matrix <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> graphically. If we look at it as a histogram, we get something like</p>
<p><a href="http://hbfs.files.wordpress.com/2013/03/matrix_m_3d.png"><img src="http://hbfs.files.wordpress.com/2013/03/matrix_m_3d.png?w=300&#038;h=259" alt="matrix_m_3d" width="300" height="259" class="aligncenter size-medium wp-image-4532" /></a></p>
<p>If we look at it as a (log) density plot, we get something like</p>
<p><a href="http://hbfs.files.wordpress.com/2013/03/matrix_m_countours.png"><img src="http://hbfs.files.wordpress.com/2013/03/matrix_m_countours.png?w=300&#038;h=259" alt="matrix_m_countours" width="300" height="259" class="aligncenter size-medium wp-image-4534" /></a></p>
<p align="center">*<br />*&emsp;*</p>
<p>Evaluating scores, once the matrix <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> is computed (and normalized), is rather straightforward. The implementation of</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cprod_%7Bi%3D1%7D%5En+P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;prod_{i=1}^n P(s_i|s_{i-1})' title='&#92;displaystyle &#92;prod_{i=1}^n P(s_i|s_{i-1})' class='latex' /></p>
<p>translates into</p>
<pre class="brush: cpp; title: ; notranslate">
double product_score( std::string text,
                      const std::vector&lt;std::vector&lt;double&gt;&gt; &amp; matrix)
 {
  double product=1.0;
  int last=0;
  for (int i=0;i&lt;text.size();i++)
   {
    unsigned char next=text[i];
    if ( (next&gt;=charset_min) &amp;&amp;
         (next&lt;=charset_max) )
     {
      next-=charset_min;
      product*=matrix[last][next];
      last=next;
     }
   }

  return product;
 }
</pre>
<p>However, if the string is rather long, the above implementation will just give <i>zero</i>. To avoid the problem, we observe that</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Clog+%5Cleft%28%5Cprod_%7Bi%3D1%7D%5En+P%28s_i%7Cs_%7Bi-1%7D%29%5Cright%29+%3D+%5Csum_%7Bi%3D1%7D%5En+%5Clog+P%28s_i%7Cs_%7Bi-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;log &#92;left(&#92;prod_{i=1}^n P(s_i|s_{i-1})&#92;right) = &#92;sum_{i=1}^n &#92;log P(s_i|s_{i-1})' title='&#92;displaystyle &#92;log &#92;left(&#92;prod_{i=1}^n P(s_i|s_{i-1})&#92;right) = &#92;sum_{i=1}^n &#92;log P(s_i|s_{i-1})' class='latex' />,</p>
<p> an expression that should be more stable, numerically. We get:</p>
<pre class="brush: cpp; title: ; notranslate">
double log_score( std::string text,
                  const std::vector&lt;std::vector&lt;double&gt;&gt; &amp; matrix)
 {
  double sum_log=0;
  
  int last=0;
  for (int i=0;i&lt;text.size();i++)
   {
    unsigned char next=text[i];
    if ( (next&gt;=charset_min) &amp;&amp;
         (next&lt;=charset_max) )
     {
      next-=charset_min;
      sum_log+=std::log(matrix[last][next]);
      last=next;
     }
   }
  return sum_log;
 }
</pre>
<p>And we&#8217;re ready to assemble everything to form an estimator of how likely is the current piece of text to be actual, readable, English. Let us now apply this to the automatic breaking of Caesar&#8217;s cipher&#8230; next week.</p>
<p><i><a href="http://hbfs.wordpress.com/2013/04/23/breaking-caesars-cipher-part-iii/">To be continued</a>&#8230;</i></p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/algorithms/'>algorithms</a>, <a href='http://hbfs.wordpress.com/category/cryptography/'>Cryptography</a>, <a href='http://hbfs.wordpress.com/category/data-structures/'>data structures</a>, <a href='http://hbfs.wordpress.com/category/machine-learning/'>machine learning</a> Tagged: <a href='http://hbfs.wordpress.com/tag/caesar-cipher/'>Caesar Cipher</a>, <a href='http://hbfs.wordpress.com/tag/markov-chains/'>Markov chains</a>, <a href='http://hbfs.wordpress.com/tag/probability/'>Probability</a>, <a href='http://hbfs.wordpress.com/tag/transition-matrix/'>Transition Matrix</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4524/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4524&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/04/16/breaking-caesars-cipher-caesars-cipher-part-ii/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2011/03/markov-chains.jpg?w=220" medium="image">
			<media:title type="html">markov-chains</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/03/matrix_m_3d.png?w=300" medium="image">
			<media:title type="html">matrix_m_3d</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/03/matrix_m_countours.png?w=300" medium="image">
			<media:title type="html">matrix_m_countours</media:title>
		</media:content>
	</item>
		<item>
		<title>Building a Tree from a List in Linear Time (II)</title>
		<link>http://hbfs.wordpress.com/2013/04/09/building-a-tree-from-a-list-in-linear-time-ii/</link>
		<comments>http://hbfs.wordpress.com/2013/04/09/building-a-tree-from-a-list-in-linear-time-ii/#comments</comments>
		<pubDate>Tue, 09 Apr 2013 15:44:28 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[C-plus-plus]]></category>
		<category><![CDATA[data structures]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[balanced tree]]></category>
		<category><![CDATA[integer decomposition]]></category>
		<category><![CDATA[Tree]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4649</guid>
		<description><![CDATA[Quite a while ago, I proposed a linear time algorithm to construct trees from sorted lists. The algorithm relied on the segregation of data and internal nodes. This meant that for a list of data items, nodes were allocated (but only contained data; the others just contained pointers. While segregating structure and data makes sense [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4649&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Quite a while ago, I proposed a <a href="http://hbfs.wordpress.com/2012/01/03/building-a-balanced-tree-from-a-list-in-linear-time/" target="_blank">linear time algorithm</a> to construct trees from sorted lists. The algorithm relied on the segregation of data and internal nodes. This meant that for a list of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> data items, <img src='http://s0.wp.com/latex.php?latex=2n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2n-1' title='2n-1' class='latex' /> nodes were allocated (but only <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> contained data; the <img src='http://s0.wp.com/latex.php?latex=n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n-1' title='n-1' class='latex' /> others just contained pointers.</p>
<p><a href="http://commons.wikimedia.org/wiki/File:Taxus_wood.jpg"><img src="http://hbfs.files.wordpress.com/2013/04/wood.jpg?w=145&#038;h=150" alt="wood" width="145" height="150" class="aligncenter size-thumbnail wp-image-4652" /></a></p>
<p>While segregating structure and data makes sense in some cases (say, the index resides in memory but the leaves/data reside on disk), I found the solution somewhat unsatisfactory (but not <a href="https://www.youtube.com/watch?v=07So_lJQyqw" target="_blank">unacceptable</a>). So I gave the problem a little more thinking and I arrived at an algorithm that produces a tree with optimal average depth, with data in every node, in linear time and using at most <img src='http://s0.wp.com/latex.php?latex=%5CTheta%28%5Clg+n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Theta(&#92;lg n)' title='&#92;Theta(&#92;lg n)' class='latex' /> extra memory.</p>
<p><span id="more-4649"></span></p>
<p>The hard part to figure out with this algorithm is that you must create the tree starting at the root, do an in-order scan of the generated tree, but still scan the list from left to right. Turns out, however, that it&#8217;s not as complicated as it sounds.</p>
<p>The root (if the list is sorted) will contain list element number <img src='http://s0.wp.com/latex.php?latex=%5Clceil+n%2F2+%5Crceil&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lceil n/2 &#92;rceil' title='&#92;lceil n/2 &#92;rceil' class='latex' /> (C++: <tt>(n+1)/2</tt>), thus creating two sub-lists: one with elements from 0 to <tt>(n+1)/2-1</tt>, and the other with elements <tt>(n+1)/2+1</tt> to <tt>n-1</tt> (we assume that indexing starts at zero). The root of the left sub-list will be at the middle: again rounding, excluding the root, and creating two sub-lists. We split each list recursively until we have a degenerate list of one item; where we stop.</p>
<p>The best way of convincing oneself that the algorithm works properly is to implement it. Here I will abstract the copying from the list (or array-list, which would be more convenient). The minimal code would look like this:</p>
<pre class="brush: cpp; title: ; notranslate">
////////////////////////////////////////
typedef int (*pivot_function_t)(int,int);

////////////////////////////////////////
class tree_node
 {
  public:
    int x;
    tree_node *left, *right;

  tree_node( tree_node *_left, int _x, tree_node *_right)
   : x(_x),
     left(_left),
     right(_right) {}
};

////////////////////////////////////////
template &lt;pivot_function_t pivot&gt;
tree_node * make_tree(int l, int h, int &amp; src)
 {
  if (l==h)
   // degenerate list, a leaf!
   return new tree_node(nullptr,l,nullptr);
  else
   {
    int p=pivot(l,h);

    return
     new tree_node( (p-1&gt;=l) ? make_tree&lt;pivot&gt;(l,p-1,src) : nullptr,
                    p, // here we would copy from list or array[src++]
                    (p+1&lt;=h) ? make_tree&lt;pivot&gt;(p+1,h,src) : nullptr);
   }
 }
</pre>
<p>where the pivot function returns an integer that tells were to split the list. Possible functions:</p>
<pre class="brush: cpp; title: ; notranslate">
//////////////////////////////
int pivot_round(int l, int h) { return (l+h+1)/2; }

//////////////////////////////
int pivot_random(int l, int h) { return l+(std::rand() % (h-l+1)); }

//////////////////////////////
int pivot_fibonacci(int l, int h)
 {
  static const int fibo[]=
   { 1,1,2,3,5,8,13,21,34,55,89,144,
     233,377,610,987,1597,2584,4181,
     6765,10946,17711,28657,46368,
     75025,121393,196418,317811,};

  int d=h-l;
  int i=0;
  while (fibo[i]&lt;d) i++;

  return l+fibo[i-1];
 }
</pre>
<p>The <tt>pivot_round</tt> computes what we want. The others are given for comparison, with <tt>pivot_random</tt> being maybe indicative of a tree constructed from random insertions over time&mdash;so it might not be as stupid as it seems at first glance. The <tt>pivot_fibonacci</tt> is rather fanciful here, but, eh, why not.</p>
<p>To benchmark the quality of the tree, we will use the average node depth. First, we must find the <em>optimal</em> average depth for a tree with <img src='http://s0.wp.com/latex.php?latex=n+%5Csim+2%5Em%2Bk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n &#92;sim 2^m+k' title='n &#92;sim 2^m+k' class='latex' /> nodes. With a bit of algebra, we find that the average depth <img src='http://s0.wp.com/latex.php?latex=%5Cbar%7Bd%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;bar{d}' title='&#92;bar{d}' class='latex' /> is given by:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbar%7Bd%7D%3D%5Cfrac%7B2%5Em%28m-1%29%2B1%2B%28k%2B1%29%28m%2B1%29%7D%7B2%5Em%2Bk%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;bar{d}=&#92;frac{2^m(m-1)+1+(k+1)(m+1)}{2^m+k}' title='&#92;displaystyle &#92;bar{d}=&#92;frac{2^m(m-1)+1+(k+1)(m+1)}{2^m+k}' class='latex' />.</p>
<p>(Don&#8217;t worry, I have an upcoming post that explains how to derive this expression.) With a formula for <img src='http://s0.wp.com/latex.php?latex=%5Cbar%7Bd%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;bar{d}' title='&#92;bar{d}' class='latex' />, we can launch experiments creating trees with arbitrary number of nodes and compare performance in terms of average depth vs the optimal average depth.</p>
<p>With <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> running from 1 to 1023 (a tree with zero nodes isn&#8217;t very interesting!), we obtain the following graph:</p>
<p><a href="http://hbfs.files.wordpress.com/2013/04/compared.png"><img src="http://hbfs.files.wordpress.com/2013/04/compared.png?w=300&#038;h=217" alt="compared" width="300" height="217" class="aligncenter size-medium wp-image-4653" /></a></p>
<p>The black (optimal) and green (center, or rounded split) overlap perfectly because the center split yields optimal average depth. The Fibonacci split isn&#8217;t as bad as I expected, but it yields trees with deeper branches that strictly needed, but not much (maybe 1.6 more?). The random pivot, shown in red, fluctuates wildly, and so is displayed with a moving average (with a window of 100). It does a lot worse than the other methods, but, again, not as much as I would have thought at first.</p>
<p align="center">*<br />*&emsp;*</p>
<p>Let us now have a look at what shape the trees actually are. If you know the average depth is optimal you kind of expect the trees to look all nifty like this one:</p>
<p><a href="http://hbfs.files.wordpress.com/2013/04/tree-15.jpg"><img src="http://hbfs.files.wordpress.com/2013/04/tree-15.jpg?w=300&#038;h=184" alt="tree-15" width="300" height="184" class="aligncenter size-medium wp-image-4654" /></a></p>
<p>(The numbers in the nodes represent the index of the element in the original sorted list.) That&#8217;s the base case tree with <img src='http://s0.wp.com/latex.php?latex=n%5Csim+2%5Em-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;sim 2^m-1' title='n&#92;sim 2^m-1' class='latex' /> (and that indeed we find with the center split: all trees shown here are svg made from the GraphViz representation of the trees generated by the code above). What if we have an &#8220;inconvenient&#8221; <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />? Say <img src='http://s0.wp.com/latex.php?latex=n%3D12&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=12' title='n=12' class='latex' />? Well, it&#8217;s not what you expect:</p>
<p><a href="http://hbfs.files.wordpress.com/2013/04/tree-12.jpg"><img src="http://hbfs.files.wordpress.com/2013/04/tree-12.jpg?w=300&#038;h=300" alt="tree-12" width="300" height="300" class="aligncenter size-medium wp-image-4655" /></a></p>
<p>but the average depth is still optimal.</p>
<p align="center">*<br />*&emsp;*</p>
<p>So, that&#8217;s the <img src='http://s0.wp.com/latex.php?latex=%5CTheta%28%5Clg+n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Theta(&#92;lg n)' title='&#92;Theta(&#92;lg n)' class='latex' /> extra memory I was speaking of at the beginning? Simply the stack. Unlike other recursive algorithms such as Quicksort where you can split the current list in two very uneven sub-lists, picking the pivot in the center creates the most even sub-lists possible, and that, in turn, guarantees that the recursion depth is <img src='http://s0.wp.com/latex.php?latex=%5CTheta%28%5Clg+n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Theta(&#92;lg n)' title='&#92;Theta(&#92;lg n)' class='latex' /> (or, &#8220;exactly <img src='http://s0.wp.com/latex.php?latex=%5Clg+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lg n' title='&#92;lg n' class='latex' />&#8220;). There&#8217;s a hidden constant (for whatever the storage of one stack frame actually is), but the &#8220;big O&#8221; notation lets us get away with it.</p>
<p align="center">*<br />*&emsp;*</p>
<p>And there, you have a linear time (each node/list item is visited exactly once) with <img src='http://s0.wp.com/latex.php?latex=%5CTheta%28%5Clg+n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Theta(&#92;lg n)' title='&#92;Theta(&#92;lg n)' class='latex' /> extra storage.</p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/algorithms/'>algorithms</a>, <a href='http://hbfs.wordpress.com/category/c-plus-plus/'>C-plus-plus</a>, <a href='http://hbfs.wordpress.com/category/data-structures/'>data structures</a>, <a href='http://hbfs.wordpress.com/category/programming/'>programming</a> Tagged: <a href='http://hbfs.wordpress.com/tag/balanced-tree/'>balanced tree</a>, <a href='http://hbfs.wordpress.com/tag/integer-decomposition/'>integer decomposition</a>, <a href='http://hbfs.wordpress.com/tag/tree/'>Tree</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4649/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4649/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4649&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/04/09/building-a-tree-from-a-list-in-linear-time-ii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/04/wood.jpg?w=145" medium="image">
			<media:title type="html">wood</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/04/compared.png?w=300" medium="image">
			<media:title type="html">compared</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/04/tree-15.jpg?w=300" medium="image">
			<media:title type="html">tree-15</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/04/tree-12.jpg?w=300" medium="image">
			<media:title type="html">tree-12</media:title>
		</media:content>
	</item>
		<item>
		<title>Caesar&#8217;s Cipher</title>
		<link>http://hbfs.wordpress.com/2013/04/02/caesars-cipher/</link>
		<comments>http://hbfs.wordpress.com/2013/04/02/caesars-cipher/#comments</comments>
		<pubDate>Tue, 02 Apr 2013 14:04:06 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Breaking Ciphers]]></category>
		<category><![CDATA[Caesar]]></category>
		<category><![CDATA[Caesar Cipher]]></category>
		<category><![CDATA[Cipher]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4509</guid>
		<description><![CDATA[Julius Caesar, presumably sometimes during the war in Gaul, according to Suetonius, used a simple cipher to ensure the privacy of his communications. Caesar&#8217;s method can hardly be considered anything close to secure, but it&#8217;s still worthwhile to have a look at how you can implement it, and break it, mostly because it&#8217;s one of [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4509&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Julius_Caesar" target="_blank">Julius Caesar</a>, presumably sometimes during the war in Gaul, according to <a href="http://www.gutenberg.org/files/6400/6400-h/6400-h.htm" target="_blank">Suetonius</a>, used a <i>simple</i> cipher to ensure the privacy of his communications.</p>
<p><a href="http://www.amazon.com/gp/product/B004J43DWK/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B004J43DWK&amp;linkCode=as2&amp;tag=hardbettfasts-20"><img src="http://hbfs.files.wordpress.com/2013/02/cipher-coin.jpg?w=148&#038;h=150" alt="cipher-coin" width="148" height="150" class="aligncenter size-thumbnail wp-image-4512" /></a></p>
<p>Caesar&#8217;s method can hardly be considered anything close to secure, but it&#8217;s still worthwhile to have a look at how you can implement it, and break it, mostly because it&#8217;s one of the simplest <a href="http://en.wikipedia.org/wiki/Substitution_cipher" target="_blank">substitution ciphers</a>.</p>
<p><span id="more-4509"></span></p>
<p>Caesar&#8217;s cipher is a symmetric cipher, where the encryption and decryption keys are the same, and is very simple. Let <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> be the key, an integer. Let <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx_i%5C%7D_%7Bi%3D1%7D%5En&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x_i&#92;}_{i=1}^n' title='&#92;{x_i&#92;}_{i=1}^n' class='latex' />, the sequence of letters to encode. Let us suppose for now that the alphabet is limited to A to Z, with no punctuation, nor diacritics, nor numbers.</p>
<p>The encryption is given by:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=c_i+%3D+%28x_i+%2B+k%29%7E%5Cmathrm%7Bmod%7D%7E26&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c_i = (x_i + k)~&#92;mathrm{mod}~26' title='c_i = (x_i + k)~&#92;mathrm{mod}~26' class='latex' />.</p>
<p>That is, each letter is &#8220;shifted&#8221; <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> positions: A shifted 3 positions becomes D. The decryption is the inverse operation:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=x_i+%3D+%28c_i+-+k%29%7E%5Cmathrm%7Bmod%7D%7E26&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_i = (c_i - k)~&#92;mathrm{mod}~26' title='x_i = (c_i - k)~&#92;mathrm{mod}~26' class='latex' />,</p>
<p>where <img src='http://s0.wp.com/latex.php?latex=%5Cmathrm%7Bmod%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathrm{mod}' title='&#92;mathrm{mod}' class='latex' /> always return a non-negative number.</p>
<p>If you encrypt a piece of text with <img src='http://s0.wp.com/latex.php?latex=k%3D3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=3' title='k=3' class='latex' />:</p>
<p align="center">Tomorrow, we attack the Gauls</p>
<p>becomes:</p>
<p align="center"><tt>tomorrowweattackthegauls</tt></p>
<p>before getting fed to the algorithm, which transforms it into:</p>
<p align="center"><tt>wrpru urzzh dwwdf nwkhj dxov</tt></p>
<p>which may superficially look completely and securely encrypted.</p>
<p>But stumbling into a cryptogram, say</p>
<p align="center"><tt>igttu zgzzg iqmga ryutc kyzhg tq</tt>,</p>
<p>it suffices to try (at most) 25 keys:</p>
<p align="center">
 <img src='http://s0.wp.com/latex.php?latex=k%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=1' title='k=1' class='latex' /> : <tt>hfsst yfyyf hplfz qxtsb jxygf sp</tt><br />
 <img src='http://s0.wp.com/latex.php?latex=k%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=2' title='k=2' class='latex' /> : <tt>gerrs xexxe gokey pwsra iwxfe ro</tt><br />
 <img src='http://s0.wp.com/latex.php?latex=k%3D3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=3' title='k=3' class='latex' /> : <tt>fdqqr wdwwd fnjdx ovrqz hvwed qn</tt><br />
 <img src='http://s0.wp.com/latex.php?latex=k%3D4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=4' title='k=4' class='latex' /> : <tt>ecppq vcvvc emicw nuqpy guvdc pm</tt><br />
 <img src='http://s0.wp.com/latex.php?latex=k%3D5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=5' title='k=5' class='latex' /> : <tt>dboop ubuub dlhbv mtpox ftucb ol</tt><br />
 <img src='http://s0.wp.com/latex.php?latex=k%3D6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=6' title='k=6' class='latex' /> : <tt>canno tatta ckgau lsonw estba nk</tt><br />
 <img src='http://s0.wp.com/latex.php?latex=k%3D7&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=7' title='k=7' class='latex' /> : <tt>bzmmn szssz bjfzt krnmv drsaz mj</tt>
</p>
<p>and <img src='http://s0.wp.com/latex.php?latex=k%3D6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=6' title='k=6' class='latex' /> yields <tt>cannot attack Gauls on west bank</tt>, which solves the cryptogram unambiguously.</p>
<p align="center">*<br />*&emsp;*</p>
<p>Caesar&#8217;s cipher is weak, and even with a pen-and-paper solving strategy, it cannot be very long before one cracks the code. For one thing, you don&#8217;t have to decipher the whole message before figuring out that the key you&#8217;re trying&#8217;s only yielding gibberish, and you don&#8217;t even have to start at the beginning of the message.</p>
<p>Still, if one wants to automate the process (let&#8217;s suppose we want to), the only missing piece is a language model that gives a score to each clear-text candidate, and allows us to pick the most probable one. Maybe we could do just that in a follow-up entry?</p>
<p align="center">*<br />*&emsp;*</p>
<p>So let us now look at the implementation. It assumes that the letters are lowercase. A direct implementation would give something like:</p>
<pre class="brush: cpp; title: ; notranslate">
////////////////////////////////////////
//
// Encrypts string (in-place) using
// Ceasar's method
//
void encrypt( int key,
              std::string &amp; message )
 {
  for (size_t i=0;i&lt;message.size();i++)
   message[i]='a'+(ord(message[i])+key) % 26;
 }


////////////////////////////////////////
//
// Decripts string (in-place) using
// Ceasar's medthod
//
void decrypt( int key,
              std::string &amp; message )
 {
  for (size_t i=0;i&lt;message.size();i++)
   message[i]='a'+(ord(message[i])+(26-key)) % 26;
 }
</pre>
<p>where <tt>ord</tt> is an helper function that returns 0 for <tt>a</tt> and 25 for <tt>z</tt>. There&#8217;s just a small complication: as C++&#8217;s modulo <tt>%</tt> returns negative remainder on negative numbers, it is necessary to compensate for this by adding 26, safely performing the arithmetic on non-negative numbers.</p>
<p align="center">*<br />*&emsp;*</p>
<p>We will be back on (simple) language models that will help us automate the process of cracking Caesar&#8217;s cipher.</p>
<p><i><a href="http://hbfs.wordpress.com/2013/04/16/breaking-caesars-cipher-caesars-cipher-part-ii/" target="_blank">To be continued&#8230;</a></i></p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/algorithms/'>algorithms</a>, <a href='http://hbfs.wordpress.com/category/cryptography/'>Cryptography</a> Tagged: <a href='http://hbfs.wordpress.com/tag/breaking-ciphers/'>Breaking Ciphers</a>, <a href='http://hbfs.wordpress.com/tag/caesar/'>Caesar</a>, <a href='http://hbfs.wordpress.com/tag/caesar-cipher/'>Caesar Cipher</a>, <a href='http://hbfs.wordpress.com/tag/cipher/'>Cipher</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4509/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4509&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/04/02/caesars-cipher/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/02/cipher-coin.jpg?w=148" medium="image">
			<media:title type="html">cipher-coin</media:title>
		</media:content>
	</item>
		<item>
		<title>A Special Case&#8230;</title>
		<link>http://hbfs.wordpress.com/2013/03/26/a-special-case/</link>
		<comments>http://hbfs.wordpress.com/2013/03/26/a-special-case/#comments</comments>
		<pubDate>Tue, 26 Mar 2013 14:24:48 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[algebra]]></category>
		<category><![CDATA[Ceiling]]></category>
		<category><![CDATA[Floor]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4441</guid>
		<description><![CDATA[Expressions with floors and ceilings ( and ) are usually troublesome to work with. There are cases where you can essentially remove them by a change of variables. Turns out, one form that regularly comes up in my calculations is , and it bugged me a while before I figured out the right way of [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4441&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Expressions with floors and ceilings (<img src='http://s0.wp.com/latex.php?latex=%5Clfloor+x+%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor x &#92;rfloor' title='&#92;lfloor x &#92;rfloor' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Clceil+y+%5Crceil&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lceil y &#92;rceil' title='&#92;lceil y &#92;rceil' class='latex' />) are usually troublesome to work with. There are cases where you can essentially remove them by a change of variables.
<p><a href="http://commons.wikimedia.org/wiki/File:Illustration_of_revolving_stairs_%28U.S._Patent_25,076_issued_to_Nathan_Ames,_9_August_1859%29.jpg" target="_blank"><img src="http://hbfs.files.wordpress.com/2012/12/stairs.jpg?w=450" alt="stairs"   class="aligncenter size-full wp-image-4445" /></a></p>
<p>Turns out, one form that regularly comes up in my calculations is <img src='http://s0.wp.com/latex.php?latex=%5Clfloor+%5Clg+x+%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor &#92;lg x &#92;rfloor' title='&#92;lfloor &#92;lg x &#92;rfloor' class='latex' />, and it bugged me a while before I figured out the right way of getting rid of them (sometimes).</p>
<p><span id="more-4441"></span></p>
<p>Let us start by an abstract case:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+h%28m%29%3D%5Csum_%7Bi%3D1%7D%5Em+%5Clfloor+%5Clg+i+%5Crfloor+f%28i%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle h(m)=&#92;sum_{i=1}^m &#92;lfloor &#92;lg i &#92;rfloor f(i)' title='&#92;displaystyle h(m)=&#92;sum_{i=1}^m &#92;lfloor &#92;lg i &#92;rfloor f(i)' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=m%5Csim+2%5Ek&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m&#92;sim 2^k' title='m&#92;sim 2^k' class='latex' /> (for now), and where <img src='http://s0.wp.com/latex.php?latex=f%28i%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(i)' title='f(i)' class='latex' /> is some function that depends only on <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i' title='i' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Clfloor+%5Clg+i+%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor &#92;lg i &#92;rfloor' title='&#92;lfloor &#92;lg i &#92;rfloor' class='latex' /> grows by one each time <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i' title='i' class='latex' /> doubles in magnitude. We can therefore write the preceding as</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+h%28m%29%3D%5Csum_%7Bj%3D0%7D%5Ek+j+%5Csum_%7Bi%3D2%5Ej%7D%5E%7B2%5E%7Bj%2B1%7D-1%7D+f%28i%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle h(m)=&#92;sum_{j=0}^k j &#92;sum_{i=2^j}^{2^{j+1}-1} f(i)' title='&#92;displaystyle h(m)=&#92;sum_{j=0}^k j &#92;sum_{i=2^j}^{2^{j+1}-1} f(i)' class='latex' /></p>
<p>where you can verify that <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i' title='i' class='latex' /> still progresses from <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> as before, but we now have a different grouping what allows us to replace <img src='http://s0.wp.com/latex.php?latex=%5Clfloor+%5Clg+i+%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor &#92;lg i &#92;rfloor' title='&#92;lfloor &#92;lg i &#92;rfloor' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j' title='j' class='latex' />, and factor it out. We can now simplify the sums depending on <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i' title='i' class='latex' /> as a function of <img src='http://s0.wp.com/latex.php?latex=j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j' title='j' class='latex' /> as</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+h%28m%29%3D%5Csum_%7Bj%3D0%7D%5Ek+j+g%28j%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle h(m)=&#92;sum_{j=0}^k j g(j)' title='&#92;displaystyle h(m)=&#92;sum_{j=0}^k j g(j)' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=g%28j%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(j)' title='g(j)' class='latex' /> is an expression depending only on <img src='http://s0.wp.com/latex.php?latex=j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j' title='j' class='latex' />. Finally, you can obtain a simplified expression for <img src='http://s0.wp.com/latex.php?latex=h%28m%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(m)' title='h(m)' class='latex' />, the initial goal.</p>
<p align="center">*<br />*&emsp;*</p>
<p>So, where does the bounds come from? As I said earlier, <img src='http://s0.wp.com/latex.php?latex=%5Clfloor+%5Clg+i+%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor &#92;lg i &#92;rfloor' title='&#92;lfloor &#92;lg i &#92;rfloor' class='latex' /> has natural boundaries of the form <img src='http://s0.wp.com/latex.php?latex=2%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^j' title='2^j' class='latex' />, with <img src='http://s0.wp.com/latex.php?latex=j%3D0%2C1%2C2%2C%5Cldots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j=0,1,2,&#92;ldots' title='j=0,1,2,&#92;ldots' class='latex' />. Indeed:</p>
<table align="center" border="1px">
<tr>
<td><img src='http://s0.wp.com/latex.php?latex=j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j' title='j' class='latex' /></td>
<td><img src='http://s0.wp.com/latex.php?latex=2%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^j' title='2^j' class='latex' /></td>
<td><img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bl%3D0%7D%5Ej+2%5El&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{l=0}^j 2^l' title='&#92;sum_{l=0}^j 2^l' class='latex' /></td>
</td>
<tr>
<td align="center">0</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="center">1</td>
<td align="center">2</td>
<td align="center">3</td>
</tr>
<tr>
<td align="center">2</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="center">3</td>
<td align="center">8</td>
<td align="center">15</td>
</tr>
<tr>
<td align="center">4</td>
<td align="center">16</td>
<td align="center">31</td>
</tr>
<tr>
<td align="center">&#8230;</td>
<td align="center">&#8230;</td>
<td align="center">&#8230;</td>
</tr>
</table>
<p>The generator of the series at the far right of the table isn&#8217;t very hard to guess: <img src='http://s0.wp.com/latex.php?latex=2%5E%7Bj%2B1%7D-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^{j+1}-1' title='2^{j+1}-1' class='latex' />.</p>
<p>It is a special case of a <a href="http://en.wikipedia.org/wiki/Geometric_series" target="_blank">geometric series</a>,</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Csum_%7Bk%3D0%7D%5E%7Bn-1%7Da+r%5Ek%3Da%5Cfrac%7B1-r%5En%7D%7B1-r%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle &#92;sum_{k=0}^{n-1}a r^k=a&#92;frac{1-r^n}{1-r}' title='&#92;displaystyle &#92;sum_{k=0}^{n-1}a r^k=a&#92;frac{1-r^n}{1-r}' class='latex' /></p>
<p>With <img src='http://s0.wp.com/latex.php?latex=a%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a=1' title='a=1' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=r%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=2' title='r=2' class='latex' />, it does simplify to the expected:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+a%5Cfrac%7B1-r%5En%7D%7B1-r%7D%3D%281%29%5Cfrac%7B1-2%5E%7Bn%2B1%7D%7D%7B1-2%7D%3D%5Cfrac%7B-2%5E%7Bn%2B1%7D%2B1%7D%7B-1%7D%3D2%5E%7Bn%2B1%7D-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle a&#92;frac{1-r^n}{1-r}=(1)&#92;frac{1-2^{n+1}}{1-2}=&#92;frac{-2^{n+1}+1}{-1}=2^{n+1}-1' title='&#92;displaystyle a&#92;frac{1-r^n}{1-r}=(1)&#92;frac{1-2^{n+1}}{1-2}=&#92;frac{-2^{n+1}+1}{-1}=2^{n+1}-1' class='latex' /></p>
<p align="center">*<br />*&emsp;*</p>
<p>So by knowing how to sum powers of two (or any other constant, for that matter), we could remove <img src='http://s0.wp.com/latex.php?latex=%5Clfloor+%5Clg+i+%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor &#92;lg i &#92;rfloor' title='&#92;lfloor &#92;lg i &#92;rfloor' class='latex' /> and get a simpler expression. While the change of variable and adapting the sums&#8217; boundaries helps a great deal, it is the fact that the base of the logarithm is an integer that helps quite a lot. Indeed, can you see what&#8217;s the problem is the base of the logarithm is now <img src='http://s0.wp.com/latex.php?latex=e&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e' title='e' class='latex' />?</p>
<p>You guessed it right. Now, instead than having boundaries at integers of the form <img src='http://s0.wp.com/latex.php?latex=2%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^j' title='2^j' class='latex' />, we must find where the function changes between some <img src='http://s0.wp.com/latex.php?latex=e%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^j' title='e^j' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=e%5E%7Bj%2B1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^{j+1}' title='e^{j+1}' class='latex' />, which can be quite messier.</p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/algorithms/'>algorithms</a>, <a href='http://hbfs.wordpress.com/category/mathematics/'>Mathematics</a> Tagged: <a href='http://hbfs.wordpress.com/tag/algebra/'>algebra</a>, <a href='http://hbfs.wordpress.com/tag/ceiling/'>Ceiling</a>, <a href='http://hbfs.wordpress.com/tag/floor/'>Floor</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4441/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4441/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4441&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/03/26/a-special-case/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2012/12/stairs.jpg" medium="image">
			<media:title type="html">stairs</media:title>
		</media:content>
	</item>
		<item>
		<title>Shallow Constitude</title>
		<link>http://hbfs.wordpress.com/2013/03/19/shallow-constitude/</link>
		<comments>http://hbfs.wordpress.com/2013/03/19/shallow-constitude/#comments</comments>
		<pubDate>Tue, 19 Mar 2013 13:44:19 +0000</pubDate>
		<dc:creator>Steven Pigeon</dc:creator>
				<category><![CDATA[C]]></category>
		<category><![CDATA[C-plus-plus]]></category>
		<category><![CDATA[C99]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[const]]></category>

		<guid isPermaLink="false">http://hbfs.wordpress.com/?p=4497</guid>
		<description><![CDATA[In programming languages, there are constructs that are of little pragmatic importance (that is, they do not really affect how code behaves or what code is generated by the compiler) but are of great &#8220;social&#8221; importance as they instruct the programmer as to what contract the code complies to. One of those constructs in C++ [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4497&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In programming languages, there are constructs that are of little pragmatic importance (that is, they do not really affect how code behaves or what code is generated by the compiler) but are of great &#8220;social&#8221; importance as they instruct the programmer as to what contract the code complies to.</p>
<p><a href="http://commons.wikimedia.org/wiki/File:Padlock-light-silver.svg"><img src="http://hbfs.files.wordpress.com/2013/02/200px-padlock-light-silver-svg.png?w=150&#038;h=150" alt="200px-Padlock-light-silver.svg" width="150" height="150" class="aligncenter size-thumbnail wp-image-4500" /></a></p>
<p>One of those constructs in C++ is the <tt>const</tt> (and other access modifiers) that explicitly states to the programmer that this function argument will be treated as read-only, and that it&#8217;s safe to pass your data to it, it won&#8217;t be modified. But is it all but <a href="http://en.wikipedia.org/wiki/Security_theater" target="_blank"><i>security theater</i></a>?</p>
<p><span id="more-4497"></span></p>
<p>Let us start with a C-style program (that may be C++):</p>
<pre class="brush: cpp; title: ; notranslate">
void le_troll_func(const char * buffer)
 {
  for (int i=0;buffer[i];i++)
   buffer[0]=0;
 }
</pre>
<p>This normally results in some compile-time error:</p>
<pre class="brush: plain; title: ; notranslate">
const-test.cpp: In function 'void le_troll_func(const char*)':
const-test.cpp:24:14: error: assignment of read-only location '* buffer'
</pre>
<p>The first observation is that despite <tt>const</tt>, one can discard constness using either C-style or C++-style casts. In C, we can easily re-write it as:</p>
<pre class="brush: cpp; title: ; notranslate">
void le_troll_func(const char * buffer)
 {
  for (int i=0;buffer[i];i++)
   ((char*)buffer)[0]=0;
 }
</pre>
<p>&#8230;which compiles without so much as a warning. In C++, we can replace the cast by a <tt>const_cast</tt>:</p>
<pre class="brush: cpp; title: ; notranslate">
void le_troll_func(const char * buffer)
 {
  for (int i=0;buffer[i];i++)
   const_cast&lt;char*&gt;(buffer)[0]=0;
 }
</pre>
<p>Even the C++ method modifier <tt>const</tt> is weak. Consider:</p>
<pre class="brush: cpp; title: ; notranslate">
class inner_thingie
{
 public: int zoidberg; // why not?
};

class thingie
 {
 private:

  int x;
  inner_thingie * z;

  void le_troll_function() const { z-&gt;zoidberg=3; }

  void on_something_else() const { x=3; }
  

  thingie()
   : x(0), z(new inner_thingie)
  { }
 };
</pre>
<p>Cause compilation error (yes, without &#8216;s&#8217;):</p>
<pre class="brush: plain; title: ; notranslate">
const-test.cpp: In member function 'void thingie::on_something_else() const':
const-test.cpp:15:38: error: assignment of member 'thingie::x' in read-only object
</pre>
<p>So, while <tt>on_something_else() const</tt> behaves as expected, <tt>le_troll_function() const</tt> clearly <i>does not</i>. Apparently, if <tt>z</tt> is <tt>const</tt>, the constness isn&#8217;t transitive and does not apply to what <tt>z</tt> points to. ISO 14882:2003 &sect;9.3.2.2 states that <tt>const</tt> on a function forces <tt>*this</tt> to be <tt>const</tt>, but does not imply it propagates, it is limited to <tt>this</tt>. tl;dr: <i>const is shallow</i>.</p>
<p align="center">*<br />*&emsp;*</p>
<p>If <tt>const</tt> doesn&#8217;t seem to enforce a great deal of constraints, is it still worth using? It is of little pragmatic importance&mdash;I have yet to find a compiler optimization that takes some real advantage of <tt>const</tt>&mdash;but if it is honored as it should be, <tt>const</tt> signs a contract between the function (class, library) and the user (the programmer). It helps distinguish functions that read-only data, and those that rewrite it. C (and C++) libraries have been extensively <tt>const</tt>ed when the keyword was introduced to the language (for it hasn&#8217;t been there forever) for this reason. So, yes, I think it&#8217;s still worth using <tt>const</tt>, even with its limitations.</p>
<p align="center">*<br />*&emsp;*</p>
<p>The <tt>const</tt> &#8220;social contract&#8221; of <tt>const</tt> isn&#8217;t very strong, and a programmer can easily de<tt>const</tt> data. But if a programmer does it deliberately, you&#8217;re to find him and taunt him. Even a second time.</p>
<br />Filed under: <a href='http://hbfs.wordpress.com/category/c/'>C</a>, <a href='http://hbfs.wordpress.com/category/c-plus-plus/'>C-plus-plus</a>, <a href='http://hbfs.wordpress.com/category/c99/'>C99</a>, <a href='http://hbfs.wordpress.com/category/hacks/'>hacks</a>, <a href='http://hbfs.wordpress.com/category/programming/'>programming</a> Tagged: <a href='http://hbfs.wordpress.com/tag/const/'>const</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbfs.wordpress.com/4497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbfs.wordpress.com/4497/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbfs.wordpress.com&#038;blog=4426521&#038;post=4497&#038;subd=hbfs&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbfs.wordpress.com/2013/03/19/shallow-constitude/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d3d9050d6870dcfaf7f207cd5ca2b50b?s=96&#38;d=identicon" medium="image">
			<media:title type="html">stevenpigeon</media:title>
		</media:content>

		<media:content url="http://hbfs.files.wordpress.com/2013/02/200px-padlock-light-silver-svg.png?w=150" medium="image">
			<media:title type="html">200px-Padlock-light-silver.svg</media:title>
		</media:content>
	</item>
	</channel>
</rss>
