23/04/2013
In the last installment of this series, we looked at Markov chains as a mean of estimating the likelihood of a given piece of text of actually being a message, written in English, rather than mere gibberish.
This week, we finally piece everything together to obtain a program to crack Caesar’s cipher without (much) human intervention.
Read the rest of this entry »
3 Comments |
algorithms, Bash (Shell), Cryptography | Tagged: Caesar Cipher, Markov chains |
Permalink
Posted by Steven Pigeon
16/04/2013
In the last installment of this series, we had a look at Caesar’s cipher, an absurdly simple encryption technique where the symmetric encryption only consists in shifting symbols
places.

While it’s ridiculously easy to break the cipher, even with pen-and-paper techniques, we ended up, last time, surmising that we should be able to crack the cipher automatically, without human intervention, if only we had a reasonable language model. This week, let us have a look at how we could build a very simple language model that does just that.
Read the rest of this entry »
4 Comments |
algorithms, Cryptography, data structures, machine learning | Tagged: Caesar Cipher, Markov chains, Probability, Transition Matrix |
Permalink
Posted by Steven Pigeon
09/04/2013
Quite a while ago, I proposed a linear time algorithm to construct trees from sorted lists. The algorithm relied on the segregation of data and internal nodes. This meant that for a list of
data items,
nodes were allocated (but only
contained data; the
others just contained pointers.

While segregating structure and data makes sense in some cases (say, the index resides in memory but the leaves/data reside on disk), I found the solution somewhat unsatisfactory (but not unacceptable). So I gave the problem a little more thinking and I arrived at an algorithm that produces a tree with optimal average depth, with data in every node, in linear time and using at most
extra memory.
Read the rest of this entry »
1 Comment |
algorithms, C-plus-plus, data structures, programming | Tagged: balanced tree, integer decomposition, Tree |
Permalink
Posted by Steven Pigeon
02/04/2013
Julius Caesar, presumably sometimes during the war in Gaul, according to Suetonius, used a simple cipher to ensure the privacy of his communications.

Caesar’s method can hardly be considered anything close to secure, but it’s still worthwhile to have a look at how you can implement it, and break it, mostly because it’s one of the simplest substitution ciphers.
Read the rest of this entry »
3 Comments |
algorithms, Cryptography | Tagged: Breaking Ciphers, Caesar, Caesar Cipher, Cipher |
Permalink
Posted by Steven Pigeon
12/03/2013
Last week we looked at an alternative series to compute
, and this week we will have a look at the computation of
. The usual series we learn in calculus textbook is given by

We can factorize the expression as
Read the rest of this entry »
Leave a Comment » |
algorithms, C, C-plus-plus, C99, Mathematics | Tagged: convergence, exp, series |
Permalink
Posted by Steven Pigeon
05/03/2013
Numerical methods are generally rather hard to get right because of error propagation due to the limited precision of floats (or even doubles). This seems to be especially true with methods involving series, where a usually large number of ever diminishing terms must added to attain something that looks like convergence. Even fairly simple sequences such as

may be complex to evaluate. First,
is cumbersome, and
becomes small exceedingly rapidly.
Read the rest of this entry »
2 Comments |
algorithms, bit twiddling, hacks, Mathematics | Tagged: convergence, e, Euler, series |
Permalink
Posted by Steven Pigeon
15/01/2013
Briggs‘ logarithms (often mistakenly understood to be Napier‘s logarithms) is such an useful function that most of us don’t really think about it, until we have to. Everyone’s familiar with its properties:


(1)

but suddenly,

What can we do with this last one?
Read the rest of this entry »
Leave a Comment » |
algorithms, Mathematics | Tagged: Briggs, Logarithm, Napier, Numerical Approximation |
Permalink
Posted by Steven Pigeon
10/07/2012
A few weeks ago, I went to Québec Ouvert Hackathon 3.3, and I was most interested by Michael Mulley’s Open Parliament. One possible addition to the project is to use cross-referencing of entries based not only on the parliament-supplied subject tags but also on the content of the text itself.

One possibility is to learn embeddings on bags of words but on stemmed words to reduce the dimensionality of the one-hot vector, essentially a bitmap where the bit corresponding to a word is set to 1 if it appears in the bag of words. So, let us start at the beginning, stemming.
Read the rest of this entry »
Leave a Comment » |
algorithms, C-plus-plus, data structures, machine learning, programming | Tagged: embedding, embeddings, one hot, Open Parliament, stem, stemming, trie |
Permalink
Posted by Steven Pigeon