LaTeXify C/C++ code snippets

So I’m still writing lecture notes. This time, I need to include kind of larger pieces of C or C++ code, and \LaTeX environments do not really help me a lot. Some are better than others, but you still have to escape and fancify text yourself. This is laborious and error-prone, and is an obvious target for automation. A script of some sort. The task isn’t overly complicated: highlight keywords, and escape symbols like { } _ and & that make \LaTeX unhappy. This looks like a job for

sed is arguably one of the most perverse Unix command there is. The documentation is bad, it is mostly counter-intuitive, and it interacts more or less nicely with the shell. So here’s the script

#!/usr/bin/env bash


keywords=( bool break case catch char class const continue delete do double
else enum float for if int long new nullptr private protected public return
short sizeof static struct switch template typedef typeid typename union
unsigned using virtual void volatile while )

while IFS= read -r line
    # all special symbols that interfere with LaTeX # \ _ { } &
    # sed special symbol &
    line=$(echo "$line" | sed s/[\#\&_{}]/\\\\\&/g )

    for key in ${keywords[@]}
        line=$(echo "$line" | sed "s/\b${key}\b/\\\\$keyword_typeface{$key}/g")
    echo "$line"
done < $1

Let’s break down the script. First, we have a keyword_typeface that defines the \LaTeX command to use to render keywords. The list of keywords, is visibly incomplete, but as I won’t be using every possible C/C++ keyword in my notes, I just put those that I am likely to use.

The first sed command escapes any of the troublesome characters by prefixing them with a backslash. The & in the second part of the substitution command stands for “the match”, while \1 would have stood for the first match, and,… of course… “the match” and the first match are
not the same thing.

The second sed, in a for loop merely wraps any of the keywords in the first with the keyword typeface command.

Let’s see how it works. Here’s a bit of C++ code:

size_t binary_search( const std::vector<int> & v,
                      int value,
                      uint64_t & steps,
                      interpolator interp)
  size_t l=0, h=v.size()-1;
  while (l<h)
    size_t p=interp(l,v[l],h,v[h],value);
    if (v[p]<value)
     if (v[p]==value)
       return p;

  return l;

Invoking the script with the file name as argument will produce:

size\_t binary\_search( \pmb{const} std::vector<\pmb{int}> \& v,
                      \pmb{int} value,
                      uint64\_t \& steps,
                      interpolator interp)
  size\_t l=0, h=v.size()-1;
  \pmb{while} (l<h)
    size\_t p=interp(l,v[l],h,v[h],value);
    \pmb{if} (v[p]<value)
     \pmb{if} (v[p]==value)
       \pmb{return} p;

  \pmb{return} l;

Which is rendered as:


* *

The script has one severe limitation: keywords that are part(s) of other keywords. Like const is a part of constexpr. If const comes first in the list of keywords, then the script will output something like \pmb{const}expr and will later fail to match constexpr. If, however, constexpr comes first in the list, then the script will generate \pmb{\pmb{const}expr} because it will match the whole word constexpr and then match within that word const later on. While \pmb{\pmb{const}expr} is not very beautiful, it renders correctly.

2 Responses to LaTeXify C/C++ code snippets

  1. justusc says:

    When including code snippets in my papers, I have found minted to be a wonderful package. See the following link for an overview.

    It avoids the need for scripts and other preprocessing steps, and has several nice syntax coloring schemes.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: