YoU CanT MaKE BuBBleSorT FaSTER With ASseMbLY

14/01/2020

In one of the classes I teach, we end up writing assembly language programs. And while I explain the (sometimes very relative) benefits of writing assembly language, I use bubble sort as an example where even carefully crafted assembly language doesn’t mean much: it’s a bad algorithm to start with.

YoU CanT MaKE BuBBleSorT FaSTER With ASseMbLY

Except that it’s not quite true.

Read the rest of this entry »


Fast Exponentiation, revisited

12/11/2019

Quite a while ago, I presented a fast exponentiation algorithm that uses the binary decomposition of the exponent n to perform O(\log_2 n) products to compute x^n.

While discussing this algorithm in class, a student asked a very interesting question: what’s special about base 2? Couldn’t we use another base? Well, yes, yes we can.

Read the rest of this entry »


rdrand

04/11/2019

Last week (at the time of writing, anyway), Ars Technica reported a serious bug in AMD’s implementation of rdrand, an instruction that helps you generate random numbers. Apparently, on (some) Ryzen 3000, 0xfff…ff is “random”.

I recently got an AMD Ryzen 9 3900x, and I wondered if I have the bug as well.

Read the rest of this entry »


(Sub)bit-fields (Coding with fractions of bits, Part II)

13/08/2019

Last week, we used the 6×7×6 palette as an example of very simple fraction-of-a-bit coding1. However, we can generalize still a bit more to allow single field extraction and modification

Read the rest of this entry »


The 6×7×6 palette (Coding with fractions of bits, Part I)

06/08/2019

Remember ye olde dayes when we had to be mindful of the so-called “web safe palette“? Once upon a time, screens could display 24-bits colors, but only 256 at a time in some “hi-res” modes. But that’s not what I’m going to tell you about: I’d rather tell you about the encoding of the palette, and about a somewhat better palette. And also about using fractions of bits for more efficient encodings.

Read the rest of this entry »


Discrete Inversion (Generating Random Sequences XII)

30/07/2019

While this sounds something like a shameful family secret, discrete inversion is only the finite-valued variation on the method of inversion for the generation of random numbers with a given distribution (as I’ve discussed quite a while ago here). The case we’ll consider here is a random variable with few possible outcomes, each with different odds

Read the rest of this entry »


Of small copies and templates.

23/07/2019

Sometimes, it’s the little things that make a big difference. For example, moving small, not machine-sized, pieces of data may be a problem. If it happens to be machine-sized (that is, the size of something like uint32_t), then the copy is readily done by a single (integer) copy. But what if it’s, say, 5 bytes?

The first solution that comes to mind is either a for-loop or a call to a library function like memcpy. The for-loop would give something like:

char * dest = ...
char * src = ...
for (std::size_t i=0;i<5;i++) dest[i]=src[i];

If you’re lucky, the compiler understands the small copy and optimizes. More likely, it will merely unroll the loop and generate something like:

14d4:   0f b6 07    movzx  eax,BYTE PTR [rdi]
14df:   88 04 24    mov    BYTE PTR [rsp],al
14e2:   0f b6 47 01 movzx  eax,BYTE PTR [rdi+0x1]
14e6:   88 44 24 01 mov    BYTE PTR [rsp+0x1],al
14ea:   0f b6 47 02 movzx  eax,BYTE PTR [rdi+0x2]
14ee:   88 44 24 02 mov    BYTE PTR [rsp+0x2],al
14f2:   0f b6 47 03 movzx  eax,BYTE PTR [rdi+0x3]
14f6:   88 44 24 03 mov    BYTE PTR [rsp+0x3],al
14fa:   0f b6 47 04 movzx  eax,BYTE PTR [rdi+0x4]
14fe:   88 44 24 04 mov    BYTE PTR [rsp+0x4],al

Let’s see how we can fix that!

Read the rest of this entry »


Finding dependencies for Make (part II)

12/02/2019

Quite a while ago, I presented my own simple, sed, grep and bash-based make-depend script. It was simple, and somewhat effective, except for the fact that it did not see include-only dependencies nor quote-includes. If you changed a header affecting a lot of things, it would not rebuild, because the make rules were incomplete. Let’s fix this.

Read the rest of this entry »


The very long summer

12/02/2019

The original plan was to come back in september 2018, but it turns out, I was rather busy with everything else. I missed blogging, so I’ve decided to continue, but it will be at a much softer rate of about a post of month, rather than once a week.


Summer Hiatus

14/08/2018

This is my tenth anniversary of blogging. I started back in 2008, and since then there has been 586 posts (including this one), about 1300 comments, 170 followers, 325000 visitors, and 825000 views, with a few posts well over 9000.

Well, that’s not bad, but for now, I must reconsider the time I invested, and will continue to invest in this blog. Posting once a week, even with the summer hiatuses, is becoming a strenuous pace, especially that I also have—many—other things to do. So I was tempted to just end it, and say, well, after ten years, it’s much better than your average two-post blog, let’s call it quits. But the thing is, I enjoy writing about the things I do, the math I work out, and other ideas. But I will likely stop for this month, then be back with once-a-month posting.

‘Till then, it’s summer. Let’s enjoy it.