### dfa.io | about | archive

This is a story about a compiler optimization that didn’t optimize.

My compilers class last semester had a lab focused on optimizations. Our compiler targeted x86_64 assembly and the optimization in question removed register-register moves. That is, if you have an instruction of the form

the compiler will modify the generated code, renaming occurrences of rbx after this instruction to rax. This is called register coalescing.

We had a good bit of trouble implementing it (mostly because of a poorly-thought-out interference graph implementation). After a couple frustrating days, we got it working.

The result? No statistically significant improvement. Dang.

At first glance, maybe this isn’t all that surprising. Movs are supposed to be cheap, right?

But there’s a little bit more to the story. Turns out (I didn’t know this at the time), modern Intel processors have way more 16 registers. They often have over a 100. There’s something called a register file, and the processor maintains a mapping from x86_64 registers to physical registers. In fact, register-register movs really become nothing more than a pointer-renaming.

So the optimization that we spent a couple days implementing was already taken care of for us at the hardware level.

Being the impetuous 20-year-old I am, I quickly developed the opinion that register coalescing is dumb. An antiquated optimization, made obsolete by modern hardware.

But a conversation with the OS professor here changed my mind.

In general, (he says), simply because you can do something at the hardware level doesn’t mean you should. With the rise of mobile devices and the #internetOfThings, a lot of widely-used processors don’t have a lot of power. If you can compile a binary once and take care of all register-register moves once and for all, sometimes that’s worthwhile.

It’s an interesting topic - very cool to think of the compiler as just a small part of a larger whole. The compiler and the hardware together share the responsibility of making code run efficiently - how the work is divided is a complex question.

And more generally, it’s a reminder to think critically about the hardware your code is running on prior to jumping into a nontrivial task. Premature optimization strikes again!

# Arranging keys on my key holder is NP-hard

This past summer, I bought this key-holding gadget. I’m a little paranoia about losing my keys, so I figured it’d be a good way to make sure I don’t lose any of them.

But god forbid you ever have to change what keys you carry with you.

I just bought a USB drive to put on it (I recently had a snafu with arch and had to chroot in. But I had to go a couple hours without my laptop because I didn’t have a boot USB on me. Never again).

Anyway.

30 minutes into the process of rearranging my keys so that the USB drive would fit on it, I thought to myself: “this has to be NP-hard.”

And indeed it is.

We reduce from PARTITION. (Really, they’re nearly the same problem). Take a second and gaze at the key holder on the linked amazon page to get a sense for the problem. We want to arrange the keys so that all keys fit on the keyring and there’s no “tangling” of keys in the middle.

Call the length of the key holder $l$.

Suppose I give you a bunch of keys to put on the key holder. Each key is very short - maybe $\frac{l}{3}$. So the length of the keys won’t be an issue - it’s mostly an issue of getting the heights to match. The keys are of varying heights.

Certainly, the key holder needs to have equal height on each end. Additionally, every key must be present (like, why would you even buy one of these things if you’re still going to have keys jangling around)?

So then, if the sum of the heights of the keys is $S$, each side of the key holder must have height $\frac{S}{2}$.

But this is just exactly PARTITION, which we know to be NP-complete.

Sigh. Wikipedia says that there’s a pseudo-poly time dynamic programming solution for PARTITION, which has led PARTITION to be called “The Easiest Hard Problem”.

Though to be frank, I don’t know if I consider that to be much condolence.

I stumbled across the following tidbit while reading Godel, Escher, Bach last night: heterological adjectives! It’s a formulation of Russell’s Paradox, but in the English language.

Suppose I present you with the following two adjectives:

autological: an adjective that describes itself

heterological: an adjective that does not describe itself

For example, ‘short’ is an autological word, but ‘long’ is not. I think my favorite autological word (and these are the far rarer of the two) is ‘attributive’ - the word itself attributes meaning to it’s modified noun.

And here’s the brain-teaser of the day. What if we ask the following question: “Is ‘heterological’ heterological?”

If ‘heterological’ is itself heterological, then it certainly describes itself. But that would make ‘heterological’ autological. But we just said it was heterological!

If ‘heterological’ is not heterological, then it is autological (they’re antonyms afterall). So then ‘heterological’ describes itself. So then ‘heterological’ is heterological. Contradiction!

So we wind up with the following: ‘heterological’ is heterological if and only if ‘heterological’ is not heterological.

Just a Wednesday morning’s musings.