2018-05-23

fanf: (Default)

(Monday's notes) (Wednesday's notes) (Thursday's notes) (Epilogue)

Today was a bit weird since I managed to perform some unusual productivity judo on myself.

I started the day feeling very unsure about what direction to take, and worried that the project would be a bust before it had even got properly started.

After I had some coffee and killed a bit of email, I found myself looking through Knot's DNS update code, worrying about how to COWify it. Usually if I am feeling unsure about a project I will put it on the back burner to think about while I do something else. That isn't going to work when I have given myself only one week to see what I can achieve.

Eventually I realised that my only chance of success would be to strictly limit the scope of the project to the qp trie code itself, aiming to make it possible to COWify the DNS code but not touching the DNS code until the trie is able to COW.

(COW = copy-on-write)

I still didn't know how COW would work: the new invariants, the fence posts, the edge cases. But it seemed clear after last night's thoughts that I would need to add some kind of COW flag to the leaf nodes.

Digression: misuse of C

I made two horrible mistakes in my original qp trie code.

First, I used a union to describe how branch and leaf nodes both occupy two words, but with different layouts. It is really hard to use a union in C and not lose a fight with the compiler over the strict aliasing rules.

Second, I used bitfields to describe how one of the branch words is split up into subfields. Bitfields have always been a portability nightmare. I was using them to describe the detailed specifics of a memory layout, which does not work for portable code.

I was aware at the time that these were really bad ideas, but I wanted a reasonably explicit and lightweight notation for accessing the parts of the data structure while I was getting it working. And I never got round to correcting this short-term hack.

Refactoring

So today I spent a few hours starting to replace this dubious C with something more explicit and less likely to confuse the compiler.

As a side-effect, it will make it possible to stash an extra COW flag into leaf nodes.

Where to bung a COW?

When I started refactoring I didn't know how I would use the COW flag.

It takes me about 25 minutes to walk between home and the beer festival, and today that was really useful thinking time.

My thoughts oscillated between different possible semantics for the COW flag (most of them clearly broken) and I wasn't sure I could make it work. (At worst I might finish the week with a bit of code cleanup...)

This evening I think I came up with something that will work, and that will justify the refactoring. The problem I have been struggling with is that the existing qp trie structure puts the metadata about a structure next to the pointer to the structure, but the COW flag is all about whether a structure is shared or not, so it really demands to be put in the structure itself.

When you have trie A and its COW-clone trie B, if you put the COW flags in the pointers you end up with different copies of the pointers and flags in A and B. Then you modify B some more, which means you need to update a flag, but the flag you need to update is in A and you don't have a quick way to locate it. Gah!

Tomorrow

My aim is to hammer through the refactoring, and think about the details of how to use this COW flag, and what the API might look like for COWing the application data structure - mainly the DNS stuff in the case of Knot, but the hooks have to be general-purpose. (Knot uses qp tries for a lot more than just DNS zones.)

fanf: (Default)

(Monday's notes) (Tuesday's notes) (Thursday's notes) (Epilogue)

Today was productive, and I feel I'm over the hump of the project - though I fear I won't get to a good stopping point this week.

Refactoring

As I hoped, I managed to finish the refactoring. Or, to be precise, I got it to the point of compiling cleanly and crashing messily.

My refactoring approach this week has been to hack in haste and debug at leisure. Hopefully not too much leisure :-)

A lot of the stripping out of unions and bitfields was fairly mechanical, but I also took the opportunity to simplify some of the internal interfaces. I also changed some of the other data representations. I hope this doesn't turn out to be foolishly lacking in refactoring discipline!

Nibbles

The qp trie code selects a child of a branch based on a nibble somewhere in the key string. A good representation of "somewhere" is pretty important.

My original qp trie code represented indexes into keys as a pair of a byte index and some flags that selected the nibble in that byte. This turned out to be pretty good when I did the tricky expansion from 4 bit to 5 bit nibbles. However, knowledge of this detail was smeared all through the code.

In this week's refactoring I've tried unifying the byte+flags into a single nibble index. Overall I think this has turned out to be simpler, and my vague handwavy feeling is that the code should compile down to about the same instructions. (If you set NDEBUG to switch all the new asserts off, that is!)

COW pushdown

I'm fairly confident now that I have a good idea of how copy-on-write will work. This afternoon I wrote a 700 word summary of the COW states and invariants - the task now (well, after the debugging) is to turn the prose into code.

Page generated 2019-04-20 00:49
Powered by Dreamwidth Studios