fanf | 2006-11-01

So last week I wrote about some simple "desugaring" transformations that turn a fairly conventional block-structured programming language into the call-by-value lambda calculus with some small extensions. The main problem is that I relied on first-class continuations. This has at least two undesirable consequences:

First-class continuations are the goto of functional programming. Actually, that understates the amount of unconstrained tangle that they can unleash - they are far more powerful and dangerous than goto. So one can argue that programmers are better off avoiding first-class continuations and using more constrained control flow, much like we are advised to shun goto in favour of break, return, and exceptions.
Continuations can't easily be implemented with a conventional linear stack. When you return a normal value up the stack, the top stack frame(s) can be freed, but if you return a continuation up the stack, they cannot: calling the continuation re-activates these frames. The stack becomes a cactus. A common solution to this problem is to allocate activation frames on the heap, but this seriously stresses the garbage collector and harms locality: whereas a stack works up and down over the same small area of memory, the heap blasts forever upwards.

Fortunately there's a nice solution: multi-return function calls. The idea is to separate continuation arguments from normal arguments, and restrict the continuations so that activation frames are still used in a normal stack-like LIFO manner. What's really neat is that it preserves a lot of the the power of first-class continuations.

However, the desugaring transforms become more complicated. The scope of the multi-return continuations is restricted so that you cannot (for example) return from a function by invoking its continuation from an inner block, because the continuation is not available there. Instead you have to add plumbing to pass the early-return continuation into the block along with the block's normal sequential continuation. This plumbing ends up being rather like the standard continuation-passing-style transformation described in the lambda-the-ultimate papers linked to above, but with an extra continuation for each early exit.

This extra-continuation plumbing is also similar to a standard implementation technique for exceptions (which I called "non-trivial" last week, but isn't actually hard to understand). As well as the sequential continuation that is passed around in CPS, you pass around an exception handler continuation which is invoked in order to throw an exception. This is very similar to implementing exceptions in C with setjmp and longjmp, except in C you can keep the current exception handler in a global variable instead of passing it around.

The downside of these techniques is that the plumbing is an overhead. However, it is "pay-as-you-go", in that you only need the plumbing if you use early exits from blocks. (It's less easy to avoid the exception plumbing since exceptions can usually be thrown anywhere.) By contrast, the more complicated infrastructure needed for first-class continuations affects the performance of the whole run-time system.

(The mathematics in the typing of these constructions relates directly to their balance of practicality and power. Constrained continuations like multi-return functions and exceptions are usually used linearly and have an intuitionistic (constructive) typing. Call-with-current-continuation, however, has a classical typing which is not expressible in most practical type systems. This kind of direct relationship between theory and practice is one of the beauties of computer science.)

At the end of http://fanf.livejournal.com/65911.html I mentioned that it might be beneficial to have multiple queues, in order to reduce the density of garbage in the older parts of the queue. There are at least a couple of other reasons why one might want multiple queues.

Even faster

In the absence of any other bottlenecks, the MTA is going to be limited by the rate that it can fsync the main queue. You can raise this limit if you have two parallel queues on different disks, and spread the load of incoming messages between them. I got this idea from the Sendmail X design document which describes a similar (but not quite so neat) queue structure to the one I have been describing.

TURN

SMTP's TURN feature allows a client to ask the server to deliver any email queued on the server for a particular domain. This might be used by business dial-up customers who call in to their ISP every so often to collect email.

There are two variants of TURN in the current specifications, ETRN and ATRN, because the original form was insecure. With ETRN the server delivers the queued messages as if from a normal queue run; the only security concern is that the server must have some throttling to prevent clients from starting unlimited numbers of queue runners. With ATRN the existing connection is used to deliver the messages from the server to the client, which switch roles after the ATRN command. The client must be authenticated so that the server knows the client is permitted to receive the email it is asking for. ATRN is used on the "on-demand mail relay" port 366 instead of the usual SMTP port.

The basic implementation of ETRN (common to sendmail, Exim, and older Postfix) is for the server to fire off sendmail -qR, which scans the entire queue for messages with recipient addresses containing the domain given by the client. This is horribly inefficient if you have lots of messages on the queue, and the more clients you have using ETRN the more your queues get clogged with undeliverable messages.

The solution to this problem is to get messages for your ETRN domains off the queue; with Exim this is typically done by delivering them to a batch-SMTP file per domain, which can then be re-injected for delivery fairly efficiently when the client says ETRN. This kind of setup is a must for ATRN: whereas ETRN uses normal SMTP routing and delivery (which works if the clients have static IP addresses), ATRN does not, so there is generally no way to deliver the messages except by ATRN. ETRN is an optimisation to allow clients to tell the server not to wait for a retry timeout, whereas ATRN is purely on-demand so it is actually wrong to leave the messages on the retry queue.

With my log-structured queues we can use this idea but do it more efficiently. When a message is addressed to an ATRN domain, or cannot be delivered to an ETRN domain, its envelope is written to that domain's queue instead of appended to the main queue. One thing that makes this slightly more interesting is that the envelope may have to be split if it has recipients at multiple domains. This introduces the requirement for some kind of reference counting of spool files. The per-domain queue files are not routinely scanned, and when a client requests a delivery the server can simply and efficiently work through its queue.

Postfix leaves ETRN messages on its "deferred" queue, but optimises ETRN by keeping per-domain indexes of messages. This has the advantage of avoiding the need to split messages per domain, and means that ETRN domains are still retried even if the client doesn't ask. However it means that the deferred queue can get large and normal queue runs can get expensive. We should also periodically retry ETRN domains, but this won't happen unless they have messages on the main queue. To deal with that we should periodically probe these domains, which does not need any disk activity if the domain remains unreachable.

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

Tony Finch's blog

2006-11-01

2006-11-01

Constraining continuations

More log-structured MTA queues.

Profile

February 2026

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags