fanf: (Default)
[personal profile] fanf
Shortly before my wedding there was a discussion on the Exim-users mailing list about Exim's handling of its hints databases, which cache information about the retry status of remote hosts and messages in the queue, callout results, ratelimit state, etc. At the moment Exim just uses a standard Unix DB library for this, e.g. GDBM, with whole-file locking to protect against concurrent access from multiple Exim processes.

There are two disadvantages with this: firstly, the performance isn't great because Exim tends to serialize on the locks, and the DB lock/open/fsync/close/unlock cycle takes a while, limiting the throughput of the whole system; secondly, if you have a cluster of mail servers the information isn't shared between them, so each machine has to populate its own database which implies poor information (e.g. an incomplete view of clients' sending rates) and duplicated effort (e.g. repeated callouts, wasted retries).

The first problem is a bit silly, because the databases are just caches and can be safely deleted, so they don't need to be on persistent storage. In fact some admins mount a ram disk on Exim's hints db directory which avoids the fsync cost and thereby ups the maximum throughput. If you go a step further, you can take the view that Exim is to some extent using the DB as an IPC mechanism.

The traditional solution to the second problem is to slap a standard SQL DB on the back-end, but if you do this the SQL DB becomes a single point of failure. This is bad given that a system like ppswitch which is a cluster of identical machines that relay email does not currently have a SPOF. It also compounds the excessive persistence silliness.

It occurs to me that what I want is something like Splash, which is a distributed masterless database, which uses the Spread toolkit to reliably multicast messages around the cluster. Wonderful! The hard work has already been done for me, so all I need to do is overhaul Exim's hints DB layer for the first time in 10 years - oh, and get a load of other stuff off the top of my to-do list first.

If it's done properly it should greatly improve the ratelimit feature on clustered systems, and make it much easier to write a high-quality greylisting implementation. (A BALGE implementation is liable to cause too many operational problems to be acceptable to us.) It should also be good for Exim even in single-host setups, by avoiding the hints lock bottleneck. The overhaul of the hints DB layer will also allow Exim to make use of other more sophisticated databases as well as Splash, e.g. standard SQL databases or Unix-style libraries that support multiple-reader / single-writer locking.

Date: 2005-07-26 09:49 (UTC)
From: [identity profile] lusercop.livejournal.com
OK, as I understand it, when a machine comes back, the machine's splash db gets populated from other things on the spread network, so the data replication you mention above. I think there is also some overhead from the spread ring reconfiguring, but as far as I recall, from what Ben said, the problem is the rate of data you inject into spread. I don't know how reliable the data replication is, but you might as well keep the hints data around if you can. Certainly it would be nice if splash got fixed, because it would mean that splashcache actually became useful for SSL session-key cacheing in clusters again, and I think Ben might possibly be doing his "it's open source, someone else can fix it" kind of attitude :-/

December 2025

S M T W T F S
 123456
78910111213
14151617181920
21222324 252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated 2025-12-30 21:52
Powered by Dreamwidth Studios