fanf: (Default)
[personal profile] fanf
Back 5 years ago when I was working with adns, one of the things I played with a bit was a perl wrapper. adns is absolutely fantastic for bulk log processing - being able to do more than 10,000 concurrent queries so that your're using all your CPU and not blocking on the network is a god-send. However C makes this more painful than it ought to be.

I never finished the perl wrapper because other things became more important, and when I next had the time and the inclination to look at it Net::DNS existed, so I thought there would be little point.

I've been paying gradually more and more attention to SpamAssassin recently, and it uses Net::DNS's background query feature to run all its DNS queries concurrently with its pattern matching. As a result of this I've found out that Net::DNS's background query handling is utterly stupid: it uses a separate socket for each query, rather than stuffing them all down the same socket and using the DNS protocol's query ID field to tie responses to queries.

This causes excessive resource usage which greatly restricts the number of concurrent queries it can handle, even on a sensible OS. On Windows it dies if the concurrency goes above about 350, which occasionally happens with SpamAssassin. http://bugzilla.spamassassin.org/show_bug.cgi?id=3924

So now I have the bit between my teeth. Must f1xx0r!

Probably because of Bind braindamage

Date: 2004-12-07 16:05 (UTC)
From: [identity profile] illiterat.livejournal.com

I presume you are talking about TCP sockets? I've done some experiments toward having a decent LGPL resolver and DNSD. So I can probably save you some hair pulling, the reason it uses one socket per query is that bind starts the TCP connection timeout when you connect, and each query it parses. And it does them syncronously (of course).

This means that if you send four queries down the TCP socket, and bind takes longer than the TCP connection timeout to process the second one (easiest way to simulate this is a delegation to a single server that isn't pingable) then it will "timeout" the entire TCP connection, so you then need to re-send the last two queries (and all the other queries will have had to wait the timeout).

I have a couple of ideas for how to do this sanely in my code, but the obvious one relies on Vstr so you can copy in O(1) time. Another idea is to make sure I only do large numbers of TCP requests against a half decent dnsd. Neither of which are really open to SA :).

December 2025

S M T W T F S
 123456
78910111213
14151617181920
21222324 252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated 2025-12-30 18:16
Powered by Dreamwidth Studios