Task #19 — OpenDBX 1.4.0 + PowerDNS issue
Attached to Project — PDNS OpenDBXLast edited by Norbert Sendetzky (nose) - Friday, 7 Aug 2009, 1:23pm
Bug Report | Low | ||
Library | Normal | ||
Unconfirmed | 1.0.x | ||
Norbert Sendetzky (nose) | Undecided | ||
BSD | ![]() |
||
I'm running powerdns + opendbx + pgsql(backend). After upgrading OpenDBX from 1.2.6 to 1.4.0 the following error occurs: Mar 31 19:56:52 lan pdns[63064]: AXFR of domain 'no-substitute.net' initiated by 213.173.243.11 Mar 31 19:56:52 lan pdns[63064]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation not permitted Mar 31 19:56:52 lan pdns[63064]: AXFR of domain 'no-substitute.net' failed: not authoritative This never happens with 1.2.6. It appears the connection to the db is lost or something cause a 'dig -t axfr domain.tld @name.server' doesn't give me anything most of the time, but after repeating the command a dozen or so times, suddenly performs normally. I've noticed it only happens when I trigger an AXFR query. Whenever I query for NS, SOA or any other types it doesn't trigger it. Since I only have records in the pgsql database I haven't tried any other OpenDBX backend. I hope this helps in tracing this one down. Cheers, Ralf |
Comments (13) | Attachments (0) | Related Tasks (0/0) | Notifications (1) | Reminders (0) | History |

Hi Ralf
Could you please enable query-logging=yes in your pdns.conf to see what are the last 2-3 statements before the error occurs? Also, I would like to ask you to execute these statements using psql from the machine running the PowerDNS daemon to eliminate the possibility of other errors.
Thanks
Norbert

query-logging is on, but for some reason it doesn't show anything in the logs with the opendbx backend. Whenever I switch to the pgsql backend my log fills up with statements though. SO for some reason the opendbx backend doesn't log the queries.
I did run the queries in question directly using psql from the command line, and that worked fine.
A note: it doesn't always fail. Sometimes AXFR works without any issues, but if it doesn't it takes a lot of retries to make it work again.

The OpenDBX backend logs queries using the INFO loglevel which should relate to loglevel=8 in pdns.conf. To be sure, you can use loglevel=9 (DEBUG) to log anything. Could you please retry with this setting?
When it fails, it's important to know what happend before.

Is there something to the PostgreSQL server logs written when the error occurs?

This shows sometimes, but not always when the problem occurs:
<powerdns%postgres 2009-04-04 20:18:39 CEST> LOG: could not receive data from client: Operation not permitted
<powerdns%postgres 2009-04-04 20:18:39 CEST> LOG: unexpected EOF on client connection

I've set log-level=9 in pdns.conf, but no statements to be found anywhere in my log. query-logging=yes is also set.
I do see additional stuff like this now:
Apr 5 12:35:27 lan pdns[32231]: [OpendbxBackend] Database connection (read) to '127.0.0.1' succeeded
Apr 5 12:35:27 lan pdns[32231]: [OpendbxBackend] Database connection (write) to '127.0.0.1' succeeded

Here's what happens after a couple of hours:
[14:26][root@lan ~]# dig -t axfr cainites.net @192.168.1.2
; <<>> DiG 9.4.2-P2 <<>> -t axfr cainites.net @192.168.1.2
;; global options: printcmd
; Transfer failed.
[14:26][root@lan ~]# dig -t aaaa cainites.net @192.168.1.2
; <<>> DiG 9.4.2-P2 <<>> -t aaaa cainites.net @192.168.1.2
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62954
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;cainites.net. IN AAAA
;; ANSWER SECTION:
cainites.net. 86400 IN AAAA 2001:888:1deb::7
;; Query time: 6 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Sun Apr 5 14:26:39 2009
;; MSG SIZE rcvd: 58
[14:26][root@lan ~]# dig -t axfr cainites.net @192.168.1.2
; <<>> DiG 9.4.2-P2 <<>> -t axfr cainites.net @192.168.1.2
;; global options: printcmd
; Transfer failed.
From the /var/log/messages file:
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' initiated by 192.168.1.2
Apr 5 14:26:35 lan pdns[32231]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation
not permitted
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
Apr 5 14:26:45 lan pdns[32231]: AXFR of domain 'cainites.net' initiated by 192.168.1.2
Apr 5 14:26:45 lan pdns[32231]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation
not permitted
Apr 5 14:26:45 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
OpenDBX 1.2.6 never showed this behaviour to my knowledge.

Which version of PowerDNS do you use?
The only change in pgsql_odbx_result was
- switch( select( fd + 1, &fds, NULL, NULL, timeout ) )
+ while( ( err = select( fd + 1, &fds, NULL, NULL, timeout ) ) < 0
+ switch( err )
to retry on interrupted select() system call.

2.9.22
I've also added some patches from svn, but none of those affect the backends (except maybe sqlite2/3).

I've tried PowerDNS 2.9.22 with OpenDBX backend myself and query-logging works as expected. My system (Debian testing) logs to /var/log/daemon.log. Could you send me your complete configuration files (pdns.conf, pdns.d/* if available)?
Which version of PostgreSQL are you using? I guess 8.3

I'm using PostgreSQL 8.3.7 on FreeBSD. Will send you my pdns.conf ASAP.
(Sorry for the long delay... I got the keys to my new home and a lot of work is needs to be done before I can move my stuff and myself in there).

Norbert,
I've changed my pdns.conf to use /tmp instead of 127.0.0.1 for the read and write connection and this has solved my issue. Apparently it's all caused by the connection timing out (at least that's what I'm guessing) and when a query is sent reconnecting to the db doesn't work as it's supposed to. Whether this is an OpenDBX or PostgreSQL issue I do not know since I'm not familiar with either API's or specs.

That sounds like a reasonable explanation but shouldn't happen according to the OpenDBX and PDNS opendbxbackend code. Are you willing to debug this issue further? I would send you a patch with would add log output to OpenDBX and PDNS so we could track this down.
When using the PDNS gpgsql backend, do you notice sometimes informational messages in your log file that a backend thread is started again? This must be the case if connections ran into timeouts.
Could you replace OpenDBX 1.4.0 by 1.2.6 and leave your current installation as is to find out if the problem is really related to OpenDBX?
The error reported by the OpenDBX library to PDNS opendbxbackend could be a warning instead of an error, so the opendbxbackend doesn't reestablish the connection.