PDNS OpenDBX

for Project:

PowerDNS OpenDBX backend

Register as a new user

Task #19 — OpenDBX 1.4.0 + PowerDNS issue

Attached to Project — PDNS OpenDBX
Opened by Ralf van der Enden (RvdE) - Thursday, 2 Apr 2009, 8:46pm
Last edited by Norbert Sendetzky (nose) - Friday, 7 Aug 2009, 1:23pm
Bug Report Low
Library Normal
Unconfirmed 1.0.x
Norbert Sendetzky (nose) Undecided
BSD 0% complete
I'm running powerdns + opendbx + pgsql(backend). After upgrading OpenDBX from 1.2.6 to 1.4.0 the following error occurs:

Mar 31 19:56:52 lan pdns[63064]: AXFR of domain 'no-substitute.net' initiated by 213.173.243.11
Mar 31 19:56:52 lan pdns[63064]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation not permitted
Mar 31 19:56:52 lan pdns[63064]: AXFR of domain 'no-substitute.net' failed: not authoritative

This never happens with 1.2.6.

It appears the connection to the db is lost or something cause a 'dig -t axfr domain.tld @name.server' doesn't give me anything most of the time, but after repeating the command a dozen or so times, suddenly performs normally. I've noticed it only happens when I trigger an AXFR query. Whenever I query for NS, SOA or any other types it doesn't trigger it.

Since I only have records in the pgsql database I haven't tried any other OpenDBX backend.

I hope this helps in tracing this one down.

Cheers,
Ralf

Comments (13) | Attachments (0) | Related Tasks (0/0) | Notifications (1) | Reminders (0) | History |

Comment by Norbert Sendetzky - Thursday, 2 Apr 2009, 10:02pm

Hi Ralf

Could you please enable query-logging=yes in your pdns.conf to see what are the last 2-3 statements before the error occurs? Also, I would like to ask you to execute these statements using psql from the machine running the PowerDNS daemon to eliminate the possibility of other errors.

Thanks


Norbert

Comment by Ralf van der Enden - Friday, 3 Apr 2009, 12:10pm

query-logging is on, but for some reason it doesn't show anything in the logs with the opendbx backend. Whenever I switch to the pgsql backend my log fills up with statements though. SO for some reason the opendbx backend doesn't log the queries.

I did run the queries in question directly using psql from the command line, and that worked fine.

A note: it doesn't always fail. Sometimes AXFR works without any issues, but if it doesn't it takes a lot of retries to make it work again.

Comment by Norbert Sendetzky - Saturday, 4 Apr 2009, 11:50am

The OpenDBX backend logs queries using the INFO loglevel which should relate to loglevel=8 in pdns.conf. To be sure, you can use loglevel=9 (DEBUG) to log anything. Could you please retry with this setting?

When it fails, it's important to know what happend before.

Comment by Norbert Sendetzky - Saturday, 4 Apr 2009, 12:03pm

Is there something to the PostgreSQL server logs written when the error occurs?

Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 12:04pm

This shows sometimes, but not always when the problem occurs:

<powerdns%postgres 2009-04-04 20:18:39 CEST> LOG: could not receive data from client: Operation not permitted
<powerdns%postgres 2009-04-04 20:18:39 CEST> LOG: unexpected EOF on client connection

Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 12:37pm

I've set log-level=9 in pdns.conf, but no statements to be found anywhere in my log. query-logging=yes is also set.

I do see additional stuff like this now:

Apr 5 12:35:27 lan pdns[32231]: [OpendbxBackend] Database connection (read) to '127.0.0.1' succeeded
Apr 5 12:35:27 lan pdns[32231]: [OpendbxBackend] Database connection (write) to '127.0.0.1' succeeded

Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 2:28pm

Here's what happens after a couple of hours:

[14:26][root@lan ~]# dig -t axfr cainites.net @192.168.1.2

; <<>> DiG 9.4.2-P2 <<>> -t axfr cainites.net @192.168.1.2
;; global options: printcmd
; Transfer failed.
[14:26][root@lan ~]# dig -t aaaa cainites.net @192.168.1.2

; <<>> DiG 9.4.2-P2 <<>> -t aaaa cainites.net @192.168.1.2
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62954
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;cainites.net. IN AAAA

;; ANSWER SECTION:
cainites.net. 86400 IN AAAA 2001:888:1deb::7

;; Query time: 6 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Sun Apr 5 14:26:39 2009
;; MSG SIZE rcvd: 58

[14:26][root@lan ~]# dig -t axfr cainites.net @192.168.1.2

; <<>> DiG 9.4.2-P2 <<>> -t axfr cainites.net @192.168.1.2
;; global options: printcmd
; Transfer failed.


From the /var/log/messages file:
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' initiated by 192.168.1.2
Apr 5 14:26:35 lan pdns[32231]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation
not permitted
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
Apr 5 14:26:45 lan pdns[32231]: AXFR of domain 'cainites.net' initiated by 192.168.1.2
Apr 5 14:26:45 lan pdns[32231]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation
not permitted
Apr 5 14:26:45 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative

OpenDBX 1.2.6 never showed this behaviour to my knowledge.

Comment by Norbert Sendetzky - Sunday, 5 Apr 2009, 3:14pm

Which version of PowerDNS do you use?

The only change in pgsql_odbx_result was
- switch( select( fd + 1, &fds, NULL, NULL, timeout ) )
+ while( ( err = select( fd + 1, &fds, NULL, NULL, timeout ) ) < 0
+ switch( err )
to retry on interrupted select() system call.

Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 3:16pm

2.9.22

I've also added some patches from svn, but none of those affect the backends (except maybe sqlite2/3).

Comment by Norbert Sendetzky - Monday, 6 Apr 2009, 11:11pm

I've tried PowerDNS 2.9.22 with OpenDBX backend myself and query-logging works as expected. My system (Debian testing) logs to /var/log/daemon.log. Could you send me your complete configuration files (pdns.conf, pdns.d/* if available)?

Which version of PostgreSQL are you using? I guess 8.3

Comment by Ralf van der Enden - Wednesday, 8 Apr 2009, 6:53pm

I'm using PostgreSQL 8.3.7 on FreeBSD. Will send you my pdns.conf ASAP.

(Sorry for the long delay... I got the keys to my new home and a lot of work is needs to be done before I can move my stuff and myself in there).

Comment by Ralf van der Enden - Tuesday, 21 Apr 2009, 9:52pm

Norbert,

I've changed my pdns.conf to use /tmp instead of 127.0.0.1 for the read and write connection and this has solved my issue. Apparently it's all caused by the connection timing out (at least that's what I'm guessing) and when a query is sent reconnecting to the db doesn't work as it's supposed to. Whether this is an OpenDBX or PostgreSQL issue I do not know since I'm not familiar with either API's or specs.

Comment by Norbert Sendetzky - Wednesday, 22 Apr 2009, 12:29am

That sounds like a reasonable explanation but shouldn't happen according to the OpenDBX and PDNS opendbxbackend code. Are you willing to debug this issue further? I would send you a patch with would add log output to OpenDBX and PDNS so we could track this down.

When using the PDNS gpgsql backend, do you notice sometimes informational messages in your log file that a backend thread is started again? This must be the case if connections ran into timeouts.

Could you replace OpenDBX 1.4.0 by 1.2.6 and leave your current installation as is to find out if the problem is really related to OpenDBX?

The error reported by the OpenDBX library to PDNS opendbxbackend could be a warning instead of an error, so the opendbxbackend doesn't reestablish the connection.