Task #19 — OpenDBX 1.4.0 + PowerDNS issue
Attached to Project — PDNS OpenDBXLast edited by Norbert Sendetzky (nose) - Friday, 7 Aug 2009, 1:23pm
| Bug Report | Low | ||
| Library | Normal | ||
| Unconfirmed | 1.0.x | ||
| Norbert Sendetzky (nose) | Undecided | ||
| BSD | |||
|
I'm running powerdns + opendbx + pgsql(backend). After upgrading OpenDBX from 1.2.6 to 1.4.0 the following error occurs: Mar 31 19:56:52 lan pdns[63064]: AXFR of domain 'no-substitute.net' initiated by 213.173.243.11 Mar 31 19:56:52 lan pdns[63064]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation not permitted Mar 31 19:56:52 lan pdns[63064]: AXFR of domain 'no-substitute.net' failed: not authoritative This never happens with 1.2.6. It appears the connection to the db is lost or something cause a 'dig -t axfr domain.tld @name.server' doesn't give me anything most of the time, but after repeating the command a dozen or so times, suddenly performs normally. I've noticed it only happens when I trigger an AXFR query. Whenever I query for NS, SOA or any other types it doesn't trigger it. Since I only have records in the pgsql database I haven't tried any other OpenDBX backend. I hope this helps in tracing this one down. Cheers, Ralf |
|||
Comments (13) | Attachments (0) | Related Tasks (0/0) | Notifications (1) | Reminders (0) | History |
Comment by Norbert Sendetzky - Thursday, 2 Apr 2009, 10:02pm
Hi Ralf
Could you please enable query-logging=yes in your pdns.conf to see what are the last 2-3 statements before the error occurs? Also, I would like to ask you to execute these statements using psql from the machine running the PowerDNS daemon to eliminate the possibility of other errors.
Thanks
Norbert
Comment by Ralf van der Enden - Friday, 3 Apr 2009, 12:10pm
query-logging is on, but for some reason it doesn't show anything in the logs with the opendbx backend. Whenever I switch to the pgsql backend my log fills up with statements though. SO for some reason the opendbx backend doesn't log the queries.
I did run the queries in question directly using psql from the command line, and that worked fine.
A note: it doesn't always fail. Sometimes AXFR works without any issues, but if it doesn't it takes a lot of retries to make it work again.
Comment by Norbert Sendetzky - Saturday, 4 Apr 2009, 11:50am
The OpenDBX backend logs queries using the INFO loglevel which should relate to loglevel=8 in pdns.conf. To be sure, you can use loglevel=9 (DEBUG) to log anything. Could you please retry with this setting?
When it fails, it's important to know what happend before.
Comment by Norbert Sendetzky - Saturday, 4 Apr 2009, 12:03pm
Is there something to the PostgreSQL server logs written when the error occurs?
Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 12:04pm
This shows sometimes, but not always when the problem occurs:
<powerdns%postgres 2009-04-04 20:18:39 CEST> LOG: could not receive data from client: Operation not permitted
<powerdns%postgres 2009-04-04 20:18:39 CEST> LOG: unexpected EOF on client connection
Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 12:37pm
I've set log-level=9 in pdns.conf, but no statements to be found anywhere in my log. query-logging=yes is also set.
I do see additional stuff like this now:
Apr 5 12:35:27 lan pdns[32231]: [OpendbxBackend] Database connection (read) to '127.0.0.1' succeeded
Apr 5 12:35:27 lan pdns[32231]: [OpendbxBackend] Database connection (write) to '127.0.0.1' succeeded
Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 2:28pm
Here's what happens after a couple of hours:
[14:26][root@lan ~]# dig -t axfr cainites.net @192.168.1.2
; <<>> DiG 9.4.2-P2 <<>> -t axfr cainites.net @192.168.1.2
;; global options: printcmd
; Transfer failed.
[14:26][root@lan ~]# dig -t aaaa cainites.net @192.168.1.2
; <<>> DiG 9.4.2-P2 <<>> -t aaaa cainites.net @192.168.1.2
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62954
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;cainites.net. IN AAAA
;; ANSWER SECTION:
cainites.net. 86400 IN AAAA 2001:888:1deb::7
;; Query time: 6 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Sun Apr 5 14:26:39 2009
;; MSG SIZE rcvd: 58
[14:26][root@lan ~]# dig -t axfr cainites.net @192.168.1.2
; <<>> DiG 9.4.2-P2 <<>> -t axfr cainites.net @192.168.1.2
;; global options: printcmd
; Transfer failed.
From the /var/log/messages file:
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' initiated by 192.168.1.2
Apr 5 14:26:35 lan pdns[32231]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation
not permitted
Apr 5 14:26:35 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
Apr 5 14:26:45 lan pdns[32231]: AXFR of domain 'cainites.net' initiated by 192.168.1.2
Apr 5 14:26:45 lan pdns[32231]: [OpendbxBackend] execStmt: Unable to execute query - could not send data to server: Operation
not permitted
Apr 5 14:26:45 lan pdns[32231]: AXFR of domain 'cainites.net' failed: not authoritative
OpenDBX 1.2.6 never showed this behaviour to my knowledge.
Comment by Norbert Sendetzky - Sunday, 5 Apr 2009, 3:14pm
Which version of PowerDNS do you use?
The only change in pgsql_odbx_result was
- switch( select( fd + 1, &fds, NULL, NULL, timeout ) )
+ while( ( err = select( fd + 1, &fds, NULL, NULL, timeout ) ) < 0
+ switch( err )
to retry on interrupted select() system call.
Comment by Ralf van der Enden - Sunday, 5 Apr 2009, 3:16pm
2.9.22
I've also added some patches from svn, but none of those affect the backends (except maybe sqlite2/3).
Comment by Norbert Sendetzky - Monday, 6 Apr 2009, 11:11pm
I've tried PowerDNS 2.9.22 with OpenDBX backend myself and query-logging works as expected. My system (Debian testing) logs to /var/log/daemon.log. Could you send me your complete configuration files (pdns.conf, pdns.d/* if available)?
Which version of PostgreSQL are you using? I guess 8.3
Comment by Ralf van der Enden - Wednesday, 8 Apr 2009, 6:53pm
I'm using PostgreSQL 8.3.7 on FreeBSD. Will send you my pdns.conf ASAP.
(Sorry for the long delay... I got the keys to my new home and a lot of work is needs to be done before I can move my stuff and myself in there).
Comment by Ralf van der Enden - Tuesday, 21 Apr 2009, 9:52pm
Norbert,
I've changed my pdns.conf to use /tmp instead of 127.0.0.1 for the read and write connection and this has solved my issue. Apparently it's all caused by the connection timing out (at least that's what I'm guessing) and when a query is sent reconnecting to the db doesn't work as it's supposed to. Whether this is an OpenDBX or PostgreSQL issue I do not know since I'm not familiar with either API's or specs.
Comment by Norbert Sendetzky - Wednesday, 22 Apr 2009, 12:29am
That sounds like a reasonable explanation but shouldn't happen according to the OpenDBX and PDNS opendbxbackend code. Are you willing to debug this issue further? I would send you a patch with would add log output to OpenDBX and PDNS so we could track this down.
When using the PDNS gpgsql backend, do you notice sometimes informational messages in your log file that a backend thread is started again? This must be the case if connections ran into timeouts.
Could you replace OpenDBX 1.4.0 by 1.2.6 and leave your current installation as is to find out if the problem is really related to OpenDBX?
The error reported by the OpenDBX library to PDNS opendbxbackend could be a warning instead of an error, so the opendbxbackend doesn't reestablish the connection.