[pmg-devel] code adjustment for PMG in /usr/share/perl5/PMG/DBTools.pm

Stoiko Ivanov s.ivanov at proxmox.com
Mon Apr 15 12:16:56 CEST 2019


Hi,


On Sat, 13 Apr 2019 21:09:55 -0400
Patrick Fogarty <patrick.fogarty at patanne.com> wrote:

> Thomas
> 
> This would suggest there is some other issue further upstream in the 
> code that is not handling an outcome when processing goes through
> nscd, or nscd is not behaving as cleanly as it should. It sounds as
> if better error handling at the point I am having issues would allow
> code to flow but still mask a problem.

I'd say it's the other way around, having nscd installed exposes a bug
in PMG's code - exactly where you pointed out:
errno/$! does not get set to 0 by any library or syscall, and PMG does
not set it to 0 before executing the setresuid call. Since it does not
check (yet) whether the call itself had an error, just checking for
errno != 0 is problematic (as experience by you and the other affected
users).

Setting $! to zero and checking if the setresuid call was successful
as Thomas suggested is the correct fix here.


> 
> What if we temporarily removed nscd as a factor?

In my tests this (uninstalling nscd) let pmg-api install cleanly - so
as a temporary workaround until the fix is released this would work (if
possible in your environment).


> 
> We tested two scenarios: one where we invalidate the nscd passwd
> cache just before upgrade; the other where we stop the nscd service
> during the upgrade process. The thinking was that invalidating the
> cache would force a fresh read. It is risky, I know, because a read
> could happen at any time and re-cache before your code reached the
> point it breaks. Ultimately it did not work.Stopping nscd during the
> upgrade process, an ugly step to be sure, worked.

Invalidating the cache of nscd does not change it's behavior - with
nscd installed the getpwnam call (run before the setresuid, to find the
UID of postgres) does open and read '/etc/passwd', and somewhere in
those calls errno gets set to 0, thus not causing the error to occur.
If nscd is installed '/etc/passwd' is not read (instead a socket is
opened and nscd is asked to provide the answer) - and errno still
contains the value from the unrelated and failed ioctl above.


> 
> What do you think?
> 
> 
> 
> Thomas Lamprecht wrote on 4/13/19 4:44 PM:
> > On 4/13/19 10:32 PM, Stoiko Ivanov wrote:  
> >> Hi,
> >>
> >> Patrick, thanks for picking this up and bringing it to our
> >> attention! On Fri, Apr 12, 2019 at 12:35:01PM -0400, Patrick
> >> Fogarty wrote:  
> >>> Thomas
> >>>
> >>> I agree that my change works, but introduces the potential for
> >>> problems, and therefore may not be the "correct" answer. In fact
> >>> I have no confirmation that everything is okay. Everything
> >>> appears to be okay after upgrade, except that tracking center
> >>> loops forever with a "connection error". However I did not press
> >>> matters to determine if there truly is an issue.  
> >> Hmm - this should not happen - but is probably not directly
> >> related to the postgres_admin_cmd issue. However I could imagine
> >> that some services did not get setup correctly, because of the
> >> failing postinst script - You could try to run `dpkg --configure
> >> -a` (with your patch in place) and restart all services (or the
> >> whole PMG-instance).
> >>
> >> If this does not fix the issue - I would suggest to open a
> >> bug-report in https://bugzilla.proxmox.com (or a forum thread) for
> >> further analysis. 
> >>> The environment...
> >>> PVE: 5.3-12
> >>> PMG: 5.1-3
> >>>
> >>> The is a test environment, so we are using your community repo,
> >>> not subscription. However it is happening in subscription as well.
> >>>
> >>> PMG is running inside a container based on Debian Stretch. It is
> >>> about as unmodified as you can get.
> >>>
> >>> We did make modifications to the PMG install. They are limited to
> >>> /var/lib/pmg/templates/main.cf.in
> >>> and /var/lib/pmg/templates/init.pre.in (geoip) and some custom
> >>> spamassassin rules. Beyond that, nothing has been touched.  
> >> If you modify the configuration templates - please put them in
> >> '/etc/pmg/templates' - that way they are preserved across upgrades
> >> (see section 4.3 of the admin guide
> >> https://www.proxmox.com/images/download/pmg/docs/pmg-admin-guide.pdf).
> >>  
> >>> The upgrade failure happens at the very end during the
> >>> configuration phase. As the post mentioned, I traced the issue to
> >>> PMG:: DBTools:: postgres_admin_cmd. But you have already reviewed
> >>> the post and the code, so you already know that.
> >>>  
> >> Your analysis is correct - and as the strace output of the OP
> >> shows the setresuid call is successful, but $! (perl's variable
> >> for errno still is set from a previous syscall (the TCGETS ioctl).
> >> I managed to reproduce the issue by installing 'nscd' (without it
> >> the getpwnam call opens and reads '/etc/passwd' directly and this
> >> resets errno to 0 somewhere, with it installed it asks the nscd,
> >> and errno still is set).
> >>
> >> We would gladly welcome a patch by you! (I would probably use the
> >> approach from [0] for readability reasons). Although setting $! to
> >> 0 before `$> = $uid;` should work as well.
> >> If you don't find the time, I can also send the patch.  
> > FWIW, both options are ok to me, but I'd like to have the $! = 0;
> > with a comment here as a reminder for people reading and/or copying
> > this, could be combined with the "is the euid now the one we
> > wanted" check too (or just that _and_ a comment would work too for
> > me, but such things need to be recorded). 
> >> BTW. this line is perl's way of 'su postgres' (what Thomas refers
> >> to as Perl_magic_set :)  
> > Perl_magic_set is actually the perl's interpreter (or whatever it
> > really is) method to handle the single letter short hand variables,
> > like $@, $_, $!, $; ... which it calls "magic variables", it is
> > surprisingly readable, even though multi platform support is
> > handled there too, in a single big switch/case... So it's not only
> > responsible for "su" but actually for a /bit/ more ;-) 
> >> I quickly skimmed through the pmg-api repos (with `git grep
> >> '\$\!') - and this seems the only problematic use of relying on $!
> >> being 0 instead of explictly checking for an error-condition.
> >>
> >> cheers,
> >> stoiko
> >>
> >> [0] https://www.perlmonks.org/bare/?node_id=117878  
> >>>
> >>> Let me know how I can help further.
> >>>
> >>>
> >>>
> >>> Thomas Lamprecht wrote on 4/12/19 11:23 AM:  
> >>>> Hi,
> >>>>
> >>>> On 4/12/19 4:17 PM, Patrick Fogarty wrote:  
> >>>>> If you follow this thread...
> >>>>> https://forum.proxmox.com/threads/setuid-postgres-114-failed-inappropriate-ioctl-for-device.41414/
> >>>>>
> >>>>> ...you will find that I had an issue, as did others, performing
> >>>>> upgrades/updates. I adjusted my code manually and things went
> >>>>> back to normal. Last month you released PMG 5.2. The upgrade
> >>>>> downloads a package that includes the original code, so the
> >>>>> upgrade overwrites my changes, then fails the upgrade process
> >>>>> "pmg-api --configure", after which I believe the process is in
> >>>>> an incomplete state.
> >>>>>
> >>>>> How can I assist, beyond my post in the above thread, showing
> >>>>> the code change I made?  
> >>>> Either, you could sent a patch here, for this see:
> >>>> https://pve.proxmox.com/wiki/Developer_Documentation
> >>>> (while written for PVE, almost all is valid for PMG too)
> >>>>
> >>>> Else you normally also can make a Bug report, which does not
> >>>> gets as easily forgotten as some Forum/Mailing list thread:
> >>>> https://bugzilla.proxmox.com/enter_bug.cgi
> >>>>
> >>>> But posting here may be enough this time, Stoiko, could you
> >>>> please take a look at the proposed change:
> >>>> https://forum.proxmox.com/threads/setuid-postgres-114-failed-inappropriate-ioctl-for-device.41414/#post-233016
> >>>>
> >>>> Looking at perl's source code at the Perl_magic_set case for
> >>>> EUID (case '>') we just see that there's no special error
> >>>> handling, thus the recommendation to check $! (errno or
> >>>> sys_errlist[errno] depending on context) after the call.
> >>>>
> >>>> We do that, and it seems correct, so either we need to reset it
> >>>> to 0 befor the seteuid call to ensure we do not check an old
> >>>> value or this is correct an the seteuid really fails in the
> >>>> posted case..
> >>>>
> >>>> Patrick, can you post a bit more details about your setup, you
> >>>> said you run it in PVE, inside a Virtual Machine (VM) or
> >>>> Container (CT)?
> >>>>
> >>>> Because your change could just omit error checking completely,
> >>>> and thus not be the "correct" solution here..
> >>>>
> >>>> cheers,
> >>>> Thomas
> >>>>
> >>>>  
> >  
> 




More information about the pmg-devel mailing list