2003-08-07 08:39:38

by Frank Cusack

[permalink] [raw]
Subject: NPTL v userland v LT (RH9+custom kernel problem)

Hi,

The RH9 kernels have NPTL patches. Standard 2.4.21 does not.
I am running a custom kernel without the NPTL stuff.

At least one RH9 userland piece is not working correctly with my
custom kernel. If I use pam_ldap, the root user cannot login on
the console.

PAM prompts for the username and password, then pam_ldap appears to
get stuck in a syslog call. It doesn't actually call syslog(), but if
I compare to a functional system, the working one opens /dev/log etc
whereas the broken one does an rt_sigsuspend() and hangs until a SIGALRM
is delivered (login having set this up before prompting for the password).
That's from looking at strace; I haven't looked at ltrace or tried to
run under the debugger yet.

Logging in as a normal user, then sudo'ing to root does work though.

A notable difference between these two cases is that in the former the
real uid of the 'login' process is root, and in the latter the real uid
of the 'sudo' process is that of the user. (PAM config for login and sudo
are identical.)

I think I've seen a case where normal users couldn't login but I
may be misremembering.

So, finally getting to my question, should I even *expect* a non-NPTL
kernel to work with the RH9 userland? If not, is there a simple fix
without going to NPTL, say just rebuilding glibc? hmm... now that I
ask it I feel dumb, I do think I would need to rebuild glibc so it
knows the kernel has LinuxThreads, not NPTL. OK, if that's true
are there any other libs I should need to rebuild?

thanks
/fc


2003-08-07 08:49:47

by Arjan van de Ven

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)


> So, finally getting to my question, should I even *expect* a non-NPTL
> kernel to work with the RH9 userland?

Yes, absolutely. It's a design goal of RHL to work with unpatched
kernel.org kernels.

>
> If not, is there a simple fix
> without going to NPTL, say just rebuilding glibc? hmm... now that I
> ask it I feel dumb, I do think I would need to rebuild glibc so it
> knows the kernel has LinuxThreads, not NPTL.

no need. RHL9 ships with BOTH LinuxThreads and NPTL and will switch
dynamically at runtime depending on your kernel capabilities.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-08-07 13:06:04

by Mathias Fröhlich

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)


Hi,

I think you can try out the nss_ldap rpm at

http://na.uni-tuebingen.de/~frohlich/nss_ldap/

It is built on RedHat 9.
The main target of this rpm is to eliminate the recursive entrance bug of
nss_ldap if the ldap host is not resolved before entering the ldap nss
modules gethostbyname. But there is also some threading cleanup included.

I don't know what the real reason of the reported problem is, but it does not
occur since the time i use my patched nss_ldap module.
I use both kinds of kernels with redhat 9 vanilla kernel.org and redhat
modified ones.

Hope this helps

Mathias Fröhlich

--
Mathias Fröhlich, email: [email protected]


2003-08-07 13:20:37

by Luke Howard

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)


Integrating this patch is on my todo list.

-- Luke

>From: Mathias =?utf-8?q?Fr=C3=B6hlich?= <[email protected]>
>Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)
>To: [email protected]
>Cc: [email protected], Luke Howard <[email protected]>
>Date: Thu, 7 Aug 2003 15:06:04 +0200
>
>
>Hi,
>
>I think you can try out the nss_ldap rpm at
>
>http://na.uni-tuebingen.de/~frohlich/nss_ldap/
>
>It is built on RedHat 9.
>The main target of this rpm is to eliminate the recursive entrance bug of
>nss_ldap if the ldap host is not resolved before entering the ldap nss
>modules gethostbyname. But there is also some threading cleanup included.
>
>I don't know what the real reason of the reported problem is, but it does not
>occur since the time i use my patched nss_ldap module.
>I use both kinds of kernels with redhat 9 vanilla kernel.org and redhat
>modified ones.
>
> Hope this helps
>
> Mathias Fröhlich
>
>--
>Mathias Fröhlich, email: [email protected]
>
>

2003-08-07 14:44:52

by Jerry Cooperstein

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

If you read the release notes for RH9 you'll see you can adjust what
thread library gets used with the environmental variable
LD_ASSUME_KERNEL. So for instance you can do:

LD_ASSUME_KERNEL=2.2.5 rpm ....
LD_ASSUME_KERNEL=2.2.5 up2date

(I've mentioned these two because I've noted these fail when you are
root...)


======================================================================
Jerry Cooperstein, Senior Consultant, <[email protected]>
Axian, Inc., Software Consulting and Training
4800 SW Griffith Dr., Ste. 202, Beaverton, OR 97005 USA
http://www.axian.com/
====================================================================


On Thu, 2003-08-07 at 03:39, Frank Cusack wrote:
> Hi,
>
> The RH9 kernels have NPTL patches. Standard 2.4.21 does not.
> I am running a custom kernel without the NPTL stuff.
>
.....
>
> So, finally getting to my question, should I even *expect* a non-NPTL
> kernel to work with the RH9 userland? If not, is there a simple fix
> without going to NPTL, say just rebuilding glibc? hmm... now that I
> ask it I feel dumb, I do think I would need to rebuild glibc so it
> knows the kernel has LinuxThreads, not NPTL. OK, if that's true
> are there any other libs I should need to rebuild?
>
> thanks


2003-08-08 05:46:33

by Frank Cusack

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

On Thu, Aug 07, 2003 at 03:06:04PM +0200, Mathias Fr?hlich wrote:
> I think you can try out the nss_ldap rpm at
>
> http://na.uni-tuebingen.de/~frohlich/nss_ldap/

Didn't fix my problem. I'll dig in and narrow this down further.

On Thu, Aug 07, 2003 at 09:42:36AM -0500, Jerry Cooperstein wrote:
> If you read the release notes for RH9 you'll see you can adjust what
> thread library gets used with the environmental variable
> LD_ASSUME_KERNEL. So for instance you can do:
>
> LD_ASSUME_KERNEL=2.2.5 rpm ....
> LD_ASSUME_KERNEL=2.2.5 up2date
>
> (I've mentioned these two because I've noted these fail when you are
> root...)

Interesting. Something these have in common is that they all use
Berkeley db4 (up2date by virtue of using rpm). I don't understand why
nss_ldap or pam_ldap would, but it's one of the sources in the srpm.

But, rpm works for me (both RH and unpatched kernels).

/fc

2003-08-08 06:03:08

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

On Thu, 07 Aug 2003 22:45:45 PDT, Frank Cusack said:

> Interesting. Something these have in common is that they all use
> Berkeley db4 (up2date by virtue of using rpm). I don't understand why
> nss_ldap or pam_ldap would, but it's one of the sources in the srpm.

ISTR that db4 was buggy in its handling of O_DIRECT - the changed 2.6
semantics gave it indigestion - from the RedHat RPM's changelog:

* Tue Jun 24 2003 Jeff Johnson <[email protected]> 4.1.25-4

- hack out O_DIRECT support in db4 for now.


Attachments:
(No filename) (226.00 B)

2003-08-08 06:53:46

by Peter Svensson

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

On Thu, 7 Aug 2003, Frank Cusack wrote:

> On Thu, Aug 07, 2003 at 09:42:36AM -0500, Jerry Cooperstein wrote:
> > If you read the release notes for RH9 you'll see you can adjust what
> > thread library gets used with the environmental variable
> > LD_ASSUME_KERNEL. So for instance you can do:
> >
> > LD_ASSUME_KERNEL=2.2.5 rpm ....
> > LD_ASSUME_KERNEL=2.2.5 up2date
> >
> > (I've mentioned these two because I've noted these fail when you are
> > root...)
>
> Interesting. Something these have in common is that they all use
> Berkeley db4 (up2date by virtue of using rpm). I don't understand why
> nss_ldap or pam_ldap would, but it's one of the sources in the srpm.

I have had rpm lock up on me a few times. I think it was waiting on a
sempahore or some other synchronization event. After killing the process
(after several hours) no rpm transactions could be completed, they all
hanged at the same point. The only way to get rpm to work again was to
reboot the system.

Not sure if it is related or not though. I never thought to try it as
non-root.

Peter
--
Peter Svensson ! Pgp key available by finger, fingerprint:
<[email protected]> ! 8A E9 20 98 C1 FF 43 E3 07 FD B9 0A 80 72 70 AF
------------------------------------------------------------------------
Remember, Luke, your source will be with you... always...


2003-08-08 07:10:28

by Pau Aliagas

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

On Fri, 8 Aug 2003, Peter Svensson wrote:

> I have had rpm lock up on me a few times. I think it was waiting on a
> sempahore or some other synchronization event. After killing the process
> (after several hours) no rpm transactions could be completed, they all
> hanged at the same point. The only way to get rpm to work again was to
> reboot the system.
>
> Not sure if it is related or not though. I never thought to try it as
> non-root.

There are some temporary files locking the normal operation.
$ rm /usr/lib/rpm/__*db*
will do the trick.

Pau

Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

Peter Svensson <[email protected]> writes:

>I have had rpm lock up on me a few times. I think it was waiting on a
>sempahore or some other synchronization event. After killing the process
>(after several hours) no rpm transactions could be completed, they all
>hanged at the same point. The only way to get rpm to work again was to
>reboot the system.

rm -f /var/lib/rpm/__db*

This is a FAQ. And a Bug in RH9.

Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
[email protected] +49 9131 50 654 0 http://www.intermeta.de/

Java, perl, Solaris, Linux, xSP Consulting, Web Services
freelance consultant -- Jakarta Turbine Development -- hero for hire

"You are being far too rational for this discussion."
--- Scott Robert Ladd in <[email protected]>

2003-08-08 17:37:55

by Frank Cusack

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

On Thu, Aug 07, 2003 at 01:39:30AM -0700, Frank Cusack wrote:
> PAM prompts for the username and password, then pam_ldap appears to
> get stuck in a syslog call. It doesn't actually call syslog(), but if
> I compare to a functional system, the working one opens /dev/log etc
> whereas the broken one does an rt_sigsuspend() and hangs until a SIGALRM
> is delivered (login having set this up before prompting for the password).
> That's from looking at strace; I haven't looked at ltrace or tried to
> run under the debugger yet.

Even without pam_ldap, I see it getting stuck. 'groups: files ldap' in
nsswitch.conf sets it off. Here's an sshd that's hung, does this light
the a-ha bulb for anyone?

(gdb) bt
#0 0x40564845 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
#1 0x40564318 in __pthread_wait_for_restart_signal ()
from /lib/i686/libpthread.so.0
#2 0x40565d30 in __pthread_alt_lock () from /lib/i686/libpthread.so.0
#3 0x40562d37 in pthread_mutex_lock () from /lib/i686/libpthread.so.0
#4 0x401df2fc in vsyslog () from /lib/i686/libc.so.6
#5 0x4024b4b7 in _log_err () from /lib/security/pam_unix.so
#6 0x40249039 in pam_sm_open_session () from /lib/security/pam_unix.so
#7 0x4003ac09 in pam_fail_delay () from /lib/libpam.so.0
#8 0x4003ad93 in _pam_dispatch () from /lib/libpam.so.0
#9 0x4003c978 in pam_open_session () from /lib/libpam.so.0
...

/fc

2003-08-08 19:06:38

by Ulrich Drepper

[permalink] [raw]
Subject: Re: NPTL v userland v LT (RH9+custom kernel problem)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank Cusack wrote:

> Even without pam_ldap, I see it getting stuck. 'groups: files ldap' in
> nsswitch.conf sets it off. Here's an sshd that's hung, does this light
> the a-ha bulb for anyone?
>
> (gdb) bt
> #0 0x40564845 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1 0x40564318 in __pthread_wait_for_restart_signal ()
> from /lib/i686/libpthread.so.0
> #2 0x40565d30 in __pthread_alt_lock () from /lib/i686/libpthread.so.0

This has nothing to do with NPTL as you can clearly see from the file
names and the functions used to implement locking.

- --
- --------------. ,-. 444 Castro Street
Ulrich Drepper \ ,-----------------' \ Mountain View, CA 94041 USA
Red Hat `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE/M/Re2ijCOnn/RHQRAinJAJ98U1aBMe6CNNS92MQhv+Y8Qcs01wCdG3H2
M2n00ZHMbZpFnrDNyezDYcg=
=T8u5
-----END PGP SIGNATURE-----