2019-02-01 21:55:33

by Alan Mackenzie

[permalink] [raw]
Subject: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.

Hello, Thomas, Hello Linux.

0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
posix-timers: Fix division by zero bug
Committed: 2018-12-17 17:35:45 +0100

With this patch in place I am seeing problems with glibc's function
timer_create. I am an Emacs maintainer, and saw these problems whilst
investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".
Full details of this bug are at
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=34235.

Emacs's profiler fails in kernel 4.19.13, but works in a version of
4.19.13 with the patch reversed, otherwise unchanged. My current version
of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).

The Emacs profiler works by a signal handler being repeatedly triggered
by the SIGPROF signal every 1 millisecond. In the bug scenario, this
signal gets triggered precisely once each time the Emacs profiler is
started, rather than continually.

The core of the code in Emacs which initialises the glibc timer is:

int i;
struct sigevent sigev;
sigev.sigev_value.sival_ptr = &profiler_timer;
sigev.sigev_signo = SIGPROF;
sigev.sigev_notify = SIGEV_SIGNAL;

for (i = 0; i < ARRAYELTS (system_clock); i++)
if (timer_create (system_clock[i], &sigev, &profiler_timer) == 0)
{
profiler_timer_ok = 1;
break;
}
}

if (profiler_timer_ok)
{
struct itimerspec ispec;
ispec.it_value = ispec.it_interval = interval;
if (timer_settime (profiler_timer, 0, &ispec, 0) == 0)
return TIMER_SETTIME_RUNNING;
}

The variable `interval' has been checked as non-zero. This code is in
.../emacs/src/profiler.c

It seems either that the patch has uncovered some invalid call between
Emacs and glibc, or between glibc and Linux, or that there is some
intrinsic problem with the patch.

I have very little familiarity with glibc and Linux source code, so I
would be greatful if you could help me investigate the bug scenario.
Naturally, I will help as I can in this process.

Thanks in advance!

--
Alan Mackenzie (Nuremberg, Germany).


2019-02-01 22:06:31

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.

Hello Alan,

On Fri, 1 Feb 2019, Alan Mackenzie wrote:
> 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
> posix-timers: Fix division by zero bug
> Committed: 2018-12-17 17:35:45 +0100
>
> With this patch in place I am seeing problems with glibc's function
> timer_create. I am an Emacs maintainer, and saw these problems whilst
> investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".

> Emacs's profiler fails in kernel 4.19.13, but works in a version of
> 4.19.13 with the patch reversed, otherwise unchanged. My current version
> of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).

Please upgrade to 4.19.19. The issue should be fixed there with the
backported variant of

93ad0fc088c5 ("posix-cpu-timers: Unbreak timer rearming")

Commit 21c0d1621b8d4b in 4.19.19

Thanks,

tglx

2019-02-02 10:46:13

by Alan Mackenzie

[permalink] [raw]
Subject: Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.

Hello, Thomas.

Thanks for such a rapid reply!

On Fri, Feb 01, 2019 at 23:04:48 +0100, Thomas Gleixner wrote:
> Hello Alan,

> On Fri, 1 Feb 2019, Alan Mackenzie wrote:
> > 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
> > posix-timers: Fix division by zero bug
> > Committed: 2018-12-17 17:35:45 +0100

> > With this patch in place I am seeing problems with glibc's function
> > timer_create. I am an Emacs maintainer, and saw these problems whilst
> > investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".

> > Emacs's profiler fails in kernel 4.19.13, but works in a version of
> > 4.19.13 with the patch reversed, otherwise unchanged. My current version
> > of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).

> Please upgrade to 4.19.19. The issue should be fixed there with the
> backported variant of

> 93ad0fc088c5 ("posix-cpu-timers: Unbreak timer rearming")

> Commit 21c0d1621b8d4b in 4.19.19

I've just built and installed Linux 4.19.19, and it does indeed solve
the Emacs profiler bug, #34235. :-)

I see that the patch has been installed in 4.20.6, 4.19.19, and 4.14.97.
Are there any plans to install it into 4.9.x, the other live long term
support branch? The reason I ask is to make an entry into Emacs's
PROBLEMS file, telling users and distributions which kernel versions to
upgrade to.

> Thanks,

> tglx

--
Alan Mackenzie (Nuremberg, Germany).

2019-02-04 17:31:54

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.

On Sat, 2 Feb 2019, Alan Mackenzie wrote:
> Hello, Thomas.
>
> Thanks for such a rapid reply!
>
> On Fri, Feb 01, 2019 at 23:04:48 +0100, Thomas Gleixner wrote:
> > Hello Alan,
>
> > On Fri, 1 Feb 2019, Alan Mackenzie wrote:
> > > 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
> > > posix-timers: Fix division by zero bug
> > > Committed: 2018-12-17 17:35:45 +0100
>
> > > With this patch in place I am seeing problems with glibc's function
> > > timer_create. I am an Emacs maintainer, and saw these problems whilst
> > > investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".
>
> > > Emacs's profiler fails in kernel 4.19.13, but works in a version of
> > > 4.19.13 with the patch reversed, otherwise unchanged. My current version
> > > of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).
>
> > Please upgrade to 4.19.19. The issue should be fixed there with the
> > backported variant of
>
> > 93ad0fc088c5 ("posix-cpu-timers: Unbreak timer rearming")
>
> > Commit 21c0d1621b8d4b in 4.19.19
>
> I've just built and installed Linux 4.19.19, and it does indeed solve
> the Emacs profiler bug, #34235. :-)
>
> I see that the patch has been installed in 4.20.6, 4.19.19, and 4.14.97.
> Are there any plans to install it into 4.9.x, the other live long term
> support branch? The reason I ask is to make an entry into Emacs's
> PROBLEMS file, telling users and distributions which kernel versions to
> upgrade to.

4.9 doesn't have the offending commit AFAICT.

Thanks,

tglx

2019-02-05 13:57:08

by Alan Mackenzie

[permalink] [raw]
Subject: Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.

Hello, Thomas.

On Mon, Feb 04, 2019 at 17:25:11 +0000, Thomas Gleixner wrote:
> On Sat, 2 Feb 2019, Alan Mackenzie wrote:

[ .... ]

> > I've just built and installed Linux 4.19.19, and it does indeed solve
> > the Emacs profiler bug, #34235. :-)

> > I see that the patch has been installed in 4.20.6, 4.19.19, and 4.14.97.
> > Are there any plans to install it into 4.9.x, the other live long term
> > support branch? The reason I ask is to make an entry into Emacs's
> > PROBLEMS file, telling users and distributions which kernel versions to
> > upgrade to.

> 4.9 doesn't have the offending commit AFAICT.

OK, thanks very much! I've put these three version numbers into the
message in Emacs's PROBLEMS file.

I think we're finished, now.

> Thanks,

> tglx

--
Alan Mackenzie (Nuremberg, Germany).