2011-08-28 19:04:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: 3.1-rc3-git6: Reported regressions from 3.0

This message contains a list of some regressions from 3.0,
for which there are no fixes in the mainline known to the tracking team.
If any of them have been fixed already, please let us know.

If you know of any other unresolved regressions from 3.0, please let us
know either and we'll add them to the list. Also, please let us know
if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply
to this message with CCs to the people involved in reporting and handling
the issue.


Listed regressions statistics:

Date Total Pending Unresolved
----------------------------------------
2011-08-28 8 4 4


Unresolved regressions
----------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=41742
Subject : duplicate filename for intel_backlight with the i915 driver
Submitter : François Valenduc <[email protected]>
Date : 2011-08-25 18:51 (4 days old)
First-Bad-Commit: http://git.kernel.org/linus/aaa6fd2a004147bf32fce05720938236de3361d9


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=41512
Subject : 3.1-rc2 failed s2ram: Freezing of tasks failed after 20.00 seconds
Submitter : Carlos R. Mafra <[email protected]>
Date : 2011-08-16 9:42 (13 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=131348782017435&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=41502
Subject : cfq-iosched: a regression
Submitter : Shaohua Li <[email protected]>
Date : 2011-08-16 2:47 (13 days old)
Message-ID : <1313462826.27321.22.camel@sli10-conroe>
References : http://marc.info/?l=linux-kernel&m=131346279329911&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=41442
Subject : rcu_sched_state detected stall on CPU 0, when booting on Xen
Submitter : Witold Baryluk <[email protected]>
Date : 2011-08-21 04:06 (8 days old)


Regressions with patches
------------------------

For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 3.0,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=40982

Please let the tracking team know if there are any Bugzilla entries that
should be added to the list in there.

Thanks!



2011-08-28 19:35:38

by Dave Jones

[permalink] [raw]
Subject: Re: 3.1-rc3-git6: Reported regressions from 3.0

On Sun, Aug 28, 2011 at 08:22:05PM +0200, Rafael J. Wysocki wrote:

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=41742
> Subject : duplicate filename for intel_backlight with the i915 driver
> Submitter : Fran?ois Valenduc <[email protected]>
> Date : 2011-08-25 18:51 (4 days old)
> First-Bad-Commit: http://git.kernel.org/linus/aaa6fd2a004147bf32fce05720938236de3361d9

this should be fixed by b727d20269e8ef1de002bfea8099f5e9db9e9f23

Dave


2011-08-28 19:50:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.1-rc3-git6: Reported regressions from 3.0

On Sun, Aug 28, 2011 at 12:35 PM, Dave Jones <[email protected]> wrote:
> On Sun, Aug 28, 2011 at 08:22:05PM +0200, Rafael J. Wysocki wrote:
>
> ?> Bug-Entry ? ?: http://bugzilla.kernel.org/show_bug.cgi?id=41742
> ?> Subject ? ? ? ? ? ? ?: duplicate filename ?for intel_backlight with the i915 driver
> ?> Submitter ? ?: Fran?ois Valenduc <[email protected]>
> ?> Date ? ? ? ? : 2011-08-25 18:51 (4 days old)
> ?> First-Bad-Commit: http://git.kernel.org/linus/aaa6fd2a004147bf32fce05720938236de3361d9
>
> this should be fixed by b727d20269e8ef1de002bfea8099f5e9db9e9f23

Actually, by a2cc797d2d1a ("i915: do not setup intel_backlight ").

That b727d20269e is just my merge commit that brings in the fix (and
the bogus initialization of 'locked') into mainline.

Linus

2011-08-28 19:37:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 3.1-rc3-git6: Reported regressions from 3.0

On Sunday, August 28, 2011, Dave Jones wrote:
> On Sun, Aug 28, 2011 at 08:22:05PM +0200, Rafael J. Wysocki wrote:
>
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=41742
> > Subject : duplicate filename for intel_backlight with the i915 driver
> > Submitter : Fran?ois Valenduc <[email protected]>
> > Date : 2011-08-25 18:51 (4 days old)
> > First-Bad-Commit: http://git.kernel.org/linus/aaa6fd2a004147bf32fce05720938236de3361d9
>
> this should be fixed by b727d20269e8ef1de002bfea8099f5e9db9e9f23

Thanks, closing.

Rafael

2011-11-22 05:27:06

by Ari Savolainen

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

2011/11/22 Linus Torvalds <[email protected]>:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>>
>> Subject ? ?: lockdep warning after aa6afca5bcab: "proc: fix races against execve() of /proc/PID/fd**"
>> Submitter ?: Ari Savolainen <[email protected]>
>> Date ? ? ? : 2011-11-08 3:47
>> Message-ID : CAEbykaXYZEFhTgWMm2AfaWQ2SaXYuO_ypTnw+6AVWScOYSCuuw@mail.gmail.com
>> References : http://marc.info/?l=linux-kernel&m=132072413125099&w=2
>
> Commit aa6afca5bcab was reverted by commit 5e442a493fc5, so this one
> is presumably stale.
>
> ? ? ? ? ? ? ? ? ? ? ?Linus

Yes, this went away after the reversion.

Ari

2011-11-22 13:54:45

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

> Subject : Regression in 3.1 causes Xen to use wrong idle routine
> Submitter : Stefan Bader <[email protected]>
> Date : 2011-10-26 10:24
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-acpi&m=131962467924564&w=2

The patch mentioned in http://mid.gmane.org/[email protected]
should do it. But the patch needs an Ack from ACPI/x86 folks.

2011-11-22 14:15:30

by John W. Linville

[permalink] [raw]
Subject: wireless regressions Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

The ones that look specific to wireless (or bluetooth)...

On Mon, Nov 21, 2011 at 10:49:30PM +0100, Rafael J. Wysocki wrote:

> Subject : iwlagn is getting very shaky
> Submitter : Norbert Preining <[email protected]>
> Date : 2011-10-19 6:01
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=131914553920614&w=2

> Subject : 3.1+ iwlwifi lockup
> Submitter : Dave Jones <[email protected]>
> Date : 2011-10-31 14:34
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132007169420160&w=2

> Subject : Oops on suspend with libertas SDIO (Linux 3.2-rc2)
> Submitter : Sven Neumann <[email protected]>
> Date : 2011-11-17 15:36
> Message-ID : 1321544210.31090.6.camel@sven
> References : http://marc.info/?l=linux-kernel&m=132154527715807&w=2

> Subject : [REGRESSION] resume takes 10s longer due to e1b6eb3 (Bluetooth: Increase HCI reset timeout ...)
> Submitter : Tomáš Janoušek <[email protected]>
> Date : 2011-11-18 18:40
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132164169511416&w=2

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2011-11-21 22:20:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> Subject ? ?: lockdep warning after aa6afca5bcab: "proc: fix races against execve() of /proc/PID/fd**"
> Submitter ?: Ari Savolainen <[email protected]>
> Date ? ? ? : 2011-11-08 3:47
> Message-ID : CAEbykaXYZEFhTgWMm2AfaWQ2SaXYuO_ypTnw+6AVWScOYSCuuw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132072413125099&w=2

Commit aa6afca5bcab was reverted by commit 5e442a493fc5, so this one
is presumably stale.

Linus

2011-11-21 22:34:39

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 2:11 PM, Linus Torvalds
<[email protected]> wrote:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>>
>> Subject ? ?: [3.1 REGRESSION] Commit 5cec93c216db77c45f7ce970d46283bcb1933884 breaks the Chromium seccomp sandbox
>> Submitter ?: Nix <[email protected]>
>> Date ? ? ? : 2011-11-14 0:40
>> Message-ID : [email protected]
>> References : http://marc.info/?l=linux-kernel&m=132123396226377&w=2
>
> So this should be fixed by commit 2b666859ec32 ("x86: Default to
> vsyscall=native for now"), since we disabled the vsyscall emulation
> because it broken UML too.

I don't think so. I think the issue is that the chromium sandbox is
trying to use getcpu, time, or gettimeofday from seccomp mode and the
kernel is (IMO correctly) sending it SIGKILL. Nix can trigger the bug
in vsyscall=native mode, so it's not the emulation. (If it's
gettimeofday, then it's definitely not a regression. vgettimeofday
would SIGKILL in seccomp mode with any timing source other than rdtsc
or hpet even on old kernels.)

I sent a patch to show which syscall is causing SIGKILL and haven't
heard back. Meanwhile, I'm downloading the 1.1GB (!) tarball to see
if I can reproduce it here. Fedora's build didn't trigger it for me,
probably because the sandbox was disabled.

To try to reduce the incidence of this stuff in the future, and to
make vsyscall=none and UML more useful, I filed this bug:

http://sourceware.org/bugzilla/show_bug.cgi?id=13425

--Andy

2011-11-21 22:12:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> Subject ? ?: [3.1 REGRESSION] Commit 5cec93c216db77c45f7ce970d46283bcb1933884 breaks the Chromium seccomp sandbox
> Submitter ?: Nix <[email protected]>
> Date ? ? ? : 2011-11-14 0:40
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132123396226377&w=2

So this should be fixed by commit 2b666859ec32 ("x86: Default to
vsyscall=native for now"), since we disabled the vsyscall emulation
because it broken UML too.

Of course, the chromium seccomp thing might re-surface with the
patches that enable the emulation (with better emulation), which Andy
is still working on, and that I was planning on merging for 3.3.

Andy, it migth be worth contacting Nix and having him test whether
your fixed emulation works for chromium too.

Linus

2011-11-21 22:18:53

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> Subject ? ?: hugetlb oops on 3.1.0-rc8-devel
> Submitter ?: Andy Lutomirski <[email protected]>
> Date ? ? ? : 2011-11-01 22:20
> Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2

Despite the subject line, that's not an oops, it's a BUG_ON().

And it *should* be fixed by commit ea4039a34c4c ("hugetlb: release
pages in the error path of hugetlb_cow()") although I don't think Andy
ever confirmed that (since it was hard to trigger).

Linus

2011-11-29 18:06:19

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Tue, Nov 22, 2011 at 08:54:12AM -0500, Konrad Rzeszutek Wilk wrote:
> > Subject : Regression in 3.1 causes Xen to use wrong idle routine
> > Submitter : Stefan Bader <[email protected]>
> > Date : 2011-10-26 10:24
> > Message-ID : [email protected]
> > References : http://marc.info/?l=linux-acpi&m=131962467924564&w=2
>
> The patch mentioned in http://mid.gmane.org/[email protected]
> should do it. But the patch needs an Ack from ACPI/x86 folks.

This patch (mentioned in the URL above) fixes the issue. Could it be
applied to the x86 tree for 3.2 or get an Ack, please?

>From 4f10ec7a7b9ff24657696aa98f25bcecde247373 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <[email protected]>
Date: Mon, 21 Nov 2011 18:02:02 -0500
Subject: [PATCH] xen/pm_idle: Make pm_idle be default_idle under Xen.

This patch:

commit d91ee5863b71e8c90eaf6035bff3078a85e2e7b5
Author: Len Brown <[email protected]>
Date: Fri Apr 1 18:28:35 2011 -0400

cpuidle: replace xen access to x86 pm_idle and default_idle

..scribble on pm_idle and access default_idle,
have it simply disable_cpuidle() so acpi_idle will not load and
architecture default HLT will be used.

idea was to have one call - disable_cpuidle() which would make
pm_idle not be molested by other code. It disallows cpuidle_idle_call
and acpi_idle_call to not set pm_idle (which is excellent). But the
amd_e400_idle and mwait_idle can still setup pm_idle which we really
do not want. In case of mwait_idle we can hit some instances where:

Brought up 2 CPUs
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in:

Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
RIP: e030:[<ffffffff81015d1d>] [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
RSP: e02b:ffff8801d28ddf10 EFLAGS: 00010082
RAX: ffff8801d28dc010 RBX: ffff8801d28ddfd8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffff8801d28ddf10 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: ffff8801d28ddfd8 R12: ffffffff81b590d0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8801dff81000(0000) knlGS:0000000000000000
CS: e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001a05000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Process swapper (pid: 0, threadinfo ffff8801d28dc000, task ffff8801d28cae60)
Stack:
ffff8801d28ddf40 ffffffff8100e2ed ffff8801dff8e390 c136dfe72feab515
0000000000000000 0000000000000000 ffff8801d28ddf50 ffffffff8149ee78
0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
[<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
RIP [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
RSP <ffff8801d28ddf10>

RH BZ #739499 and Ubuntu #881076

In case of amd_e400_idle we don't get so spectacular crashes, but
we do end up making an MSR which is trapped in the hypervisor,
and then follow it up with a yield hypercall. Meaning we end up
going to hypervisor twice instead of just once.

Lets make pm_idle be default_idle to take care of that.

Reported-by: Stefan Bader <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/include/asm/system.h | 1 +
arch/x86/kernel/process.c | 8 ++++++++
arch/x86/xen/setup.c | 2 +-
3 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h
index c2ff2a1..2d2f01c 100644
--- a/arch/x86/include/asm/system.h
+++ b/arch/x86/include/asm/system.h
@@ -401,6 +401,7 @@ extern unsigned long arch_align_stack(unsigned long sp);
extern void free_init_pages(char *what, unsigned long begin, unsigned long end);

void default_idle(void);
+bool set_pm_idle_to_default(void);

void stop_this_cpu(void *dummy);

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 1f7f8c8..336b299 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -404,6 +404,14 @@ void default_idle(void)
EXPORT_SYMBOL(default_idle);
#endif

+bool set_pm_idle_to_default()
+{
+ if (!pm_idle) {
+ pm_idle = default_idle;
+ return true;
+ }
+ return false;
+}
void stop_this_cpu(void *dummy)
{
local_irq_disable();
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 46d6d21..7506181 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -448,6 +448,6 @@ void __init xen_arch_setup(void)
#endif
disable_cpuidle();
boot_option_idle_override = IDLE_HALT;
-
+ WARN_ON(!set_pm_idle_to_default());
fiddle_vdso();
}
--
1.7.7.3


2011-11-23 07:37:49

by Rafał Miłecki

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

W dniu 21 listopada 2011 23:22 użytkownik Linus Torvalds
<[email protected]> napisał:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>>
>> Subject    : [3.1-rc8 REGRESSION] sky2 hangs machine on turning off or suspending
>> Submitter  : Rafał Miłecki <[email protected]>
>> Date       : 2011-11-09 11:46
>> Message-ID : CACna6ryTdLcWVYgHu=_mRFga1sFivpE_DyZOY-HMmKggkWAJAw@mail.gmail.com
>> References : http://marc.info/?l=linux-netdev&m=132083922228088&w=4
>
> This should be fixed by commit 1401a8008a09 ("sky2: fix hang on
> shutdown (and other irq issues)") in current -git.

This patch doesn't fix my hang.

However git contains also:
sky2: fix hang in napi_disable
This is the one fixing my case.

So the bug is resolved, however I'm a little disappointed noone
ping-ed me about that patches. I've spent some time on bisecting this
issue, expected to get some response :/

--
Rafał

2011-11-22 05:58:55

by Andrew Morton

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Tue, 22 Nov 2011 11:19:24 +0530 "Srivatsa S. Bhat" <[email protected]> wrote:

> > Subject : khugepaged blocks suspend2ram (3.2.0-rc1-00252-g8f042aa)
> > Submitter : Sergei Trofimovich <[email protected]>
> > Date : 2011-11-12 10:48
> > Message-ID : [email protected]
> > References : https://lkml.org/lkml/2011/11/12/11
> >
>
> Andrea's patch already fixes this issue, which was reported first by
> Jiri Slaby. https://lkml.org/lkml/2011/11/11/93
> I remember Andrew Morton taking this patch in his -mm tree. But it is
> not in mainline yet. So can we consider this closed or not?

grr, nothing in that patch's changelog indicates that it fixed a
regression nor that it fixed an infinite blockage of suspend.

I moved it to my 3.2 queue, thanks.

2011-11-21 22:29:19

by Alex Deucher

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 4:49 PM, Rafael J. Wysocki <[email protected]> wrote:
> This message contains a list of some regressions from 3.0 and 3.1
> for which there are no fixes in the mainline known to the tracking team.
> If any of them have been fixed already, please let us know.
>
> If you know of any other unresolved regressions from 3.0 and 3.1, please let us
> know either and we'll add them to the list. ?Also, please let us know if any of
> the entries below are invalid.
>
> The entries below are simplified and the statistics are not present due to the
> continuing Bugzilla outage.
>
> Subject ? ?: iwlagn is getting very shaky
> Submitter ?: Norbert Preining <[email protected]>
> Date ? ? ? : 2011-10-19 6:01
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=131914553920614&w=2
>
> Subject ? ?: Regression: "irqpoll" hasn't been working for me since March 16 IRQ
> Submitter ?: Edward Donovan <[email protected]>
> Date ? ? ? : 2011-10-19 22:09
> Message-ID : CADdbW+HXdCPfJu2RTF6zz+ujCmiu_dmZwL2iScuF53p=AaZ1Uw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=131914554220679&w=2
>
> Subject ? ?: Regression in 3.1 causes Xen to use wrong idle routine
> Submitter ?: Stefan Bader <[email protected]>
> Date ? ? ? : 2011-10-26 10:24
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-acpi&m=131962467924564&w=2
>
> Subject ? ?: 3.1+ iwlwifi lockup
> Submitter ?: Dave Jones <[email protected]>
> Date ? ? ? : 2011-10-31 14:34
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132007169420160&w=2
>
> Subject ? ?: hugetlb oops on 3.1.0-rc8-devel
> Submitter ?: Andy Lutomirski <[email protected]>
> Date ? ? ? : 2011-11-01 22:20
> Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2
>
> Subject ? ?: Simultaneous cat and external keyboard input causing kernel panic
> Submitter ?: Timo Jyrinki <[email protected]>
> Date ? ? ? : 2011-11-03 12:14
> Message-ID : CAJtFfxmovJHspHHKbvBVc4pw+u5mjGmUejCXEzdV+GqE=jVSOQ@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132032253903074&w=2
>
> Subject ? ?: Linus GIT - INFO: possible circular locking dependency detected
> Submitter ?: Miles Lane <[email protected]>
> Date ? ? ? : 2011-11-03 15:57
> Message-ID : CAHFgRy8S0xLfhZxTUOEH5A0PL_Fb79-0-gmbQ=9h2D-xMqt1hA@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132033587908426&w=2
>
> Subject ? ?: lockdep warning after aa6afca5bcab: "proc: fix races against execve() of /proc/PID/fd**"
> Submitter ?: Ari Savolainen <[email protected]>
> Date ? ? ? : 2011-11-08 3:47
> Message-ID : CAEbykaXYZEFhTgWMm2AfaWQ2SaXYuO_ypTnw+6AVWScOYSCuuw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132072413125099&w=2
>
> Subject ? ?: DMA-API check_sync errors with 3.2
> Submitter ?: Josh Boyer <[email protected]>
> Date ? ? ? : 2011-11-08 17:31
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132077357608797&w=2
>
> Subject ? ?: [3.1-rc8 REGRESSION] sky2 hangs machine on turning off or suspending
> Submitter ?: Rafa? Mi?ecki <[email protected]>
> Date ? ? ? : 2011-11-09 11:46
> Message-ID : CACna6ryTdLcWVYgHu=_mRFga1sFivpE_DyZOY-HMmKggkWAJAw@mail.gmail.com
> References : http://marc.info/?l=linux-netdev&m=132083922228088&w=4
>
> Subject ? ?: 3.2-rc1 doesn't boot on dual socket opteron without swap
> Submitter ?: Niklas Schnelle <[email protected]>
> Date ? ? ? : 2011-11-10 20:09
> Message-ID : 1320955769.1718.8.camel@jupiter
> References : http://marc.info/?l=linux-kernel&m=132095583501767&w=2
>
> Subject ? ?: Sparc-32 doesn't work in 3.1.
> Submitter ?: Rob Landley <[email protected]>
> Date ? ? ? : 2011-11-12 11:22
> Message-ID : [email protected]
> References : http://www.spinics.net/lists/kernel/msg1260383.html
>
> Subject ? ?: khugepaged blocks suspend2ram (3.2.0-rc1-00252-g8f042aa)
> Submitter ?: Sergei Trofimovich <[email protected]>
> Date ? ? ? : 2011-11-12 10:48
> Message-ID : [email protected]
> References : https://lkml.org/lkml/2011/11/12/11
>
> Subject ? ?: WARNING: at fs/sysfs/sysfs.h:195 (during boot)
> Submitter ?: Markus Trippelsdorf <[email protected]>
> Date ? ? ? : 2011-11-13 19:24
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132121231921932&w=2
>
> Subject ? ?: PROBLEM: Radeon display connector : No monitor connected or invalid EDID
> Submitter ?: Treeve Jelbert <[email protected]>
> Date ? ? ? : 2011-11-13 17:27
> Message-ID : 2407026.akcTO2Ggic@gemini-64
> References : http://marc.info/?l=linux-kernel&m=132120530920139&w=2

Treeve replied to me directly saying the it was a problem with his
config file and everything is working fine now.

Alex

>
> Subject ? ?: [3.1 REGRESSION] Commit 5cec93c216db77c45f7ce970d46283bcb1933884 breaks the Chromium seccomp sandbox
> Submitter ?: Nix <[email protected]>
> Date ? ? ? : 2011-11-14 0:40
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132123396226377&w=2
>
> Subject ? ?: max PWM is zero
> Submitter ?: Marcos Paulo de Souza <[email protected]>
> Date ? ? ? : 2011-11-15 15:14
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132137019330548&w=2
>
> Subject ? ?: Oops on suspend with libertas SDIO (Linux 3.2-rc2)
> Submitter ?: Sven Neumann <[email protected]>
> Date ? ? ? : 2011-11-17 15:36
> Message-ID : 1321544210.31090.6.camel@sven
> References : http://marc.info/?l=linux-kernel&m=132154527715807&w=2
>
> Subject ? ?: Impossible high cpu-time values for migration threads
> Submitter ?: Markus Trippelsdorf <[email protected]>
> Date ? ? ? : 2011-11-17 22:17
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132156832124314&w=2
>
> Subject ? ?: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
> Submitter ?: Markus Trippelsdorf <[email protected]>
> Date ? ? ? : 2011-11-18 7:25
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132160119031794&w=2
>
> Subject ? ?: [REGRESSION] resume takes 10s longer due to e1b6eb3 (Bluetooth: Increase HCI reset timeout ...)
> Submitter ?: Tom?? Janou?ek <[email protected]>
> Date ? ? ? : 2011-11-18 18:40
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132164169511416&w=2
>
> Subject ? ?: [REGRESSION] sudden cpu hogging in kernel 3.2 rc-series
> Submitter ?: "Nicolas Kalkhof" <[email protected]>
> Date ? ? ? : 2011-11-18 20:33
> Message-ID : 506786689.810044.1321648395265.JavaMail.fmail@mwmweb010
> References : http://marc.info/?l=linux-kernel&m=132164909313594&w=2
>
> Thanks!
> _______________________________________________
> dri-devel mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

2011-11-28 08:33:17

by Michal Hocko

[permalink] [raw]
Subject: [PATCH] hugetlb: release pages in the error path of hugetlb_cow() (was: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1)

On Mon 21-11-11 14:18:29, Linus Torvalds wrote:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
> >
> > Subject ? ?: hugetlb oops on 3.1.0-rc8-devel
> > Submitter ?: Andy Lutomirski <[email protected]>
> > Date ? ? ? : 2011-11-01 22:20
> > Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
> > References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2
>
> Despite the subject line, that's not an oops, it's a BUG_ON().
>
> And it *should* be fixed by commit ea4039a34c4c ("hugetlb: release
> pages in the error path of hugetlb_cow()") although I don't think Andy
> ever confirmed that (since it was hard to trigger).

AFAICS the issue has been introduced by 0fe6e20b (hugetlb, rmap:
add reverse mapping for hugepage) in 2.6.36-rc1 so this is a stable
material. I do not see the patch in any stable branch so here we go.
The patch is on top of 3.0.y branch and it applies as is to 3.1.y
as well.
---
>From fdaa4aaa008cce149a5fd60934112acd8988e0b6 Mon Sep 17 00:00:00 2001
From: Hillf Danton <[email protected]>
Date: Tue, 15 Nov 2011 14:36:12 -0800
Subject: [PATCH] hugetlb: release pages in the error path of hugetlb_cow()

commit ea4039a34c4c206d015d34a49d0b00868e37db1d upstream.

If we fail to prepare an anon_vma, the {new, old}_page should be released,
or they will leak.

Signed-off-by: Hillf Danton <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
---
mm/hugetlb.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bfcf153..2b57cd9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2415,6 +2415,8 @@ retry_avoidcopy:
* anon_vma prepared.
*/
if (unlikely(anon_vma_prepare(vma))) {
+ page_cache_release(new_page);
+ page_cache_release(old_page);
/* Caller expects lock to be held */
spin_lock(&mm->page_table_lock);
return VM_FAULT_OOM;
--
1.7.7.3


--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-11-22 12:22:17

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 09:59:18PM -0800, Andrew Morton wrote:
> grr, nothing in that patch's changelog indicates that it fixed a
> regression nor that it fixed an infinite blockage of suspend.

Well it's not a recent thing so I didn't flag it as a regression. It
doesn't infinite block it, suspend works fine almost all the time (or
it would have been noticed before), and if you've bad luck and it
doesn't suspend the first time around like someone experienced, if you
try again a bit later it'll work.

> I moved it to my 3.2 queue, thanks.

Thanks!

2011-11-29 18:34:38

by Borislav Petkov

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Tue, Nov 29, 2011 at 01:04:14PM -0500, Konrad Rzeszutek Wilk wrote:
> This patch:
>
> commit d91ee5863b71e8c90eaf6035bff3078a85e2e7b5
> Author: Len Brown <[email protected]>
> Date: Fri Apr 1 18:28:35 2011 -0400
>
> cpuidle: replace xen access to x86 pm_idle and default_idle
>
> ..scribble on pm_idle and access default_idle,
> have it simply disable_cpuidle() so acpi_idle will not load and
> architecture default HLT will be used.
>
> idea was to have one call - disable_cpuidle() which would make
> pm_idle not be molested by other code. It disallows cpuidle_idle_call
> and acpi_idle_call to not set pm_idle (which is excellent). But the

what is acpi_idle_call, I can't find it anywhere.

> amd_e400_idle and mwait_idle can still setup pm_idle which we really
> do not want.

This is not the case: rather select_idle_routine()/idle_setup() sets
pm_idle.

[..]

> +bool set_pm_idle_to_default()
> +{
> + if (!pm_idle) {
> + pm_idle = default_idle;
> + return true;
> + }
> + return false;
> +}

I don't understand what you're trying to achieve here? Do you want
default_idle to be always the pm_idle for xen or what is the deal here?

If yes, then simply do:

bool set_pm_idle_to_default(void) // remember to add "void" for no function args
{
bool ret = !!pm_idle;

pm_idle = default_idle;

return ret;

}

...

> void stop_this_cpu(void *dummy)
> {
> local_irq_disable();
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index 46d6d21..7506181 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -448,6 +448,6 @@ void __init xen_arch_setup(void)
> #endif
> disable_cpuidle();
> boot_option_idle_override = IDLE_HALT;
> -
> + WARN_ON(!set_pm_idle_to_default());

and then do

WARN_ON(set_pm_idle_to_default());

instead of having arbitrary confusing logic. This way you can warn
whether something else set pm_idle already. Or?

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2011-11-22 05:49:49

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On 11/22/2011 03:19 AM, Rafael J. Wysocki wrote:
> This message contains a list of some regressions from 3.0 and 3.1
> for which there are no fixes in the mainline known to the tracking team.
> If any of them have been fixed already, please let us know.
>
> If you know of any other unresolved regressions from 3.0 and 3.1, please let us
> know either and we'll add them to the list. Also, please let us know if any of
> the entries below are invalid.
>
> The entries below are simplified and the statistics are not present due to the
> continuing Bugzilla outage.
>
> Subject : iwlagn is getting very shaky
> Submitter : Norbert Preining <[email protected]>
> Date : 2011-10-19 6:01
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=131914553920614&w=2
>
> Subject : Regression: "irqpoll" hasn't been working for me since March 16 IRQ
> Submitter : Edward Donovan <[email protected]>
> Date : 2011-10-19 22:09
> Message-ID : CADdbW+HXdCPfJu2RTF6zz+ujCmiu_dmZwL2iScuF53p=AaZ1Uw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=131914554220679&w=2
>
> Subject : Regression in 3.1 causes Xen to use wrong idle routine
> Submitter : Stefan Bader <[email protected]>
> Date : 2011-10-26 10:24
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-acpi&m=131962467924564&w=2
>
> Subject : 3.1+ iwlwifi lockup
> Submitter : Dave Jones <[email protected]>
> Date : 2011-10-31 14:34
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132007169420160&w=2
>
> Subject : hugetlb oops on 3.1.0-rc8-devel
> Submitter : Andy Lutomirski <[email protected]>
> Date : 2011-11-01 22:20
> Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2
>
> Subject : Simultaneous cat and external keyboard input causing kernel panic
> Submitter : Timo Jyrinki <[email protected]>
> Date : 2011-11-03 12:14
> Message-ID : CAJtFfxmovJHspHHKbvBVc4pw+u5mjGmUejCXEzdV+GqE=jVSOQ@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132032253903074&w=2
>
> Subject : Linus GIT - INFO: possible circular locking dependency detected
> Submitter : Miles Lane <[email protected]>
> Date : 2011-11-03 15:57
> Message-ID : CAHFgRy8S0xLfhZxTUOEH5A0PL_Fb79-0-gmbQ=9h2D-xMqt1hA@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132033587908426&w=2
>
> Subject : lockdep warning after aa6afca5bcab: "proc: fix races against execve() of /proc/PID/fd**"
> Submitter : Ari Savolainen <[email protected]>
> Date : 2011-11-08 3:47
> Message-ID : CAEbykaXYZEFhTgWMm2AfaWQ2SaXYuO_ypTnw+6AVWScOYSCuuw@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132072413125099&w=2
>
> Subject : DMA-API check_sync errors with 3.2
> Submitter : Josh Boyer <[email protected]>
> Date : 2011-11-08 17:31
> Message-ID : [email protected]
> References : http://marc.info/?l=linux-kernel&m=132077357608797&w=2
>
> Subject : [3.1-rc8 REGRESSION] sky2 hangs machine on turning off or suspending
> Submitter : Rafał Miłecki <[email protected]>
> Date : 2011-11-09 11:46
> Message-ID : CACna6ryTdLcWVYgHu=_mRFga1sFivpE_DyZOY-HMmKggkWAJAw@mail.gmail.com
> References : http://marc.info/?l=linux-netdev&m=132083922228088&w=4
>
> Subject : 3.2-rc1 doesn't boot on dual socket opteron without swap
> Submitter : Niklas Schnelle <[email protected]>
> Date : 2011-11-10 20:09
> Message-ID : 1320955769.1718.8.camel@jupiter
> References : http://marc.info/?l=linux-kernel&m=132095583501767&w=2
>
> Subject : Sparc-32 doesn't work in 3.1.
> Submitter : Rob Landley <[email protected]>
> Date : 2011-11-12 11:22
> Message-ID : [email protected]
> References : http://www.spinics.net/lists/kernel/msg1260383.html
>
> Subject : khugepaged blocks suspend2ram (3.2.0-rc1-00252-g8f042aa)
> Submitter : Sergei Trofimovich <[email protected]>
> Date : 2011-11-12 10:48
> Message-ID : [email protected]
> References : https://lkml.org/lkml/2011/11/12/11
>

Andrea's patch already fixes this issue, which was reported first by
Jiri Slaby. https://lkml.org/lkml/2011/11/11/93
I remember Andrew Morton taking this patch in his -mm tree. But it is
not in mainline yet. So can we consider this closed or not?

Thanks,
Srivatsa S. Bhat


2011-11-29 20:09:35

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Tue, Nov 29, 2011 at 07:34:28PM +0100, Borislav Petkov wrote:
> On Tue, Nov 29, 2011 at 01:04:14PM -0500, Konrad Rzeszutek Wilk wrote:
> > This patch:
> >
> > commit d91ee5863b71e8c90eaf6035bff3078a85e2e7b5
> > Author: Len Brown <[email protected]>
> > Date: Fri Apr 1 18:28:35 2011 -0400
> >
> > cpuidle: replace xen access to x86 pm_idle and default_idle
> >
> > ..scribble on pm_idle and access default_idle,
> > have it simply disable_cpuidle() so acpi_idle will not load and
> > architecture default HLT will be used.
> >
> > idea was to have one call - disable_cpuidle() which would make
> > pm_idle not be molested by other code. It disallows cpuidle_idle_call
> > and acpi_idle_call to not set pm_idle (which is excellent). But the
>
> what is acpi_idle_call, I can't find it anywhere.

You are right. I had "acpi_idle_enter_*" and its friend in mind. Which
are called from the cpuidle_idle_call.

Let me fix that comment up.
>
> > amd_e400_idle and mwait_idle can still setup pm_idle which we really
> > do not want.
>
> This is not the case: rather select_idle_routine()/idle_setup() sets
> pm_idle.

Yes. Let me fix up the comment.
>
> [..]
>
> > +bool set_pm_idle_to_default()
> > +{
> > + if (!pm_idle) {
> > + pm_idle = default_idle;
> > + return true;
> > + }
> > + return false;
> > +}
>
> I don't understand what you're trying to achieve here? Do you want
> default_idle to be always the pm_idle for xen or what is the deal here?

Yes (always want default_idle).
>
> If yes, then simply do:
>
> bool set_pm_idle_to_default(void) // remember to add "void" for no function args
> {
> bool ret = !!pm_idle;
>
> pm_idle = default_idle;

That would work too.
>
> return ret;
>
> }
>
> ...
>
> > void stop_this_cpu(void *dummy)
> > {
> > local_irq_disable();
> > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> > index 46d6d21..7506181 100644
> > --- a/arch/x86/xen/setup.c
> > +++ b/arch/x86/xen/setup.c
> > @@ -448,6 +448,6 @@ void __init xen_arch_setup(void)
> > #endif
> > disable_cpuidle();
> > boot_option_idle_override = IDLE_HALT;
> > -
> > + WARN_ON(!set_pm_idle_to_default());
>
> and then do
>
> WARN_ON(set_pm_idle_to_default());
>
> instead of having arbitrary confusing logic. This way you can warn
> whether something else set pm_idle already. Or?

That would work as well.

2011-11-21 21:46:46

by Rafael J. Wysocki

[permalink] [raw]
Subject: 3.2-rc2+: Reported regressions from 3.0 and 3.1

This message contains a list of some regressions from 3.0 and 3.1
for which there are no fixes in the mainline known to the tracking team.
If any of them have been fixed already, please let us know.

If you know of any other unresolved regressions from 3.0 and 3.1, please let us
know either and we'll add them to the list. Also, please let us know if any of
the entries below are invalid.

The entries below are simplified and the statistics are not present due to the
continuing Bugzilla outage.

Subject : iwlagn is getting very shaky
Submitter : Norbert Preining <[email protected]>
Date : 2011-10-19 6:01
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=131914553920614&w=2

Subject : Regression: "irqpoll" hasn't been working for me since March 16 IRQ
Submitter : Edward Donovan <[email protected]>
Date : 2011-10-19 22:09
Message-ID : CADdbW+HXdCPfJu2RTF6zz+ujCmiu_dmZwL2iScuF53p=AaZ1Uw@mail.gmail.com
References : http://marc.info/?l=linux-kernel&m=131914554220679&w=2

Subject : Regression in 3.1 causes Xen to use wrong idle routine
Submitter : Stefan Bader <[email protected]>
Date : 2011-10-26 10:24
Message-ID : [email protected]
References : http://marc.info/?l=linux-acpi&m=131962467924564&w=2

Subject : 3.1+ iwlwifi lockup
Submitter : Dave Jones <[email protected]>
Date : 2011-10-31 14:34
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132007169420160&w=2

Subject : hugetlb oops on 3.1.0-rc8-devel
Submitter : Andy Lutomirski <[email protected]>
Date : 2011-11-01 22:20
Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2

Subject : Simultaneous cat and external keyboard input causing kernel panic
Submitter : Timo Jyrinki <[email protected]>
Date : 2011-11-03 12:14
Message-ID : CAJtFfxmovJHspHHKbvBVc4pw+u5mjGmUejCXEzdV+GqE=jVSOQ@mail.gmail.com
References : http://marc.info/?l=linux-kernel&m=132032253903074&w=2

Subject : Linus GIT - INFO: possible circular locking dependency detected
Submitter : Miles Lane <[email protected]>
Date : 2011-11-03 15:57
Message-ID : CAHFgRy8S0xLfhZxTUOEH5A0PL_Fb79-0-gmbQ=9h2D-xMqt1hA@mail.gmail.com
References : http://marc.info/?l=linux-kernel&m=132033587908426&w=2

Subject : lockdep warning after aa6afca5bcab: "proc: fix races against execve() of /proc/PID/fd**"
Submitter : Ari Savolainen <[email protected]>
Date : 2011-11-08 3:47
Message-ID : CAEbykaXYZEFhTgWMm2AfaWQ2SaXYuO_ypTnw+6AVWScOYSCuuw@mail.gmail.com
References : http://marc.info/?l=linux-kernel&m=132072413125099&w=2

Subject : DMA-API check_sync errors with 3.2
Submitter : Josh Boyer <[email protected]>
Date : 2011-11-08 17:31
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132077357608797&w=2

Subject : [3.1-rc8 REGRESSION] sky2 hangs machine on turning off or suspending
Submitter : Rafał Miłecki <[email protected]>
Date : 2011-11-09 11:46
Message-ID : CACna6ryTdLcWVYgHu=_mRFga1sFivpE_DyZOY-HMmKggkWAJAw@mail.gmail.com
References : http://marc.info/?l=linux-netdev&m=132083922228088&w=4

Subject : 3.2-rc1 doesn't boot on dual socket opteron without swap
Submitter : Niklas Schnelle <[email protected]>
Date : 2011-11-10 20:09
Message-ID : 1320955769.1718.8.camel@jupiter
References : http://marc.info/?l=linux-kernel&m=132095583501767&w=2

Subject : Sparc-32 doesn't work in 3.1.
Submitter : Rob Landley <[email protected]>
Date : 2011-11-12 11:22
Message-ID : [email protected]
References : http://www.spinics.net/lists/kernel/msg1260383.html

Subject : khugepaged blocks suspend2ram (3.2.0-rc1-00252-g8f042aa)
Submitter : Sergei Trofimovich <[email protected]>
Date : 2011-11-12 10:48
Message-ID : [email protected]
References : https://lkml.org/lkml/2011/11/12/11

Subject : WARNING: at fs/sysfs/sysfs.h:195 (during boot)
Submitter : Markus Trippelsdorf <[email protected]>
Date : 2011-11-13 19:24
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132121231921932&w=2

Subject : PROBLEM: Radeon display connector : No monitor connected or invalid EDID
Submitter : Treeve Jelbert <[email protected]>
Date : 2011-11-13 17:27
Message-ID : 2407026.akcTO2Ggic@gemini-64
References : http://marc.info/?l=linux-kernel&m=132120530920139&w=2

Subject : [3.1 REGRESSION] Commit 5cec93c216db77c45f7ce970d46283bcb1933884 breaks the Chromium seccomp sandbox
Submitter : Nix <[email protected]>
Date : 2011-11-14 0:40
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132123396226377&w=2

Subject : max PWM is zero
Submitter : Marcos Paulo de Souza <[email protected]>
Date : 2011-11-15 15:14
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132137019330548&w=2

Subject : Oops on suspend with libertas SDIO (Linux 3.2-rc2)
Submitter : Sven Neumann <[email protected]>
Date : 2011-11-17 15:36
Message-ID : 1321544210.31090.6.camel@sven
References : http://marc.info/?l=linux-kernel&m=132154527715807&w=2

Subject : Impossible high cpu-time values for migration threads
Submitter : Markus Trippelsdorf <[email protected]>
Date : 2011-11-17 22:17
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132156832124314&w=2

Subject : WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
Submitter : Markus Trippelsdorf <[email protected]>
Date : 2011-11-18 7:25
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132160119031794&w=2

Subject : [REGRESSION] resume takes 10s longer due to e1b6eb3 (Bluetooth: Increase HCI reset timeout ...)
Submitter : Tomáš Janoušek <[email protected]>
Date : 2011-11-18 18:40
Message-ID : [email protected]
References : http://marc.info/?l=linux-kernel&m=132164169511416&w=2

Subject : [REGRESSION] sudden cpu hogging in kernel 3.2 rc-series
Submitter : "Nicolas Kalkhof" <[email protected]>
Date : 2011-11-18 20:33
Message-ID : 506786689.810044.1321648395265.JavaMail.fmail@mwmweb010
References : http://marc.info/?l=linux-kernel&m=132164909313594&w=2

Thanks!

2011-11-21 22:29:25

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 2:18 PM, Linus Torvalds
<[email protected]> wrote:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>>
>> Subject ? ?: hugetlb oops on 3.1.0-rc8-devel
>> Submitter ?: Andy Lutomirski <[email protected]>
>> Date ? ? ? : 2011-11-01 22:20
>> Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
>> References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2
>
> Despite the subject line, that's not an oops, it's a BUG_ON().
>
> And it *should* be fixed by commit ea4039a34c4c ("hugetlb: release
> pages in the error path of hugetlb_cow()") although I don't think Andy
> ever confirmed that (since it was hard to trigger).

I haven't seen it again, but that probably doesn't mean anything.
I've also fixed a bug in some userspace software I was running, and
that fix means I'm probably not stressing that part of the kernel
anymore. (Even without the fix, it took two weeks to hit this.)

2011-11-22 07:16:47

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 2:34 PM, Andy Lutomirski <[email protected]> wrote:
> On Mon, Nov 21, 2011 at 2:11 PM, Linus Torvalds
> <[email protected]> wrote:
>> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>>>
>>> Subject ? ?: [3.1 REGRESSION] Commit 5cec93c216db77c45f7ce970d46283bcb1933884 breaks the Chromium seccomp sandbox
>>> Submitter ?: Nix <[email protected]>
>>> Date ? ? ? : 2011-11-14 0:40
>>> Message-ID : [email protected]
>>> References : http://marc.info/?l=linux-kernel&m=132123396226377&w=2
>>

This is apparently fixed in seccompsandbox. See:

https://code.google.com/p/seccompsandbox/issues/detail?id=17
https://code.google.com/p/seccompsandbox/source/detail?r=178

Unless someone objects, I'll consider this to not be a kernel
regression worth fixing.

--Andy

2011-11-21 22:22:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> Subject ? ?: [3.1-rc8 REGRESSION] sky2 hangs machine on turning off or suspending
> Submitter ?: Rafa? Mi?ecki <[email protected]>
> Date ? ? ? : 2011-11-09 11:46
> Message-ID : CACna6ryTdLcWVYgHu=_mRFga1sFivpE_DyZOY-HMmKggkWAJAw@mail.gmail.com
> References : http://marc.info/?l=linux-netdev&m=132083922228088&w=4

This should be fixed by commit 1401a8008a09 ("sky2: fix hang on
shutdown (and other irq issues)") in current -git.

Linus

2011-11-30 18:10:59

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Tue, Nov 29, 2011 at 07:34:28PM +0100, Borislav Petkov wrote:
> On Tue, Nov 29, 2011 at 01:04:14PM -0500, Konrad Rzeszutek Wilk wrote:
> > This patch:


Borislav,

Thanks for your review comments. How does this patch look? I believe
I touched upon all of the things you mentioned.

>From eb6dbd80078312c428dde69e9313606b7513a2e6 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <[email protected]>
Date: Mon, 21 Nov 2011 18:02:02 -0500
Subject: [PATCH] xen/pm_idle: Make pm_idle be default_idle under Xen.

This patch:

commit d91ee5863b71e8c90eaf6035bff3078a85e2e7b5
Author: Len Brown <[email protected]>
Date: Fri Apr 1 18:28:35 2011 -0400

cpuidle: replace xen access to x86 pm_idle and default_idle

..scribble on pm_idle and access default_idle,
have it simply disable_cpuidle() so acpi_idle will not load and
architecture default HLT will be used.

idea was to have one call - disable_cpuidle() which would make
pm_idle not be molested by other code. It disallows cpuidle_idle_call
to be set to pm_idle (which is excellent). But in the select_idle_routine()
and idle_setup(), the pm_idle can still be set to either:
amd_e400_idle, mwait_idle or default_idle. This depends on some
CPU flags (MWAIT) and in AMD case on the type of CPU.

In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:

Brought up 2 CPUs
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in:

Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
RIP: e030:[<ffffffff81015d1d>] [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
RSP: e02b:ffff8801d28ddf10 EFLAGS: 00010082
RAX: ffff8801d28dc010 RBX: ffff8801d28ddfd8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffff8801d28ddf10 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: ffff8801d28ddfd8 R12: ffffffff81b590d0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8801dff81000(0000) knlGS:0000000000000000
CS: e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001a05000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Process swapper (pid: 0, threadinfo ffff8801d28dc000, task ffff8801d28cae60)
Stack:
ffff8801d28ddf40 ffffffff8100e2ed ffff8801dff8e390 c136dfe72feab515
0000000000000000 0000000000000000 ffff8801d28ddf50 ffffffff8149ee78
0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
[<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
RIP [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
RSP <ffff8801d28ddf10>

In case of amd_e400_idle we don't get so spectacular crashes, but
we do end up making an MSR which is trapped in the hypervisor,
and then follow it up with a yield hypercall. Meaning we end up
going to hypervisor twice instead of just once.

The previous behavior before v3.0 was that pm_idle was set
to default_idle irregardless of select_idle_routine/idle_setup.

We want to do that, but only for one specific case: Xen.
This patch does that.

Fixes RH BZ #739499 and Ubuntu #881076
Reported-by: Stefan Bader <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/include/asm/system.h | 1 +
arch/x86/kernel/process.c | 8 ++++++++
arch/x86/xen/setup.c | 2 +-
3 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h
index c2ff2a1..2d2f01c 100644
--- a/arch/x86/include/asm/system.h
+++ b/arch/x86/include/asm/system.h
@@ -401,6 +401,7 @@ extern unsigned long arch_align_stack(unsigned long sp);
extern void free_init_pages(char *what, unsigned long begin, unsigned long end);

void default_idle(void);
+bool set_pm_idle_to_default(void);

void stop_this_cpu(void *dummy);

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 1f7f8c8..31f47ba 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -404,6 +404,14 @@ void default_idle(void)
EXPORT_SYMBOL(default_idle);
#endif

+bool set_pm_idle_to_default(void)
+{
+ bool ret = !!pm_idle;
+
+ pm_idle = default_idle;
+
+ return ret;
+}
void stop_this_cpu(void *dummy)
{
local_irq_disable();
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 46d6d21..79dfb57 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -448,6 +448,6 @@ void __init xen_arch_setup(void)
#endif
disable_cpuidle();
boot_option_idle_override = IDLE_HALT;
-
+ WARN_ON(set_pm_idle_to_default());
fiddle_vdso();
}
--
1.7.7.3


2011-11-21 22:07:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> Subject ? ?: Simultaneous cat and external keyboard input causing kernel panic
> Submitter ?: Timo Jyrinki <[email protected]>
> Date ? ? ? : 2011-11-03 12:14
> Message-ID : CAJtFfxmovJHspHHKbvBVc4pw+u5mjGmUejCXEzdV+GqE=jVSOQ@mail.gmail.com
> References : http://marc.info/?l=linux-kernel&m=132032253903074&w=2

So while funny, I doubt this is actually a bug. It's a feature, as
pointed out by Clemens Ladisch in that thread.

It's simply sysrq-c: "perform a system crash by a NULL pointer dereference".

Now, I'm perfectly willing to consider that feature to be a
mis-feature, and that this should be considered a bug to be fixed. But
it is not a regression.

Keeping it on the regression list just because it is amusing is
understandable, though ;)

Linus

2011-12-01 11:39:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

On Wed, Nov 30, 2011 at 12:59:36PM -0500, Konrad Rzeszutek Wilk wrote:
> On Tue, Nov 29, 2011 at 07:34:28PM +0100, Borislav Petkov wrote:
> > On Tue, Nov 29, 2011 at 01:04:14PM -0500, Konrad Rzeszutek Wilk wrote:
> > > This patch:
>
>
> Borislav,
>
> Thanks for your review comments. How does this patch look? I believe
> I touched upon all of the things you mentioned.
>
> From eb6dbd80078312c428dde69e9313606b7513a2e6 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <[email protected]>
> Date: Mon, 21 Nov 2011 18:02:02 -0500
> Subject: [PATCH] xen/pm_idle: Make pm_idle be default_idle under Xen.
>
> This patch:
>
> commit d91ee5863b71e8c90eaf6035bff3078a85e2e7b5
> Author: Len Brown <[email protected]>
> Date: Fri Apr 1 18:28:35 2011 -0400
>
> cpuidle: replace xen access to x86 pm_idle and default_idle
>
> ..scribble on pm_idle and access default_idle,
> have it simply disable_cpuidle() so acpi_idle will not load and
> architecture default HLT will be used.
>
> idea was to have one call - disable_cpuidle() which would make
> pm_idle not be molested by other code. It disallows cpuidle_idle_call
> to be set to pm_idle (which is excellent). But in the select_idle_routine()
> and idle_setup(), the pm_idle can still be set to either:
> amd_e400_idle, mwait_idle or default_idle. This depends on some
> CPU flags (MWAIT) and in AMD case on the type of CPU.
>
> In case of mwait_idle we can hit some instances where the hypervisor
> (Amazon EC2 specifically) sets the MWAIT and we get:
>
> Brought up 2 CPUs
> invalid opcode: 0000 [#1] SMP
> CPU 1
> Modules linked in:
>
> Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
> RIP: e030:[<ffffffff81015d1d>] [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
> RSP: e02b:ffff8801d28ddf10 EFLAGS: 00010082
> RAX: ffff8801d28dc010 RBX: ffff8801d28ddfd8 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> RBP: ffff8801d28ddf10 R08: 0000000000000000 R09: 0000000000000001
> R10: 0000000000000001 R11: ffff8801d28ddfd8 R12: ffffffff81b590d0
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff8801dff81000(0000) knlGS:0000000000000000
> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000001a05000 CR4: 0000000000002660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
> Process swapper (pid: 0, threadinfo ffff8801d28dc000, task ffff8801d28cae60)
> Stack:
> ffff8801d28ddf40 ffffffff8100e2ed ffff8801dff8e390 c136dfe72feab515
> 0000000000000000 0000000000000000 ffff8801d28ddf50 ffffffff8149ee78
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
> [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
> RIP [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
> RSP <ffff8801d28ddf10>
>
> In case of amd_e400_idle we don't get so spectacular crashes, but
> we do end up making an MSR which is trapped in the hypervisor,
> and then follow it up with a yield hypercall. Meaning we end up
> going to hypervisor twice instead of just once.
>
> The previous behavior before v3.0 was that pm_idle was set
> to default_idle irregardless of select_idle_routine/idle_setup.
>
> We want to do that, but only for one specific case: Xen.
> This patch does that.
>
> Fixes RH BZ #739499 and Ubuntu #881076
> Reported-by: Stefan Bader <[email protected]>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> arch/x86/include/asm/system.h | 1 +
> arch/x86/kernel/process.c | 8 ++++++++
> arch/x86/xen/setup.c | 2 +-
> 3 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h
> index c2ff2a1..2d2f01c 100644
> --- a/arch/x86/include/asm/system.h
> +++ b/arch/x86/include/asm/system.h
> @@ -401,6 +401,7 @@ extern unsigned long arch_align_stack(unsigned long sp);
> extern void free_init_pages(char *what, unsigned long begin, unsigned long end);
>
> void default_idle(void);
> +bool set_pm_idle_to_default(void);
>
> void stop_this_cpu(void *dummy);
>
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index 1f7f8c8..31f47ba 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -404,6 +404,14 @@ void default_idle(void)
> EXPORT_SYMBOL(default_idle);
> #endif
>
> +bool set_pm_idle_to_default(void)
> +{
> + bool ret = !!pm_idle;
> +
> + pm_idle = default_idle;
> +
> + return ret;
> +}
> void stop_this_cpu(void *dummy)
> {
> local_irq_disable();
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index 46d6d21..79dfb57 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -448,6 +448,6 @@ void __init xen_arch_setup(void)
> #endif
> disable_cpuidle();
> boot_option_idle_override = IDLE_HALT;
> -
> + WARN_ON(set_pm_idle_to_default());
> fiddle_vdso();
> }

>From what I can see, you get the following callchain:

start_kernel
|-> setup_arch
|-> x86_init.oem.arch_setup = xen_arch_setup
....
|-> check_bugs
|-> identify_boot_cpu
|-> identify_cpu
|-> select_idle_routine


so xen_arch_setup will set pm_idle and select_idle_routine will honour
it. I'm being told this is run either in the dom0 or the paravirt guest
and if so, I don't see any issue with this for baremetal. So you can
have my ACK provided this is tested on baremetal too.

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2011-12-03 14:42:10

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

> >From what I can see, you get the following callchain:
>
> start_kernel
> |-> setup_arch
> |-> x86_init.oem.arch_setup = xen_arch_setup
> ....
> |-> check_bugs
> |-> identify_boot_cpu
> |-> identify_cpu
> |-> select_idle_routine
>
>
> so xen_arch_setup will set pm_idle and select_idle_routine will honour
> it. I'm being told this is run either in the dom0 or the paravirt guest
> and if so, I don't see any issue with this for baremetal. So you can
> have my ACK provided this is tested on baremetal too.

Tested on baremetal and there were no abnormalities. Thanks!