Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752751AbbKYTdZ (ORCPT ); Wed, 25 Nov 2015 14:33:25 -0500 Received: from www.linutronix.de ([62.245.132.108]:33139 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752300AbbKYTco (ORCPT ); Wed, 25 Nov 2015 14:32:44 -0500 Date: Wed, 25 Nov 2015 20:31:57 +0100 (CET) From: Thomas Gleixner To: Joe Lawrence cc: LKML , Jiang Liu , x86@kernel.org Subject: Re: irq_desc use-after-free in smp_irq_move_cleanup_interrupt In-Reply-To: Message-ID: References: <5653B688.4050809@stratus.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1770 Lines: 58 On Wed, 25 Nov 2015, Thomas Gleixner wrote: > The problem is actually in the vector assignment code. > > > [001] 22.936764: __assign_irq_vector : cpu 44 : vector=134 -> 0xffff88102a8196f8 > > No interrupt happened so far. So nothing cleans up the vector on cpu 1 > > > [044] 61.670267: __assign_irq_vector : cpu 34 : vector=123 -> 0xffff88102a8196f8 > > Now that moves it from 44 to 34 and ignores that cpu 1 still has the > vector assigned. __assign_irq_vector unconditionally overwrites > data->old_domain, so the bit of cpu 1 is lost .... > > I'm staring into the code to figure out a fix .... Just to figure out that my analysis was completely wrong. __assign_irq_vector() { if (d->move_in_progress) return -EBUSY; ... So that cannot happen. Now the question is: > [001] 22.936764: __assign_irq_vector : cpu 44 : vector=134 -> 0xffff88102a8196f8 So CPU1 sees still data->move_in_progress [001] 54.636722: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8 And why does __assign_irq_vector not see it, but no cleanup vector was received by cpu1 with data->move_in_progress == 0. > [044] 61.670267: __assign_irq_vector : cpu 34 : vector=123 -> 0xffff88102a8196f8 Ahhhhh. __send_cleanup_vector() { send_IPI() move_in_progress = 0; } So if CPU1 gets the IPI _BEFORE_ move_in_progress is set to 0, and does not get another IPI before the next move ..... That has been that way forever. Duh. Working on a real fix this time. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/