Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754364AbcJUF4j (ORCPT ); Fri, 21 Oct 2016 01:56:39 -0400 Received: from merlin.infradead.org ([205.233.59.134]:43432 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754137AbcJUF4h (ORCPT ); Fri, 21 Oct 2016 01:56:37 -0400 Date: Fri, 21 Oct 2016 07:56:30 +0200 From: Peter Zijlstra To: Andy Lutomirski Cc: Roman Pen , Andy Lutomirski , Josh Poimboeuf , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , Tejun Heo , X86 ML , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH 1/2] x86/dumpstack: on oops do not rewind stack for kthread Message-ID: <20161021055630.GA3102@twins.programming.kicks-ass.net> References: <20160921154350.13128-1-roman.penyaev@profitbricks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1260 Lines: 30 On Thu, Oct 20, 2016 at 04:07:28PM -0700, Andy Lutomirski wrote: > On Wed, Sep 21, 2016 at 8:43 AM, Roman Pen > wrote: > > kthread uses stack and keeps completion structure on it to be woken up > > on vfork_done completion. > > > > In commit 2deb4be28 Andy Lutomirski rewinds the stack unconditionally > > and further completion of task->vfork_done for any kthread leads to stack > > corruption (or infinite spin on attempt to spin lock on garbage memory). > > This is sort of okay, but it will blow up pretty badly if a kthread > overflows its stack. Would it make more sense to change > rewind_stack_do_exit() to leave a big enough gap at the top of the > stack to avoid clobbering the completion? We need to preserve the entire struct kthread on the stack, kthread just abuses that pointer to stash an on-stack kthread descriptor. See kthread(): current->vfork_done = &self.exited; Its a horrible horrible thing kthread does. I suppose there might have been some intent by keeping that exited completion last in the structure, but *shudder*. But yes, leaving enough stack to not clobber that might keep this horror show working. ISTR talk about alternative schemes for this a long time ago, but I cannot recall :-(