Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1286206ybg; Wed, 23 Oct 2019 13:18:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqxNPBkRLK84SoYwwGSbLk4DGfnI+mavdyjSEmc5Rt+T4jlmg7vGobBssmi3MRoAfoFYtzY5 X-Received: by 2002:a17:906:5c0a:: with SMTP id e10mr34581338ejq.285.1571861935929; Wed, 23 Oct 2019 13:18:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571861935; cv=none; d=google.com; s=arc-20160816; b=ULI5VIOtZBuamWvcUjI1XzcH4Vp3sXFnQKh41XvfmXD7LgsRPi/qk0sceLO7/bndvq D9p8d2+RhFw8HIQakYaXBSdzFDPKhjRwVN+5CB21vi/CdW6K0oQoAkcq+n8/rVrF0mRO MSpD9U/g3Gh9m04flelYRpktLpqWYoZCqrs2PY8dY+N13zVcWjctzTVBOWjD3KXapg+T jaXQRKDJXXutlY7i4tMACYGYBu8o/sOJkxblWhsJ624UrYR0oHOigvJD65ZaW6AHNld0 bio1X82rNeNXqJacJD1tWHJBkyOUsZ9m89QVTfDA1ifpAoh/S2YDebBDSt+VXZrFEQuu q7dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=BTwjlsF+rrpwEPUXZ5UJJ6L4ZCFLupzMiV59C6LCOQo=; b=zTBrLA1cv+WJLlMGR/QVVvInDjTWEVL9jn+eo8j0O3PH86H/XRXpQ+5Dnz0eGfewKZ IfzsUfPERiYqJ4vg74lauXEEC/taRmW8hl+QlUSAJXsCaPkiAsOWWBjlK5j2je2XtsAo akF+z0vUCyV9Ls4FoCqEp7dlWACnlJptp/qk2Zb9Lxa1y1FxPich5HaylKkLv6jSVnEH CaRaG70goGI6kGuegnNZWvUKmF70WeWBAuJRXlwXaQujVvcerZMxbFUR+oNJ3IzkCXyS I/m/BrCvhFxKIWBNKLSVndwDtQwWqGqwJhOLmy7EaLltQLcDKpYvS2Pc3TeCTcmyEY6R kkjg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g21si15534552edq.289.2019.10.23.13.18.31; Wed, 23 Oct 2019 13:18:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406110AbfJWNsB (ORCPT + 99 others); Wed, 23 Oct 2019 09:48:01 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:49458 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403782AbfJWNsB (ORCPT ); Wed, 23 Oct 2019 09:48:01 -0400 Received: from [5.158.153.52] (helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1iNGzV-0002uQ-RP; Wed, 23 Oct 2019 15:47:57 +0200 Date: Wed, 23 Oct 2019 15:47:57 +0200 (CEST) From: Thomas Gleixner To: Cyrill Gorcunov cc: LKML , Ingo Molnar , Peter Zijlstra , linux-mm@kvack.org, Catalin Marinas Subject: Re: [BUG -tip] kmemleak and stacktrace cause page faul In-Reply-To: Message-ID: References: <20191019114421.GK9698@uranus.lan> <20191022142325.GD12121@uranus.lan> <20191022145619.GE12121@uranus.lan> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 23 Oct 2019, Thomas Gleixner wrote: > On Tue, 22 Oct 2019, Cyrill Gorcunov wrote: > Ergo ep must be a valid pointer pointing to the statically allocated and > statically initialized estack_pages array. > > /* Guard page? */ > if (!ep->size) > > How on earth can dereferencing ep crash the machine? > > return false; > > That does not make any sense. > > Surely, we should not even try to decode exception stack when > cea_exception_stacks is not yet initialized, but that does not explain > anything what you are observing. So looking at your actual crash: [ 0.027246] BUG: unable to handle page fault for address: 0000000000001ff0 So this derefences the stack pointer address. [ 0.082275] stk 0x1010 k 1 begin 0x0 end 0xd000 estack_pages 0xffffffff82014880 ep 0xffffffff82014888 ep is pointing correctly to estack_pages[1] which is bogus because 0x1010 is not a valid stack value, but dereferencing ep does not make it crash. The crash farther down: end = begin + (unsigned long)ep->size; ==> end = 0x2000 regs = (struct pt_regs *)end - 1; ==> regs = 0x2000 - sizeof(struct pt_regs *) = 0x1ff0 info->type = ep->type; info->begin = (unsigned long *)begin; info->end = (unsigned long *)end; ----> info->next_sp = (unsigned long *)regs->sp; This is the crashing instruction trying to access 0x1ff0 And you are right this happens because cea_exception_stacks is not yet initialized which makes begin = 0 and therefore point into nirvana. So the fix is trivial. Thanks, tglx 8<------------ --- a/arch/x86/kernel/dumpstack_64.c +++ b/arch/x86/kernel/dumpstack_64.c @@ -94,6 +94,13 @@ static bool in_exception_stack(unsigned BUILD_BUG_ON(N_EXCEPTION_STACKS != 6); begin = (unsigned long)__this_cpu_read(cea_exception_stacks); + /* + * Handle the case where stack trace is collected _before_ + * cea_exception_stacks had been initialized. + */ + if (!begin) + return false; + end = begin + sizeof(struct cea_exception_stacks); /* Bail if @stack is outside the exception stack area. */ if (stk < begin || stk >= end)