Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp643226ybh; Sat, 18 Jul 2020 16:08:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxe/r/pBxO/llOy/zPKQjjGj1ejnKTcJRqMrnhlLFfqeAi6b4gOd6gJn1k2xB+htodvom19 X-Received: by 2002:a17:907:7291:: with SMTP id dt17mr13901703ejc.73.1595113731862; Sat, 18 Jul 2020 16:08:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595113731; cv=none; d=google.com; s=arc-20160816; b=ACcGMIn0IBdsYupHukUpm65Zpqc3FUNNDF95jcCIqFaVxQGsQ6BNbR1xVvHbCF3Mnw M5HzBxh7r27Dr9iIY1uW//IzYCw2M7A3c1KaPvzgKwkGc/pvPl1DDLggqlnK1Eh8jM0m pwAknLLUi+BKpi+f9ugPc1vJVRnuMJiz2HNbXMBSg0DXgvAdQfn0HknTH8E689wPyGju CG4P32wexuBGHxmMnuoEkZjRKDEzDBP7VDf/K8iqlmjgbRbD/eEWBX62Pm9AVniA1OjS K1f8fP4tD7DaZUxezE5PwGGL+M7JYGwlAOSWZwFVv4FGnhiD0STuWK2Q6bnNLfgXsnpO 3BUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=5rC17XUacjqtPIcpgNQKRp4muU4y2Y7PtonE6EZs65Y=; b=HC30GPDfBsQkA9iTVWK1aK4lw9QQ/bsIr5BRpqr6BnH2rvPbndQcK3Gla67bGGzxz9 skFw+PPgyiUXUutXQ2MD0HqjtRFKShshqHYU5e63dwjuo7Hfy+7qkhjmDmM1AXnN3Oea /wqeZGYQos0DJvMbSaKeZoaE0naazzKiO1oAd6UPzqTEPD7rPOpmhIoTO1r9E8327ecp fF9ivtTAZPMixQjbX8cnz2eV8IZ9ug+W7p0hxqABtodP80CRJ15SFcpbF5w9j7FRjSJX AtjxYQghk6aKL390jjBVUxx0GBDSFosf0npd/VBqGqcUqrlhmEUJpqp1WZjYWAkPpMqv vtfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JovRvs53; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i22si7198556edy.547.2020.07.18.16.08.28; Sat, 18 Jul 2020 16:08:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JovRvs53; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726499AbgGRXFY (ORCPT + 99 others); Sat, 18 Jul 2020 19:05:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726256AbgGRXFY (ORCPT ); Sat, 18 Jul 2020 19:05:24 -0400 Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFA01C0619D2 for ; Sat, 18 Jul 2020 16:05:23 -0700 (PDT) Received: by mail-io1-xd43.google.com with SMTP id i4so13951185iov.11 for ; Sat, 18 Jul 2020 16:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5rC17XUacjqtPIcpgNQKRp4muU4y2Y7PtonE6EZs65Y=; b=JovRvs53vS3LhD4wT75SBuQ0IxMgWyEcioHUxjf0RlgV6RsWK/e1mccgRNHD9f8Rvy NJDRvU5IlhaBXje4zlnPg/9DeDEEMljRrriD0n7Ib0xmNr71T/MoVMAGEZUolnaTIFnF r4uQPLCSmbtSxjmRKa2qUv/fUr9QeWMubZ8TVxEdVP6/UCuj+ONm9Cp9Fy2XcAqrSW3z kc0UOEbigR6Q7S7bx4djCfpIJaZsrX7Vh9jiWo84uScielx+B9mlsfPBv+fd7q79jR00 IH2JOHs7UoQQnw8qpH7dN3WEnlNO2pgxXdhRJ2C19ym0QNouoTEyL9sRfNbGgWyif5+4 CFvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5rC17XUacjqtPIcpgNQKRp4muU4y2Y7PtonE6EZs65Y=; b=NqvTb1HhABdyu11HxfZiUuoZNuvNxNsz/+ZbpgRABtoioH6lhMxouHug7ONSYu95hH KsL2uPQIiE7VCNaS6ea2XhNmiDG+vaU1v8jNRBVKvpHkxw2nshXwKzXOjLdc8Ftkq0eU je4BD0oZnD9cZTgdo9yuE7DKCHHkzotR9owZdo9L7R8SaIrDWMGOBtHzFTPgUsAr3wKV 2YHvaBd6atrHRbkXqZf30pqt6xXFHyEr01WlEzOtcT6VWUYBVysimoFQeSDdZAJg+t+S Bq48HMy6OQV4wmG8U4aaSKI64IeOeVUbkqkL0H/Nufc7AM4dODdssy54p7cJf07t/tEk 4H3Q== X-Gm-Message-State: AOAM530JSSfJl3E/510NIBXXBw7OK9A/CxuxBfqc6J1Ic6Q+0lrZZB6m 7XMHXBFhR9cXjfGOTMq9ugy3WaSS+Dxy7rccxFY= X-Received: by 2002:a05:6638:1187:: with SMTP id f7mr18520540jas.58.1595113522233; Sat, 18 Jul 2020 16:05:22 -0700 (PDT) MIME-Version: 1.0 References: <7653c6c74a4eee18b8bdc8262e0c0b5b95f9d771.camel@intel.com> In-Reply-To: From: "H.J. Lu" Date: Sat, 18 Jul 2020 16:04:46 -0700 Message-ID: Subject: Re: Random shadow stack pointer corruption To: Dave Hansen Cc: Yu-cheng Yu , Andy Lutomirski , LKML , X86 ML , Borislav Petkov , Dave Hansen , Ingo Molnar , "Ravi V. Shankar" , Sebastian Andrzej Siewior , Tony Luck , Thomas Gleixner , Peter Zijlstra , Weijiang Yang Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 18, 2020 at 3:41 PM Dave Hansen wrote: > > On 7/18/20 11:24 AM, Yu-cheng Yu wrote: > > On Sat, 2020-07-18 at 11:00 -0700, Andy Lutomirski wrote: > >> On Sat, Jul 18, 2020 at 10:58 AM Yu-cheng Yu wrote: > >>> Hi, > >>> > >>> My shadow stack tests start to have random shadow stack pointer corruption after > >>> v5.7 (excluding). The symptom looks like some locking issue or the kernel is > >>> confused about which CPU a task is on. In later tip/master, this can be > >>> triggered by creating two tasks and each does continuous > >>> pthread_create()/pthread_join(). If the kernel has max_cpus=1, the issue goes > >>> away. I also checked XSAVES/XRSTORS, but this does not seem to be an issue > >>> coming from there. > >> > >> What do you mean "shadow stack pointer corruption"? Is SSP itself > >> corrupt while running in the kernel? Is one of the MSRs getting > >> corrupted? Is the memory to which the shadow stack points getting > >> corrupted? Is the CPU rejecting an attempt to change SSP? > > > > What I see is, a new thread after ret_from_fork() and iret back to ring-3, > > its shadow stack pointer (MSR_IA32_PL3_SSP) is corrupted. > > Does corrupt mean random? Or is it a valid stack address, just not for > _this_ thread? Or NULL? Or is it a kernel address? Have you tried > tracing *ALL* the WRMSR's and XRSTOR's that write to the MSR? Another data point. When memory corruption happened, there was no core dump at all. We verified that core dump was enabled and we did get core dump for other programs. -- H.J.