Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp716588ybg; Tue, 28 Jul 2020 17:37:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwNbmyZPUDUrfKkOPesy+CibbFuQjBRMrtvxh1jgsc0ZmpHT6RBn0XIJQY0yXUCA5OZU3Ft X-Received: by 2002:a17:906:c29a:: with SMTP id r26mr8217621ejz.153.1595983020255; Tue, 28 Jul 2020 17:37:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595983020; cv=none; d=google.com; s=arc-20160816; b=cvuzQFrfdKyDVm5s9DUG9SlmEJuZOpctAkSqSAGBxi3qFirINiDE+RA08cCFrh0YW2 yqhMLBK+4r+7ZBl+y/vewDNLZxgAyGt31kbJ9NgOy617jhDXgFJc2dt3NL9FnllLaGjn 2xVm9L1Gz3C7CLpMjc9Glqgn4Nb17Mwk5xwtk/Gb7oiJ8GrsqydDIgGZlr9Z4lS6RUsJ RX78BFslpPtVlDpw6aW5b7nXYYhXZPfNkATQWaky/zThxK2El+jxwHyTZFrNAVUmzqFj xu8jOo4U0/+5mRBZrf67iq4wQ58l9u6iaM9DS55C3EoDPV22QKIFGVRVxpluvDPaJW7g iWYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=BRha7jx49vrjhrwihmHUXl2bsb7CryChoPGMp67OlAw=; b=bL6/OtbJ3dDFgts/QIT8oaHxI0tOSd73EmgzaDVrRJIpgbKpYGmpsxMM2g2n82Hnvy j0uKTCzKSIzlQH10Fmtl4t4gQXSvt7wPh8cOk4voT5TyvC2ljrxYPqOP1QH4Pi9MMQUl 9milNHNpGbbZhkFBihv0tzVqSmE35PvX1mtx6sB+dPpJP+uafYxfwRZ5R3bTV6aayNfp ISyxmLD1Or9ZvbMm9uYJJohLjN4S8g0zZFhN3VKTDm4IaMjgd/O5WekEeEyWZfSbZjC1 GBU5xQz4YsktPhzmOsYGxeHodUPtms8pmAE0nNdv4aT0gKLguv/5rxIh79P1tqsVEnd+ 4Y2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=dtI4IXDy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ds12si112510ejc.147.2020.07.28.17.36.38; Tue, 28 Jul 2020 17:37:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=dtI4IXDy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730767AbgG2AgO (ORCPT + 99 others); Tue, 28 Jul 2020 20:36:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730507AbgG2AgO (ORCPT ); Tue, 28 Jul 2020 20:36:14 -0400 Received: from mail-il1-x141.google.com (mail-il1-x141.google.com [IPv6:2607:f8b0:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D32F8C061794 for ; Tue, 28 Jul 2020 17:36:13 -0700 (PDT) Received: by mail-il1-x141.google.com with SMTP id z17so2779147ill.6 for ; Tue, 28 Jul 2020 17:36:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BRha7jx49vrjhrwihmHUXl2bsb7CryChoPGMp67OlAw=; b=dtI4IXDysOcFC/8Nn9kbGIJx8CkV8yny9ilR/JCriJ2AR5gSdfBgPpMAD00wdPRO6H L8S3BPEYoNMQ/7rRAOlw1NHrWUlT4oIFXSsYwVeUkzVXPUgmMp6yLsxPx8HPjVV6cgmR 5nrgsvlPVycdtpSpo/oPf+FCI8uMnMoY8j/P4FGzQoaCBDz6Mb2dlmJ+8c30IDESYQ30 VnYypXUdoRyd+Bh9wgwHd4/9UXn5x8wggykc1kjWbd0p+f3qSUqDZsVkGboorFgioN8/ YE9mKmxzb9DTEjhyDEOIPukQtWBfIxVQ15bsQQcFQSnpIVqumXnGuHAGUQkjvp/PmCIS erZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BRha7jx49vrjhrwihmHUXl2bsb7CryChoPGMp67OlAw=; b=MvMsyd2O7gnIh0HBWuJebklWTZW8YLkmw9h0Ft/YAhXX45/VZ05uiTzrlhcJir+iLY w7tiwGlUDShHVFq4dDgS5LSfLAfCWo9XODKwRCqy58pLy38l4PYw6e1XxFu/79wwdtRI AQSnHD1+nhlH4ul5WurcncfSBcUMOuUU0ZuOaWLrN1igR0nIjqhAVRNIjdpZAwoJPtIH /fNZqpUb6qkCltDD+NZGSeId5wkldTDa53TrG3u8Z5BtmCGzV6ARBjKGUhHTDqhsnJPy FizTypupaHGhKeNUzCUmhKNtSLQTQlaQglVTPNvoZA7rJfDOpmfhulrGO+A5pziNg6dh hXMQ== X-Gm-Message-State: AOAM5322UnPhMvmXcJzZLAq+sxaqPkuG+UFiSwvgj7oYhh0+t2uJuShq IxoijpYHh3JA24Dph2+WpdnPLC5UuvHE2KKANsw= X-Received: by 2002:a05:6e02:e89:: with SMTP id t9mr28352221ilj.292.1595982973064; Tue, 28 Jul 2020 17:36:13 -0700 (PDT) MIME-Version: 1.0 References: <7653c6c74a4eee18b8bdc8262e0c0b5b95f9d771.camel@intel.com> <14f3bee7569f229541852f61f0a1a88fcdec7249.camel@intel.com> In-Reply-To: <14f3bee7569f229541852f61f0a1a88fcdec7249.camel@intel.com> From: "H.J. Lu" Date: Tue, 28 Jul 2020 17:35:37 -0700 Message-ID: Subject: Re: Random shadow stack pointer corruption To: Yu-cheng Yu Cc: Dave Hansen , Andy Lutomirski , LKML , X86 ML , Borislav Petkov , Dave Hansen , Ingo Molnar , "Ravi V. Shankar" , Sebastian Andrzej Siewior , Tony Luck , Thomas Gleixner , Peter Zijlstra , Weijiang Yang Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 18, 2020 at 4:35 PM Yu-cheng Yu wrote: > > On Sat, 2020-07-18 at 15:41 -0700, Dave Hansen wrote: > > On 7/18/20 11:24 AM, Yu-cheng Yu wrote: > > > On Sat, 2020-07-18 at 11:00 -0700, Andy Lutomirski wrote: > > > > On Sat, Jul 18, 2020 at 10:58 AM Yu-cheng Yu wrote: > > > > > Hi, > > > > > > > > > > My shadow stack tests start to have random shadow stack pointer corruption after > > > > > v5.7 (excluding). The symptom looks like some locking issue or the kernel is > > > > > confused about which CPU a task is on. In later tip/master, this can be > > > > > triggered by creating two tasks and each does continuous > > > > > pthread_create()/pthread_join(). If the kernel has max_cpus=1, the issue goes > > > > > away. I also checked XSAVES/XRSTORS, but this does not seem to be an issue > > > > > coming from there. > > > > > > > > What do you mean "shadow stack pointer corruption"? Is SSP itself > > > > corrupt while running in the kernel? Is one of the MSRs getting > > > > corrupted? Is the memory to which the shadow stack points getting > > > > corrupted? Is the CPU rejecting an attempt to change SSP? > > > > > > What I see is, a new thread after ret_from_fork() and iret back to ring-3, > > > its shadow stack pointer (MSR_IA32_PL3_SSP) is corrupted. > > > > Does corrupt mean random? Or is it a valid stack address, just not for > > _this_ thread? Or NULL? Or is it a kernel address? Have you tried > > tracing *ALL* the WRMSR's and XRSTOR's that write to the MSR? > > When a shadow stack address is changed, the address appears to be other task's. > I traced all WRMSR's and XRSTOR's. I also verified there have not been any > XRSTORS from a wrong buffer. When rc6 is tagged, I will re-base, test, and > share current patches. > We have identified that ommit 91eeafea1e4b7c95cc4f38af186d7d48fceef89a Author: Thomas Gleixner Date: Thu May 21 22:05:28 2020 +0200 x86/entry: Switch page fault exception to IDTENTRY_RAW Convert page fault exceptions to IDTENTRY_RAW: - Implement the C entry point with DEFINE_IDTENTRY_RAW - Add the CR2 read into the exception handler - Add the idtentry_enter/exit_cond_rcu() invocations in in the regular page fault handler and in the async PF part. - Emit the ASM stub with DECLARE_IDTENTRY_RAW - Remove the ASM idtentry in 64-bit - Remove the CR2 read from 64-bit - Remove the open coded ASM entry code in 32-bit - Fix up the XEN/PV code - Remove the old prototypes No functional change. triggered the shadow stack corruption when the process returned from syscall. SSP MSR somehow was changed between setting SSP MSR and IRET. Could there be a page fault between setting SSP MSR and IRET? -- H.J.