Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp3971346rdg; Wed, 18 Oct 2023 10:56:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IErhrJJLsUKBCj6sqdUovXglk7f3mmwssVJWj9Qac7/mIJBaNrvtRJ4UthSD1i7KvmzTc83 X-Received: by 2002:a05:6358:190f:b0:168:9fac:4417 with SMTP id w15-20020a056358190f00b001689fac4417mr331333rwm.22.1697651770008; Wed, 18 Oct 2023 10:56:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697651769; cv=none; d=google.com; s=arc-20160816; b=cjzE8OxhyvUHy00nAjWB/prBjSoenDeNEMTxw2NXlGUP0cieXUQ2itncgtNxd6U4mo tDc892JOyAAuX9ftopi1+pRKyfWTIOxYhmLxvC1PmYI2i8ylIs9ICAyfqC9sppwN0BLS +rto4WMKo97kpPYBWOlp8fCIe8WAyIiU/0wuLtTCmqwN2yyQ4yA0GLqOcU6VfTwMoGHc Gp1bCEJM6qakLBoXnTPy5rin5AYDv/PJn8YwOG386adj0RsjwA4K/Py+GTYgSwH/1a9X aUgRkkOEEOh/c+3xNFGHgExZmpE1jsHUXYXQPJ9QzObjIAaLw3/5xl+s/vfzcHlr7w22 DCVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=h8dqClmQ0sq/MV6zPdpgsznCjz8VpPQTgB3SxTIF0oM=; fh=5gNLRMoM+s8WeQuR3TP9hM4p8gAa8/bjMP4LRT++uhg=; b=x1S/f0UlN7lSJrfWC+nvW9T6R0IICTVIck3Q2zFzPkLU3Ffd3yBLDtnpHLq1Ms1aS3 RqGZca/TXMRIVBUpoQE4kML3qmNVoMyw2XLllOxbZ/RypJsk+xNGqtwBq3rnFHBPvm+L pGbeEuD7/neAj+IchWGFM8hHUO+YT/zBcFlVWnJyzkjKPndL5HEpRgAfU8EKug0sHQWL zos8ZZ4KztHRWNfV2j8rKRUWSg4hJLsHsdprdCvmlPndqUGUT7+3fkTpmU4lvK6JIKqa I4q1TfKUeY4SSm+WHELbw6hx5p5jVOo26Ca8YE+hSLBI9iJZfPyc5zFqljuKYkb+g4Jt htJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alien8.de header.s=alien8 header.b=F232D63q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id b8-20020a056a000cc800b006bc18937f9bsi4594028pfv.15.2023.10.18.10.56.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 10:56:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@alien8.de header.s=alien8 header.b=F232D63q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 53B4B8031AFC; Wed, 18 Oct 2023 10:56:06 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229744AbjJRRzx (ORCPT + 99 others); Wed, 18 Oct 2023 13:55:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229447AbjJRRzu (ORCPT ); Wed, 18 Oct 2023 13:55:50 -0400 Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6045DEA; Wed, 18 Oct 2023 10:55:48 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 91EB840E0196; Wed, 18 Oct 2023 17:55:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Authentication-Results: mail.alien8.de (amavisd-new); dkim=pass (4096-bit key) header.d=alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id PVMOhDw17b1i; Wed, 18 Oct 2023 17:55:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1697651744; bh=h8dqClmQ0sq/MV6zPdpgsznCjz8VpPQTgB3SxTIF0oM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=F232D63q1cznJq5AGjEIZAmT4+Pb38JT7CdosuQA2fu4QiT+5/+XBaO17Kd/dC4ji 5BDYlMO27qS583vFK8rA/VbpZS1xeL2XdCUrNPfZka2jht7qJea0yjfenGatO0U8Rn 1m+VEnSIBunbmSMmjV2OccArVrwhFqg5zkx6t+XdzfM3qPkyhkuO13nT+1ZE5S2QKb m9nFbyBMro7RZt0yUcoUsPg4BSKj5YWVf0UU/grjKmBG/QAPudlkmrI1/gZKJ0Sz99 3zvjeqhzR44osXlS6EMdtg/U9a014niwKyfAbABKf6Griwc+MmBm/rJ8nkSmj5R5Sc 6YQQJnUO9g9D6kfdKQXYqLIuQZw4x9LOOuqnRdbjz8y7QnR6OfuGVfiHh0xjZFBmQD liELKgpxSy5bpmxJHMtM5oS796RQ79XsvhRt9FvhPy+W38z2ilIWMPP7G1stRzAyB4 65ceZnN/u9uDb5OexE88jJ0a0vtn7nxoaS05gy1OmWRRur25xhC/kz+Q1TLQsq5bCS p3eMOSmCfwH7mm+eKfznKRcDxYVSnxLsxKE/YSAgjju28YIJBRMuUNGpztuM/6II3/ Fu1ggvwq0XK9kOSHG128kV8Kfxl9Pv54egJDEzV3x77Vomx09pZC3adGF9OtbhuZAg y0jtE4pmwZ4F5oBkkJf2MUYk= Received: from zn.tnic (pd95304da.dip0.t-ipconnect.de [217.83.4.218]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 8FAE040E0193; Wed, 18 Oct 2023 17:55:36 +0000 (UTC) Date: Wed, 18 Oct 2023 19:55:31 +0200 From: Borislav Petkov To: Josh Poimboeuf Cc: Ingo Molnar , linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org, David Kaplan , "Peter Zijlstra (Intel)" , x86@kernel.org, David Howells Subject: Re: [tip: x86/bugs] x86/retpoline: Ensure default return thunk isn't used at runtime Message-ID: <20231018175531.GEZTAcE2p92U1AuVp1@fat_crate.local> References: <20231012141031.GHZSf+V1NjjUJTc9a9@fat_crate.local> <169713303534.3135.10558074245117750218.tip-bot2@tip-bot2> <20231018132352.GBZS/caGJ8Wk9kmTbg@fat_crate.local> <20231018151245.GCZS/17QhDGe7q6K+w@fat_crate.local> <20231018155433.z4auwckr5s27wnig@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20231018155433.z4auwckr5s27wnig@treble> X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 18 Oct 2023 10:56:06 -0700 (PDT) On Wed, Oct 18, 2023 at 08:54:33AM -0700, Josh Poimboeuf wrote: > On Wed, Oct 18, 2023 at 05:12:45PM +0200, Borislav Petkov wrote: > > On Wed, Oct 18, 2023 at 03:38:56PM +0200, Ingo Molnar wrote: > > > If then WARN_ONCE(). > > > > WARN_ONCE() is not enough considering that if this fires, it means we're > > not really properly protected against one of those RET-speculation > > things. > > > > It needs to be warning constantly but then still allow booting. I.e, > > a ratelimited warn of sorts but I don't think we have that... yet. > > I'm not sure a rate-limited WARN() would be a good thing. Either the > user is regularly checking dmesg (most likely in some automated fashion) > or they're not. If the latter, a rate-limited WARN() would wrap dmesg > pretty quickly. Well, freezing the box without any mention about why it happens is not viable either. So for lack of a better solution, overflowing dmesg is all we could do. And, on a related note, I'm thinking I should revert: e92626af3234 ("x86/retpoline: Remove .text..__x86.return_thunk section") after all because I'm debugging another similar issue reported by dhowells. And I can reproduce it on linux-next with his config and gcc-13. The splat looks like this below - and mind you, that's in a VM. On baremetal you get to see only the first warning and output stops. And that happens because for whatever reason apply_returns() can't find that last jmp __x86_return_thunk for %r15 and it barfs. When I revert e92626af3234, it is fixed. It fixes dhowells' box too. Which means, IMHO, objtool is missing to add a return return call site at the end of that __x86_indirect_thunk_r15. And considering how close we are to the merge window, I'd let that .text..__x86.return_thunk section exist so that objtool can find the return sites more reliably that what we currently have. We can always do e92626af3234 later, when it has seen more testing. Now, to the UD2 case - look below at "* first splat". Stack protector fires but there's no #UD exception. Well, there is, it is well hidden: (gdb) bt #0 delay_tsc (cycles=3670543) at arch/x86/lib/delay.c:90 #1 0xffffffff810c706e in panic (fmt=fmt@entry=0xffffffff82504fe4 "stack-protector: Kernel stack is corrupted in: %pB") at kernel/panic.c:456 #2 0xffffffff81d64afb in __stack_chk_fail () at kernel/panic.c:763 #3 0xffffffff810e9333 in notify_die (val=val@entry=DIE_TRAP, str=str@entry=0xffffffff824f49b8 "invalid opcode", regs=regs@entry=0xffff8880794000a8, err=err@entry=0, trap=trap@entry=6, sig=sig@entry=4) at kernel/notifier.c:597 #4 0xffffffff8101d4fb in do_error_trap (regs=regs@entry=0xffff8880794000a8, error_code=error_code@entry=0, str=str@entry=0xffffffff824f49b8 "invalid opcode", trapnr=trapnr@entry=6, signr=signr@entry=4, sicode=sicode@entry=2, addr=0xffffffff81d712a0 <__x86_return_thunk>) at arch/x86/kernel/traps.c:170 #5 0xffffffff81d62355 in handle_invalid_op (regs=0xffff8880794000a8) at ./arch/x86/include/asm/ptrace.h:209a ^^^^^^^^^^^^^^^^ #6 exc_invalid_op (regs=0xffff8880794000a8) at arch/x86/kernel/traps.c:263 #7 0xffffffff81e00a96 in asm_exc_invalid_op () at ./arch/x86/include/asm/idtentry.h:568 #8 0xffffffff81d49ff4 in cmp_ex_sort (a=0x38020f, b=0x761d61d8b) at lib/extable.c:61 #9 0x000000000000000c in fixed_percpu_data () #10 0xffff888079400198 in ?? () #11 0xffffffff826cdc70 in __modver_attr () #12 0x0000000000000270 in ?? () #13 0xffffffff826ceb10 in __start___ex_table () #14 0x0000000000000000 in ?? () (gdb) *while* it runs the #UD exception handler, stackprotector determines that the stack has been corrupted, leading to that panic. And nothing in dmesg tells the user what's really going on. And with the warn, you can actually see it: ------------[ cut here ]------------ Unconverted return thunk WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/bugs.c:2855 check_thunks+0x11/0x1a Modules linked in: ... ---- I still need to figure out, though, how to make check_thunks *not* have a "jmp __x86_return_thunk" at the end itself because it gets loopy. :) * first splat ------------- ... x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. ------------[ cut here ]------------ missing return thunk: __x86_indirect_thunk_r15+0xa/0x5f-0x0: eb 74 66 66 2e WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:755 apply_returns+0xca/0x247 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.6.0-rc6-next-20231018-build3 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:apply_returns+0xca/0x247 Code: 80 3d bd e1 aa 01 00 75 b4 49 89 d8 48 89 ea 48 89 de c6 05 ab e1 aa 01 01 b9 05 00 00 00 48 c7 c7 45 65 4f 82 e8 0b 10 0a 00 <0f> 0b eb 8f f6 05 36 2e 1f 02 02 74 26 0f b6 54 24 52 48 89 de 48 RSP: 0000:ffffffff82803e30 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff81d7122a RCX: 0000000000000003 RDX: 0000000000000086 RSI: 00000000fff7ffff RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff82803969 R12: ffffffff831fed00 R13: 0000000000000005 R14: ffffffff831fed18 R15: 0000000000013af0 FS: 0000000000000000(0000) GS:ffff888079400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888003401000 CR3: 0000000002854000 CR4: 00000000001506f0 Call Trace: ? __warn+0x8c/0xf6 ? report_bug+0xbf/0x11f ? apply_returns+0xca/0x247 ? handle_bug+0x3c/0x63 ? exc_invalid_op+0x13/0x5d ? asm_exc_invalid_op+0x16/0x20 ? __x86_indirect_thunk_r15+0xa/0x5f ? apply_returns+0xca/0x247 ? __x86_indirect_thunk_r15+0xa/0x5f ? __x86_indirect_thunk_r15+0x19/0x5f ? __x86_indirect_thunk_r15+0xc/0x5f alternative_instructions+0x35/0xe2 arch_cpu_finalize_init+0xba/0xdb start_kernel+0x4a1/0x524 x86_64_start_reservations+0x25/0x25 x86_64_start_kernel+0x73/0x73 secondary_startup_64_no_verify+0x166/0x16b ---[ end trace 0000000000000000 ]--- Freeing SMP alternatives memory: 36K pid_max: default: 32768 minimum: 301 LSM: initializing lsm=capability,yama,selinux Yama: becoming mindful. SELinux: Initializing. Mount-cache hash table entries: 4096 (order: 3, 32768 bytes, linear) Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes, linear) smpboot: CPU0: Intel Core Processor (Haswell, no TSX) (family: 0x6, model: 0x3c, stepping: 0x1) RCU Tasks Rude: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. RCU Tasks Trace: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. Performance Events: unsupported p6 CPU model 60 no PMU driver, software events only. signal: max sigframe size: 1776 rcu: Hierarchical SRCU implementation. rcu: Max phase no-delay instances is 1000. NMI watchdog: Perf NMI watchdog permanently disabled smp: Bringing up secondary CPUs ... BUG: spinlock bad magic on CPU#0, swapper/0/1 Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: notify_die+0x52/0x5b CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.6.0-rc6-next-20231018-build3 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: notify_die+0x52/0x5b ]--- -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette