Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3770365ybi; Fri, 5 Jul 2019 13:40:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqwodV3q1h1cQiUo5JaW5plnI0O+mqhGSRblc06QGmpjGYx0irxsjzIJ6RD4dboJRZSZflU1 X-Received: by 2002:a63:d84e:: with SMTP id k14mr7464198pgj.234.1562359255751; Fri, 05 Jul 2019 13:40:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562359255; cv=none; d=google.com; s=arc-20160816; b=PnoTn3mkFU0uTZt6XT9LEPbwRpRi/ax0cF8pasr1ZTMr5t/NE50fcNh3kokvp7Vv7S cCxPlasm8sZUYdUyCsGdnYuI/9DBozJA80madgNq3uWXZ9makVsqIEB5Q/Fu9ZPgjSIx uLvGOwlTmAcshf2bFyb9cmjKUYOLjDTKakaYCSEu186b3bWuhdfefSq90/yQAb086cl3 k+wqs1tuFHR6jP3M1VJJJn8Ye6EufprnrhPY7jeyE27PjJYp0VczXdHyWrDy7id83s+/ 2q4P1TBdtZwnO1MNH5iNTl+T5bzkEFPgaN2l9UrBdzu1otn3S3CjV44EeyFGZ5wFwrVT JqFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=PMzt4HjDi6E0yjEN09AuAD//648wE2X0iuTkQyEyT1U=; b=J+Kg/uIT+ONO38lWuAeicskiK0oZ5ATPWVUKZD66hgLgDISHM57lM3HNinylW94l7/ o7d1tepFfUcFxGbwz/a2ljVZU+bX3weqeWoJ0gEZN9CCA3e+nD6uVQD9Cov5T73Zn9Hq tT6xpHgw4H8uvK9bI3lhe7IiqlL81Llk0shJi3R3BLQSIbQJz29PTO2t0obsV64odADS AgN0CvPnnLL/+hmCIXWmJP//sYlAor4maTSmkj4EM9MJtY5Z58qC30/vkPARjLvEHvCp Hzi0hphGz2YIoGgFcai0+Jt/A71QPNXl5AZfH6JyK8ziI8KS9GKIgtoxGj9bEfGBmYaK CCWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=lMXbFdQ5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p3si9131070pjr.18.2019.07.05.13.40.40; Fri, 05 Jul 2019 13:40:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=lMXbFdQ5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727898AbfGEUkJ (ORCPT + 99 others); Fri, 5 Jul 2019 16:40:09 -0400 Received: from mail.kernel.org ([198.145.29.99]:39288 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726922AbfGEUkJ (ORCPT ); Fri, 5 Jul 2019 16:40:09 -0400 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7A97A216FD for ; Fri, 5 Jul 2019 20:40:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1562359207; bh=3rNzLuURIot9kjZOouo8mpgcH2k7Tx+lkD/aZzwEAaQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=lMXbFdQ54rR/dKtXF5R72dQoV+V9vHUXz3N9RZXXoKmh1I07WVMJBVIKmNFV5wDRl vs1r+/AklWuPuI9JIZWYSNBqg/UMVaLC0lKuRzBLQZDh1b9qy70UVBkeCmu1MCixVQ DesmFg4y1MALHJl5wXn1dwEQXZegoN01tx/DD4HI= Received: by mail-wr1-f49.google.com with SMTP id z1so6423183wru.13 for ; Fri, 05 Jul 2019 13:40:07 -0700 (PDT) X-Gm-Message-State: APjAAAV+9G0VEizjYP1Yt9cPPIUvg73xUrCO3k2aSZpP0Nur7ofeBJ96 9PtT8k4yhj3E236BhzG9aUjiNJ8JxMYmGnepj8Ch1A== X-Received: by 2002:adf:dd0f:: with SMTP id a15mr4974171wrm.265.1562359206078; Fri, 05 Jul 2019 13:40:06 -0700 (PDT) MIME-Version: 1.0 References: <20190704155145.617706117@linutronix.de> <20190704155608.636478018@linutronix.de> <958a67c2-4dc0-52e6-43b2-1ebd25a59232@citrix.com> In-Reply-To: From: Andy Lutomirski Date: Fri, 5 Jul 2019 13:39:54 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more robust To: Thomas Gleixner Cc: Andy Lutomirski , Andrew Cooper , Josh Poimboeuf , Peter Zijlstra , LKML , X86 ML , Nadav Amit , Ricardo Neri , Stephane Eranian , Feng Tang Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 5, 2019 at 1:36 PM Thomas Gleixner wrote: > > On Fri, 5 Jul 2019, Andy Lutomirski wrote: > > On Fri, Jul 5, 2019 at 8:47 AM Andrew Cooper wrote: > > > Because TPR is 0, an incoming IPI can trigger #AC, #CP, #VC or #SX > > > without an error code on the stack, which results in a corrupt pt_regs > > > in the exception handler, and a stack underflow on the way back out, > > > most likely with a fault on IRET. > > > > > > These can be addressed by setting TPR to 0x10, which will inhibit > > > delivery of any errant IPIs in this range, but some extra sanity logic > > > may not go amiss. An error code on a 64bit stack can be spotted with > > > `testb $8, %spl` due to %rsp being aligned before pushing the exception > > > frame. > > > > Several years ago, I remember having a discussion with someone (Jan > > Beulich, maybe?) about how to efficiently make the entry code figure > > out the error code situation automatically. I suspect it was on IRC > > and I can't find the logs. I'm thinking that maybe we should just > > make Linux's idtentry code do something like this. > > > > If nothing else, we could make idtentry do: > > > > testl $8, %esp /* shorter than testb IIRC */ > > jz 1f /* or jnz -- too lazy to figure it out */ > > pushq $-1 > > 1: > > Errm, no. We should not silently paper over it. If we detect that this came > in with a wrong stack frame, i.e. not from a CPU originated exception, then > we truly should yell loud. Also in that case you want to check the APIC:ISR > and issue an EOI to clear it. It gives us the option to replace idtentry with something table-driven. I don't think I love it, but it's not an awful idea. > > > > Another interesting problem is an IPI which its vector 0x80. A cunning > > > attacker can use this to simulate system calls from unsuspecting > > > positions in userspace, or for interrupting kernel context. At the very > > > least the int0x80 path does an unconditional swapgs, so will try to run > > > with the user gs, and I expect things will explode quickly from there. > > > > At least SMAP helps here on non-FSGSBASE systems. With FSGSBASE, I > > How does it help? It still crashes the kernel. > > > suppose we could harden this by adding a special check to int $0x80 to > > validate GSBASE. > > > > One option here is to look at ISR and complain if it is found to be set. > > > > Barring some real hackery, we're toast long before we get far enough to > > do that. > > No. We can map the APIC into the user space visible page tables for PTI > without compromising the PTI isolation and it can be read very early on > before SWAPGS. All you need is a register to clobber not more. It the ISR > is set, then go into an error path, yell loudly, issue EOI and return. > The only issue I can see is: It's slow :) > > I think this will be really extremely slow. If we can restrict this to x2apic machines, then maybe it's not so awful. FWIW, if we just patch up the GS thing, then we are still vulnerable: the bad guy can arrange for a privileged process to have register state corresponding to a dangerous syscall and then send an int $0x80 via the APIC.