Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp1189887imd; Thu, 1 Nov 2018 11:35:11 -0700 (PDT) X-Google-Smtp-Source: AJdET5cxdd0oKIlB1A/HYJj/yjjf4nWCZsMIjVdJ1hnKJS/rQ1jIpuKqr/btzMT5+ixSLgA2M0dU X-Received: by 2002:a62:2211:: with SMTP id i17-v6mr2624763pfi.35.1541097311916; Thu, 01 Nov 2018 11:35:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541097311; cv=none; d=google.com; s=arc-20160816; b=D98O+/4+CcB0VbODg+Bx5DRK50JPx75P+YB8Z4+InB6O7O7DJin1MnOiOjtrsbTCyk BhKD2rDDueiv9e4PG4W/1YQKs6vj3A8zLWe7Vq+mq7qdU0+ZeEPxeDk2HSLc3NNv/ysl L6Mh96Rp51nxRzkpBimHlOASu7yUNH3jjsPpJdv6xGj/naCmdVddlRvjnV4pCjMqEqiY 1abB8sLulkwUZ6Qz83B8O31gC2SafAF3A1j501JfIxHfGzddLpzJsK+I1JVYHSPmkp0I hEe5E7JfoucSCA//tYVmEQc+lJHtUPI54cLpqcfXKrw43EcsQ1vB8zdc1zulOH7yGjhK RMfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=VydAecWCb2O6SNfJkI4cq4re5+yo1Xcs1mpLrFucJa8=; b=VnbvXIA6UmgnSwVSJ9CZzCFKF0ptZ7KDm1K9wDswvzpkqfZ8U5AB3zJ7fiSYS0BI/q 506nWX5OnmNKmNTi8AbCecZhC7chDMq71qwR3LoJsPZCxDekrg0565AOiZb9ehXshRK/ pggnsWUJpGSaD/ExTBZkwzcHyjGEhVtK1AibnVoKOxri8gPscgrM8cBxDXTbkpB/QucT wIA0Jsxuxk9hncztjruL/QEKz+7maSULZdf5nhwZPqMWaBS6+3NokwkvX2PFo7rnhOTU EALsYE3csourAHWCfpMtyHHf8ogx5NEZvjhkOJY9blAUcRjLEs2fMPFrzTreFUQFgxbx +Y4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hky0196a; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m187-v6si28124649pfb.202.2018.11.01.11.34.57; Thu, 01 Nov 2018 11:35:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hky0196a; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727679AbeKBDiH (ORCPT + 99 others); Thu, 1 Nov 2018 23:38:07 -0400 Received: from mail-oi1-f172.google.com ([209.85.167.172]:42505 "EHLO mail-oi1-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727519AbeKBDiH (ORCPT ); Thu, 1 Nov 2018 23:38:07 -0400 Received: by mail-oi1-f172.google.com with SMTP id h23-v6so17440035oih.9 for ; Thu, 01 Nov 2018 11:34:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VydAecWCb2O6SNfJkI4cq4re5+yo1Xcs1mpLrFucJa8=; b=hky0196a7cwB2rM7dJ30lrKv6D/YrNg34ugSwkgvw+RPxl9GcztNQ9LAdVy7e58s5D geN9s6d96KFlsVvpImymSAP60bf1h7XHcJjX0TYklEmI6N6ZEk+Iwb7GJDBYPk5yt1wX SzxJPtWpH4ko2GEMAwILpy+SBbGIFjQg+03AwXulXiu1pwfP9B6fE9xWG7KXFl1tb0Yc YYU+C8g1S6Vb1Gin7TPKxgQKapyeq7X2NeFg2xqXn6IClC+6UU3t8yHxUzIn9RAdTANy Ly300kWRlYLPl14ju3fLBJJergir/Q7O+k+jBM/1rNMJsO7/pvY9XmugFVZpGxqdhMzW uvSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VydAecWCb2O6SNfJkI4cq4re5+yo1Xcs1mpLrFucJa8=; b=M/d8dLr/Ew32nAysZWpzKFX1RaPxWXwzEfeewswl2a2Lme1HI68lOm99AFWpNc4X9E l+0QUWAn6rtdCAlS95mc9w/5lPVzVasSJEM2b0t1zJkOKmYruQoDO48YgqSmgFNddnuQ kNr3QF6BeKY3lLMYDf2kFFJzLWJ9tlPRiScIe6UT8VL7KOalNYtvHOP8WO+ocJ34Azbx ViLYePEvU/dZ5OkWt1xZ7t9Jt4tBbpKImO/rmdEaWhB87XR9oAyKiGln7e/PUOgvqsl9 XzP2qs5U8EEx6Th0JaJxj+kkSEQ/tm+kvM6RqLGo7xXePOiz7cXZGIcbVpiOWKIfZE0v jViQ== X-Gm-Message-State: AGRZ1gLJy8u02a9zVKXEIsbuLewNq6k9CnxU4/dGmjrK8zT/ztaZwoKo PdjyLDy+YkVKsyd+MokfW8qK5KDe69WuJ7P7Ib/ZtA== X-Received: by 2002:aca:b05:: with SMTP id 5-v6mr5168286oil.157.1541097239392; Thu, 01 Nov 2018 11:33:59 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jann Horn Date: Thu, 1 Nov 2018 19:33:33 +0100 Message-ID: Subject: Re: RFC: userspace exception fixups To: Andy Lutomirski Cc: Dave Hansen , sean.j.christopherson@intel.com, jethro@fortanix.com, jarkko.sakkinen@linux.intel.com, Florian Weimer , Linux API , Linus Torvalds , "the arch/x86 maintainers" , linux-arch , kernel list , Peter Zijlstra , dalias@libc.org, nhorman@redhat.com, npmccallum@redhat.com, serge.ayoun@intel.com, shay.katz-zamir@intel.com, linux-sgx@vger.kernel.org, andriy.shevchenko@linux.intel.com, Thomas Gleixner , Ingo Molnar , Borislav Petkov , carlos@redhat.com, adhemerval.zanella@linaro.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 1, 2018 at 6:53 PM Andy Lutomirski wrote: > The people working on SGX enablement are grappling with a somewhat > annoying issue: the x86 EENTER instruction is used from user code and > can, as part of its normal-ish operation, raise an exception. It is > also highly likely to be used from a library, and signal handling in > libraries is unpleasant at best. > > There's been some discussion of adding a vDSO entry point to wrap > EENTER and do something sensible with the exceptions, This sounds reasonable to me. > but I'm > wondering if a more general mechanism would be helpful. > > The basic idea would be to allow libc, or maybe even any library, to > register a handler that gets a chance to act on an exception caused by > a user instruction before a signal is delivered. As a straw-man > example for how this could work, there could be a new syscall: > > long register_exception_handler(void (*handler)(int, siginfo_t *, void *)); > > If a handler is registered, then, if a synchronous exception happens > (page fault, etc), the kernel would set up an exception frame as usual > but, rather than checking for signal handlers, it would just call the > registered handler. > That handler is expected to either handle the > exception entirely on its own or to call one of two new syscalls to > ask for normal signal delivery If you do it this way, these exception handlers would have to chain, with an API convention that you're obligated to always ask for resumption of signal delivery if you don't recognize the address, right? Kind of like a notifier chain. (Except that, unless this is implemented in the vDSO, each notifier invocation would cross the kernel-user boundary twice.) > or to ask to retry the faulting instruction. Why would that have to be a syscall? For signal handlers registered with SA_NODEFER, you can basically leave the signal handler with a longjmp, right? > Alternatively, we could do something a lot more like the kernel's > internal fixups where there's a table in user memory that maps > potentially faulting instructions to landing pads that handle > exceptions. I like this direction more, although I'm not sure whether the table the kernel sees should be at instruction-level granularity. Perhaps you could associate an exception handler with a VMA? Any instruction that faults in the VMA triggers the fault handler? > Do you think this would be useful? Here are some use cases that I > think are valid: > > (a) Enter an SGX enclave and handle errors. There would be two > instructions that would need special handling: EENTER and ERESUME. > > (b) Do some math and catch division by zero. I think it would be a > bad idea to have user code call a function and say that it wants to > handle *any* division by zero, but having certain specified division > instructions have special handling seems entirely reasonable. > > (c) Ditto for floating point errors. > > (d) Try an instruction and see if it gets #UD. > > (e) Run a bunch of code and handle page faults to a given address > range by faulting something in. This is not like the others, in that > a handler wants to handle a range of target addresses, not > instructions. And userfaultfd is plausibly a better solution anyway. > > (f) Run NaCl-like sandboxed code where the code can cause page faults > to certain mapped-but-intentionally-not-present ranges and those need > to be handled. > > On Windows, you can use SEH to do crazy things like running > known-buggy code and eating the page faults. I don't think we want to > go there. > > All of this makes me think that the right solution is to have a way to > register fault handlers for instructions to cover (a) - (d) and to > treat (e) and (f) as something else entirely if there's enough demand.