Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp600345ybz; Wed, 15 Apr 2020 14:54:53 -0700 (PDT) X-Google-Smtp-Source: APiQypIK0QOvdRAKBVtJDUTQ+utKmSxgIZtDQrdple/l0vNjeBqRNZTm0VuKd0ol1ZLXplnRKieQ X-Received: by 2002:a05:6402:22f7:: with SMTP id dn23mr27727396edb.167.1586987693670; Wed, 15 Apr 2020 14:54:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586987693; cv=none; d=google.com; s=arc-20160816; b=N3Gu+wmj1giNyhuqE+LcYb/zeROuFqFFeUMNiPChh/xxcmaDcVTvoscpcRUTv4dIdo FMCPJLIsa7Qk97uLQ2fqJ1dZBoQCxJ3IAIWOEPVhYDmjNIjr9sNWa6siMnJqWrFMbkXN eU6mTdaOb8DBo+ubs2fFdR+qgPlT41F8cgnKIYlWFLH6mtXB6JWs0VVTGnSsZ5uEfG0W UDIFI6SMyvzMBebpUjFmbas4SOpdlcirJKVfR4pl0Sfxl/YG7vxhRNzLRiZuIH3cdN2n GdqjbEZ2XAo3+EpEK6ydsKXx7PnjNxAamIAKv7fwNee/GctwlepZXfl3acuVwk2Gbflw 4faw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=sjVvZonb0hVGjhdZ35VWK21WqwfB1XaKuSBkV0Zyjh0=; b=aV2r0Fa6aI8z5IcyUJbGYnVOWZ5Djc0EgmBq3SXxWG8mzE1CjHpq8Wu9uCFq4bqUxB YVodlYxjUHM6GKFOqGNKdTQEfMiiKd/2JkBXsu5afdxhLifX5PDO5QEPi5h+0pwgruWS tAynhh76hP+3DZtIaDNBCqRmh+Fl8dodq3EaEhlr30ZZvFqXEy5pL8Lu2HUnl7FU+hh/ OXnFpVKQlHR8cNJiEBZQcAzKJXvr2lpFDB1/LbQmUFuqvhzPiyRQTIYZj5gl3IM+BZJp 3b6hVhF1nelepfATLvF6Z7SWFTIbW0airBWEttm7ztbJi4h/IrVoJTG6R6GmeRYrbGq8 KRIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@juliacomputing-com.20150623.gappssmtp.com header.s=20150623 header.b=PvNd+JcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dc18si11037380edb.498.2020.04.15.14.54.30; Wed, 15 Apr 2020 14:54:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@juliacomputing-com.20150623.gappssmtp.com header.s=20150623 header.b=PvNd+JcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2632846AbgDNUPB (ORCPT + 99 others); Tue, 14 Apr 2020 16:15:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S2632810AbgDNUOv (ORCPT ); Tue, 14 Apr 2020 16:14:51 -0400 Received: from mail-io1-xd42.google.com (mail-io1-xd42.google.com [IPv6:2607:f8b0:4864:20::d42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9E5CC061A41 for ; Tue, 14 Apr 2020 12:55:44 -0700 (PDT) Received: by mail-io1-xd42.google.com with SMTP id f19so14584215iog.5 for ; Tue, 14 Apr 2020 12:55:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juliacomputing-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sjVvZonb0hVGjhdZ35VWK21WqwfB1XaKuSBkV0Zyjh0=; b=PvNd+JcVIrErvIye2eIVifVi9exD5Rgm72/G+zjK72njls/t+oLDlFGJsW98dujywT 4XjOKGsIZADatakxpcHQBqMe7NgqrmCsotFm4GnOGy4vsttMaWLGW8UO0ZQwbvZfyg5N VYUfSBF6hI/2JCKAJ+EQ3DiuZvYmSJA3wQIwIw1GvgvF8U0YHtv3cOJKHT/pLmusZ0i5 wUS4/Q4uYw4iJ2spiaTC+yG7VYNSowyLRNz6d56ajnLlMEeBEPEMk2hCqvkf/xsAWI+6 nmLEgDu/dHzlHtFPoltUGrYDTCnYjwte9eG+1jrx/QfHaOcvBnursf3Pnjvv5TfwOy0s ff2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sjVvZonb0hVGjhdZ35VWK21WqwfB1XaKuSBkV0Zyjh0=; b=f8Nuzt13pMvmt6nWFTd8dmKGGHzfvaqFGZIjK6yhiaJ84uua5EO1GkfNRj0fdt+dLf hEVkW7Mc/SWp4iI+jGIk4FVoFhuXC52RynMBtq9zeGuyfCW/7v/2GRB+5lV90lpM0cH+ vc6ooxoYQQM9hD1XySIE8AKS+pt8u8AfUjeqAdEFogMskhd8d7Z4AQ9+daCiun6Tw7Zi 88+DXuetKh0E21Lzd5fraBjuqtYDAg+BMlnbgkUvaWg3kDoJcOgGndq/DCkxoud9s88G HIx8blRutVNVjEnWuSrL99jUoHk4G/Xy1Z7HVclUdVBa0L8dHe2GF18vHO5WgnJn7oiu fJIA== X-Gm-Message-State: AGi0PuarCXBhpvx5g1WP4aRt5jfioMynbKaUaWOA7edghJqwl+Zh3Gu/ QTdv/DZGWdaqSf8tUqL380YtcTHXC4H4bsNfPS98+w== X-Received: by 2002:a05:6602:2182:: with SMTP id b2mr22542641iob.19.1586894144019; Tue, 14 Apr 2020 12:55:44 -0700 (PDT) MIME-Version: 1.0 References: <20200407011259.GA72735@juliacomputing.com> <8f95e8b4-415f-1652-bb02-0a7c631c72ac@intel.com> <5208ad1e-cd9b-d57e-15b0-0ca935fccacd@intel.com> <9921cb2e-a7cb-c1d0-b120-c08f06be7c7f@intel.com> In-Reply-To: From: Keno Fischer Date: Tue, 14 Apr 2020 15:55:07 -0400 Message-ID: Subject: Re: [RFC PATCH v2] x86/arch_prctl: Add ARCH_SET_XCR0 to set XCR0 per-thread To: Dave Hansen Cc: Linux Kernel Mailing List , Thomas Gleixner , Ingo Molnar , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , "H. Peter Anvin" , Borislav Petkov , Dave Hansen , Andi Kleen , Kyle Huey , "Robert O'Callahan" , Andy Lutomirski , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi everyone, I'd like to continue this discussion along two directions: 1) In this patch, what should happen to signal frames? I continue to think that it would be good for these to observe the process' XCR0, but I understand the argument that we should not let the XCR0 setting modify any kernel behavior whatsoever. Andy, I would in particular appreciate your views on this since I believe you thought it should do the latter. 2) What would a solution based on the raw KVM API look like? I'm still afraid that going down the KVM route would just end up back in the same situation as we're in right now, but I'd like to explore this further, so here's my current thinking: Particularly for recording, the process does need to look very much like a regular linux process, so we can get recording of syscalls and signal state right. I don't have enough of an intuition for the performance implications of this. For example, suppose we added a way for the kernel to directly take syscalls from guest CPL3 - what would the cost of incurring a vmexit for every syscall be? I suppose another idea would be to build a minimal linux kernel that sits in guest CPL0 and emulates at least the process state and other high frequency syscalls, but forwards the rest to the host kernel. Seems potentially doable, but a bit brittle - is there prior art here I should be aware of, e.g. from people looking at securing containers? As I mentioned, I had looked at Project Dune before (http://dune.scs.stanford.edu/), which does seem to do a lot of the things I would need, though it doesn't appear to currently be handling signals at all, and of course it's also not really KVM based, but rather KVM-but-copy-pasted-and-manually-hacked-up-in-a-separate.ko based. I may also be missing a completely obvious way to do this - my apologies if so. I would certainly appreciate any insight on how to achieve the set of requirements here (multiple tracees with potentially differing XCR0 values, faithful and performant provision of syscalls/signals to the tracees) on top of KVM. If we can figure out a good way forward with KVM, I'd be quite interested in it, since I think there may be additional performance games that could be played by having part of rr be in guest CPL0, I'm just unsure that KVM is really the right abstraction here, so I'd like to think through it a bit. Keno