Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3195830imu; Mon, 17 Dec 2018 15:16:20 -0800 (PST) X-Google-Smtp-Source: AFSGD/WjMongW17BcKE9a2PVVb74UNfczvrCqh5GqoHjE00S4g2LYxczRiJ6z1cJnKMYuBhtYEXw X-Received: by 2002:a63:dc54:: with SMTP id f20mr13931459pgj.410.1545088580142; Mon, 17 Dec 2018 15:16:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545088580; cv=none; d=google.com; s=arc-20160816; b=ZNkHihhEYoFfF2R0EFT4cyGwaEBFocLipggs/yt3qpSRv+i0vr1z/L7FFrNiR2E8hC l9QwmgqR5qIxh6WkAAaxkGJLR+BY3gQabW/2H5E4CJwiednHXZnrdmlFqyDcOhKOaLz5 kzqI8piEq02MoVW0ONTMefjZ06eMs8ghl8EfsIdRkm/Kmm/coOgk1e7rmDIOK7CVt9ki 9A7GC6rYVWUjXmlhhcx5gQEHZyrZyOFl73QChqKeDueeJzwjWQyUTMnF/Im55OkByG9d 0wLlh3NV6UrRSx+KG50JmaOwQ+GQ/0t3dCTIy++qxsoLtqc7lpsGdpP5EulSj4jV1+/S 7uOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=a02yNf5XVfcSFjGh+n7PKr81Vb2QXs/FOGpdcMnqzVc=; b=YSn+MBDi3BuE9jra9YcTF2ruUPpNYKbzFiF1o129c2Xq3GQPWeZwJlrrAf3Ckun9vp 1w+iQFU4xM3JHoyLk0CwS7abEPT2NxqVC8fy63mGhx9cQ63nBpewDGsUEj7IYi/nOYNi pnI/qGBB4R5lOXq9ACCJcMG2pJSjW8/+21S+DuLwvRCKnMxJok9PVkTerl07sG/Gd+y9 rdeEjjho2wB9SO4EfCiQM1GIH6IDqpe+2oMSNyv9dVvamB83T6uhpQiQa+XPYhtEhh/5 nqVFPZbwHj+ZfTDOqy7f0/Bs/O0fxhABNF4+k+vkqTsrof4P04YiZgfcZ+u9LK334VNQ AzTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=iAP5eK2u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u123si12291964pgb.516.2018.12.17.15.16.05; Mon, 17 Dec 2018 15:16:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=iAP5eK2u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387715AbeLQTMi (ORCPT + 99 others); Mon, 17 Dec 2018 14:12:38 -0500 Received: from mail.kernel.org ([198.145.29.99]:59324 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387451AbeLQTMi (ORCPT ); Mon, 17 Dec 2018 14:12:38 -0500 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 93164214D8 for ; Mon, 17 Dec 2018 19:12:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1545073957; bh=G9zX9KVe7E5PaZcyj8gw8Uifhsyto9qYhWoeYUwDfiM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=iAP5eK2uM3Zdg3sSh0U1PIpPEpOkvzXeTA0LvbYyrmH7w0xA6lWsfDfVqb+remI6f qHcuwUmXd/R1oh4TyiM81/qeyVaNUpiTu3AUt1Zq3DcK+sFhMzpMC4OwNH5i3+8ZRK QASd1qhemoCLdfqxq+/ptZok/rxNEtSM2WJeJqeg= Received: by mail-wm1-f53.google.com with SMTP id f81so332635wmd.4 for ; Mon, 17 Dec 2018 11:12:37 -0800 (PST) X-Gm-Message-State: AA+aEWZVeBm/FLPG5WbF+JFJFR3s2L8VCHD7UAU0DtuCg0XV4YKumwZ4 U9m9JFB1sD7rJhNhs5R2Ta2CRqEr2oKAVLqwj6orAw== X-Received: by 2002:a1c:f112:: with SMTP id p18mr289692wmh.83.1545073954108; Mon, 17 Dec 2018 11:12:34 -0800 (PST) MIME-Version: 1.0 References: <20181116010412.23967-1-jarkko.sakkinen@linux.intel.com> <20181116010412.23967-19-jarkko.sakkinen@linux.intel.com> <7d5cde02-4649-546b-0f03-2d6414bb80b5@intel.com> <20181217180102.GA12560@linux.intel.com> <20181217183613.GD12491@linux.intel.com> <20181217184333.GA26920@linux.intel.com> In-Reply-To: From: Andy Lutomirski Date: Mon, 17 Dec 2018 11:12:21 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v17 18/23] platform/x86: Intel SGX driver To: Dave Hansen Cc: Jarkko Sakkinen , Sean Christopherson , X86 ML , Platform Driver , linux-sgx@vger.kernel.org, nhorman@redhat.com, npmccallum@redhat.com, "Ayoun, Serge" , shay.katz-zamir@intel.com, Haitao Huang , Andy Shevchenko , Thomas Gleixner , "Svahn, Kai" , mark.shanahan@intel.com, Suresh Siddha , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Darren Hart , Andy Shevchenko , "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 17, 2018 at 10:47 AM Dave Hansen wrote: > > On 12/17/18 10:43 AM, Jarkko Sakkinen wrote: > > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote: > >> I'm pretty sure doing mmget() would result in circular dependencies and > >> a zombie enclave. In the do_exit() case where a task is abruptly killed: > >> > >> - __mmput() is never called because the enclave holds a ref > >> - sgx_encl_release() is never be called because its VMAs hold refs > >> - sgx_vma_close() is never called because __mmput()->exit_mmap() is > >> blocked and the process itself is dead, i.e. won't unmap anything. > > Right, it does, you are absolutely right. Tried it and removed the > > commit already. > > > > Well, what we came up from your suggestion i.e. setting mm to NULL > > and checking that is very subtle change and does not have any such > > circular dependencies. We'll go with that. > > This all screams that you need to break out this code from the massive > "18" patch and get the mm interactions reviewed more thoroughly. > > Also, no matter what method you go with, you have a bunch of commenting > and changelogging to do here. I'm going to ask an obnoxious high-level question: why does an enclave even refer to a specific mm? If I were designing this thing, and if I hadn't started trying to implement it, my first thought would be that an enclave tracks its linear address range, which is just a pair of numbers, and also keeps track of a whole bunch of physical EPC pages, data structures, etc. And that mmap() gets rejected unless the requested virtual address matches the linear address range that the enclave wants and, aside from that, just creates a VMA that keeps a reference to the enclave. (And, for convenience, I suppose that the first mmap() call done before any actual enclave setup happens could choose any address and then cause the enclave to lock itself to that address, although a regular anonymous PROT_NONE MAP_NORESERVE mapping would do just fine, too.) And the driver would explicitly allow multiple different mms to have the same enclave mapped. More importantly, a daemon could set up an enclave without necessarily mapping it at all and then SCM_RIGHTS the enclave over to the process that plans to run it. Now I'm sure this has all kinds of problems, such as the ISA possibly making it rather obnoxious to add pages to the enclave without having it mapped. But these operations could, in principle, be done by having the enclave own a private mm that's used just for setup. While this would be vaguely annoying, Nadav's still-pending-but-nearly-done text_poke series adds all the infrastructure that's needed for the kernel to manage little private mms. But some things get simpler -- invalidating the enclave can presumably use the regular rmap APIs to zap all the PTEs in all VMAs pointing into the enclave. So I'm not saying that you shouldn't do it the way you are now, but I do think that the changelog or at least some emails should explain *why* the enclave needs to keep a pointer to the creating process's mm. And, if you do keep the current model, it would be nice to understand what happens if you do something awful like mremap()ing an enclave, or calling madvise on it, or otherwise abusing the vma. Or doing fork(), for that matter. I also find it suspicious that the various ioctl handlers systematically ignore their "filep" parameters and instead use find_vma() to find the relevant mm data structures. That seems backwards.