Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp22077192ybl; Mon, 6 Jan 2020 17:37:53 -0800 (PST) X-Google-Smtp-Source: APXvYqwjjUxLtMSSxLkGaBcXsi866ODh++NqMvpRO4sUj6FABJZuy7fXZ5axEGzvVqez0B5R+UbR X-Received: by 2002:a05:6830:1141:: with SMTP id x1mr45406559otq.120.1578361073673; Mon, 06 Jan 2020 17:37:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578361073; cv=none; d=google.com; s=arc-20160816; b=NqDvFB88zTpORvGocURkF6yhmKO/T/RGn+MNcL8gjpgsBVzh6dpXx0EYSBe/RquVaM ip5HCevctq4SX+e2j3yZSY5xi7LHqQ0Lkosut9GIRwWA8HsaZMxXc7lBOFU+qN5WiHwd sI2eBED61+BvVKn/kfdSUkaeF6uprL6uKVo0fUSKcig7MSh1vBccuPrkpcMlty0rwJr4 Z+VjTtJw4+o/vjhq5e8MrcoiczU3LoPG8ZQuW5t+0sH+SiA2r3kENzlSZOZm+6rn5Rgn +dtaXpaCxZfXryO0uQUeFouTnge8fjfe1HO7oVVbYck+ZQ50T60MFB9OjntWTSonfea5 oDug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:in-reply-to:cc:references:message-id :date:subject:mime-version:from:content-transfer-encoding :dkim-signature; bh=HWVPU6NTGp9nzRxw5BWJeatmP8pnvZxGmvGTyfzAkRc=; b=n750mq7syOREhFFVzDsEWFP5rXymoWC1ScVYMkSvQfRrWkGbdWJ5InBILCuWagLVd/ bnt31EejVTFAV9S3Oz1n92wGZbTJGJkekscaE9ua0XFV92XVMPkn71jdvyJhIZ+gSZh0 bTYYeipj8p0BQmjpmG6GT8nf6dqzCxE4o8OboHwv3tFy7sQa9OkigXWjVjHzX66iiqEl X+j9XFzcr6Y/c5Z5WwGLlILduVT2ebborGsgoLDc2tgYtrq3EFSNl/Xc2sjCUkhs7M5s VI7L8PTZkSsomSYWHCo9jZHRa5HbQZbzSOPA9BSIA+/kd211EoACTvF2O+nMQ53VvCiz 6oCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b="H/bxSDl9"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p26si37921875oto.240.2020.01.06.17.37.41; Mon, 06 Jan 2020 17:37:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b="H/bxSDl9"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727445AbgAGBgw (ORCPT + 99 others); Mon, 6 Jan 2020 20:36:52 -0500 Received: from mail-pg1-f196.google.com ([209.85.215.196]:41661 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727326AbgAGBgw (ORCPT ); Mon, 6 Jan 2020 20:36:52 -0500 Received: by mail-pg1-f196.google.com with SMTP id x8so27688314pgk.8 for ; Mon, 06 Jan 2020 17:36:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=HWVPU6NTGp9nzRxw5BWJeatmP8pnvZxGmvGTyfzAkRc=; b=H/bxSDl9ArODmhysOekCr4511LlfvhHtaBDU8GIImpRdmsvDx40o7nMg/xP6prsIR3 W3wfDDHw7bYbaNMyMAt/CZBXvSY1qUoH2OGKUMGcoGkZjTvK/Ss3sTUWmjPTx+xUnPcB /Ri5nVSIBJ4fQ8uFTyem8NfwiSa8/uBW6+OJbrR+gDUtYqvj2yLwQek6ho40oG80i7c3 utvI49OLakatRGxFuN91PBHLPNGJO+b3PNIg2CMjCDaW5r58PLezOGoGIxRgPFxmOhIr jmlP+7U7gJwDrjgCWKNqBW3MxgSLx2NQth3QScpuYOm1wjI7LyouHYobvzIozvvpD+8X AyHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=HWVPU6NTGp9nzRxw5BWJeatmP8pnvZxGmvGTyfzAkRc=; b=LOi1SBoeo8qVCcZexr/HvWH569NrAwmHWsjO3k8YVGkTq+LeEs1k+SNf433NMNH34U 6sXamBbN5sT5FkKADJH5XFZdoE3Gx8BkihSSiNkd85i2WDFmsxujemi4WsMSy1Degu9n lENmerAX9tGKratkRrvERSRAGfda7IUx2BCSDrHrJikPTp8aXIEbsZQj9Qr+BUDyg9kY 0cRYPJ2zxWMy/ev4Vsjgyl1oAnGIPoII8kVlwrNVh+EOvaQX0bu56iJyEBIJCM0srLk4 A6NuAQl53Zyx7k0KsKj+jVueyDGBCId43xXy0DUJjUHhdLq0vOAUXdB/AFIPx8nA18yZ 5bNA== X-Gm-Message-State: APjAAAXDuEaVBLa4gsNI7NwHJil9skhYSTwKCvBBxZm0/a3T2FoSda7y U04Vwjmy8K8gCyHEKb6X1itZew== X-Received: by 2002:aa7:9218:: with SMTP id 24mr110748206pfo.145.1578361011553; Mon, 06 Jan 2020 17:36:51 -0800 (PST) Received: from ?IPv6:2600:1013:b01b:fb95:11fc:e81d:31f1:7b96? ([2600:1013:b01b:fb95:11fc:e81d:31f1:7b96]) by smtp.gmail.com with ESMTPSA id q12sm78770893pfh.158.2020.01.06.17.36.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 Jan 2020 17:36:50 -0800 (PST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [PATCH bpf-next] bpf: Make trampolines W^X Date: Mon, 6 Jan 2020 15:36:49 -1000 Message-Id: References: <21bf6bb46544eab79e792980f82520f8fbdae9b5.camel@intel.com> Cc: "kpsingh@chromium.org" , "songliubraving@fb.com" , "linux-kernel@vger.kernel.org" , "bpf@vger.kernel.org" , "keescook@chromium.org" , "ast@kernel.org" , "daniel@iogearbox.net" , "kuznet@ms2.inr.ac.ru" , "jannh@google.com" , "mjg59@google.com" , "thgarnie@chromium.org" , "linux-security-module@vger.kernel.org" , "x86@kernel.org" , "revest@chromium.org" , "jackmanb@chromium.org" , "kafai@fb.com" , "yhs@fb.com" , "davem@davemloft.net" , "yoshfuji@linux-ipv6.org" , "mhalcrow@google.com" , "andriin@fb.com" In-Reply-To: <21bf6bb46544eab79e792980f82520f8fbdae9b5.camel@intel.com> To: "Edgecombe, Rick P" X-Mailer: iPhone Mail (17C54) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jan 6, 2020, at 12:25 PM, Edgecombe, Rick P wrote: >=20 > =EF=BB=BFOn Sat, 2020-01-04 at 09:49 +0900, Andy Lutomirski wrote: >>>> On Jan 4, 2020, at 8:47 AM, KP Singh wrote: >>>=20 >>> =EF=BB=BFFrom: KP Singh >>>=20 >>> The image for the BPF trampolines is allocated with >>> bpf_jit_alloc_exe_page which marks this allocated page executable. This >>> means that the allocated memory is W and X at the same time making it >>> susceptible to WX based attacks. >>>=20 >>> Since the allocated memory is shared between two trampolines (the >>> current and the next), 2 pages must be allocated to adhere to W^X and >>> the following sequence is obeyed where trampolines are modified: >>=20 >> Can we please do better rather than piling garbage on top of garbage? >>=20 >>>=20 >>> - Mark memory as non executable (set_memory_nx). While module_alloc for >>> x86 allocates the memory as PAGE_KERNEL and not PAGE_KERNEL_EXEC, not >>> all implementations of module_alloc do so >>=20 >> How about fixing this instead? >>=20 >>> - Mark the memory as read/write (set_memory_rw) >>=20 >> Probably harmless, but see above about fixing it. >>=20 >>> - Modify the trampoline >>=20 >> Seems reasonable. It=E2=80=99s worth noting that this whole approach is s= uboptimal: >> the =E2=80=9Cmodule=E2=80=9D allocator should really be returning a list o= f pages to be >> written (not at the final address!) with the actual executable mapping to= be >> materialized later, but that=E2=80=99s a bigger project that you=E2=80=99= re welcome to ignore >> for now. (Concretely, it should produce a vmap address with backing page= s but >> with the vmap alias either entirely unmapped or read-only. A subsequent h= ealer >> would, all at once, make the direct map pages RO or not-present and make t= he >> vmap alias RX.) >>> - Mark the memory as read-only (set_memory_ro) >>> - Mark the memory as executable (set_memory_x) >>=20 >> No, thanks. There=E2=80=99s very little excuse for doing two IPI flushes w= hen one >> would suffice. >>=20 >> As far as I know, all architectures can do this with a single flush witho= ut >> races x86 certainly can. The module freeing code gets this sequence righ= t. >> Please reuse its mechanism or, if needed, export the relevant interfaces.= >=20 > So if I understand this right, some trampolines have been added that are > currently set as RWX at modification time AND left that way during runtime= ? The > discussion on the order of set_memory_() calls in the commit message made m= e > think that this was just a modification time thing at first. I=E2=80=99m not sure what the status quo is. We really ought to have a genuinely good API for allocation and initializati= on of text. We can do so much better than set_memory_blahblah. FWIW, I have some ideas about making kernel flushes cheaper. It=E2=80=99s cu= rrently blocked on finding some time and on tglx=E2=80=99s irqtrace work. >=20 > Also, is there a reason you couldn't use text_poke() to modify the trampol= ine > with a single flush? >=20 Does text_poke to an IPI these days?=