Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp2194391imj; Sun, 10 Feb 2019 21:19:43 -0800 (PST) X-Google-Smtp-Source: AHgI3IZ5pBAu749CmxmyMU1HR2Q/W9nCtwL1EOZy0ErvW4o09flZDgicLa3WmFEQSFPxE9/P/94D X-Received: by 2002:a17:902:b60a:: with SMTP id b10mr34083909pls.303.1549862383543; Sun, 10 Feb 2019 21:19:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549862383; cv=none; d=google.com; s=arc-20160816; b=vOpuTZSbKE+gXVheDbBxnzdqVuq07tT9FmHhZ1bNvI6eeoKrj8eYlCRf5D3QD65YuW H64JLP5RkhZMP0sGCzknyP81pfkRay4w1UwhXV80sFbqC0Fi2ZO7qWmY4DS+vKMEGtit oEHU7qc5JQPlYk5olevX2B2SmgSx/Er2LQKEsb7TU1R3S46T6qyjoZTV0uFKJ9vA/HiB G9deKjfQH5fsbUFe71fIM0ZXrEYOupHwA2u90f+xDChk4SrYWz83pzAlb//UDyD7qQNy NZYm5nQviuDTwqUeRAzUw54esJ+a3QEeLL/eiHWHYJPdPlVqk6C5N+MLCbCdbphyqHFS EI9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=P/nz3ACoxpLDqb0eHS2Lbfh7g08M62jcwE0fE5p6slQ=; b=KloZy1GWXPXDIH/+yAxW2xKXacCBoiNMUFBpo8EmTugBuE5S8YQLRfrDoK7h33QFoX 84wZoqHGmJRHm7yg4qWVJ7HkKiT35B4Oni6QH/KXzhqqVDXkoj0TwPJMFPMcOiGIHAQe 2OPgPmYgKhHB006pW1siI333h5ny7cIw2mRKuzigOXgVsyFBOjj/EGNFdy4Ywvk4pTEe IWHNxVVPaJx/k44MTAsuWn01/Ml6efr1jRK84Uq/ZC8jIUhQoslToHdRmkFEu2oSlqQ2 nhuEhm0dGY7aTb06SUkM8T+97rdXfgW8TuC8YIFAyYeH8SLX1hP7oU77XKRZxAUSzfXI wTsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=xyuciNjk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 76si9265591pfs.104.2019.02.10.21.19.26; Sun, 10 Feb 2019 21:19:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=xyuciNjk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726860AbfBKFSj (ORCPT + 99 others); Mon, 11 Feb 2019 00:18:39 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:38250 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726157AbfBKFSi (ORCPT ); Mon, 11 Feb 2019 00:18:38 -0500 Received: by mail-pg1-f193.google.com with SMTP id g189so4454144pgc.5 for ; Sun, 10 Feb 2019 21:18:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=P/nz3ACoxpLDqb0eHS2Lbfh7g08M62jcwE0fE5p6slQ=; b=xyuciNjkFWciBsSx/is2lBM2iiUNDec2HmbyGj9VTUJYEYAUDuSAr19G2O+5RQXZja kDN98a6hx1onWzTw7CzmW+8qMZmYMnWAibmq/6o1dY+qqCj/M3v6KtlTjBFVdVkmnHU7 0aadpz2eqQLSu6Ist19NGgHJ/rqUS0wyDRkHsl/XcQorAL9QaroZvLPCNGCdzSNG0/sD jkdB1uk7FF3ixo/nUGKC57zIIWHDq+mUzpn29lf3AgAXtDae1qKETi1id1Y4c6r3ZFRG fr2FI8bBDK7khVD+ie9FEuGxSE6QU26XGV9pCOiVMG3MKav8TqVqSrNRJ0uvlV/Q4j8B o21Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=P/nz3ACoxpLDqb0eHS2Lbfh7g08M62jcwE0fE5p6slQ=; b=fwazTZ20cy9MPtItikRRZ7Zkr7sJ4zl5fJUr/qE+p7Pgar1BFtbV6eyhRoxfGxgZHE gfiN5HD3KiG0t7Q6YwccwUEN5U0Qd2k/uqB2f1EOvktdNkdB41inGoCfrTuk3DzxBR5a mxghDU77pdlU+CsL4RW5tkSTIm5XENur3MslGY6zIpyW0zPMaXHvK9C1mPynHBhUVlqL Bo1A+qFV9O9TzanHw9qlHbbcpezMGqsRG47WXwrWUEVV3aTdGsNtgwz48wRDJE3PNtMf 0asubPirDwRcngc0gEQNoRRzF/TMhn6SbBbS0F1C7J3USESYioS7xf4gW0m8lhFYlQQE zjnA== X-Gm-Message-State: AHQUAuYI8m09lzvfrKj1R/s4IfjiRLf5rU2uwv1DXcPhPAzaGeSm4DN+ eokQOvXYiOcTryoj8QL7UJoSjw== X-Received: by 2002:a63:61d8:: with SMTP id v207mr7347489pgb.308.1549862317394; Sun, 10 Feb 2019 21:18:37 -0800 (PST) Received: from ?IPv6:2601:646:c200:7429:5d4d:83bf:b51b:8718? ([2601:646:c200:7429:5d4d:83bf:b51b:8718]) by smtp.gmail.com with ESMTPSA id l11sm11539621pff.65.2019.02.10.21.18.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 10 Feb 2019 21:18:35 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v2 05/20] x86/alternative: initializing temporary mm for patching From: Andy Lutomirski X-Mailer: iPhone Mail (16C101) In-Reply-To: <162C6C29-CD81-46FE-9A54-6ED05A93A9CB@gmail.com> Date: Sun, 10 Feb 2019 21:18:34 -0800 Cc: Rick Edgecombe , Andy Lutomirski , Ingo Molnar , LKML , X86 ML , "H. Peter Anvin" , Thomas Gleixner , Borislav Petkov , Dave Hansen , Peter Zijlstra , Damian Tometzki , linux-integrity , LSM List , Andrew Morton , Kernel Hardening , Linux-MM , Will Deacon , Ard Biesheuvel , Kristen Carlson Accardi , "Dock, Deneen T" , Kees Cook , Dave Hansen Content-Transfer-Encoding: quoted-printable Message-Id: <00649AE8-69C0-4CD2-A916-B8C8F0F5DAC3@amacapital.net> References: <20190129003422.9328-1-rick.p.edgecombe@intel.com> <20190129003422.9328-6-rick.p.edgecombe@intel.com> <162C6C29-CD81-46FE-9A54-6ED05A93A9CB@gmail.com> To: Nadav Amit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Feb 10, 2019, at 4:39 PM, Nadav Amit wrote: >> On Jan 28, 2019, at 4:34 PM, Rick Edgecombe w= rote: >>=20 >> From: Nadav Amit >>=20 >> To prevent improper use of the PTEs that are used for text patching, we >> want to use a temporary mm struct. We initailize it by copying the init >> mm. >>=20 >> The address that will be used for patching is taken from the lower area >> that is usually used for the task memory. Doing so prevents the need to >> frequently synchronize the temporary-mm (e.g., when BPF programs are >> installed), since different PGDs are used for the task memory. >>=20 >> Finally, we randomize the address of the PTEs to harden against exploits >> that use these PTEs. >>=20 >> Cc: Kees Cook >> Cc: Dave Hansen >> Acked-by: Peter Zijlstra (Intel) >> Reviewed-by: Masami Hiramatsu >> Tested-by: Masami Hiramatsu >> Suggested-by: Andy Lutomirski >> Signed-off-by: Nadav Amit >> Signed-off-by: Rick Edgecombe >> --- >> arch/x86/include/asm/pgtable.h | 3 +++ >> arch/x86/include/asm/text-patching.h | 2 ++ >> arch/x86/kernel/alternative.c | 3 +++ >> arch/x86/mm/init_64.c | 36 ++++++++++++++++++++++++++++ >> init/main.c | 3 +++ >> 5 files changed, 47 insertions(+) >>=20 >> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtabl= e.h >> index 40616e805292..e8f630d9a2ed 100644 >> --- a/arch/x86/include/asm/pgtable.h >> +++ b/arch/x86/include/asm/pgtable.h >> @@ -1021,6 +1021,9 @@ static inline void __meminit init_trampoline_defaul= t(void) >> /* Default trampoline pgd value */ >> trampoline_pgd_entry =3D init_top_pgt[pgd_index(__PAGE_OFFSET)]; >> } >> + >> +void __init poking_init(void); >> + >> # ifdef CONFIG_RANDOMIZE_MEMORY >> void __meminit init_trampoline(void); >> # else >> diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/= text-patching.h >> index f8fc8e86cf01..a75eed841eed 100644 >> --- a/arch/x86/include/asm/text-patching.h >> +++ b/arch/x86/include/asm/text-patching.h >> @@ -39,5 +39,7 @@ extern void *text_poke_kgdb(void *addr, const void *opc= ode, size_t len); >> extern int poke_int3_handler(struct pt_regs *regs); >> extern void *text_poke_bp(void *addr, const void *opcode, size_t len, voi= d *handler); >> extern int after_bootmem; >> +extern __ro_after_init struct mm_struct *poking_mm; >> +extern __ro_after_init unsigned long poking_addr; >>=20 >> #endif /* _ASM_X86_TEXT_PATCHING_H */ >> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.= c >> index 12fddbc8c55b..ae05fbb50171 100644 >> --- a/arch/x86/kernel/alternative.c >> +++ b/arch/x86/kernel/alternative.c >> @@ -678,6 +678,9 @@ void *__init_or_module text_poke_early(void *addr, co= nst void *opcode, >> return addr; >> } >>=20 >> +__ro_after_init struct mm_struct *poking_mm; >> +__ro_after_init unsigned long poking_addr; >> + >> static void *__text_poke(void *addr, const void *opcode, size_t len) >> { >> unsigned long flags; >> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c >> index bccff68e3267..125c8c48aa24 100644 >> --- a/arch/x86/mm/init_64.c >> +++ b/arch/x86/mm/init_64.c >> @@ -53,6 +53,7 @@ >> #include >> #include >> #include >> +#include >>=20 >> #include "mm_internal.h" >>=20 >> @@ -1383,6 +1384,41 @@ unsigned long memory_block_size_bytes(void) >> return memory_block_size_probed; >> } >>=20 >> +/* >> + * Initialize an mm_struct to be used during poking and a pointer to be u= sed >> + * during patching. >> + */ >> +void __init poking_init(void) >> +{ >> + spinlock_t *ptl; >> + pte_t *ptep; >> + >> + poking_mm =3D copy_init_mm(); >> + BUG_ON(!poking_mm); >> + >> + /* >> + * Randomize the poking address, but make sure that the following pa= ge >> + * will be mapped at the same PMD. We need 2 pages, so find space fo= r 3, >> + * and adjust the address if the PMD ends after the first one. >> + */ >> + poking_addr =3D TASK_UNMAPPED_BASE; >> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) >> + poking_addr +=3D (kaslr_get_random_long("Poking") & PAGE_MASK) %= >> + (TASK_SIZE - TASK_UNMAPPED_BASE - 3 * PAGE_SIZE); >> + >> + if (((poking_addr + PAGE_SIZE) & ~PMD_MASK) =3D=3D 0) >> + poking_addr +=3D PAGE_SIZE; >=20 > Further thinking about it, I think that allocating the virtual address for= > poking from user address-range is problematic. The user can set watchpoint= s > on different addresses, cause some static-keys to be enabled/disabled, and= > monitor the signals to derandomize the poking address. >=20 Hmm, I hadn=E2=80=99t thought about watchpoints. I=E2=80=99m not sure how mu= ch we care about possible derandomization like this, but we certainly don=E2= =80=99t want to send signals or otherwise malfunction. > Andy, I think you were pushing this change. Can I go back to use a vmalloc= =E2=80=99d > address instead, or do you have a better solution? Hmm. If we use a vmalloc address, we have to make sure it=E2=80=99s not actu= ally allocated. I suppose we could allocate one once at boot and use that. W= e also have the problem that the usual APIs for handling =E2=80=9Cuser=E2=80= =9D addresses might assume they=E2=80=99re actually in the user range, altho= ugh this seems unlikely to be a problem in practice. More seriously, though= , the code that manipulates per-mm paging structures assumes that *all* of t= he structures up to the top level are per-mm, and, if we use anything less t= han a private pgd, this isn=E2=80=99t the case. > I prefer not to > save/restore DR7, of course. >=20 I suspect we may want to use the temporary mm concept for EFI, too, so we ma= y want to just suck it up and save/restore DR7. But only if a watchpoint is= in use, of course. I have an old patch I could dust off that tracks DR7 to m= ake things like this efficient.=