Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2169148imu; Thu, 10 Jan 2019 09:22:34 -0800 (PST) X-Google-Smtp-Source: ALg8bN5U9C/T29YL75Hee6dAciwI155eejEhs1jiIZfh6DZbGidCfUZPAfQAVSw/8Z0nv3arqI5q X-Received: by 2002:a17:902:9a02:: with SMTP id v2mr11374003plp.180.1547140954252; Thu, 10 Jan 2019 09:22:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547140954; cv=none; d=google.com; s=arc-20160816; b=a7wFm09SJAs2TNc96KjZ/0+aDY5g1HY6vcNZT5AGl5tnL91cdWOHWgrQtD/3hqUi0Y oOkbdN6cyVTrlWNnxGJ1+qK9EehhVo3E2W+MSVZkhYnP3mWO1J+lIbNPxbKzofCwS1T6 A+dlfIFjNWepqRyIrOgO48O1kKLX8IRObkuTj81Fcv6GbfqiFPZdXn/smrre03AAsPDl M4kbAzf6C2lWlwdEvOkYv1iimlPYGbeHYK7FpNO/2IRMZbv6LlLS3UGPlDE89XXa6Gpy Qwnnx8acTC/Ykf1eekEPhJCymQAWyy301YOoBoAlwXjIy8W8Sik+1ZRJdJHpvsTL9pUn /NDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=57z6z2hU8iK44SFOoHWdUgp07QKISBGraUMuSTXTPtE=; b=ML5KL/4yWIM7vEvfXgJ8cFZL3dWBXWpZoJSkfptij5cgCktQGEsGzjv4PAd3v6RIcS refpezyevXlG4AP7uNgx7QU/w+J5+XXm+Qyq4e8uBaMMtyGjUaJEAis2aawi5V+r6syE yK9pnSenSzeyVTFFq3Bc5Ci7NNnqyyWQo/LKW8k1HCq0yi/oaTm7V4rLh/v/pf/0SV/b MlpmHpIbyH0Us3Y32MVIcp5IrL3JGHY9yGnM2nuyIZJsxNSK7LlnpuSi8jMOwGgjSnO2 4Y7i0v7lJhhyyGmn2jLse5hjHdYQOG3/GVWmMIf8TbG3efZntrz0bUfFErbSU2jbwYJp xmaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v5si68514138pgg.1.2019.01.10.09.22.17; Thu, 10 Jan 2019 09:22:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729784AbfAJRUO (ORCPT + 99 others); Thu, 10 Jan 2019 12:20:14 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47836 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729465AbfAJRUN (ORCPT ); Thu, 10 Jan 2019 12:20:13 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B3876C0C4297; Thu, 10 Jan 2019 17:20:12 +0000 (UTC) Received: from treble (ovpn-125-32.rdu2.redhat.com [10.10.125.32]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4364C600C3; Thu, 10 Jan 2019 17:20:06 +0000 (UTC) Date: Thu, 10 Jan 2019 11:20:04 -0600 From: Josh Poimboeuf To: Nadav Amit Cc: X86 ML , LKML , Ard Biesheuvel , Andy Lutomirski , Steven Rostedt , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Linus Torvalds , Masami Hiramatsu , Jason Baron , Jiri Kosina , David Laight , Borislav Petkov , Julia Cartwright , Jessica Yu , "H. Peter Anvin" , Rasmus Villemoes , Edward Cree , Daniel Bristot de Oliveira Subject: Re: [PATCH v3 5/6] x86/alternative: Use a single access in text_poke() where possible Message-ID: <20190110172004.wuh45xoafynfm2df@treble> References: <279b8003f7f0a6831d090ab822d37bc958f974de.1547073843.git.jpoimboe@redhat.com> <8138A1EE-359D-4CD2-8E96-5BF00313AB3B@vmware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <8138A1EE-359D-4CD2-8E96-5BF00313AB3B@vmware.com> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Thu, 10 Jan 2019 17:20:13 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 10, 2019 at 09:32:23AM +0000, Nadav Amit wrote: > > @@ -714,14 +714,39 @@ void *text_poke(void *addr, const void *opcode, size_t len) > > } > > BUG_ON(!pages[0]); > > local_irq_save(flags); > > + > > set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0])); > > if (pages[1]) > > set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1])); > > - vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0); > > - memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); > > + > > + vaddr = fix_to_virt(FIX_TEXT_POKE0) + ((unsigned long)addr & ~PAGE_MASK); > > + > > + /* > > + * Use a single access where possible. Note that a single unaligned > > + * multi-byte write will not necessarily be atomic on x86-32, or if the > > + * address crosses a cache line boundary. > > + */ > > + switch (len) { > > + case 1: > > + WRITE_ONCE(*(u8 *)vaddr, *(u8 *)opcode); > > + break; > > + case 2: > > + WRITE_ONCE(*(u16 *)vaddr, *(u16 *)opcode); > > + break; > > + case 4: > > + WRITE_ONCE(*(u32 *)vaddr, *(u32 *)opcode); > > + break; > > + case 8: > > + WRITE_ONCE(*(u64 *)vaddr, *(u64 *)opcode); > > + break; > > + default: > > + memcpy((void *)vaddr, opcode, len); > > + } > > + > > Even if Intel and AMD CPUs are guaranteed to run instructions from L1 > atomically, this may break instruction emulators, such as those that > hypervisors use. They might not read instructions atomically if on SMP VMs > when the VM's text_poke() races with the emulated instruction fetch. > > While I can't find a reason for hypervisors to emulate this instruction, > smarter people might find ways to turn it into a security exploit. Interesting point... but I wonder if it's a realistic concern. BTW, text_poke_bp() also relies on undocumented behavior. The entire instruction doesn't need to be read atomically; just the 32-bit call destination. Assuming the hypervisor is x86-64, and it uses a 32-bit access to read the call destination (which seems logical), the intra-cacheline reads will be atomic, as stated in the SDM. If the above assumptions are not true, and the hypervisor reads the call destination non-atomically (which seems unlikely IMO), even then I don't see how it could be realistically exploitable. It would just oops from calling a corrupt address. -- Josh