Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp919348ybl; Wed, 29 Jan 2020 12:00:21 -0800 (PST) X-Google-Smtp-Source: APXvYqxeAqGPhc6YveGVAwx89ajW6QtGeXHETdEl8mP+4aisYYUPkVCEZc5NBKcOkc5zbJqc530M X-Received: by 2002:a05:6830:184:: with SMTP id q4mr776174ota.232.1580328021511; Wed, 29 Jan 2020 12:00:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580328021; cv=none; d=google.com; s=arc-20160816; b=BTMQ30XBeZNMAo2roCCxNj+6+4hpov27++/ahrNSTIzo1FigJLj5tI1ZTlqmRjp+J7 bn6UHP/eGlRkZQLe/briAmDf1T6gif8f8Tc8k21VmDjYCxQyIM0lksfts5T9KGXrpJaB Xvhwrl9ofBt61pJJ8nVcxEB5csnrWsBQoOLcx5Q/5cCOWUHJCauLSUrfh6xnD6ZuqlPO 3ZPEUSnBDBl8b+IELE5OL6tpkDd7sYlCX+ZGLnTXpWJcKMfS70gQzTIEpvrFc17Tf+p9 Re3PNWY7p69y5+jdtLODoGtegianq7aV8X7+U5p36Qed0Jr22ZN/intID1Hl5i/n6sSb glXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Z9BlSVBnWMRUugpORj0B03NwD6HSM3bQdEqgGY5lYC4=; b=jNjMZr2D2KDqouL6yUuhtWOfpU+FAji66X62q5ZzroQG9pO85gdtD3c0UcQw8N/m8X DN+93Gj/mqUBOM+YHo5Cx1jIk7WfWeKHBpXeo1nk7hKLB5H9Y/CQILxqYVlI6SHwbReF VXkQKTNrEGQP5OdlSpYKLHtxFujJ7QpBGxYuLKncHAt+RC8XHR1zMoIRg9Z8t31J1WfG kXX+eIRMVH6lbn5c8kKGvzVx7iNNbpb/UUhjY/1MdE21X2k/+/vdjAgPnWC4Y3f4erxf x+jQNOJivRATIdUn1FxyRrQmzCVs7ECdNmaEgOpz7vg5N7ESLxQFp0+5k8P3TIkLMGTt AxOg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l26si1776943otd.143.2020.01.29.12.00.08; Wed, 29 Jan 2020 12:00:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727469AbgA2SeR (ORCPT + 99 others); Wed, 29 Jan 2020 13:34:17 -0500 Received: from mx2.suse.de ([195.135.220.15]:49390 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726332AbgA2SeR (ORCPT ); Wed, 29 Jan 2020 13:34:17 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 9A407AC77; Wed, 29 Jan 2020 18:34:15 +0000 (UTC) Date: Wed, 29 Jan 2020 19:34:04 +0100 From: Borislav Petkov To: Linus Torvalds Cc: "Luck, Tony" , Ingo Molnar , Linux Kernel Mailing List , Thomas Gleixner , Borislav Petkov , Peter Zijlstra , Andrew Morton Subject: Re: [GIT PULL] x86/asm changes for v5.6 Message-ID: <20200129183404.GB30979@zn.tnic> References: <20200128165906.GA67781@gmail.com> <20200129132618.GA30979@zn.tnic> <20200129170725.GA21265@agluck-desk2.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 29, 2020 at 09:40:58AM -0800, Linus Torvalds wrote: > On Wed, Jan 29, 2020 at 9:07 AM Luck, Tony wrote: > > > > This returns "3" ... not what we want. But swapping the ERMS/FSRM order > > gets the correct version. > > That actually makes sense, and is what I suspected (after I wrote the > patch) what would happen. It would just be good to have it explicitly > documented at the macro. Like this? --- diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 13adca37c99a..d94bad03bcb4 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -164,6 +164,11 @@ static inline int alternatives_text_reserved(void *start, void *end) ALTINSTR_REPLACEMENT(newinstr, feature, 1) \ ".popsection\n" +/* + * The patching happens in the natural order of first newinstr1 and then + * newinstr2, iff the respective feature bits are set. See apply_alternatives() + * for details. + */ #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\ OLDINSTR_2(oldinstr, 1, 2) \ ".pushsection .altinstructions,\"a\"\n" \ > That would be bad indeed, but I don't think it should matter > particularly for this case - it would have been bad before too. > > I suspect there is some other issue going on, like the NOP padding > logic being confused. Or the cmp $0x20 test missing in the default case, see below. > In particular, look here, this is the alt_instruction entries: > > altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b > altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b > > where the arguments are "orig alt feature orig_len alt_len pad_len" in > that order. > > Notice how "pad_len" in both cases is the padding from the _original_ > instruction (142b-141b). > So I'm just hand-waving. Maybe there was some simpler explanation > (like me just picking the wrong instructions when I did the rough > conversion and simply breaking things with some stupid bug). Looks like it. So I did this: --- diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S index 7ff00ea64e4f..a670d01570df 100644 --- a/arch/x86/lib/memmove_64.S +++ b/arch/x86/lib/memmove_64.S @@ -39,23 +39,19 @@ SYM_FUNC_START(__memmove) cmp %rdi, %r8 jg 2f - /* FSRM implies ERMS => no length checks, do the copy directly */ -.Lmemmove_begin_forward: - ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM - ALTERNATIVE "", "movq %rdx, %rcx; rep movsb; retq", X86_FEATURE_ERMS - /* - * movsq instruction have many startup latency - * so we handle small size by general register. - */ - cmp $680, %rdx - jb 3f - /* - * movsq instruction is only good for aligned case. + * Three rep-string alternatives: + * - go to "movsq" for large regions where source and dest are + * mutually aligned (same in low 8 bits). "label 4" + * - plain rep-movsb for FSRM + * - rep-movs for > 32 byte for ERMS. */ +.Lmemmove_begin_forward: + ALTERNATIVE_2 \ + "cmp $0x20, %rdx; jb 1f; cmp $680, %rdx ; jb 3f ; cmpb %dil, %sil; je 4f", \ + "cmp $0x20, %rdx; jb 1f; movq %rdx, %rcx; rep movsb; retq", X86_FEATURE_ERMS, \ + "movq %rdx, %rcx ; rep movsb; retq", X86_FEATURE_FSRM - cmpb %dil, %sil - je 4f 3: sub $0x20, %rdx /* --- Notice how the *first* option of the alternative, which is the default one, has that gotten that additional "cmp $0x20, %rdx; jb 1f" test which sends us down to the less than 32 bytes copy length. Your original version didn't have it and here's what I saw: So I stopped the guest just before that code and the call trace looked like this: #0 memmove () at arch/x86/lib/memmove_64.S:43 #1 0xffffffff824448c2 in memblock_insert_region (type=0xffffffff824a8298 , idx=, base=0, size=4096, nid=2, flags=MEMBLOCK_NONE) at mm/memblock.c:553 #2 0xffffffff824454f0 in memblock_add_range (type=0xffffffff824a8298 , base=0, size=, nid=73400320, flags=) at mm/memblock.c:641 #3 0xffffffff82445627 in memblock_reserve (base=0, size=4096) at mm/memblock.c:830 #4 0xffffffff823ff399 in setup_arch (cmdline_p=0xffffffff82003f28) at arch/x86/kernel/setup.c:798 #5 0xffffffff823f9ae1 in start_kernel () at init/main.c:598 #6 0xffffffff810000d4 in secondary_startup_64 () at arch/x86/kernel/head_64.S:242 #7 0x0000000000000000 in ?? () and count in rdx was: rdx 0x18 24 Without that "cmp $0x20" test above, we do the "cmp $680, %rdx; jb 3f;" test and we run into the following asm at label 3: 3: sub $0x20, %rdx /* * We gobble 32 bytes forward in each loop. */ <--- right here %rdx is: rdx 0xfffffffffffffff8 -8 and yeeehaaw, we're in the weeds and then end up triplefaulting at some unmapped source address in %rsi or so. So now I'm going to play all three variants with pen and paper to make sure we're still sane. Thx. -- Regards/Gruss, Boris. SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg