Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2819306rwd; Fri, 26 May 2023 11:42:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6Ti/GDTKLbI2Ophf5IOTzGF3bzoIMcBkCuO9JqE82BrNAKG+EchreJBQYgrEEkmc29c9zZ X-Received: by 2002:a17:902:ec8b:b0:1af:f660:1689 with SMTP id x11-20020a170902ec8b00b001aff6601689mr4136267plg.31.1685126561561; Fri, 26 May 2023 11:42:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685126561; cv=none; d=google.com; s=arc-20160816; b=ZJzlwrMUOtdokk6uYyiIM+lFEZnQTZeWoJVq6GY6cRoWfx2snVZdc7n4gJFuomdvqQ LpNS4SupwsZQwassnLdx6TmC8W8JOqSiH7dbIiRV6N/durwGlJ/G0XFAdRBwOKwQqljp 2Mp6cS+zh3xoNvf8IttIrJ0kUS8xgzWwQ44ThajulD0OSH3n2SAOLdCXCnJHDiiENxY+ eTHphuugxd/ywfKJcSxLay1jJjemMe0GIPalZ7E0u2v33HguOkQtnLEN1AyIoKPMMiPZ ybkrxD+7t9yAlQCurrByve+eAPSX5+bfh0lxsBDjZM8G7ti27OX7sW0ScS34ZmTa9pUd Ihug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ptC6d5EWrHkNhGZSpOw3bJ7cmTKxwB6ClWZ/+djSvzY=; b=PGSzA2EPgNNk/IapY6deNxzNgKqLZ9axnNRhq/4baQZp5sJY9oTfDaY39YvvBEyrrf vku+k8hVBWEnFAtyRrI0tzVdQ8YHRtLF7rH4DZptSg3QJlfa6qgqx1Pa+niV16o96Si0 5AA0JpbHRAnnDqOqfRpSciFyUMeO0Lf9UFvHlSGwGs8QNWnO5ss1UYAr5xomMB8TmLPa ueka178gE9ZD6pZA8wPenXG7Mwxiq1ASM0psrkH8MxyaBeUtGJnVZdxE3Nz2ARkYB3KQ wCU8QB0pgFbHyUcrSHag2F0Bx7i1J3cQ21GSTbEG7rwgY0iipqQ7i5G3EWqL2rmpNlS+ +dyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=eYFz6zqj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w11-20020a170902e88b00b001a95d336558si5070946plg.594.2023.05.26.11.42.26; Fri, 26 May 2023 11:42:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=eYFz6zqj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242340AbjEZSdz (ORCPT + 99 others); Fri, 26 May 2023 14:33:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236450AbjEZSdy (ORCPT ); Fri, 26 May 2023 14:33:54 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06F971B3 for ; Fri, 26 May 2023 11:33:45 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-96fb1642b09so156749666b.0 for ; Fri, 26 May 2023 11:33:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1685126023; x=1687718023; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ptC6d5EWrHkNhGZSpOw3bJ7cmTKxwB6ClWZ/+djSvzY=; b=eYFz6zqj/vwbgy7UP5WJImuZ4iYtRRI4LjBh1R0qJ3PKOuE1/Y0pGxopCwTNwQo8ou yn5Hqd8FSTbBVJMzgnuvYl34BK+PP4ReYcxPgNEAzZ4N94bdAjrjVAq6WWp40kbqzVm0 z9yftudF/WuyICoR9+XvIyLE3n02IHeoTZdlA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685126023; x=1687718023; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ptC6d5EWrHkNhGZSpOw3bJ7cmTKxwB6ClWZ/+djSvzY=; b=hWxyZuC8fjjbtBLgDC7wrMVqLpP3s54gknIb18x/A1s+3n45LHhACDjn+reVG/DOwA e/MvfcJXl6kAPc+V9vDAdDu5dhKYSIXA82FFNoOJqAPzyvSOaFTRYEst5AuiXCp9K5J9 Av1GIvylvmge/e+lcW2/PEZ0lc4ZeyBm+6Sq1SPNfhUcfi/OFZSHsH4NnPebBtvvPc4L 1WuCp5rEYtrBKlyuLxPFyOzGUlOEFa4qts/I4bDMbdScLHgXi4d4meFnrS+SNFS4lI+a rWFmL7KC30yIzAkEUKCF3ctSpuyi041FLLT5MmQZ2SzDxZ+p9o3w1JvDO+0e4ez7q3Lj lMag== X-Gm-Message-State: AC+VfDwiulvQegXmpwUuBL76OVHddWyQuhl5sz7FU/unG2AJBEHq78H0 YDZ11HuBVMjtr7QxTpURSKz10qEQyUqdiYESkHD9X6uK X-Received: by 2002:a17:907:9411:b0:973:9857:b9b7 with SMTP id dk17-20020a170907941100b009739857b9b7mr2768029ejc.40.1685126023535; Fri, 26 May 2023 11:33:43 -0700 (PDT) Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com. [209.85.208.49]) by smtp.gmail.com with ESMTPSA id sa24-20020a170906edb800b0096595cc0810sm2457901ejb.72.2023.05.26.11.33.42 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 26 May 2023 11:33:42 -0700 (PDT) Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-51440706e59so1522681a12.3 for ; Fri, 26 May 2023 11:33:42 -0700 (PDT) X-Received: by 2002:a17:907:80b:b0:96f:9962:be19 with SMTP id wv11-20020a170907080b00b0096f9962be19mr2508836ejb.31.1685126022260; Fri, 26 May 2023 11:33:42 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Fri, 26 May 2023 11:33:25 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: x86 copy performance regression To: Eric Dumazet Cc: LKML , netdev Content-Type: multipart/mixed; boundary="000000000000f5c99605fc9cf9b1" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --000000000000f5c99605fc9cf9b1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 26, 2023 at 10:51=E2=80=AFAM Eric Dumazet = wrote: > > Hmmm > > [ 25.532236] RIP: 0010:0xffffffffa5a85134 > [ 25.536173] Code: Unable to access opcode bytes at 0xffffffffa5a8510a. This was the other reason I really didn't want to use alternatives on the conditional branch instructions. The relocations are really not very natural, and we have odd rules for those things. So I suspect our instruction rewriting simply gets this wrong, because that's such a nasty pattern. I really wanted my "just hardcode the instruction bytes" to work. Not only did it get me the small 2-byte conditional jump, it meant that there was no relocation on it. But objtool really hates not understanding what the alternatives code does. Which is fair enough, but it's frustrating here when it only results in more problems. Anyway, I guess *this* avoids all issues. It creates an extra jump to a jump for the case where the CPU doesn't have ERMS, but I guess we don't really care about those CPUs anyway. And it avoids all the "alternative instructions have relocations" issues. And it creates all small two-byte jumps, and the "rep movsb" fits exactly on that same 2 bytes too. Which I guess all argues for this being what I should have started with. This time it *really* works. Famous last words. Linus --000000000000f5c99605fc9cf9b1 Content-Type: text/x-patch; charset="US-ASCII"; name="patch.diff" Content-Disposition: attachment; filename="patch.diff" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_li4whxwl0 IGFyY2gveDg2L2xpYi9jb3B5X3VzZXJfNjQuUyB8IDEwICsrKysrKysrKy0KIDEgZmlsZSBjaGFu Z2VkLCA5IGluc2VydGlvbnMoKyksIDEgZGVsZXRpb24oLSkKCmRpZmYgLS1naXQgYS9hcmNoL3g4 Ni9saWIvY29weV91c2VyXzY0LlMgYi9hcmNoL3g4Ni9saWIvY29weV91c2VyXzY0LlMKaW5kZXgg NGZjNWMyZGUyZGU0Li4wMWM1ZGU0YzI3OWIgMTAwNjQ0Ci0tLSBhL2FyY2gveDg2L2xpYi9jb3B5 X3VzZXJfNjQuUworKysgYi9hcmNoL3g4Ni9saWIvY29weV91c2VyXzY0LlMKQEAgLTcsNiArNyw4 IEBACiAgKi8KIAogI2luY2x1ZGUgPGxpbnV4L2xpbmthZ2UuaD4KKyNpbmNsdWRlIDxhc20vY3B1 ZmVhdHVyZXMuaD4KKyNpbmNsdWRlIDxhc20vYWx0ZXJuYXRpdmUuaD4KICNpbmNsdWRlIDxhc20v YXNtLmg+CiAjaW5jbHVkZSA8YXNtL2V4cG9ydC5oPgogCkBAIC0yOSw3ICszMSw3IEBACiAgKi8K IFNZTV9GVU5DX1NUQVJUKHJlcF9tb3ZzX2FsdGVybmF0aXZlKQogCWNtcHEgJDY0LCVyY3gKLQlq YWUgLkx1bnJvbGxlZAorCWphZSAuTGxhcmdlCiAKIAljbXAgJDgsJWVjeAogCWphZSAuTHdvcmQK QEAgLTY1LDYgKzY3LDEyIEBAIFNZTV9GVU5DX1NUQVJUKHJlcF9tb3ZzX2FsdGVybmF0aXZlKQog CV9BU01fRVhUQUJMRV9VQSggMmIsIC5MY29weV91c2VyX3RhaWwpCiAJX0FTTV9FWFRBQkxFX1VB KCAzYiwgLkxjb3B5X3VzZXJfdGFpbCkKIAorLkxsYXJnZToKKzA6CUFMVEVSTkFUSVZFICJqbXAg Lkx1bnJvbGxlZCIsICJyZXAgbW92c2IiLCBYODZfRkVBVFVSRV9FUk1TCisxOglSRVQKKworICAg ICAgICBfQVNNX0VYVEFCTEVfVUEoIDBiLCAxYikKKwogCS5wMmFsaWduIDQKIC5MdW5yb2xsZWQ6 CiAxMDoJbW92cSAoJXJzaSksJXI4Cg== --000000000000f5c99605fc9cf9b1--