Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp771650yba; Mon, 1 Apr 2019 16:56:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqxZvDCfJy//NAktmx5G+JopIqdSnHYEV0+cpj2pgQD+xY7cq9lCeH1HuyiI8pXiRwqhX4kZ X-Received: by 2002:a17:902:526:: with SMTP id 35mr63495589plf.276.1554162990963; Mon, 01 Apr 2019 16:56:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554162990; cv=none; d=google.com; s=arc-20160816; b=WXtutf7B89NnqEo2+qEUZdLYX2Y/X1+4/R1oUv4H4WTuvTcnLrXUg8nlLC0IsV6VyE FP/PanXxLGROUfyMPFbq1OQb6NRWl/2uyVG7yQSIDCFgHhuw617RzuKwQUsJiSN1IaFF goPEjFrP95Qg0HQv4izLq3rYQDAh70oPYxZONdDJVnPaZOXV3qpuRQo5Lgc4tBq4olJa SKD8zc3y/q3aA3m1nCTfQouSVN6sLL92NL3vrJn+ODq+ZzjcCiqd7QM6fTdHsPwyvB/g A3Il3roPPV9yqQxoNX+cbfRCD+G1ornXT+xx8EbaH3jYwMvGQernjhGhFyFUpXUqGtte 4/dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=53C+UAYBW94xvPX0bSE+X4pak3bywZ/H7OEveLLLrVc=; b=SaPvsXfHa/pTeqIjAiH8z4SWa/Q9uRaW10dJZzT3gJHVH1WEg1iLjtAgGnPBnOYiDT wJba6W1iIKm/Zw4cFd+sTO2noCMP3eBIkg4M/I1sRp9lKR4DNl53rQ8U1YZjrTV2GNlU 1T2kVXua2MS+4rxC0NKzINkAy0dmF+W51Pi7yVv9zJ5tBEPCgPyCeQnvkeZZyvoA2ZFC natqAhi4UXt1G8Z/53n9gG7B0w3FLgFmJb04kiHu89SFdQxC5BcjVt785II+jw/aMgBP 0Sy59qHlEDXv55KCOwxpNPMG5U6Xyw+eTMX/D7PihUX/DVjmZ64ebtwYNzgu4VGcxkZA NgiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d12si10156186pgt.59.2019.04.01.16.56.15; Mon, 01 Apr 2019 16:56:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726534AbfDAXzk (ORCPT + 99 others); Mon, 1 Apr 2019 19:55:40 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:43073 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725893AbfDAXzk (ORCPT ); Mon, 1 Apr 2019 19:55:40 -0400 Received: by mail-pl1-f196.google.com with SMTP id m10so5269317plt.10 for ; Mon, 01 Apr 2019 16:55:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=53C+UAYBW94xvPX0bSE+X4pak3bywZ/H7OEveLLLrVc=; b=nQbgstZCU+rjmjMdiU5T4393+SoGi1lKKeWERVBM0Vw0IlkdzW8wZBlCSLPC7UwLTC ud8oTTfIY6gzv9h0pGu3tRjuEyFi0u44m1rSd6pKAlT75K1Cntb96oN6qRQ+iQY+pKAD Amhc4WCFiZ3ozXXVnaSEy5dnJ2rC7NDezf3i81DR7EXgahBYPYzhcujpMVV4OMY46mn1 99wlS6EbACQYr3198KmMkfTAgTpX10FCaMuV1Ivr0OSgWILrqpppxepnFCoOhtcGhFoN sy25/hR1O1CMligTybnx6EYkHQENtj8eQ6cu8fCwkJ/0FZHeH4w4dRPJ/sKQkf3ZNKeS MPWg== X-Gm-Message-State: APjAAAUN+S2qw1ODVsZoJe/le/CG8ODSyQIfdAIOfA2BA04XdGNFLhNg ly5votsOR/wuZh6FlkBDPGAfezd6 X-Received: by 2002:a17:902:2e83:: with SMTP id r3mr49558255plb.153.1554162939659; Mon, 01 Apr 2019 16:55:39 -0700 (PDT) Received: from sultan-box.localdomain ([2601:200:c001:5f40:f27:f07b:c72c:45e3]) by smtp.gmail.com with ESMTPSA id i79sm22625237pfj.28.2019.04.01.16.55.38 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 01 Apr 2019 16:55:38 -0700 (PDT) Date: Mon, 1 Apr 2019 16:55:36 -0700 From: Sultan Alsawaf To: Rasmus Villemoes Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Nathan Chancellor Subject: Re: [RFCv2] string: Use faster alternatives when constant arguments are used Message-ID: <20190401235536.GA874@sultan-box.localdomain> References: <20190324014445.28688-1-sultan@kerneltoast.com> <20190324022406.GA18988@sultan-box.localdomain> <2293c54f-40b1-1e59-665a-bd8f2cb957d2@rasmusvillemoes.dk> <20190324223202.GA875@sultan-box.localdomain> <8672b98c-bf71-7b5b-625e-2f241807d46c@rasmusvillemoes.dk> <20190330225941.GA7456@sultan-box.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.4 (2019-03-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 01, 2019 at 10:43:13PM +0200, Rasmus Villemoes wrote: > Consider your patch replacing !strcmp(buf, "123") by !memcmp(buf, "123", > 4). buf is known to point to a nul-terminated string. But it may point > at, say, the second-last byte in a page, with the last byte in that page > being a nul byte, and the following page being MMIO or unmapped or all > kinds of bad things. On e.g. x86 where unaligned accesses are cheap, and > seeing that you're only comparing for equality, gcc is likely to compile > the memcmp version into > > *(u32*)buf == 0x00333231 > > because you've told the compiler that there's no problem accessing four > bytes starting at buf. Boom. Even without unaligned access being cheap > this can happen; suppose the length is 8 instead, and gcc somehow knows > that buf is four-byte aligned (and in this case it happens to point four > bytes before a page boundary), so it could compile the memcmp(,,8) into > > *(u32*)(buf+4) == secondword && *(u32*)buf == firstword > > (or do the comparisons in the "natural" order, but it might still do > both loads first). Ok, this is the first solid counter I've seen against my patch. I hadn't considered unaligned word-sized accesses. Thanks. > > And if there are concerns for some arches but not others, then couldn't this be > > a feasible optimization for those which would work well with it? > > No. First, these are concerns for all arches. Second, if you can find > some particular place where string parsing/matching is in any way > performance relevant and not just done once during driver init or > whatnot, maybe the maintainers of that file would take a patch > hand-optimizing some strcmps to memcmps, or, depending on what the code > does, perhaps replacing the whole *cmp logic with a custom hash table. FWIW, I didn't have a specific place in the kernel in mind that heavily relied on str* operations and needed optimization. I just thought it could be a "free" optimization, but apparently not. > But a patch implicitly and silently touching thousands of lines of code, > without an analysis of why none of the above is a problem for any of > those lines, for any .config, arch, compiler version? No. Well, I thought it could be proven to be correct regardless of the context, which would imply that making such a global change touching thousands lof lines of code would be safe. But sadly it isn't. This does bring into question a greater problem though: usage of memcmp throughout the kernel where the size provided is not the size of the smaller container being compared. This usage of memcmp (which is quite easy to find all across the kernel) could run into the unaligned memory access across a page boundary that you gave as an example, no? Sultan