Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5AB6C64ED6 for ; Tue, 28 Feb 2023 15:17:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229900AbjB1PRz (ORCPT ); Tue, 28 Feb 2023 10:17:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229881AbjB1PRw (ORCPT ); Tue, 28 Feb 2023 10:17:52 -0500 Received: from mail-io1-xd36.google.com (mail-io1-xd36.google.com [IPv6:2607:f8b0:4864:20::d36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AD7E100 for ; Tue, 28 Feb 2023 07:17:51 -0800 (PST) Received: by mail-io1-xd36.google.com with SMTP id q6so4171419iot.2 for ; Tue, 28 Feb 2023 07:17:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=P7Cz6kbvRX5Jf355ebSBKy0RIgfXk61NCeYRAb4vPT8=; b=i4rv6znLYMhsTceJFOhHDG9ba1CMN/Xk36xMgbkEK9T50xA7NB4K8WlZUXAeHfZVaN qOeQ38C2ZXS/Iy1uZJn614UYbxacq9cN2VclRZHdRFUBRprtuc2LaVir1PHpApWHDIPt EBKTyqt1WJCPbljjDQgsRVQuB2DC44NbaT9sLH0lYHefgrnqfk71zUxSruat+Opongo4 k0jkDNJKa/wVUdgEFynO5PGQqYhEI7vqd8usjc/l8pq/K05/wttqLXmk8Fx+jXIhJoJ1 cVCRd08UgQYfXbbEud6th3PIR0kpX3sIJMveSBB1PACm6e1LRcOtVDRqVaM1AIpWlWCJ 85Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P7Cz6kbvRX5Jf355ebSBKy0RIgfXk61NCeYRAb4vPT8=; b=AXEWRzUS3mMuwQp34OzPpcKxTT+UJpWmiRJagzY0g+UEgOFiM6+PXGRBTfNVt6Gw/H vJw930nJaYadRcwx+BCVszx4kSOZMyCbBGNncaLlspQW84akwUyY+x9RbdVK7Rnh3Ltj 0mj9cNTI5VsqkYu+UNEqU/4UJLbAdFjUf7CI2I/yymAFATsUwODaCofHfOGaGx6LkpR/ bu4rg5kcB3HOnTUiYA7ucapujRxs6q3WOanGxVWBB8R/QXQXIMo4Mbyt+MBnyJn0WN2B OvBtxF+Z2YCeiZvRprE1D+PeDzm8BzFPIJwIpUZS0px0TgRbwf5r/nEujvow1rhmGo1y NQ3g== X-Gm-Message-State: AO0yUKUYpr9DE+fosJVWdhMvGenXLD8WC8LZqhERnCffDOl4caCEA2Wb vacXCY30Jhgui8s0fX2xyKSdrgJPCJkqE3kMhVw69Q== X-Google-Smtp-Source: AK7set8p8GdPjTXxA1EMGlcXKaGQ77U2TcLGMP9YKcp9xsezJR3r0dwWN+BLGAkxLMMl8OKY5qO8Ulng+/xyxKzGfaM= X-Received: by 2002:a6b:e40c:0:b0:744:d7fc:7a4f with SMTP id u12-20020a6be40c000000b00744d7fc7a4fmr1429782iog.1.1677597465904; Tue, 28 Feb 2023 07:17:45 -0800 (PST) MIME-Version: 1.0 References: <20230228132118.978145284@linutronix.de> <20230228132910.934296889@linutronix.de> In-Reply-To: <20230228132910.934296889@linutronix.de> From: Eric Dumazet Date: Tue, 28 Feb 2023 16:17:34 +0100 Message-ID: Subject: Re: [patch 1/3] net: dst: Prevent false sharing vs. dst_entry::__refcnt To: Thomas Gleixner Cc: LKML , Linus Torvalds , x86@kernel.org, Wangyang Guo , Arjan van De Ven , "David S. Miller" , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org, Will Deacon , Peter Zijlstra , Boqun Feng , Mark Rutland , Marc Zyngier Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 28, 2023 at 3:33=E2=80=AFPM Thomas Gleixner wrote: > > From: Wangyang Guo > > dst_entry::__refcnt is highly contended in scenarios where many connectio= ns > happen from and to the same IP. The reference count is an atomic_t, so th= e > reference count operations have to take the cache-line exclusive. > > Aside of the unavoidable reference count contention there is another > significant problem which is caused by that: False sharing. > > perf top identified two affected read accesses. dst_entry::lwtstate and > rtable::rt_genid. > > dst_entry:__refcnt is located at offset 64 of dst_entry, which puts it in= to > a seperate cacheline vs. the read mostly members located at the beginning > of the struct. This will probably increase struct rt6_info past the 4 cache line size, rig= ht ? It would be nice to allow sharing the 'hot' cache line with seldom used fie= lds. Instead of mere pads, add some unions, and let rt6i_uncached/rt6i_uncached_= list use them.