Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp246614pxb; Thu, 14 Jan 2021 05:06:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJxaFCY15m94birFW1uZSukpOgXab8D+RW2b2B6S9ztSJ/UK3InNxBkwuKjpspyBYtmWFIKn X-Received: by 2002:a17:906:338b:: with SMTP id v11mr5475845eja.74.1610629561761; Thu, 14 Jan 2021 05:06:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610629561; cv=none; d=google.com; s=arc-20160816; b=LZCVoi1Ne7WF0qY8ijXEphT4BcMtSxVuHzGRRtLOvV2d4KASieyl2/CaRjgPZFGIV1 8CvjJaD5umGw6ipRVjWuf9mjB0hDoIu6B/tlTvMbhRS+bcyiNW1AUSoHkriZBEMgYwCV lXsjLnMs24N/EoMOSJ2WRP1MDYdwlzA4qnboAjJOhaMQObNDy7POtm2SSMLaI69QD5Cc b682V8ssQ4U5ywaINyEUlZZJQxXbff3qwvsCTn73NA/Z1m3yqihwRNdZ0bU7DCqJHyng u3wlRGRQpVz47BkZftlvunPexVBaRIH0iqWGKCwAIoDKFQgoPtWe6fEJMcSZGRWaRsQE d3Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=vKtBSmXo7yoSlrgbicG5OfxMQrgHn93Uxhjn4Wm/1aM=; b=BCkMni+/oO2Qx9qCRKxjutG5+kAm29GxRbdDdxDnmvVLmoQUqRdj9jBwXwnVIv/BYo JJYTxVZ7ubfKDTix53i5EZZi5R+MOsyCKSB5BMtkmTMRHCmgSaL3gB71YDsrSH7PSuXF VWo7yeYj3JMODdzOpD/+6GRYvjXdckY69YaaRh9H4P6XGm86Jr50gYoclztxhnVPcdJZ zujmzI0ym8yn3+X9d3clIPid8NJnTO0kbvoeU8d3YX2YBECxeZPeEG/089U5GQFvEoIC TFQpvVSLjBSboXtPvhng1ON2akDz8USINjQSwsPAQe2vGkh1uHOXeprjjjzjzAisb8yW ruAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=AjSaufPS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a17si2636494edn.146.2021.01.14.05.05.37; Thu, 14 Jan 2021 05:06:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=AjSaufPS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727432AbhANNCx (ORCPT + 99 others); Thu, 14 Jan 2021 08:02:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726066AbhANNCw (ORCPT ); Thu, 14 Jan 2021 08:02:52 -0500 Received: from mail-qk1-x72c.google.com (mail-qk1-x72c.google.com [IPv6:2607:f8b0:4864:20::72c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 048B6C061574 for ; Thu, 14 Jan 2021 05:02:12 -0800 (PST) Received: by mail-qk1-x72c.google.com with SMTP id d14so7499504qkc.13 for ; Thu, 14 Jan 2021 05:02:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vKtBSmXo7yoSlrgbicG5OfxMQrgHn93Uxhjn4Wm/1aM=; b=AjSaufPSwp+El9q+7geJZwAW7K2q9BI4y8JfoYL+O+M28M6hRAttbQTmvMi4EjMK58 GdtI/Mvxe1A8Qvg+7wGP2ca1dfeJ6DZIYZEo1PnseuRWJzy8kzF+F7A1hvaPS6GXTUSh qxIKEUVxb1Db/FccCUCGEPTdj9zbYR/n9d3BYsyuHaw/t8Gy4P2XtJUZNP567eqeKcir xqGuW6JDFTAdhHo1urzIrlBWsu6CTNkAmZySh/zLZFK5YHQ+izR7Pxpxt0fCqLlIrUAu Zg1Eb7Zmv955kP1x0yAR1qL4n74Ps2kPigt4BEMiD05GSNcqngmWJphiyUA7qnEc/aCC ovjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vKtBSmXo7yoSlrgbicG5OfxMQrgHn93Uxhjn4Wm/1aM=; b=S5qlKXyRdHHklKjJJgJQa5Facjg5+D865nEFqAkoE0ZrWTjUH+68zmuVEFEdvwF4bj vABFBT6m3viOROLBPPCd6Qa6um3M7dXWYSrPza5nUS1oOulhuZXZeyM/HW7ZvDLQC6zE yHrFLuC4KGiREdCK6fKy9GwbgRd+fA/ft0J3U3q2UK/xU4RyKcWhSoIdBCdzv10ZN05E dn74Pz7/NqE2fGe1PwOXR/leIOoYnlfQqt9lcLYez59bLVPkKyhgtBIEO0ihRhdISfV7 thVhSy3oRb6Och4nhKySGw5SV0AtrbNZeOPduszWyfg2oYiGEjcVKTSeJR3967CqHnQs GM6g== X-Gm-Message-State: AOAM533Byafj+m9nPyRLVu37X8v8MVS3avsOs1y2kPUsXw4zcxtEAetn a+5VjhNSOT8WX70AbSM+dkbUbpssKlDDUosvHRQ1+w== X-Received: by 2002:a05:620a:713:: with SMTP id 19mr7084459qkc.424.1610629330999; Thu, 14 Jan 2021 05:02:10 -0800 (PST) MIME-Version: 1.0 References: <20210113133523.39205-1-alobakin@pm.me> <20210113133635.39402-1-alobakin@pm.me> <20210113133635.39402-2-alobakin@pm.me> <20210114114046.7272-1-alobakin@pm.me> <20210114124406.9049-1-alobakin@pm.me> <20210114125932.9594-1-alobakin@pm.me> In-Reply-To: <20210114125932.9594-1-alobakin@pm.me> From: Dmitry Vyukov Date: Thu, 14 Jan 2021 14:01:59 +0100 Message-ID: Subject: Re: [PATCH v2 net-next 2/3] skbuff: (re)use NAPI skb cache on allocation path To: Alexander Lobakin Cc: Eric Dumazet , "David S. Miller" , Jakub Kicinski , Edward Cree , Jonathan Lemon , Willem de Bruijn , Miaohe Lin , Steffen Klassert , Guillaume Nault , Yadu Kishore , Al Viro , netdev , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 14, 2021 at 2:00 PM Alexander Lobakin wrote: > >>>>>> Instead of calling kmem_cache_alloc() every time when building a NAPI > >>>>>> skb, (re)use skbuff_heads from napi_alloc_cache.skb_cache. Previously > >>>>>> this cache was only used for bulk-freeing skbuff_heads consumed via > >>>>>> napi_consume_skb() or __kfree_skb_defer(). > >>>>>> > >>>>>> Typical path is: > >>>>>> - skb is queued for freeing from driver or stack, its skbuff_head > >>>>>> goes into the cache instead of immediate freeing; > >>>>>> - driver or stack requests NAPI skb allocation, an skbuff_head is > >>>>>> taken from the cache instead of allocation. > >>>>>> > >>>>>> Corner cases: > >>>>>> - if it's empty on skb allocation, bulk-allocate the first half; > >>>>>> - if it's full on skb consuming, bulk-wipe the second half. > >>>>>> > >>>>>> Also try to balance its size after completing network softirqs > >>>>>> (__kfree_skb_flush()). > >>>>> > >>>>> I do not see the point of doing this rebalance (especially if we do not change > >>>>> its name describing its purpose more accurately). > >>>>> > >>>>> For moderate load, we will have a reduced bulk size (typically one or two). > >>>>> Number of skbs in the cache is in [0, 64[ , there is really no risk of > >>>>> letting skbs there for a long period of time. > >>>>> (32 * sizeof(sk_buff) = 8192) > >>>>> I would personally get rid of this function completely. > >>>> > >>>> When I had a cache of 128 entries, I had worse results without this > >>>> function. But seems like I forgot to retest when I switched to the > >>>> original size of 64. > >>>> I also thought about removing this function entirely, will test. > >>>> > >>>>> Also it seems you missed my KASAN support request ? > >>>> I guess this is a matter of using kasan_unpoison_range(), we can ask for help. > >>>> > >>>> I saw your request, but don't see a reason for doing this. > >>>> We are not caching already freed skbuff_heads. They don't get > >>>> kmem_cache_freed before getting into local cache. KASAN poisons > >>>> them no earlier than at kmem_cache_free() (or did I miss someting?). > >>>> heads being cached just get rid of all references and at the moment > >>>> of dropping to the cache they are pretty the same as if they were > >>>> allocated. > >>> > >>> KASAN should not report false positives in this case. > >>> But I think Eric meant preventing false negatives. If we kmalloc 17 > >>> bytes, KASAN will detect out-of-bounds accesses beyond these 17 bytes. > >>> But we put that data into 128-byte blocks, KASAN will miss > >>> out-of-bounds accesses beyond 17 bytes up to 128 bytes. > >>> The same holds for "logical" use-after-frees when object is free, but > >>> not freed into slab. > >>> > >>> An important custom cache should use annotations like > >>> kasan_poison_object_data/kasan_unpoison_range. > >> > >> As I understand, I should > >> kasan_poison_object_data(skbuff_head_cache, skb) and then > >> kasan_unpoison_range(skb, sizeof(*skb)) when putting it into the > >> cache? > > > > I think it's the other way around. It should be _un_poisoned when used. > > If it's fixed size, then unpoison_object_data should be a better fit: > > https://elixir.bootlin.com/linux/v5.11-rc3/source/mm/kasan/common.c#L253 > > Ah, I though of this too. But wouldn't there be a false-positive if > a poisoned skb hits kmem_cache_free_bulk(), not the allocation path? > We plan to use skb_cache for both reusing and bulk-freeing, and SLUB, > for example, might do writes into objects before freeing. > If it also should get unpoisoned before kmem_cache_free_bulk(), we'll > lose bulking as unpoisoning is performed per-object. Yes, it needs to be unpoisoned before free. Unpoison one-by-one, free in bulk. Unpoisoningin is debug-only code anyway.