Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2587063pxv; Sun, 11 Jul 2021 19:08:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzBzmJgjtFXiuaKmf6uKOfG2c6//7n56ftADfhWDYuvrhptc00Av7ln+7OGgH6G6/N8/YAu X-Received: by 2002:a17:906:a0ce:: with SMTP id bh14mr43024371ejb.434.1626055737184; Sun, 11 Jul 2021 19:08:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626055737; cv=none; d=google.com; s=arc-20160816; b=vgs/vonLuv0ATveTTwrdhd+tddNJy7ypi66A71ihK68GCaqOykmP+gre7lU1BrZG0v +SO4ayuc+yt1DIqJCwuSar3s4TVxCjzenWHAZgQBHV5M7Uf3X6DpE2MYf8+ryUsRaP4F OxEWLEp1G7NCo+oRVmlbkP5gtRhO+/anrpYkmmKBPkAdn2AW/Ljeh4JbdFxWDDaFloSg NnS+bmENRgtPQb38S4qINGOUsU8nyd/KB3xz/lJ5BwkuAYyzvMGSmo3xR/GOECbBwoDE 8sFU9HghWi4maFidClW2dtSy9CroeW7ldwuw029w1XtZiDs9r5ebrMaUWoXrTyBJechP QeTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=aChcZK1yLiC2f73IPI8sUdxMYGfRaGpMdUkwbKrCF9k=; b=Xo5JPhruwg0AqMCCpXWO4H0ecikKASDIhb2tMgyyJde7c6FjJKEYLbkXU5Ju9c6UG2 mrhKmzWWzqnYpCdUalVFueGEHJXSWBmYiicsPEgiY6MzUGZF3TEltvrKRpXseQJtsTil uDzk/EXnRT0CnL2bzpI6wGVYswhwkGOaCf204rY0OboFW0yqYX5WahuYCcqWXjuo5B4e Q+ASzcQaljpj4s/WNYleHniYF+Q6vlhAna4ADm+OGRb86zyH63qce6GH2q7ZmoQ1VujB sYy4v0nuN91F3CHXqkd/fff3Yjuq02lKFBYvjLjJXtiW+KK3/7nbMVFOSOmv1IrAgUKS pqZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o18si14426076edc.14.2021.07.11.19.08.33; Sun, 11 Jul 2021 19:08:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229891AbhGLCJO (ORCPT + 99 others); Sun, 11 Jul 2021 22:09:14 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:6911 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229660AbhGLCJO (ORCPT ); Sun, 11 Jul 2021 22:09:14 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4GNRqc4rdBz7BQ6; Mon, 12 Jul 2021 10:02:52 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Mon, 12 Jul 2021 10:06:02 +0800 Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Mon, 12 Jul 2021 10:06:02 +0800 Subject: Re: [PATCH rfc v2 3/5] page_pool: add page recycling support based on elevated refcnt To: Alexander Duyck CC: David Miller , Jakub Kicinski , Russell King - ARM Linux , Marcin Wojtas , , , "Salil Mehta" , , , Ilias Apalodimas , "Alexei Starovoitov" , Daniel Borkmann , "John Fastabend" , Andrew Morton , Peter Zijlstra , "Will Deacon" , Matthew Wilcox , "Vlastimil Babka" , , , Peter Xu , Feng Tang , Jason Gunthorpe , Matteo Croce , Hugh Dickins , Jonathan Lemon , "Alexander Lobakin" , Willem de Bruijn , , Cong Wang , Kevin Hao , , Marco Elver , Netdev , LKML , bpf References: <1625903002-31619-1-git-send-email-linyunsheng@huawei.com> <1625903002-31619-4-git-send-email-linyunsheng@huawei.com> From: Yunsheng Lin Message-ID: Date: Mon, 12 Jul 2021 10:06:01 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggeme719-chm.china.huawei.com (10.1.199.115) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/7/11 1:31, Alexander Duyck wrote: > On Sat, Jul 10, 2021 at 12:44 AM Yunsheng Lin wrote: > >> @@ -419,6 +471,20 @@ static __always_inline struct page * >> __page_pool_put_page(struct page_pool *pool, struct page *page, >> unsigned int dma_sync_size, bool allow_direct) >> { >> + int bias = page_pool_get_pagecnt_bias(page); >> + >> + /* Handle the elevated refcnt case first */ >> + if (bias) { >> + /* It is not the last user yet */ >> + if (!page_pool_bias_page_recyclable(page, bias)) >> + return NULL; >> + >> + if (likely(!page_is_pfmemalloc(page))) >> + goto recyclable; >> + else >> + goto unrecyclable; >> + } >> + > > So this part is still broken. Anything that takes a reference to the > page and holds it while this is called will cause it to break. For > example with the recent fixes we put in place all it would take is a > skb_clone followed by pskb_expand_head and this starts leaking memory. Ok, it seems the fix is confilcting with the expectation this patch is based, which is "the last user will always call page_pool_put_full_page() in order to do the recycling or do the resource cleanup(dma unmaping..etc) and freeing.". As the user of the new skb after skb_clone() and pskb_expand_head() is not aware of that their frag page may still be in the page pool after the fix? > > One of the key bits in order for pagecnt_bias to work is that you have > to deduct the bias once there are no more parties using it. Otherwise > you leave the reference count artificially inflated and the page will > never be freed. It works fine for the single producer single consumer > case but once you introduce multiple consumers this is going to fall > apart. It seems we have diffferent understanding about consumer, I treat the above user of new skb after skb_clone() and pskb_expand_head() as the consumer of the page pool too, so that new skb should keep the recycle bit in order for that to happen. If the semantic is "the new user of a page should not be handled by page pool if page pool is not aware of the new user(the new user is added by calling page allocator API instead of calling the page pool API, like the skb_clone() and pskb_expand_head() above) ", I suppose I am ok with that semantic too as long as the above semantic is aligned with the people involved. Also, it seems _refcount and dma_addr in "struct page" is in the same cache line, which means there is already cache line bouncing already between _refcount and dma_addr updating, so it may makes senses to only use bias to indicate number of the page pool user for a page, instead of using "bias - page_ref_count", as the page_ref_count is not reliable if somebody is using the page allocator API directly. And the trick part seems to be how to make the bias atomic for allocating and freeing. Any better idea? > . >