Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp25730523rwd; Sun, 2 Jul 2023 23:39:50 -0700 (PDT) X-Google-Smtp-Source: APBJJlHlMuJnT3rx406PwvaQ6o9g3sr3s77QcHkpbVtnzqSwDuorT0uonvot/lFn6C9Wpe8QQ/T4 X-Received: by 2002:a17:902:d2cd:b0:1b8:1c9e:444e with SMTP id n13-20020a170902d2cd00b001b81c9e444emr11986840plc.25.1688366390207; Sun, 02 Jul 2023 23:39:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688366390; cv=none; d=google.com; s=arc-20160816; b=UGXhgLjDvWSOlCDnEiMftsI/kimdlxf1zg8dasEvw+f4W0vBDDmvq3SQ12dpTcioRu nEHlY/2HClHqnWLmKFf62gbePamSHe+sMbPSY0JCjjj9fmYFH+dTmEM5du18UTJB+xe6 0tfbFcXZKWL7GhWKQ7j/eP97N3Roy3yOJQ6cGUdDYmBq0Byik9mZoN+zOv8nSWruZbfh ryV9GkGWe+rIFGk8+tubBlELKjkR5BgWJ+s5Ez702qGDOD4hPAX0wwpq5SG4HKFJ6F/A NZPz5CORtTDjDLG6CG6RxBrbPtOWjeJpsiJonydr6GGrshqEWSsT3qs0P8/LO5oDoRqA 6uKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Dh2dAhBT/r+t8dBDKmgiVOzd46bgkC5w60SC2kQomME=; fh=uGXpFEJzs0oVx+eNmk8oRvtcAQvIA8Ru63fReRcFzis=; b=gx1W8m3tQIWDGS3lGUr6cKnWFjL7SWb9VNrZATtOTSCrdMVUiSXMm1H7LySg8m48KU grb1NBYFA99DpRZuzeEzbIq5jGwVw+nRk+/DOgU9tYFOW/DJjYksqonZdUjy6+v0u36w W60yjObBidLO3IxGh13PJoX8UmbiQ9oFpCTd7bceZ8vOSnlnTD94ZcgKRG6IyRW6+6Cs wg+FeeqMXHIxqcg9zFMqqaBR35nMPxkQ0b3WE2Pm/aFChfqwkf9DboYyy1/QrJxRCxT4 8FJJGefaaD+O+bdYK0kA2U4JHhOKJV/vqsvU+SsRdu2GnvcofO88DC2Z2ZvOXrzN02Wp poMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=CSbXeIll; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m10-20020a170902bb8a00b001b04c325d66si16924394pls.565.2023.07.02.23.39.38; Sun, 02 Jul 2023 23:39:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=CSbXeIll; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230059AbjGCGWs (ORCPT + 59 others); Mon, 3 Jul 2023 02:22:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229771AbjGCGWr (ORCPT ); Mon, 3 Jul 2023 02:22:47 -0400 Received: from mail-ua1-x92a.google.com (mail-ua1-x92a.google.com [IPv6:2607:f8b0:4864:20::92a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD39DC0 for ; Sun, 2 Jul 2023 23:22:45 -0700 (PDT) Received: by mail-ua1-x92a.google.com with SMTP id a1e0cc1a2514c-7943be26e84so1365339241.2 for ; Sun, 02 Jul 2023 23:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688365365; x=1690957365; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Dh2dAhBT/r+t8dBDKmgiVOzd46bgkC5w60SC2kQomME=; b=CSbXeIllDKG52Hz2y0BzvzeWTNUEc4fr1uNpd8I0jFTbwT+x3xDzfAMsPWIKx6IndK bOfUdQihG5MDXKogCsRu0ZSQMmMqJLNluQnXpau1J4w9cUIMMHIL/jF+DEbhCBDpOV4D UrvGya3awuDtGeA3F5CEGu7Tsy5x6It+z9gqJsw2oPuPXEPiIxuE9fQSoLE3tGreKNSn n29sZy8eBYgJ3uG2O2Vor2R9NR0H8vnSeaY4+eTO2hqtFSudrrp5NWwTcNsssh4ExF2G md6hi/utlTHH/EJqeqUjebheF7KLxlDoH+TCdnplsNmsiqqiOSECjVqhIKnbPUezcxIN Ax/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688365365; x=1690957365; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dh2dAhBT/r+t8dBDKmgiVOzd46bgkC5w60SC2kQomME=; b=MpYoXIOGikOvqAg37KDsPyztbv0xEYW3yfJaNmW8UC+cSDNHBDbzSo96k0RbN7NBsn 2xkYIKS1qWHwKlDFp77FvrI15y1ENXk8bW8wmyvn9krXMglSNDPRVXRIqSXXmnDIzjkt tuHxzr9Hr8cWRo7xk5l288sDsTJE6uP99zrNk3h8nnyeLNt+UcCZ7yKXpF+97XzLqeS7 URv8vkAafjZyIQgmzXeveC/BAT/xyWDCRyYJVsDX5X1I1mS5O7Zs4ZSKQ40FRH4pd9hJ sm3EiupjFeUaBd5xzdropznnGauYzJSSfQ+DFdKQ0GSL82YUILEc0kfnjbcslx856/E1 3wvQ== X-Gm-Message-State: ABy/qLZtnxdwl/7VinlbvO1S4QwVnP8SzT+tLYK5LiBe3j70IQyNS1f+ DRFW8vIp4H5Naz+Vxotj66EDfG++lj10icXxpAhpZQ== X-Received: by 2002:a67:ee55:0:b0:443:6052:43ac with SMTP id g21-20020a67ee55000000b00443605243acmr3287737vsp.30.1688365364750; Sun, 02 Jul 2023 23:22:44 -0700 (PDT) MIME-Version: 1.0 References: <20230612130256.4572-1-linyunsheng@huawei.com> <20230612130256.4572-5-linyunsheng@huawei.com> <20230614101954.30112d6e@kernel.org> <8c544cd9-00a3-2f17-bd04-13ca99136750@huawei.com> <20230615095100.35c5eb10@kernel.org> <908b8b17-f942-f909-61e6-276df52a5ad5@huawei.com> <72ccf224-7b45-76c5-5ca9-83e25112c9c6@redhat.com> <20230616122140.6e889357@kernel.org> <20230619110705.106ec599@kernel.org> <5e0ac5bb-2cfa-3b58-9503-1e161f3c9bd5@kernel.org> In-Reply-To: <5e0ac5bb-2cfa-3b58-9503-1e161f3c9bd5@kernel.org> From: Mina Almasry Date: Sun, 2 Jul 2023 23:22:33 -0700 Message-ID: Subject: Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag) To: David Ahern Cc: Jakub Kicinski , Jesper Dangaard Brouer , brouer@redhat.com, Alexander Duyck , Yunsheng Lin , davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lorenzo Bianconi , Yisen Zhuang , Salil Mehta , Eric Dumazet , Sunil Goutham , Geetha sowjanya , Subbaraya Sundeep , hariprasad , Saeed Mahameed , Leon Romanovsky , Felix Fietkau , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Jesper Dangaard Brouer , Ilias Apalodimas , linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Jonathan Lemon Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On Sun, Jul 2, 2023 at 9:20=E2=80=AFPM David Ahern wro= te: > > On 6/29/23 8:27 PM, Mina Almasry wrote: > > > > Hello Jakub, I'm looking into device memory (peer-to-peer) networking > > actually, and I plan to pursue using the page pool as a front end. > > > > Quick description of what I have so far: > > current implementation uses device memory with struct pages; I am > > putting all those pages in a gen_pool, and we have written an > > allocator that allocates pages from the gen_pool. In the driver, we > > use this allocator instead of alloc_page() (the driver in question is > > gve which currently doesn't use the page pool). When the driver is > > done with the p2p page, it simply decrements the refcount on it and > > the page is freed back to the gen_pool. Quick update here, I was able to get my implementation working with the page pool as a front end with the memory provider API Jakub wrote here: https://github.com/kuba-moo/linux/tree/pp-providers The main complication indeed was the fact that my device memory pages are ZONE_DEVICE pages, which are incompatible with the page_pool due to the union in struct page. I thought of a couple of approaches to resolve that. 1. Make my device memory pages non-ZONE_DEVICE pages. The issue with that is that if the page is not ZONE_DEVICE, put_page(page) will attempt to free it to the buddy allocator I think, which is not correct. The only places where the mm stack currently allow custom freeing callback (AFAIK), are for ZONE_DEVICE page where free_zone_device_page() will call the provided callback in page->pgmap->ops->page_free, and compound pages where the compound_dtor is specified. My device memory pages aren't compound pages so only ZONE_DEVICE pages do what I want. 2. Convert the pages from ZONE_DEVICE pages to page_pool pages and vice versa as they're being inserted and removed from the page pool. This, I think, works elegantly without any issue, and is the option I went with. The info from ZONE_DEVICE that I care about for device memory TCP is the page->zone_device_data which holds the dma_addr, and the page->pgmap which holds the page_free op. I'm able to store both in my memory provider so I can swap pages from ZONE_DEVICE and page_pool back and forth. So far I haven't needed to make any modifications to the memory provider implementation Jakub has pretty much, and my functionality tests are passing. If there are no major objections I'll look into cleaning up the interface a bit and propose it for merge. This is a prerequisite of device memory TCP via the page_pool. > > I take it these are typical Linux networking applications using standard > socket APIs (not dpdk or xdp sockets or such)? If so, what does tcpdump > show for those skbs with pages for the device memory? > Yes these are using (mostly) standing socket APIs. We have small extensions to sendmsg() and recvmsg() to pass a reference to the device memory in both these cases, but that's about it. tcpdump is able to access the header of these skbs which is in host memory, but not the payload in device memory. Here is an example session with my netcat-like test for device memory TCP: https://pastebin.com/raw/FRjKf0kv tcpdump seems to work, and the length of the packets above is correct. tcpdump -A however isn't able to print the payload of the packets: https://pastebin.com/raw/2PcNxaZV --=20 Thanks, Mina