Received: by 2002:a05:7412:b101:b0:e2:908c:2ebd with SMTP id az1csp2763230rdb; Wed, 15 Nov 2023 09:45:43 -0800 (PST) X-Google-Smtp-Source: AGHT+IEB/I+sUk87TrX56s5y7pjeqoSOKycAljCy9lYuPDXjK8ctLKVxCMMziPHpL2d439jcYcRd X-Received: by 2002:a17:902:dad2:b0:1cc:4072:22c6 with SMTP id q18-20020a170902dad200b001cc407222c6mr7027487plx.24.1700070343242; Wed, 15 Nov 2023 09:45:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700070343; cv=none; d=google.com; s=arc-20160816; b=BFxx91RXG2ltSkcwf8j2E0gpQHw9MrM/cDnSinLM6fYqvsVa2iSvF8HPDHIhVyP7+W y3ihqpk+pIbX9XIs0aqOoL5UiZO1bwEbXWjfc7gZ2m1jNTmpZ93ArpJdiekGwnYjLc99 OqvXlbu3jLune2EmL8KFRXrLVkwHWvkNKSYpCQtWSsLAjlUUufC41eF30GJr4rQXYjdn Jr4zIetvu4NLnSeODLkLsl1CPPPsb00USoh301ae4HC3sbmoCDbn6INOqoDhyQyzmItX GoAS5sTdb2fNht8SW66W36etUVTso/cdJnbam4KggixcidBNayDgCy/jda9xt+3zUQX5 FxJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=yA0mSDgxgLYHALoqWknT9J7D4wZ2ecpGCOc3KqqrcZQ=; fh=FlJFHzPRPCP04MjKdvjc+an7p7KNLKksUNeY6KpEUqE=; b=Rxsc/MxVpIQy6qKJCRwV/HC6Wn8jUaZe3W88jKKD/B4R+0sP6FgAFjw4ALuuEXqzzK ZnL7Gt9sSsy9veS/Xq7mJjRJS/FtmqESmJpUl6KomUv0moMxTkpUxvar2yaiYHh0KaCU sDDWBRhJgMTWlBxSNPs5mhSI7kXbThrctu/B4Mh7B5eyiOJoQhgzziULtSqajFJ7u0fp orChnIQAf3UKaGwLOVEWSJVaBq+l0dD8c2rhE2SXXcLDsoMAnyRAdaoJio3SqRIfIaGL Mq+8E0b4h8nYtXHVP6eUVVzbh1I1snHl3FnkCpZ8xDVvTgdfdZAmuHWPoD8ZoUElvurB rCxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=SwhCU04B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id ix1-20020a170902f80100b001cad3a744aesi10148172plb.153.2023.11.15.09.45.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 09:45:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=SwhCU04B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 05EC880267D5; Wed, 15 Nov 2023 09:45:06 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229796AbjKORo4 (ORCPT + 99 others); Wed, 15 Nov 2023 12:44:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229504AbjKORoz (ORCPT ); Wed, 15 Nov 2023 12:44:55 -0500 Received: from mail-ua1-x92f.google.com (mail-ua1-x92f.google.com [IPv6:2607:f8b0:4864:20::92f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CD1C83 for ; Wed, 15 Nov 2023 09:44:52 -0800 (PST) Received: by mail-ua1-x92f.google.com with SMTP id a1e0cc1a2514c-7ba45fc8619so2776079241.2 for ; Wed, 15 Nov 2023 09:44:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700070291; x=1700675091; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yA0mSDgxgLYHALoqWknT9J7D4wZ2ecpGCOc3KqqrcZQ=; b=SwhCU04BPZ9EP8tQWeVlrXnomNWBQTJLe4hTlLdwyFmJFWlfGTS+3/zZTXJBEhH8VE Nc/UVDhLDpLC/Lu4a6fmgiUVgWKKjH1wq0ipjlHsleiEnJORdw3Nwgj7j+ubCPVV0wwh 2qGmfoScdW5A2Q2zJwmFhe+s4uJYOHM+CUUZRUPWQB/FDjLSLOoANFAeSUk6Z8TKl6pP 3ufDsMIOLxtG5a6aHgAxRNLLI2mCdgKa614ZZjo5geUtStfOoO+7YiLqo5Az4u3xI23A AU4PPvG1KSmQueLkjhwncmlNXafL0PZ/Ee4VpsJ5tyv564vCujoFg1HlzjSQBFlGCs4E ElDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700070291; x=1700675091; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yA0mSDgxgLYHALoqWknT9J7D4wZ2ecpGCOc3KqqrcZQ=; b=MBQo7QqbKpHuj5iJAbKVf0UkyG+X5Ap6OLK5O4fy1rPc39IuJc3LTdf/X5pnB0JWIh WhEo2NCMp6h3lomX/wP5lDjGDCsfAqX+IZrVF3IPJc9uqT7VFpV4W3R/VdU90H2GXE7/ gZh2nxkZRLAUsjofLqK1Z4gGeob3/9wfLA1W4jQ+32iBUf5r4nHvuqcTWlJ3epWoYd7z Vlmbv/DFI+vzOrkYia38mdHbyhLYLGz+obI/L2vlG40CfEiManTXPcl7XC73o17HC8JT 5an/uM7lUQibgkN+5ZMn/SL5Mi6jlTq8KIA21eQwKCSuh5ZseP/8o8eXcEWzGZzUfyyn Jilg== X-Gm-Message-State: AOJu0YyQ3j3b2qr6OdPQcgVGFQ4ilAZsS3vu0eDSk35Zy9+yfjzZbmEw 8cu+VrKmXbYxvdHl5fz3KOerN00xWkLOE58FPIzLaDB04jfMgKK+YZI= X-Received: by 2002:a05:6122:922:b0:4ac:462b:7417 with SMTP id j34-20020a056122092200b004ac462b7417mr14917004vka.8.1700070290777; Wed, 15 Nov 2023 09:44:50 -0800 (PST) MIME-Version: 1.0 References: <20231113130041.58124-1-linyunsheng@huawei.com> <20231113130041.58124-4-linyunsheng@huawei.com> <20231113180554.1d1c6b1a@kernel.org> <0c39bd57-5d67-3255-9da2-3f3194ee5a66@huawei.com> In-Reply-To: From: Mina Almasry Date: Wed, 15 Nov 2023 09:44:37 -0800 Message-ID: Subject: Re: [PATCH RFC 3/8] memory-provider: dmabuf devmem memory provider To: Yunsheng Lin Cc: Jason Gunthorpe , Jakub Kicinski , davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Willem de Bruijn , Kaiyuan Zhang , Jesper Dangaard Brouer , Ilias Apalodimas , Eric Dumazet , =?UTF-8?Q?Christian_K=C3=B6nig?= , Matthew Wilcox , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 15 Nov 2023 09:45:06 -0800 (PST) On Wed, Nov 15, 2023 at 1:21=E2=80=AFAM Yunsheng Lin wrote: > > On 2023/11/14 21:16, Jason Gunthorpe wrote: > > On Tue, Nov 14, 2023 at 04:21:26AM -0800, Mina Almasry wrote: > > > >> Actually because you put the 'strtuct page for devmem' in > >> skb->bv_frag, the net stack will grab the 'struct page' for devmem > >> using skb_frag_page() then call things like page_address(), kmap, > >> get_page, put_page, etc, etc, etc. > > > > Yikes, please no. If net has its own struct page look alike it has to > > stay entirely inside net. A non-mm owned struct page should not be > > passed into mm calls. It is just way too hacky to be seriously > > considered :( > > Yes, that is something this patchset is trying to do, defining its own > struct page look alike for page pool to support devmem. > > struct page for devmem will not be called into the mm subsystem, so most > of the mm calls is avoided by calling into the devmem memory provider' > ops instead of calling mm calls. > > As far as I see for now, only page_ref_count(), page_is_pfmemalloc() and > PageTail() is called for devmem page, which should be easy to ensure that > those call for devmem page is consistent with the struct page owned by mm= . I'm not sure this is true. These 3 calls are just the calls you're aware of. In your proposal you're casting mirror pages into page* and releasing them into the net stack. You need to scrub the entire net stack for mm calls, i.e. all driver code and all skb_frag_page() call sites. Of the top of my head, the driver is probably calling page_address() and illegal_highdma() is calling PageHighMem(). TCP zerocopy receive is calling vm_insert_pages(). > I am not sure if we can use some kind of compile/runtime checking to ensu= re > those kinds of consistency? > > > > >>> I would expect net stack, page pool, driver still see the 'struct pag= e', > >>> only memory provider see the specific struct for itself, for the abov= e, > >>> devmem memory provider sees the 'struct page_pool_iov'. > >>> > >>> The reason I still expect driver to see the 'struct page' is that dri= ver > >>> will still need to support normal memory besides devmem. > > > > I wouldn't say this approach is unreasonable, but it does have to be > > done carefully to isolate the mm. Keeping the struct page in the API > > is going to make this very hard. > > I would expect that most of the isolation is done in page pool, as far as > I can see: > > 1. For control part: the driver may need to tell the page pool which memo= ry > provider it want to use. Or the administrator specif= ies > which memory provider to use by some netlink-based c= md. > > 2. For data part: I am thinking that driver should only call page_pool_al= loc(), > page_pool_free() and page_pool_get_dma_addr related fun= ction. > > Of course the driver may need to be aware of that if it can call kmap() o= r > page_address() on the page returned from page_pool_alloc(), and maybe tel= l > net stack that those pages is not kmap()'able and page_address()'able. > > > > > Jason > > . > > --=20 Thanks, Mina