Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp361507pxk; Wed, 9 Sep 2020 07:21:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw7kaG+th+0Y1+i6Q3fsmjLnzXMu0vQBmnhYP4hF+is0mBihVPHu/OLWBdCb5JYCf52B2y4 X-Received: by 2002:a17:906:15cc:: with SMTP id l12mr4026446ejd.7.1599661273877; Wed, 09 Sep 2020 07:21:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599661273; cv=none; d=google.com; s=arc-20160816; b=V5ELxjNvb864TYTOUmejaDuBd4fYYdF5iPdxl1NuONLkXOC4V9F3lavAS2iOxt/yXv b17waD06p7qLY6uBv2UDmMoGk70Y9qk6RMeTjgyjxBZWllx0/coAlX8o9YiH0WT/WHyY fBfbgM5BcM3ovqjkagtfWWJZJxT7u3evxhS+/0l+k5JSCVNjy554rO11/Jhw5QJRzeyn RsJVKDAfxRLWzN13BWfk+rSW5QwlRcL1YCG5CKALdPd3cpG7xmfBp8HzJ1R0EqDmCUfe bCvBH/vFNyUZjqnUfXRYGu6jj/lYjkoZ1YvHXy9C9AFNrIb95qcCFy1W4MZI1GAjZLvb aXKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=SQiYBb9hIxC2l4xPBV67YXsRJO9uNM2atVYfM8q4ou8=; b=hB+TlM8dubrc08BuSu4YmPW7DX61o9SFxScF/QzRjmLVnEIAhFoKHj93ickPc2CLdb D5Tw0gUIq45nZe9NGwLG0eltKoJMvxRLA5bTx9TjWTCzdUB0whgTm0MQbGE5ESLQ4Ub6 kVGtZ97zUwTDiE4TF8P1K2Z8qtUZGjzpMSdXqr8/4aUs3bi38VFSLRyaGX7j1j79MFil j7nfOFwzGOntI2KdVcoO1rn1bn7NqU6U4xLDV/MjgMtsRUwVPttuOPczA73XtO7/RYAw tbGSvY9OP8VDr2NTMkqvIUIcUBnH2Z9SV1kgLlA+HIq1vi35BEhsLhbuXc9oTHScyakC OEog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=pP83bq75; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i19si1310901edy.150.2020.09.09.07.20.50; Wed, 09 Sep 2020 07:21:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=pP83bq75; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730490AbgIIOSV (ORCPT + 99 others); Wed, 9 Sep 2020 10:18:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729822AbgIINF0 (ORCPT ); Wed, 9 Sep 2020 09:05:26 -0400 Received: from mail-qv1-xf42.google.com (mail-qv1-xf42.google.com [IPv6:2607:f8b0:4864:20::f42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB9B7C061755 for ; Wed, 9 Sep 2020 06:05:21 -0700 (PDT) Received: by mail-qv1-xf42.google.com with SMTP id j3so1423082qvi.7 for ; Wed, 09 Sep 2020 06:05:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=SQiYBb9hIxC2l4xPBV67YXsRJO9uNM2atVYfM8q4ou8=; b=pP83bq75IxsdMCDo379I0Q+Yw0WLU7W25iHcm/NAdOU4dk/3KQeaVC5V+pfGTpmjXW Gpn5tif/F15CgHcQb4U7ql9Dz6Wmu3MthNL9mE6Cumt4rZu5TWxsRoclfaUIXgUocGbM pmL17BtjKqtlTBtdvZn7TcyXLDF6awGA3i0KZA5PSqV7XqaApbdVtQPFJ7lli/EiNHqR 9lx2zwT0jmis/pdIIn09cKsFEDH3+7X+NuQCodI5jqv2DcqEE+0DjriVNzLnm+xmdWOz PKMGEJCMwy2wkxLknBRVoL81kq6jouZ+wlEKsokwmO/lkusgjrBgLts5EXtZ2UQZWWFO Cc2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=SQiYBb9hIxC2l4xPBV67YXsRJO9uNM2atVYfM8q4ou8=; b=jsNKdGMTS4+GyAA7LbjmeyRosIqgaumjFDF3X0hXrMX4Di9UCACDmL7e7Wo+CuqjPF +D5j0PC9cfALwDGwsQ7zwKuL4DqNZaQRQK9fjU/xeDc1J1DQrqxMpRzzxSYTh/ZIj9WF MFWr5NUgtT8wuZooGTUCtfW3ZFn+0PUwCEFFj/hzGlpgruN7+e9vZ+RNq2vWJnpifnrE OTBZG2gIzkZR9Ptr13QeGb5JbRvWcrEA3NzjkmlGIyJgg/uf2xnOxlQQEFBEtteBn11B OXAeobY0+GKaP6dmPQHCXkl6IfYNDrbIz1nLrjfPu9l89aco7yMSIHjTCnBgdwLT81e2 KZEA== X-Gm-Message-State: AOAM532FDsFSl4GuAGXoA9/TRWfbu1lCnUTji6gpVzvz09MRm2UXM7yn poDCaih5369KQLtPUS9Vig+INw== X-Received: by 2002:a05:6214:1045:: with SMTP id l5mr3984047qvr.110.1599656720625; Wed, 09 Sep 2020 06:05:20 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id 192sm2534880qkm.110.2020.09.09.06.05.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Sep 2020 06:05:19 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kFzmo-003JUV-UL; Wed, 09 Sep 2020 10:05:18 -0300 Date: Wed, 9 Sep 2020 10:05:18 -0300 From: Jason Gunthorpe To: Christoph Hellwig Cc: Ming Mao , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, alex.williamson@redhat.com, akpm@linux-foundation.org, cohuck@redhat.com, jianjay.zhou@huawei.com, weidong.huang@huawei.com, peterx@redhat.com, aarcange@redhat.com, wangyunjian@huawei.com, willy@infradead.org, jhubbard@nvidia.com Subject: Re: [PATCH V4 1/2] vfio dma_map/unmap: optimized for hugetlbfs pages Message-ID: <20200909130518.GE87483@ziepe.ca> References: <20200908133204.1338-1-maoming.maoming@huawei.com> <20200908133204.1338-2-maoming.maoming@huawei.com> <20200909080114.GA8321@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200909080114.GA8321@infradead.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 09, 2020 at 09:01:14AM +0100, Christoph Hellwig wrote: > I really don't think this approach is any good. You workaround > a deficiency in the pin_user_pages API in one particular caller for > one particular use case. RDMA has the same basic issues, this should should not be solved with workarounds in VFIO - a common API would be good > I think you'd rather want either: > > (1) a FOLL_HUGEPAGE flag for the pin_user_pages API family that returns > a single struct page for any kind of huge page, which would also > benefit all kinds of other users rather than adding these kinds of > hacks to vfio. How to use? The VMAs can have mixed page sizes so the caller would have to somehow switch and call twice? Not sure this is faster. > (2) add a bvec version of the API that returns a variable size > "extent" This is the best one, I think.. The IOMMU setup can have multiple page sizes, so having largest contiguous blocks pre-computed should speed that up. vfio should be a win to use a sgl rather than a page list? Especially if we can also reduce the number of pages pinned by only pinning head pages.. > I had started on (2) a while ago, and here is branch with my code (which > is broken and fails test, but might be a start): > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec > > But for now I wonder if (1) is the better start, which could still be > reused to (2) later. What about some 'pin_user_page_sgl' as a stepping stone? Switching from that point to bvec seems like a smaller step? Jason