Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp653669imm; Wed, 4 Jul 2018 03:44:21 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfmH7IDgSRoa81urBZxCb7VXqznzoM7xtudM6RxAo8rZ6T9M73/ITnuG5AUQazlxtoh0gQP X-Received: by 2002:a63:b705:: with SMTP id t5-v6mr1433058pgf.45.1530701061359; Wed, 04 Jul 2018 03:44:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530701061; cv=none; d=google.com; s=arc-20160816; b=USXvxDwrdzShSE8B6bT0xpHdy3hy30ofArUkT/+ZdpHFmP8aaJytUWjiGJSw7E1ROq bY3VP1VQ5D7eAok7VXzOpVjkmsoZZor2b37TGRUOFrVSbbicDXvOk8xDqAsgi4VZ0fgv 4Quw8Dpyl0K9RB9YrE86iwpOQfwgFRLwKtvBqcmzyncx2x3Fvfj1G1cBM5+ly0fWabHb MAlmKTUbFDqyDQFsL7r1GrXNwhwZIyeDjndbjM+btb5ob3VYRDtA1GPH5zvGcS1KZsWm y7JZasq+SfLQh+knXLzlI0T/njRg3zjSeMI8YVpgbKU6grsl8DWK/AF4uS+h5DNGzGBL 2b/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=mlppZ00cSlTiJ6hjDRlaBpCODACYIl75i8J29xy144M=; b=O1Vhq0F+LkRdxBGH0oYbHPuM7QT+BTDV+ALCUl1gRoAo/dBdUgHvRWVqQXqOk35CIe XY0RIEkZbbmxtx5EYMHEBjEQE1x2b/XhL2F7w7CCM3VqqfsPc+6FL8ujOXX2StTqclYn zQg8XxlD0WoRqII0dkHLpawZ+N7iiy1cwKn806+tJjDOWJeBCwmnH3mC+RVz6Wm98aXF FFH31aF/Wv+41UPuj0Y7Yx9Mre2p1cCcq/f256+a4W/02SJnOrApZ+HWIv1AcDYxIZmi 4GEST1d+Qaa9wvg7uSdyiUV8V0Hgu4aArpwG6SsD8IJxaVIkNDacld8kjeJTVPEKizK8 js0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k6-v6si3029720pgk.256.2018.07.04.03.44.06; Wed, 04 Jul 2018 03:44:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933071AbeGDKnX (ORCPT + 99 others); Wed, 4 Jul 2018 06:43:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:56850 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753048AbeGDKnV (ORCPT ); Wed, 4 Jul 2018 06:43:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C3937ADC2; Wed, 4 Jul 2018 10:43:19 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id C6BD41E3C02; Wed, 4 Jul 2018 12:43:18 +0200 (CEST) Date: Wed, 4 Jul 2018 12:43:18 +0200 From: Jan Kara To: John Hubbard Cc: Christopher Lameter , Jan Kara , john.hubbard@gmail.com, Matthew Wilcox , Michal Hocko , Jason Gunthorpe , Dan Williams , linux-mm@kvack.org, LKML , linux-rdma , linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields Message-ID: <20180704104318.f5pnqtnn3unkwauw@quack2.suse.cz> References: <20180702005654.20369-1-jhubbard@nvidia.com> <20180702005654.20369-6-jhubbard@nvidia.com> <20180702095331.n5zfz35d3invl5al@quack2.suse.cz> <010001645d77ee2c-de7fedbd-f52d-4b74-9388-e6435973792b-000000@email.amazonses.com> <01000164611dacae-5ac25e48-b845-43ef-9992-fc1047d8e0a0-000000@email.amazonses.com> <3c71556f-1d71-873a-6f74-121865568bf7@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3c71556f-1d71-873a-6f74-121865568bf7@nvidia.com> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 03-07-18 10:36:05, John Hubbard wrote: > On 07/03/2018 10:08 AM, Christopher Lameter wrote: > > On Mon, 2 Jul 2018, John Hubbard wrote: > > > >>> If you establish a reference to a page then increase the page count. If > >>> the reference is a dma pin action also then increase the pinned count. > >>> > >>> That way you know how many of the references to the page are dma > >>> pins and you can correctly manage the state of the page if the dma pins go > >>> away. > >>> > >> > >> I think this sounds like what this patch already does, right? See: > >> __put_page_for_pinned_dma(), __get_page_for_pinned_dma(), and > >> pin_page_for_dma(). The locking seems correct to me, but I suspect it's > >> too heavyweight for such a hot path. But without adding a new put_user_page() > >> call, that was the best I could come up with. > > > > When I saw the patch it looked like you were avoiding to increment the > > page->count field. > > Looking at it again, this patch is definitely susceptible to Jan's "page gets > dma-unpinnned too soon" problem. That leaves a window in which the original > problem can occur. > > The page->_refcount field is used normally, in addition to the dma_pinned_count. > But the problem is that, unless the caller knows what kind of page it is, > the page->dma_pinned_count cannot be looked at, because it is unioned with > page->lru.prev. page->dma_pinned_flags, at least starting at bit 1, are > safe to look at due to pointer alignment, but now you cannot atomically > count... > > So this seems unsolvable without having the caller specify that it knows the > page type, and that it is therefore safe to decrement page->dma_pinned_count. > I was hoping I'd found a way, but clearly I haven't. :) Well, I think the misconception is that "pinned" is a fundamental property of a page. It is not. "pinned" is a property of a page reference (i.e., a kind of reference that can be used for DMA access) and page gets into "pinned" state if it has any reference of "pinned" type. And when you realize this, it is obvious that you just have to have a special api for getting and dropping references of this "pinned" type. For getting we already have get_user_pages(), for putting we have to create the api... Honza -- Jan Kara SUSE Labs, CR