Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp1258150pxb; Wed, 16 Feb 2022 14:56:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJywuqs7mls8aaXU19w0RGmIMyS127a7mPRkHenXZpKkZORjrPepKvgcgeM1eyJwJIXCkamm X-Received: by 2002:a17:907:12d5:b0:6cf:bb0d:9b2f with SMTP id vp21-20020a17090712d500b006cfbb0d9b2fmr205827ejb.138.1645052175031; Wed, 16 Feb 2022 14:56:15 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1645052175; cv=pass; d=google.com; s=arc-20160816; b=uImbC3ECoNZE6GecXVC3+CnkVtJrkY1BWEbqXC6LzDwH5FCYJX+ffN5s4DUtK/R9u7 ugdjCXwJKDjwgj2DwenOTVgzSO4LWVORW6qvMLsYkBBRVgmgYXwYuJrzJOG1af1VhWro o20j4ecfEussWoMdtI5+KiOn7/hMYEt2sZ48d7iPUb+Aoco4L7nLB6rOutrtEJ87YCjQ 92RWRV7m2vChseH79aYpMAOkhbhw/9AJpXnO7fTn+7wwMFj2twLjjrkdT6b+gmEE+jMI fLHxRD8V8cBlkzFAe4nuziperDdHdn2xcFN45XbU3zqEk6XlP8AAdkfxRebQVpM318pL NqYw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :in-reply-to:from:references:cc:to:content-language:subject :user-agent:date:message-id:dkim-signature; bh=3hWFbG3vbl8TgBT8Z4DbUpAOIIH2ow8JRMteTkSIXSg=; b=ObUghd/Q5SeGG6SA5KveKnamStlHapIhCRrjTPs9Z+fysvFh4NLa4cQZhanc6IF3Wo 0SS8e8PrJuTlmCwVpKtlLv9zkJWlC8oaLOVB+JRi692v8wMfi3CCKJqZQRtJAuMvfUe5 N6ghF5bxInKFLxssKVzjV0gA2nR7afexBG+CzCSvh8P8GlK82NjTQThTHrgW12dgD3aG JQzSWlFlIHu9sv5vyMMbxrJNQuViR3WiCFFIUK80sE3MUoLQMs6uarAnWWoVSbst6gPf e91ww5Xdrdso/YH2czWeH0OcwdU/S5G4gDZZHPZ999YQ4Z66pv1JttUGBVcd6SUv2Uo4 J7ww== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=TuXLgo7w; arc=pass (i=1); spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sa15si822969ejc.797.2022.02.16.14.55.42; Wed, 16 Feb 2022 14:56:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=TuXLgo7w; arc=pass (i=1); spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236761AbiBPQ5N (ORCPT + 99 others); Wed, 16 Feb 2022 11:57:13 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:43536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236786AbiBPQ5L (ORCPT ); Wed, 16 Feb 2022 11:57:11 -0500 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2063.outbound.protection.outlook.com [40.107.236.63]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90A9225D24D; Wed, 16 Feb 2022 08:56:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PARwtzBHpw71gd88b7eMhRRRMl7EFEgLBiaUghQNQ+CZHDvHTB3MjJxJ/25gtQNAN62SLVBZCn2w20P8HYjK6kwFo0qtr7PoMZ+1OSNyObATl1Mn4GbCoMfnxUKwA3rMyBQKa/rlildU2TGuP3wp7gTlsN3RimjHue4+ZQDqf8aixaoUbhRztk+BvoOmqWlDPqFe+9P2EkZ2JTziRw3ifBBBTXFCWYkZPW4AcR2wgaTPKAoyELnqQS+0tRoyaKtfJzCBDlAd42ObB4+xlTs/RhFYGHUFsiyORKWOVXvtbAfYDOqnnxoGBk3piAbD80bry3JB64UN02PRbKLINheYpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3hWFbG3vbl8TgBT8Z4DbUpAOIIH2ow8JRMteTkSIXSg=; b=m3526gzBI55Ki/JABXUbZCWa01gq8GiYh3/0ca/Kdbvx4IlvMgIOC++1LbpV7rAleI1PGAQK1HtcL3jZEITzY7Ikly6OLitqbq+1cjl3UveIkB1Zb+gRn1MzLXfBWvJuu2ex2K8bVqNqd0pfKZcEd63zzLPC6B8/J4LcuKAJ1fGYqaZv4bXGghLNB3gNpxeDYTeY3q0leAwTtVobXC0X2CP/h92kDI7IG78/Y/QfFuW/TJUCQkKGPcnmG3h05QldTaved7zWMyD7L/6red9vUKyLdmKg2B43nkUfk1T3d4HFIsPlIzHJ9qo8Hyp6dVsompLm3kcQmU6pnJpwrW5DWA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3hWFbG3vbl8TgBT8Z4DbUpAOIIH2ow8JRMteTkSIXSg=; b=TuXLgo7w6QQNMubsP96eB0evI6TQ0mugi1Kmnk3xXbMtdtyjNKOfZQvdRPHg/rAPu2rMGpM8R5o7VObvtdX1vkKYIHH/1dOEXCG/mcgOJPX8sgiE8ld8wzp4bhXSlXEZ9BADAnvRVO+PLSnmvGlXKgXSoBudP0AQ7xNb/qDuPf0= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; Received: from BN9PR12MB5115.namprd12.prod.outlook.com (2603:10b6:408:118::14) by BN8PR12MB3459.namprd12.prod.outlook.com (2603:10b6:408:65::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4975.18; Wed, 16 Feb 2022 16:56:54 +0000 Received: from BN9PR12MB5115.namprd12.prod.outlook.com ([fe80::38ec:3a46:f85e:6cfa]) by BN9PR12MB5115.namprd12.prod.outlook.com ([fe80::38ec:3a46:f85e:6cfa%4]) with mapi id 15.20.4995.016; Wed, 16 Feb 2022 16:56:54 +0000 Message-ID: <4362b4ec-ceb5-a712-bd03-24b749d1d004@amd.com> Date: Wed, 16 Feb 2022 11:56:51 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH v6 01/10] mm: add zone device coherent type memory support Content-Language: en-US To: Jason Gunthorpe Cc: Christoph Hellwig , David Hildenbrand , Alex Sierra , akpm@linux-foundation.org, linux-mm@kvack.org, rcampbell@nvidia.com, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, jglisse@redhat.com, apopple@nvidia.com, willy@infradead.org References: <078dd84e-ebbc-5c89-0407-f5ecc2ca3ebf@redhat.com> <20220215144524.GR4160@nvidia.com> <20220215183209.GA24409@lst.de> <20220215194107.GZ4160@nvidia.com> <20220215214749.GA4160@nvidia.com> <002ad572-4d32-7133-06f3-aa680c297be2@amd.com> <20220216020100.GC4160@nvidia.com> From: Felix Kuehling In-Reply-To: <20220216020100.GC4160@nvidia.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: YT3PR01CA0043.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:82::10) To BN9PR12MB5115.namprd12.prod.outlook.com (2603:10b6:408:118::14) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9df28f19-e447-4e13-4ddd-08d9f16d5582 X-MS-TrafficTypeDiagnostic: BN8PR12MB3459:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jR3roYlZWZZIq8J05wUpFz2VdfUkOrLeLUcrqxIJpKglnr1XAGc0STlVmzvLdLtR9R8CkHT8meYzsSsrACSztqxlwYcdBje2Fwgl1v7iPKEwCxxEqdxWG51NewDhHDt9yiVFK/vfDie8/v9zf938nh0y+CTo18ZlBp1pD5CCywnSFKwbjDgLksHyAbutXa2zfso/4jdODQyIpoUFh3Fo7AzQqu4idmnM+kwlSfJloZ4XsFbTdvKcCdbw8W5GlDn3wffQyzkZRw7pigzb3IW6EA7zixM2o4mbyj+mX/aMVcYX6U1y+1a4rDeoH0OJtCvg1U5yaRNHBjf//yiPS2ZL0kx4uSnOD3KgJNj/7UaARjHYs13Bm0rOr74U6OSu3UzdEHORvd6sabSbSmnCCMrVTfZxHZZQ8E9ZMASGYJFdiigcg200sYQxxgxBThbG0ABDtjljvvH1Rw19/Kn8hwgxSa4Fwjkr8rmd+yNNH0MJObtQ7krePUP6iNeWqzLCQemyrwyarbOKTDzTa3zBRESZRWFrjII8fKVKNT8Fm5jwxLl4YRcPxi620nLerlCMZvlcLP0Lfy72wFmZ/ptYHwBxQ3jMbS2dJg1xEenQg5oKmcKp/UMyi4kAbQzYUile6tFN2KUkrhHzSrZNLR9B7EhmlBs9JBSU6Ps+2PgV7vFkn8TWvGk1xnPKDJGmkLCyjrGejLurCswtdvAagQnAs631kIIRP5rTeATzPn+MzpUStqM= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BN9PR12MB5115.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(4636009)(366004)(44832011)(2906002)(36756003)(2616005)(508600001)(6506007)(83380400001)(31696002)(26005)(186003)(7416002)(66556008)(4326008)(8676002)(8936002)(66476007)(66946007)(5660300002)(38100700002)(316002)(6916009)(54906003)(6666004)(6512007)(6486002)(86362001)(31686004)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?TjFKeWhSRk11cnFIK2F0MXN4ek1hZWZxaWN0VHdPS1ZPLzF1R2U5QTFGMS84?= =?utf-8?B?WFp5Sm9mQ21LaG9uN1luM1AwY0RHcUYyTitaZzRKb3doWjl6clU3alJPdXNK?= =?utf-8?B?akY2cXRUTXNacitiaFhQTWNxaklZZ1doNzNoYnNDYmlSbXJYWHF2QXMvMmRv?= =?utf-8?B?bHhqRm1tRlIyMkVWMFJvdjFXQldKQlM4b3JzcWx1VnhBTHVvN1c1cVN4dEM3?= =?utf-8?B?OVJrd0FpcG9YSVBMbHZuZjN5ZDFVTHJKMnh0NytwRXlraTgwTEdhUmd4WG5j?= =?utf-8?B?VHhHVmcwOWdvQ25VckhyK0FMVWNYVlNEcEtEb3orZ3lMT2M3bUw3NXp1QUdw?= =?utf-8?B?T21nQWM3YkgwUWFhRkN0cnFDV2tCSkZ1eStzSDFMSGRCWG93VnlsUFdkQ2hl?= =?utf-8?B?ZUpPZE1JNC9DRUZmUVREcTM5Nm12TXBNelZZQ3dhQStYWU1pTklpd0cyeEdU?= =?utf-8?B?a2FjZmorY0tVd2JpZ3k1NGF1ZndsLzNDcm5zaTd3VSswSmNkdWVxcEtPNktE?= =?utf-8?B?MGdhNzhkMXlFRkJFSkkwMzNGTU9sT0UxSTNhVUJwVmQwREVHNmExRW5SdHNy?= =?utf-8?B?MnFMUGQycnhWTlhLTUxma2NSUTNjblRyYjFueTByb0p1TWQ5VkJSYzZQaERs?= =?utf-8?B?cVZSenpHcjN1aWJxamh0eklxWWhDWllHVFpwODYxMittaHJMcE5VODFhWFYy?= =?utf-8?B?UGk3Y3EzQWRlemk5cVNETjBGaS9SajVqTnhNWk45YjNUd0tpYnlnZkpnM1Z1?= =?utf-8?B?LzhSK1NqNHMxUVIzMVNJWm40ek5JaXRaK1lsSGpHbkFhM3VvdUV4bEdvRmNi?= =?utf-8?B?TDNBWmNUOFY2TTRvRGMxbHl3QlM5S1FrZ1dEVFRDNzBiSnNZSk9DdjkxZ0Js?= =?utf-8?B?ZXJ6eU4rbjgzaGhreXUrOWFMeHV2b2hLc3ZUVEF4RjhFa0JyQUdOejY4Vm5R?= =?utf-8?B?U1UrckIyUy8rVWxmbHJVV1l5NjJQU0hIQ2dZVEhGNU1hQmo4ZERHQm1XcXls?= =?utf-8?B?SnpGWG1aOE5aRWVHU1BMMVRLV0FHSVYzOVU2ZmlZdlpCTUxpNkRRT2tHSGJ1?= =?utf-8?B?ZG4zcVJHMFptbG9PQ0JKdzdiOG81bnVRNGhnSVh0cG5sYjVSZDdKMTh2b3pP?= =?utf-8?B?bjZmQ040SDhZVy94SDVvd3NjQzYyeERuRW4wUGVYRTNTR3ZuWHEvbWo3ZTMv?= =?utf-8?B?eWFMWXhXSlVwc1g0c0V0RDVUaWNvY2NBQmZ3TlVhMGJNWEhJenRSMGduU1Jq?= =?utf-8?B?YUdyMm5DV1g2S1BaWmpra3RDQjYwOUZRallZTjhBMGFXamU4Q3k0cEdTaGo4?= =?utf-8?B?QTc3ek9SaTEzWmltTmpmTGtUektKZENnREdiTU84UkJMQXByUEkzRTBFckNt?= =?utf-8?B?QVhwV1UxVW5HYXNLWExvamVYdEc4eVpoc0tYc2F4WjM2dklGa1dTeHBmQ0VM?= =?utf-8?B?TCtBQnYyVHBuTG55Sm00ajVLNTVYeDI3RnAvb3EydDA2em9NSTY0RWwvaUU0?= =?utf-8?B?dlI4OVlLemVTQ0VnenBhRFpvRWszRzVpMERvRHBORnFOZEhXY21lWVV2bmx5?= =?utf-8?B?QldKTG5VS21BTW9yQlBBSWkxVEYyc1ZYSFVzbDhvS1VCbFdHU1BNeFltY0Q3?= =?utf-8?B?N1dVWERkdXpDMGg1Z2FTVzcxVE5Pa29GZ0lVMFpKYmRGQmxNQkNiWTU4UlRZ?= =?utf-8?B?L1FPRVN1YWNNT1gvZ21PRm8zSUsvUFlHUFJjTkVQRjRCSklCaElJY1hjMlhQ?= =?utf-8?B?YmUzRUpUeXhBUXFCbVFNNmJiMjA3em1zRDhiNkZSZ2dIL0F1c3BEWXA0SG54?= =?utf-8?B?WUxJNitmVmV1a21QdnJVTlVBUHhGYWFiSVkvN3A3N3ZxdlplUWh6MFB1c1RO?= =?utf-8?B?NDQ1VXhuTkkwMXQzTDEzTjNyVnZSRVBBOThZZEdVY3I5M05jTURBZXBWQytJ?= =?utf-8?B?aExPZ3h4QU5rdFd2SUQwSjF5Z1ZhVlRzeFNsSCtMRXgzUHVoa0l2SzEvL2hm?= =?utf-8?B?K0RYcTJRb2ZzRll6VVk4UGYvVXRLNXhrZU1GSEFzdFBoVVBod05QL0tsQy9r?= =?utf-8?B?OVE1N3VYeXhwb3JmREt6UE9iTWpNOHlESWxFRWNEZTkrT1U1RUdUUUlzZkpT?= =?utf-8?B?QnVXb3lsbHZLcUNoUmhCVlR5U1NJazU4SzRIZG1CRjd6QldHY0J4S1h2RXA4?= =?utf-8?Q?HeGB2Y/GJ2MlGh5B9+InzHM=3D?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9df28f19-e447-4e13-4ddd-08d9f16d5582 X-MS-Exchange-CrossTenant-AuthSource: BN9PR12MB5115.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Feb 2022 16:56:53.9842 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: EtGAavjWFVbD9gQ3oqaiATJSGUFogjJ3mZrQF9t5MQfO0nwoTX4oAVXLHZyTHW3p41NgRWipTpogtm+K06yagw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR12MB3459 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Am 2022-02-15 um 21:01 schrieb Jason Gunthorpe: > On Tue, Feb 15, 2022 at 05:49:07PM -0500, Felix Kuehling wrote: > >>> Userspace does >>> 1) mmap(MAP_PRIVATE) to allocate anon memory >>> 2) something to trigger migration to install a ZONE_DEVICE page >>> 3) munmap() >>> >>> Who decrements the refcout on the munmap? >>> >>> When a ZONE_DEVICE page is installed in the PTE is supposed to be >>> marked as pte_devmap and that disables all the normal page refcounting >>> during munmap(). >>> >>> fsdax makes this work by working the refcounts backwards, the page is >>> refcounted while it exists in the driver, when the driver decides to >>> remove it then unmap_mapping_range() is called to purge it from all >>> PTEs and then refcount is decrd. munmap/fork/etc don't change the >>> refcount. >> Hmm, that just means, whether or not there are PTEs doesn't really >> matter. > Yes, that is the FSDAX model > >> It should work the same as it does for DEVICE_PRIVATE pages. I'm not sure >> where DEVICE_PRIVATE page's refcounts are decremented on unmap, TBH. But I >> can't find it in our driver, or in the test_hmm driver for that matter. > It is not the same as DEVICE_PRIVATE because DEVICE_PRIVATE uses swap > entries. The put_page for that case is here: > > static unsigned long zap_pte_range(struct mmu_gather *tlb, > struct vm_area_struct *vma, pmd_t *pmd, > unsigned long addr, unsigned long end, > struct zap_details *details) > { > [..] > if (is_device_private_entry(entry) || > is_device_exclusive_entry(entry)) { > struct page *page = pfn_swap_entry_to_page(entry); > > if (unlikely(zap_skip_check_mapping(details, page))) > continue; > pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); > rss[mm_counter(page)]--; > > if (is_device_private_entry(entry)) > page_remove_rmap(page, false); > > put_page(page); > > However the devmap case will return NULL from vm_normal_page() and won't > do the put_page() embedded inside the __tlb_remove_page() in the > pte_present() block in the same function. > > After reflecting for awhile, I think Christoph's idea is quite > good. Just make it so you don't set pte_devmap() on your pages and > then lets avoid pte_devmap for all refcount correct ZONE_DEVICE pages. I'm not sure if pte_devmap is actually set for our DEVICE_COHERENT pages. As far as I can tell, this comes from a bit in the pfn: #define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3)) #define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4)) ... static inline bool pfn_t_devmap(pfn_t pfn) { const u64 flags = PFN_DEV|PFN_MAP; return (pfn.val & flags) == flags; } In the case of DEVICE_COHERENT memory, the pfns correspond to real physical memory addresses. I don't think they have those PFN_DEV|PFN_MAP bits set. Regards,   Felix > > Jason