Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp221110rwr; Tue, 2 May 2023 19:22:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7seDyk5fYseGybFP4pdZT0zkXqbYulDnNn0p3StVZHiOmhtNtQR0NZNmLk1hmVHobFYrUs X-Received: by 2002:a17:902:e5cd:b0:1ab:595:2f3c with SMTP id u13-20020a170902e5cd00b001ab05952f3cmr522098plf.57.1683080561500; Tue, 02 May 2023 19:22:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683080561; cv=none; d=google.com; s=arc-20160816; b=SwgHg+1H9Kafr9l1uACOxEuw6Y+Qt2e2BGXDGKZMEOT2ypCRDOzJMwxOufHIQrmTe8 TIO64FJs7cdCj1KDJ0ZrDlYXo5sog/1etme7d+d43+XNCkoC/hYur48xQj6tH7GCwOSm SUvnLi23+0QdyFAqDatph9y10g6ilUTahqP7WCQ0D30QEpZ3we8cAR9eyj3YKG7GwyH/ PQ73/0D46PbGBI9TgusCbaQpT57t392Bna+yak+WrwBQJ5gVd2uIJc3ZEaOSu+EUTfcT RqupmR7CoP7Tu2eI7SzrPK62lDITm12nUdBNXSy0xgqi1FUvCiJcCzSkUA5hZ1FH4SS3 U/LA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=JoJ5SpihN/QtdPkAWRyYxCvZqinMeEO/Vv8+UL+iDFk=; b=MOONlrzPDd4urJWHanddz2rXBERa7otn1YnDuHJ7lkvmqol1tqQl1slQyUkt59Mu5k ZXhSE1cWTCQ6C1Ybe0RAqEOd/souYQGLE9mIFrbPTcgypP9oKfw9BOm6xvGMjtwR6cLI FB79rzOdSJvhSSgxrk77L/pHJaU3FiCEK5r529TPiybI4KNarw6TT+C7FeiDrAX/8TTv lOSUMNU4etXCI0EKypSsgdirGrYLz+sE05+a/J0y5GNdvPgKti8yQhNAXo3TYk+gJkpM nPpMLGgi43qWVTqwq4k3qydD8Q6+XqMcjGhNFRiAwwvIx7OJFbQT0AQC2/gfZRyheoqM g7ew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=k37u0tf1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q7-20020a17090311c700b001a526bc2b84si33034015plh.620.2023.05.02.19.22.27; Tue, 02 May 2023 19:22:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=k37u0tf1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229625AbjECCRx (ORCPT + 99 others); Tue, 2 May 2023 22:17:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229575AbjECCRu (ORCPT ); Tue, 2 May 2023 22:17:50 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85B9430E8; Tue, 2 May 2023 19:17:49 -0700 (PDT) Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3432BBu1015196; Wed, 3 May 2023 02:15:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=JoJ5SpihN/QtdPkAWRyYxCvZqinMeEO/Vv8+UL+iDFk=; b=k37u0tf1VyFeaC7Jg83qYfqqMntBgTM7x67ARJyBxuuMvl3ygfFtJVFR7gEFY/9YRaWH liOMpemLILu+uY9V3+wdx6fgxi3yFRc/FBZVjFW/Hef9jKGWu11wAQ9UEhNWBnJKiUZi zkFS8h6uGJg7PSHullw9NRMOwIG7PubRIYeuff6vqfxcE8VRLHzyc4ZMDXkiKY+fWkEF oy2gtybz+DF7ozhocJfnOwxP8CBRlwWkp9B5QZlH0fnTMjSx+PF3hqYqsOWPreN27zsR GU7npTMSDk142G4QPFgIiiL5YKMLycHNObY99pwrTYbJ/tKGQmvoi+/n103EhArosoLw QQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qbefk0gpm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 May 2023 02:15:54 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3432FIcr027482; Wed, 3 May 2023 02:15:52 GMT Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qbefk0gp5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 May 2023 02:15:52 +0000 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 342MjM4Y003793; Wed, 3 May 2023 00:31:55 GMT Received: from smtprelay07.wdc07v.mail.ibm.com ([9.208.129.116]) by ppma05wdc.us.ibm.com (PPS) with ESMTPS id 3q8tv7t06v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 May 2023 00:31:55 +0000 Received: from smtpav06.wdc07v.mail.ibm.com (smtpav06.wdc07v.mail.ibm.com [10.39.53.233]) by smtprelay07.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3430Vpcm47448388 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 3 May 2023 00:31:51 GMT Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AEA305803F; Wed, 3 May 2023 00:31:51 +0000 (GMT) Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 766D658056; Wed, 3 May 2023 00:31:46 +0000 (GMT) Received: from [9.160.35.135] (unknown [9.160.35.135]) by smtpav06.wdc07v.mail.ibm.com (Postfix) with ESMTP; Wed, 3 May 2023 00:31:46 +0000 (GMT) Message-ID: <20d078c5-4ee6-18dc-d3a5-d76b6a68f64e@linux.ibm.com> Date: Tue, 2 May 2023 20:31:45 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH v8 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Content-Language: en-US To: Lorenzo Stoakes , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton Cc: Jason Gunthorpe , Jens Axboe , Matthew Wilcox , Dennis Dalessandro , Leon Romanovsky , Christian Benvenuti , Nelson Escobar , Bernard Metzler , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Christian Brauner , Richard Cochran , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, Oleg Nesterov , Jason Gunthorpe , John Hubbard , Jan Kara , "Kirill A . Shutemov" , Pavel Begunkov , Mika Penttila , David Hildenbrand , Dave Chinner , "Theodore Ts'o" , Peter Xu , "Paul E . McKenney" , Christian Borntraeger References: From: Matthew Rosato In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: AlrGBLtyvdeT3uz2aLYHtlgeq7D8Cq1z X-Proofpoint-ORIG-GUID: _xLhuMzbYl-0maWOE8quUSjewJ8AjUjs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-02_14,2023-04-27_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=358 clxscore=1015 suspectscore=0 bulkscore=0 impostorscore=0 lowpriorityscore=0 adultscore=0 spamscore=0 priorityscore=1501 phishscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2305030015 X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/2/23 6:51 PM, Lorenzo Stoakes wrote: > Writing to file-backed mappings which require folio dirty tracking using > GUP is a fundamentally broken operation, as kernel write access to GUP > mappings do not adhere to the semantics expected by a file system. > > A GUP caller uses the direct mapping to access the folio, which does not > cause write notify to trigger, nor does it enforce that the caller marks > the folio dirty. > > The problem arises when, after an initial write to the folio, writeback > results in the folio being cleaned and then the caller, via the GUP > interface, writes to the folio again. > > As a result of the use of this secondary, direct, mapping to the folio no > write notify will occur, and if the caller does mark the folio dirty, this > will be done so unexpectedly. > > For example, consider the following scenario:- > > 1. A folio is written to via GUP which write-faults the memory, notifying > the file system and dirtying the folio. > 2. Later, writeback is triggered, resulting in the folio being cleaned and > the PTE being marked read-only. > 3. The GUP caller writes to the folio, as it is mapped read/write via the > direct mapping. > 4. The GUP caller, now done with the page, unpins it and sets it dirty > (though it does not have to). > > This change updates both the PUP FOLL_LONGTERM slow and fast APIs. As > pin_user_pages_fast_only() does not exist, we can rely on a slightly > imperfect whitelisting in the PUP-fast case and fall back to the slow case > should this fail. > > v8: > - Fixed typo writeable -> writable. > - Fixed bug in writable_file_mapping_allowed() - must check combination of > FOLL_PIN AND FOLL_LONGTERM not either/or. > - Updated vma_needs_dirty_tracking() to include write/shared to account for > MAP_PRIVATE mappings. > - Move to open-coding the checks in folio_pin_allowed() so we can > READ_ONCE() the mapping and avoid unexpected compiler loads. Rename to > account for fact we now check flags here. > - Disallow mapping == NULL or mapping & PAGE_MAPPING_FLAGS other than > anon. Defer to slow path. > - Perform GUP-fast check _after_ the lowest page table level is confirmed to > be stable. > - Updated comments and commit message for final patch as per Jason's > suggestions. Tested again on s390 using QEMU with a memory backend file (on ext4) and vfio-pci -- This time both vfio_pin_pages_remote (which will call pin_user_pages_remote(flags | FOLL_LONGTERM)) and the pin_user_pages_fast(FOLL_WRITE | FOLL_LONGTERM) in kvm_s390_pci_aif_enable are being allowed (e.g. returning positive pin count)