Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp386069ybz; Fri, 1 May 2020 00:22:47 -0700 (PDT) X-Google-Smtp-Source: APiQypKcSFMqLI0mzdEEJDDnKrAZkRF8OBSxyMQR0epbY4ccSEj7811RrDXQWfqc45j9TDTC9JP4 X-Received: by 2002:a17:906:558a:: with SMTP id y10mr2062172ejp.192.1588317767501; Fri, 01 May 2020 00:22:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588317767; cv=none; d=google.com; s=arc-20160816; b=zjEHEAuNrcdo64VsDsEAxgVTnZCDKSJLsTfH5Wc/tHjX3e8Y1ECG4T9plhccXVsdJ4 S/oPOYbMKbj5/dHmWDnEwA/kZeh77tN/ZgSPY/O2OGycq2TBk3knCwWlg/gZEfUsgJgA 33I9zxL4c99ob4M2RWJT58fW2q6hBWRyoq4M7GpGUQMPGE1M10gET3uJAo8PSAAzVExg 1zoUlPBJpxkdIs7E/Vil6RA5Yt/sFlwIIQsAqy0RhJSOv82ELqUTh1lfyH3GFY1oo4DM YWXJDOtqCRnSYRYskEov84Nre5Q5CqCPpdM7jW0nurt7UlBWJtA+nwUKJ/+9nf2MtvWx GNfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:from:references:cc:to:subject; bh=UIQnKc+wBH5rPGa9F0NHTpUpiND2y9sr+oSkWqVI3vc=; b=hw53wcDsNnL+Yk6MDDQbYZhUpM5qcI8s4S+gpK+juNSpfO51Blwv7cFOJ2E1Z+fdvH IENufqGOr/8GfMoSsuu+T8OpyEXE2ForL6f6QzBpgTYH1wZeVYglQJVgT62+I8D8AWGo l0Q3GbIJJGGHGtxR86o1JVw5TfOnrIVxAS8gDs8BW3IHBU4uaWpGdwLw08SUelhpstrU PJ/ShtmrdPuQN26eS6SLxAAue6aB+wWM89nYsi8pw3xw+RMlP09H+FgNUKlzeIq+xLuM S/D3TC5G5jrOU6617gMMAf8jMxf4u5B3/82QErWA9HdcSLmGw+WxbNNAM20tDGy/6r7z 5Juw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d16si1084109eds.428.2020.05.01.00.22.24; Fri, 01 May 2020 00:22:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728294AbgEAHTB (ORCPT + 99 others); Fri, 1 May 2020 03:19:01 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:42268 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726452AbgEAHTA (ORCPT ); Fri, 1 May 2020 03:19:00 -0400 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 04172os0030587; Fri, 1 May 2020 03:18:42 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 30r7mcagsy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 May 2020 03:18:42 -0400 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 04172qoI030818; Fri, 1 May 2020 03:18:41 -0400 Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 30r7mcags7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 May 2020 03:18:41 -0400 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0417Gbi8024788; Fri, 1 May 2020 07:18:38 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma03fra.de.ibm.com with ESMTP id 30mcu5bba4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 May 2020 07:18:38 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0417IZjV63308074 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 1 May 2020 07:18:35 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C8F774C046; Fri, 1 May 2020 07:18:35 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0A5AC4C04A; Fri, 1 May 2020 07:18:35 +0000 (GMT) Received: from oc7455500831.ibm.com (unknown [9.145.25.110]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 1 May 2020 07:18:34 +0000 (GMT) Subject: Re: [PATCH v2 1/1] fs/splice: add missing callback for inaccessible pages To: Dave Hansen , Claudio Imbrenda , viro@zeniv.linux.org.uk Cc: david@redhat.com, akpm@linux-foundation.org, aarcange@redhat.com, linux-mm@kvack.org, frankja@linux.ibm.com, sfr@canb.auug.org.au, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, jack@suse.cz, kirill@shutemov.name, peterz@infradead.org, sean.j.christopherson@intel.com, Ulrich.Weigand@de.ibm.com References: <20200430143825.3534128-1-imbrenda@linux.ibm.com> <1a3f5107-9847-73d4-5059-c6ef9d293551@de.ibm.com> From: Christian Borntraeger Autocrypt: addr=borntraeger@de.ibm.com; prefer-encrypt=mutual; keydata= xsFNBE6cPPgBEAC2VpALY0UJjGmgAmavkL/iAdqul2/F9ONz42K6NrwmT+SI9CylKHIX+fdf J34pLNJDmDVEdeb+brtpwC9JEZOLVE0nb+SR83CsAINJYKG3V1b3Kfs0hydseYKsBYqJTN2j CmUXDYq9J7uOyQQ7TNVoQejmpp5ifR4EzwIFfmYDekxRVZDJygD0wL/EzUr8Je3/j548NLyL 4Uhv6CIPf3TY3/aLVKXdxz/ntbLgMcfZsDoHgDk3lY3r1iwbWwEM2+eYRdSZaR4VD+JRD7p8 0FBadNwWnBce1fmQp3EklodGi5y7TNZ/CKdJ+jRPAAnw7SINhSd7PhJMruDAJaUlbYaIm23A +82g+IGe4z9tRGQ9TAflezVMhT5J3ccu6cpIjjvwDlbxucSmtVi5VtPAMTLmfjYp7VY2Tgr+ T92v7+V96jAfE3Zy2nq52e8RDdUo/F6faxcumdl+aLhhKLXgrozpoe2nL0Nyc2uqFjkjwXXI OBQiaqGeWtxeKJP+O8MIpjyGuHUGzvjNx5S/592TQO3phpT5IFWfMgbu4OreZ9yekDhf7Cvn /fkYsiLDz9W6Clihd/xlpm79+jlhm4E3xBPiQOPCZowmHjx57mXVAypOP2Eu+i2nyQrkapaY IdisDQfWPdNeHNOiPnPS3+GhVlPcqSJAIWnuO7Ofw1ZVOyg/jwARAQABzUNDaHJpc3RpYW4g Qm9ybnRyYWVnZXIgKDJuZCBJQk0gYWRkcmVzcykgPGJvcm50cmFlZ2VyQGxpbnV4LmlibS5j b20+wsF5BBMBAgAjBQJdP/hMAhsDBwsJCAcDAgEGFQgCCQoLBBYCAwECHgECF4AACgkQEXu8 gLWmHHy/pA/+JHjpEnd01A0CCyfVnb5fmcOlQ0LdmoKWLWPvU840q65HycCBFTt6V62cDljB kXFFxMNA4y/2wqU0H5/CiL963y3gWIiJsZa4ent+KrHl5GK1nIgbbesfJyA7JqlB0w/E/SuY NRQwIWOo/uEvOgXnk/7+rtvBzNaPGoGiiV1LZzeaxBVWrqLtmdi1iulW/0X/AlQPuF9dD1Px hx+0mPjZ8ClLpdSp5d0yfpwgHtM1B7KMuQPQZGFKMXXTUd3ceBUGGczsgIMipZWJukqMJiJj QIMH0IN7XYErEnhf0GCxJ3xAn/J7iFpPFv8sFZTvukntJXSUssONnwiKuld6ttUaFhSuSoQg OFYR5v7pOfinM0FcScPKTkrRsB5iUvpdthLq5qgwdQjmyINt3cb+5aSvBX2nNN135oGOtlb5 tf4dh00kUR8XFHRrFxXx4Dbaw4PKgV3QLIHKEENlqnthH5t0tahDygQPnSucuXbVQEcDZaL9 WgJqlRAAj0pG8M6JNU5+2ftTFXoTcoIUbb0KTOibaO9zHVeGegwAvPLLNlKHiHXcgLX1tkjC DrvE2Z0e2/4q7wgZgn1kbvz7ZHQZB76OM2mjkFu7QNHlRJ2VXJA8tMXyTgBX6kq1cYMmd/Hl OhFrAU3QO1SjCsXA2CDk9MM1471mYB3CTXQuKzXckJnxHkHOwU0ETpw8+AEQAJjyNXvMQdJN t07BIPDtbAQk15FfB0hKuyZVs+0lsjPKBZCamAAexNRk11eVGXK/YrqwjChkk60rt3q5i42u PpNMO9aS8cLPOfVft89Y654Qd3Rs1WRFIQq9xLjdLfHh0i0jMq5Ty+aiddSXpZ7oU6E+ud+X Czs3k5RAnOdW6eV3+v10sUjEGiFNZwzN9Udd6PfKET0J70qjnpY3NuWn5Sp1ZEn6lkq2Zm+G 9G3FlBRVClT30OWeiRHCYB6e6j1x1u/rSU4JiNYjPwSJA8EPKnt1s/Eeq37qXXvk+9DYiHdT PcOa3aNCSbIygD3jyjkg6EV9ZLHibE2R/PMMid9FrqhKh/cwcYn9FrT0FE48/2IBW5mfDpAd YvpawQlRz3XJr2rYZJwMUm1y+49+1ZmDclaF3s9dcz2JvuywNq78z/VsUfGz4Sbxy4ShpNpG REojRcz/xOK+FqNuBk+HoWKw6OxgRzfNleDvScVmbY6cQQZfGx/T7xlgZjl5Mu/2z+ofeoxb vWWM1YCJAT91GFvj29Wvm8OAPN/+SJj8LQazd9uGzVMTz6lFjVtH7YkeW/NZrP6znAwv5P1a DdQfiB5F63AX++NlTiyA+GD/ggfRl68LheSskOcxDwgI5TqmaKtX1/8RkrLpnzO3evzkfJb1 D5qh3wM1t7PZ+JWTluSX8W25ABEBAAHCwV8EGAECAAkFAk6cPPgCGwwACgkQEXu8gLWmHHz8 2w//VjRlX+tKF3szc0lQi4X0t+pf88uIsvR/a1GRZpppQbn1jgE44hgF559K6/yYemcvTR7r 6Xt7cjWGS4wfaR0+pkWV+2dbw8Xi4DI07/fN00NoVEpYUUnOnupBgychtVpxkGqsplJZQpng v6fauZtyEcUK3dLJH3TdVQDLbUcL4qZpzHbsuUnTWsmNmG4Vi0NsEt1xyd/Wuw+0kM/oFEH1 4BN6X9xZcG8GYUbVUd8+bmio8ao8m0tzo4pseDZFo4ncDmlFWU6hHnAVfkAs4tqA6/fl7RLN JuWBiOL/mP5B6HDQT9JsnaRdzqF73FnU2+WrZPjinHPLeE74istVgjbowvsgUqtzjPIG5pOj cAsKoR0M1womzJVRfYauWhYiW/KeECklci4TPBDNx7YhahSUlexfoftltJA8swRshNA/M90/ i9zDo9ySSZHwsGxG06ZOH5/MzG6HpLja7g8NTgA0TD5YaFm/oOnsQVsf2DeAGPS2xNirmknD jaqYefx7yQ7FJXXETd2uVURiDeNEFhVZWb5CiBJM5c6qQMhmkS4VyT7/+raaEGgkEKEgHOWf ZDP8BHfXtszHqI3Fo1F4IKFo/AP8GOFFxMRgbvlAs8z/+rEEaQYjxYJqj08raw6P4LFBqozr nS4h0HDFPrrp1C2EMVYIQrMokWvlFZbCpsdYbBI= Message-ID: <3d379d9e-241c-ef3b-dcef-20fdd3b8740d@de.ibm.com> Date: Fri, 1 May 2020 09:18:34 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.676 definitions=2020-05-01_02:2020-04-30,2020-05-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 bulkscore=0 mlxscore=0 phishscore=0 priorityscore=1501 clxscore=1015 mlxlogscore=992 impostorscore=0 suspectscore=0 lowpriorityscore=0 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005010047 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01.05.20 00:06, Dave Hansen wrote: > I was also wondering if Claudio was right about the debug patch having > races. I went to go look how the s390 code avoids races when pages go > from accessible->inaccessible. > > Because, if if all of the traps are in place to transform pages from > inaccessible->accessible, the code *after* those traps is still > vulnerable. What *keeps* pages accessible? > > The race avoidance is this, basically: > > down_read(&gmap->mm->mmap_sem); > lock_page(page); > ptep = get_locked_pte(gmap->mm, uaddr, &ptelock); > ... >> expected = expected_page_refs(page); >> if (!page_ref_freeze(page, expected)) >> return -EBUSY; >> set_bit(PG_arch_1, &page->flags); >> rc = uv_call(0, (u64)uvcb); >> page_ref_unfreeze(page, expected); > > ... up_read(mmap_sem) / unlock_page() / unlock pte > > I'm assuming that after the uv_call(), the page is inaccessible and I/O > devices will go boom if they touch the page. > > The page_ref_freeze() ensures that references come between the > freeze/unfreeze are noticed, but it doesn't actually *stop* new ones for > users that hold references already. For the page cache, especially, > someone could do: > > page = find_get_page(); > arch_make_page_accessible(); > lock_page(); > ... make_secure_pte(); Not sure if I got your point here, but this make_secure_pte should bail out because we actually do check for a calculated refcount value and return -EBUSY. The find_get_page should have raised this refcount to a value that would go beyond the expected value, No? > unlock_page(); > get_page(); > // ^ OK because I have a ref > // do DMA on inaccessible page > > Because the make_secure_pte() code isn't looking for a *specific* > 'expected' value, it has no way of noticing that the extra ref snuck in > there. I think the expected calcution is actually doing that,giving back the minimum value when no one else has any references that are valid for I/O. But I might not have understood what you are trying to tell me? > > I _think_ expected actually needs to be checked for having a specific > (low) value so that if there's a *possibility* of a reference holder > acquiring additional references, the page is known to be off-limits. > mm/migrate.c has a few examples of this, but I'm not quite sure how > bulletproof they are. Some of it appears to just be optimizations. > > > b