Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp32389232rwd; Fri, 7 Jul 2023 13:06:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlE21rr08hwT45NknXulTHjE10+jP4b6XNaedLaNOvtlM2PY/KsZWJA06xw/vZB/dHv4HBA7 X-Received: by 2002:a05:6870:424d:b0:1b3:842d:b063 with SMTP id v13-20020a056870424d00b001b3842db063mr6889892oac.48.1688760411812; Fri, 07 Jul 2023 13:06:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688760411; cv=none; d=google.com; s=arc-20160816; b=Yb6A3sPiuLQmjzBUaiSclM1vtUjdpgpv1/lUdpNOx58GrsRjrj/2JY44Cd9pdbfeis 7a6MUq6qyJZLmfDUB/u8RT0jOjks/BMluJrjeUC47ZRORxGeBbnlBH1uuNg2yqtTkL7o em8hOu2V95EJn6kPAP4hkSEmnE3Eto5hkFdYKKCymYHB1Mm5/Zp/e66OjTZy4u1c2mbP r0gNY/AvoxsgLeeuQifDXpLlsukR0R4Cq2fi1r/ikqgV/96fVJ/QbjAVMMJTVF4y5Lfc dNiVatPsig5JOAGBbri+o54VVWcYDo5B+LcqYhVHTCLaa2TkxAKiz5De4h56+MAQ7i5s UGDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=IxB32od6PqZJdUe5KvFRjX12BUZtWHFhrjYTDAzxhJY=; fh=GZ8AjSm4D/pvo/TqdQfRmssmRrWh0l+thXU+IX781EA=; b=P18qnjDNqLm9w8ilAhPjXBeJ7vtbhg5fqHHT+/F61ZjoRjxsZxBcryMLCLjhsnF7ra hxgMoWvxeTCUE49m4FW2gGHij3mdb8Rhzh+l0zD85nMYyexGuGIq2AgtWV66izUp+WFu lT4IIlWdRpMwrs5IFKTvIBYaLV5VsbtPArEK9sr8CmvKj1Ubz30kEW6pQbC83sWuNL2d sUvnp3qQ9kGXBVtL9J3GYaqmw9lDYEXWSTJm5kzQFOMjxMWX+0KB/9X6dFYFhvaohsDP NgI6hmu9Hz/IVAWURhQaO8S/8fIFTPJloo3AKNkXa7Fy5U2jKf2EbOeiiGx77Xtlb0jA mNig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=F3bN+Y2C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c20-20020a6566d4000000b0055baf9fc37esi4422453pgw.30.2023.07.07.13.06.38; Fri, 07 Jul 2023 13:06:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=F3bN+Y2C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229675AbjGGT4r (ORCPT + 99 others); Fri, 7 Jul 2023 15:56:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229629AbjGGT4p (ORCPT ); Fri, 7 Jul 2023 15:56:45 -0400 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49B901FD9 for ; Fri, 7 Jul 2023 12:56:43 -0700 (PDT) Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-992b66e5affso277676966b.3 for ; Fri, 07 Jul 2023 12:56:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688759802; x=1691351802; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IxB32od6PqZJdUe5KvFRjX12BUZtWHFhrjYTDAzxhJY=; b=F3bN+Y2C3obAkkXGRADKQLZ0LgNR3Npz0zPdisMg+57COAbWl3CQyMSuDgDchVJdzZ WN2z1oeVhHRrKxCBeO9xm509lTqy5PjfMukjvh86Zzpgg7oqHp4oNhj2x3knnZ2WSLZb QA48WYc4+R1bLExL+Zd/+sp939tEtUCOW2XjUJJFwBwz1X38NOFNJisP4pOWrdkXuscM BO44X1ZPaItl/nKfbQzu9ytwMaEUEHdTXLfDMbZIbRs0fIQ59jpUVqEkvTlf0aKGFU/y ykQQu72VxK/NchIJYzfbiZwX1G/m6/ZTrDdGXSLTvLF+KVytWN3t1M10MxPCHfA2Y35r /CxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688759802; x=1691351802; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IxB32od6PqZJdUe5KvFRjX12BUZtWHFhrjYTDAzxhJY=; b=C9ppSpQLig5q7D1QdwYrJD+7euqsUZq6sLfGldEOueK/MvBZTiz5Yo5NNo9UnQOrpi rl681t2K91bz8aKymds5FCVtg8cS7Cqvxm8NQXOYY6/2A0tbhz0kjVFkgCkDqJ7HPW+N QB2GPJqbZw+ZkPtmKaPPaEzng+yDMFLay/hHUI/e1dMetaXCtmQXtgKACf9mNwIcnK5V fT7A/7g+1rv+NFjsyhCV21SDl1m/rnGKOFZqIMobm+2bEgYMu03mfvNUwZ8K9W33yrV7 DURxmG9wby0a/xBBEk2TGwaZcl48B8Dl7vF5W7uwa2PfQeZcq9E1NwAG/yfdOdAbHk+R ZOVg== X-Gm-Message-State: ABy/qLb8WTOlnDTKcThbjuxjI9W+nthIctsLTUBOhKKKfdMELzFx/pSm 7IpJWC36MWFkeCL+R+0JKiFCYOUQriVQqxnTIZ6+qw== X-Received: by 2002:a17:906:ca17:b0:96f:d780:5734 with SMTP id jt23-20020a170906ca1700b0096fd7805734mr4451808ejb.65.1688759801563; Fri, 07 Jul 2023 12:56:41 -0700 (PDT) MIME-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> <20230706225037.1164380-5-axelrasmussen@google.com> In-Reply-To: From: Axel Rasmussen Date: Fri, 7 Jul 2023 12:56:04 -0700 Message-ID: Subject: Re: [PATCH v3 4/8] mm: userfaultfd: add new UFFDIO_POISON ioctl To: Peter Xu Cc: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 7, 2023 at 6:37=E2=80=AFAM Peter Xu wrote: > > On Thu, Jul 06, 2023 at 03:50:32PM -0700, Axel Rasmussen wrote: > > The basic idea here is to "simulate" memory poisoning for VMs. A VM > > running on some host might encounter a memory error, after which some > > page(s) are poisoned (i.e., future accesses SIGBUS). They expect that > > once poisoned, pages can never become "un-poisoned". So, when we live > > migrate the VM, we need to preserve the poisoned status of these pages. > > > > When live migrating, we try to get the guest running on its new host as > > quickly as possible. So, we start it running before all memory has been > > copied, and before we're certain which pages should be poisoned or not. > > > > So the basic way to use this new feature is: > > > > - On the new host, the guest's memory is registered with userfaultfd, i= n > > either MISSING or MINOR mode (doesn't really matter for this purpose)= . > > - On any first access, we get a userfaultfd event. At this point we can > > communicate with the old host to find out if the page was poisoned. > > - If so, we can respond with a UFFDIO_POISON - this places a swap marke= r > > so any future accesses will SIGBUS. Because the pte is now "present", > > future accesses won't generate more userfaultfd events, they'll just > > SIGBUS directly. > > > > UFFDIO_POISON does not handle unmapping previously-present PTEs. This > > isn't needed, because during live migration we want to intercept > > all accesses with userfaultfd (not just writes, so WP mode isn't useful > > for this). So whether minor or missing mode is being used (or both), th= e > > PTE won't be present in any case, so handling that case isn't needed. > > > > Similarly, UFFDIO_POISON won't replace existing PTE markers. This might > > be okay to do, but it seems to be safer to just refuse to overwrite any > > existing entry (like a UFFD_WP PTE marker). > > > > Signed-off-by: Axel Rasmussen > > I agree the current behavior is not as clear, especially after hwpoison > introduced. > > uffdio-copy is special right now that it can overwrite a marker, so a bug= gy > userapp can also overwrite a poisoned entry, but it also means the userap= p > is broken already, so may not really matter much. > > While zeropage wasn't doing that. I think that was just overlooked - i > assume it has the same reasoning as uffdio-copy otherwise.. and no one ju= st > used zeropage over a wp marker yet, or just got it work-arounded by > unprotect+zeropage. > > Not yet sure whether it'll make sense to unify this a bit, but making the > new poison api to be strict look fine. If you have any thoughts after > reading feel free to keep the discussion going, I can ack this one I thin= k > (besides my rename request in 1st patch): Agreed, it would be nice to unify things. In my v2 I had anon/shmem and hugetlbfs behaving differently in this respect, for the same reason - it was just overlooked / cargo culted from existing code. If nothing else I think a single ioctl should be consistent across memory types! Heh. But I also think you're right and it's not exactly intentional that copy / zeropage / etc are different in this respect. Some unification would be nice, although I'm not 100% sure what that looks like concretely. My rule of thumb is, in cases where we can't imagine a real use case, it's better to be too strict rather than too loose. And in the future, it's less disruptive to loosen restrictions rather than tighten them (potentially breaking something which used to work). I'll leave untangling this to some future series, though. > > Acked-by: Peter Xu > > -- > Peter Xu >