Received: by 2002:ac0:da4c:0:0:0:0:0 with SMTP id a12csp61950imi; Wed, 20 Jul 2022 17:09:15 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tAK+/h3iC8nAea0Rk6UqH1QIAEAajsH/AkjMhp0bdMq/GTFeprpLrzp/YlArrOXJeJ2vqr X-Received: by 2002:a17:906:9b93:b0:72b:8fad:6cf8 with SMTP id dd19-20020a1709069b9300b0072b8fad6cf8mr37254214ejc.415.1658362155222; Wed, 20 Jul 2022 17:09:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658362155; cv=none; d=google.com; s=arc-20160816; b=lFCZlTU96TTmtg9jYT49FU0AEqrPDka36UxtTEZ6pKqzLz9zhHFQENi61EDrwTLum2 uh+D5/OJh9mShIYwTsbcvg05lLUK1n21Y97cyXtawMHhbgASTxmdFyYGnlAOirgVgcAe PkTVySxFrw7T59rttn6JF6c8BkMW2Y682OmudXxJMH/uR7NTtXC0Oz54hECWUaVsNwVf XKplEuf4U9c0WO4g9e6SiOaTW32YSioeWIFnZdB0FUce8SnO1me2jAm8jic8htuy9NSl pF/CDVedLPWARNZc/+ltbb1p4lsdwethGXooHz6MGkQhbn/Ws9Mv9Rsp/ks8mCYeqrZL e+xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uYesQTXIGdo47yQu+QowHw+8l05Pgun0K3Pva/sDbSk=; b=j4uOcoR0U7XxD2xpDs/mgm/EYI5uNKXIcs6ktcPCMy7/zw75EGeY/JYKcQ4BTKjyGo jd8xgR6cT7Z6GSaXIhqE/OUp6ZUQPiYYDk1g0hUi9Vl+5m+cPo3SEGgD6YDBm0+ybYj7 dN0bhJguq6HA1GJLaxKBYtZK8lh19L01tgfyw6A721+SGhA2a6J5hgMqMLWdybUIPJHj 7Tf5SXXT1ok1yYkyHjKWty2evKHyHuKl2J+RDvu23Vku8ZkqoGmRJVQRLQxXjWpxpoxH ATgQ4E1F7NkWks+DcNcYK8Xxk5lPFp8NRmD6w/3kl+XLY4MCOFemI+vpqDQmW7Q/coGl CH1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Vw4L9Qwg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dm26-20020a05640222da00b0043ba55cb9e3si432638edb.318.2022.07.20.17.08.50; Wed, 20 Jul 2022 17:09:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Vw4L9Qwg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231303AbiGUAD2 (ORCPT + 99 others); Wed, 20 Jul 2022 20:03:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229879AbiGUADZ (ORCPT ); Wed, 20 Jul 2022 20:03:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EA02F74787 for ; Wed, 20 Jul 2022 17:03:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658361804; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uYesQTXIGdo47yQu+QowHw+8l05Pgun0K3Pva/sDbSk=; b=Vw4L9QwghI9mAElfEXQXAMwcYjbgTFaVP7E6kgMS/9LqMVhVFOqzhhywyLyP3JzS+DmCWs QPO3pSKq5IaBSY1nQ2/L6N6M4LYvOCo5LbPAbgqDaY42gSwz803uY/S1VAEVdJRkGU/vO0 HrEyVxdMpD6akCseOwJsBzlkcCWiBBI= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-541-U2izdY9vNd6Es9tHyKoYNg-1; Wed, 20 Jul 2022 20:03:23 -0400 X-MC-Unique: U2izdY9vNd6Es9tHyKoYNg-1 Received: by mail-qv1-f69.google.com with SMTP id e1-20020ad44181000000b00472f8ad6e71so10532710qvp.20 for ; Wed, 20 Jul 2022 17:03:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uYesQTXIGdo47yQu+QowHw+8l05Pgun0K3Pva/sDbSk=; b=MEMh3Wu81Qsrk6QYFDeHjDSSjSLkN4itGXX9t0y9lvWgz6dtR0MO6zH1kTHL9tFztn 6E5lsFVS5GzEx9eZbADfVpfH6w/7Kyqf9Zu+62TCeEgHHaOmeW4HV7GuW2EF0HE+bInz lSTlNUgcKtbWKMjB/Y1KMkqCh42cvCgGbRoHBWLa53OOyxLZ1l0hu755pWIowrpG/ba0 tQiyHYXk4O6vurv4HKEh272Va3s+z6BEb2Fkmy5BNK8GViiF1yCn3DNuGNrglghdq5o0 FN+ia/9Cgwvf2HMfM367zHhfkOi1dSv1YiHd5UjZuPKljOYB07ElGnQLlY+LRrOfj/uY pr2w== X-Gm-Message-State: AJIora8Uh95hrSdPqDSHTwSSaiohpby8oZ6dyU8FUxnRy0/FDmURCuDP LgimHO59iYRSXIVOFA4R5rhw+nG4pYNuqNXXuHwMiqQuZNV1ZN9pG4hU8NgUuf/fBi5FIh6MMan 9eLHGBOn6kQgfJJ89GS7APJa8WkUdqAPSFSRW7SNoZC9Di6RVBh8iZCTwjXo8YVntDXiYrHLFQA == X-Received: by 2002:a05:6214:2a84:b0:473:2958:2b02 with SMTP id jr4-20020a0562142a8400b0047329582b02mr32042145qvb.122.1658361802195; Wed, 20 Jul 2022 17:03:22 -0700 (PDT) X-Received: by 2002:a05:6214:2a84:b0:473:2958:2b02 with SMTP id jr4-20020a0562142a8400b0047329582b02mr32042102qvb.122.1658361801787; Wed, 20 Jul 2022 17:03:21 -0700 (PDT) Received: from localhost.localdomain (bras-base-aurron9127w-grc-37-74-12-30-48.dsl.bell.ca. [74.12.30.48]) by smtp.gmail.com with ESMTPSA id g4-20020ac87f44000000b0031eb3af3ffesm418640qtk.52.2022.07.20.17.03.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 20 Jul 2022 17:03:21 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: David Hildenbrand , "Dr . David Alan Gilbert" , peterx@redhat.com, John Hubbard , Sean Christopherson , Linux MM Mailing List , Andrew Morton , Paolo Bonzini , Andrea Arcangeli Subject: [PATCH v2 1/3] mm/gup: Add FOLL_INTERRUPTIBLE Date: Wed, 20 Jul 2022 20:03:16 -0400 Message-Id: <20220721000318.93522-2-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220721000318.93522-1-peterx@redhat.com> References: <20220721000318.93522-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We have had FAULT_FLAG_INTERRUPTIBLE but it was never applied to GUPs. One issue with it is that not all GUP paths are able to handle signal delivers besides SIGKILL. That's not ideal for the GUP users who are actually able to handle these cases, like KVM. KVM uses GUP extensively on faulting guest pages, during which we've got existing infrastructures to retry a page fault at a later time. Allowing the GUP to be interrupted by generic signals can make KVM related threads to be more responsive. For examples: (1) SIGUSR1: which QEMU/KVM uses to deliver an inter-process IPI, e.g. when the admin issues a vm_stop QMP command, SIGUSR1 can be generated to kick the vcpus out of kernel context immediately, (2) SIGINT: which can be used with interactive hypervisor users to stop a virtual machine with Ctrl-C without any delays/hangs, (3) SIGTRAP: which grants GDB capability even during page faults that are stuck for a long time. Normally hypervisor will be able to receive these signals properly, but not if we're stuck in a GUP for a long time for whatever reason. It happens easily with a stucked postcopy migration when e.g. a network temp failure happens, then some vcpu threads can hang death waiting for the pages. With the new FOLL_INTERRUPTIBLE, we can allow GUP users like KVM to selectively enable the ability to trap these signals. Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- include/linux/mm.h | 1 + mm/gup.c | 33 +++++++++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index cf3d0d673f6b..c09eccd5d553 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2941,6 +2941,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ #define FOLL_PIN 0x40000 /* pages must be released via unpin_user_page */ #define FOLL_FAST_ONLY 0x80000 /* gup_fast: prevent fall-back to slow gup */ +#define FOLL_INTERRUPTIBLE 0x100000 /* allow interrupts from generic signals */ /* * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each diff --git a/mm/gup.c b/mm/gup.c index 551264407624..f39cbe011cf1 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -933,8 +933,17 @@ static int faultin_page(struct vm_area_struct *vma, fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |= FAULT_FLAG_REMOTE; - if (locked) + if (locked) { fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; + /* + * FAULT_FLAG_INTERRUPTIBLE is opt-in. GUP callers must set + * FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE. + * That's because some callers may not be prepared to + * handle early exits caused by non-fatal signals. + */ + if (*flags & FOLL_INTERRUPTIBLE) + fault_flags |= FAULT_FLAG_INTERRUPTIBLE; + } if (*flags & FOLL_NOWAIT) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { @@ -1322,6 +1331,22 @@ int fixup_user_fault(struct mm_struct *mm, } EXPORT_SYMBOL_GPL(fixup_user_fault); +/* + * GUP always responds to fatal signals. When FOLL_INTERRUPTIBLE is + * specified, it'll also respond to generic signals. The caller of GUP + * that has FOLL_INTERRUPTIBLE should take care of the GUP interruption. + */ +static bool gup_signal_pending(unsigned int flags) +{ + if (fatal_signal_pending(current)) + return true; + + if (!(flags & FOLL_INTERRUPTIBLE)) + return false; + + return signal_pending(current); +} + /* * Please note that this function, unlike __get_user_pages will not * return 0 for nr_pages > 0 without FOLL_NOWAIT @@ -1403,11 +1428,11 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm, * Repeat on the address that fired VM_FAULT_RETRY * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. Note that GUP can be interrupted - * by fatal signals, so we need to check it before we + * by fatal signals of even common signals, depending on + * the caller's request. So we need to check it before we * start trying again otherwise it can loop forever. */ - - if (fatal_signal_pending(current)) { + if (gup_signal_pending(flags)) { if (!pages_done) pages_done = -EINTR; break; -- 2.32.0