Received: by 2002:ac0:da4c:0:0:0:0:0 with SMTP id a12csp61758imi; Wed, 20 Jul 2022 17:08:59 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uBfnSGMfBtgaRtrfkBtd30IWXCxsl/vaztq5Mmc4+XGbRYifjb9OktRTzWwo029pgm8j20 X-Received: by 2002:a17:907:7205:b0:72f:38ec:f70e with SMTP id dr5-20020a170907720500b0072f38ecf70emr15489505ejc.130.1658362139584; Wed, 20 Jul 2022 17:08:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658362139; cv=none; d=google.com; s=arc-20160816; b=sWK9PhYh71f/2MV6y/Sqf4W92PJd3DcWXgXfk0GjL3HunMpEkkOADUbw7rXRLIoTDN tcbzA1tmilrrH7ZHdnwPkPU7kcl8QqZqXzqnaZ2ZXNj82NW6sNRu/3ekBNk+Mu696yhh 2IbPzwgHa17uOAGTNTLISsrcgyXTyeZ7dCZLYdPl+jZSWE2VFIk3ozx9KPtd9X9s0zzR HbmdBuAXO3C6cg4Bcw2N0e7O8F7yxqxcgZ8TJFVeyMHrOFJLyDF0XLfBLGL073xDbZQc +Mf3hyeH71BVDeMRL9Ga4XMgwA6eRxH1JLY3EFk5aIo+NjGbQY+wCbEu67au4hOdPwr6 DXOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=0HOOHkAh3MZpUMeUIVAREqDy3wbpVTpA7LvIV0AcxA0=; b=r/X/1Y4CnN/uqOtdKRnXUNZKePkn0GYOPXQWhuBO16FLOXC+KohnNWEyj39KhYp1B1 3/GxDr633xn/LLsfaxtM1mat+dP0r161MQxJQAHQW1OEpsprnjSZ0jwWhWpwEQWI9S3X 1CNGbv9YE26pdlKb4/WMvJxb5qjI5g88rGTm7iREowz+bdfutlBn9qEYgCrssbfi9HoA aESdTOgn2b2sb7D6bKKJJ8eXrFUPHJJVzMMwsS6yY4RTPsVtnam6WBRaO2zKS2HRqW0R KSGWETO9y0TH0Rl60P+7SNDLcQHXchBEX0htjgjWt1ZY+EyxSPT/zcmFjRFax9pHtQNU Q4DA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BIoRVQsj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x14-20020a05640226ce00b0043b1f6f8d6esi534274edd.629.2022.07.20.17.08.33; Wed, 20 Jul 2022 17:08:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BIoRVQsj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229604AbiGUAD0 (ORCPT + 99 others); Wed, 20 Jul 2022 20:03:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229776AbiGUADY (ORCPT ); Wed, 20 Jul 2022 20:03:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9B84974783 for ; Wed, 20 Jul 2022 17:03:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658361802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=0HOOHkAh3MZpUMeUIVAREqDy3wbpVTpA7LvIV0AcxA0=; b=BIoRVQsjx3OYlE/wF7hBGVvUbC8XnhTzPQ3LDtP/yqreRVnzoQiqVkoYRDArJBlGmDtq9M D29ESBGVzprwJe1Ep1uIC/e7LxtnLryY/jSVcZHV2WMrTafNmwI/6/1DDuCdMcIg37k3F5 SV1vUmLM+7zEfh7mRgD4ACma+Jiqjfg= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-362-C-c1XmvGOiGV-jZudbuC9A-1; Wed, 20 Jul 2022 20:03:21 -0400 X-MC-Unique: C-c1XmvGOiGV-jZudbuC9A-1 Received: by mail-qk1-f199.google.com with SMTP id bk21-20020a05620a1a1500b006b5c24695a4so214569qkb.15 for ; Wed, 20 Jul 2022 17:03:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0HOOHkAh3MZpUMeUIVAREqDy3wbpVTpA7LvIV0AcxA0=; b=NQJzCpaC+IbQyYGXlS/b65edTTns9jrnbICfBkGwR2JXoVOPzO0dHbNYOI/0r/XhVS Kxi62NtXiLa0PcbA8no2stlmQEZ29l9hWQCV0ZxADHmNIwKgWMzWI0CKWZDNyCcBSqnN S4XjDJRoRefpHvgefb6KXax13hWQ/qtrt1bpdAdGACDMtcH2+Gypmc3SojdgTvmExcFm oKFdApY9o//u0DSwSmRelh4NJXM9XDn5YeZa+cKe+yJzpet4u7bokCYGMMZ4PeAU1PDZ EPy8DF5wIxrYfY8uoVzSMLYOHY7bmHVbbGIyhrRrJV8wHg3qy7oLn+0eV0DoaeBKG3Y6 fLWg== X-Gm-Message-State: AJIora+xWBeQBmZogL5x7K1qE1/ihD51HhMJudnMB3SG5qEmueFmdkvx pRpTkN60RIlRFKne++/j1RYhTwMkxTGW2WiJb1Fy9k63hsnz3bdBDNveKlc6IlzoHsloVmM+O5Z maJyQqYBjbBO++gZIUVxJez4/QULG+q1lEPS/Y6TYKKtFJS4ChlWsM716eJGPZQ4loFUgp1ihfg == X-Received: by 2002:a05:6214:194b:b0:474:69c:c21a with SMTP id q11-20020a056214194b00b00474069cc21amr3792543qvk.25.1658361800676; Wed, 20 Jul 2022 17:03:20 -0700 (PDT) X-Received: by 2002:a05:6214:194b:b0:474:69c:c21a with SMTP id q11-20020a056214194b00b00474069cc21amr3792502qvk.25.1658361800353; Wed, 20 Jul 2022 17:03:20 -0700 (PDT) Received: from localhost.localdomain (bras-base-aurron9127w-grc-37-74-12-30-48.dsl.bell.ca. [74.12.30.48]) by smtp.gmail.com with ESMTPSA id g4-20020ac87f44000000b0031eb3af3ffesm418640qtk.52.2022.07.20.17.03.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 20 Jul 2022 17:03:19 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: David Hildenbrand , "Dr . David Alan Gilbert" , peterx@redhat.com, John Hubbard , Sean Christopherson , Linux MM Mailing List , Andrew Morton , Paolo Bonzini , Andrea Arcangeli Subject: [PATCH v2 0/3] kvm/mm: Allow GUP to respond to non fatal signals Date: Wed, 20 Jul 2022 20:03:15 -0400 Message-Id: <20220721000318.93522-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org v2: - Added r-b - Rewrite the comment in faultin_page() for FOLL_INTERRUPTIBLE [John] - Dropped the controversial patch to introduce a flag for __gfn_to_pfn_memslot(), instead used a boolean for now [Sean] - Rename s/is_sigpending_pfn/KVM_PFN_ERR_SIGPENDING/ [Sean] - Change comment in kvm_faultin_pfn() mentioning fatal signals [Sean] rfc: https://lore.kernel.org/kvm/20220617014147.7299-1-peterx@redhat.com v1: https://lore.kernel.org/kvm/20220622213656.81546-1-peterx@redhat.com One issue was reported that libvirt won't be able to stop the virtual machine using QMP command "stop" during a paused postcopy migration [1]. It won't work because "stop the VM" operation requires the hypervisor to kick all the vcpu threads out using SIG_IPI in QEMU (which is translated to a SIGUSR1). However since during a paused postcopy, the vcpu threads are hang death at handle_userfault() so there're simply not responding to the kicks. Further, the "stop" command will further hang the QMP channel. The mm has facility to process generic signal (FAULT_FLAG_INTERRUPTIBLE), however it's only used in the PF handlers only, not in GUP. Unluckily, KVM is a heavy GUP user on guest page faults. It means we won't be able to interrupt a long page fault for KVM fetching guest pages with what we have right now. I think it's reasonable for GUP to only listen to fatal signals, as most of the GUP users are not really ready to handle such case. But actually KVM is not such an user, and KVM actually has rich infrastructure to handle even generic signals, and properly deliver the signal to the userspace. Then the page fault can be retried in the next KVM_RUN. This patchset added FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE, and let KVM be the first one to use it. KVM and mm/gup can always be able to respond to fatal signals, but not non-fatal ones until this patchset. One thing to mention is that this is not allowing all KVM paths to be able to respond to non fatal signals, but only on x86 slow page faults. In the future when more code is ready for handling signal interruptions, we can explore possibility to have more gup callers using FOLL_INTERRUPTIBLE. Tests ===== I created a postcopy environment, pause the migration by shutting down the network to emulate a network failure (so the handle_userfault() will stuck for a long time), then I tried three things: (1) Sending QMP command "stop" to QEMU monitor, (2) Hitting Ctrl-C from QEMU cmdline, (3) GDB attach to the dest QEMU process. Before this patchset, all three use case hang. After the patchset, all work just like when there's not network failure at all. Please have a look, thanks. [1] https://gitlab.com/qemu-project/qemu/-/issues/1052 Peter Xu (3): mm/gup: Add FOLL_INTERRUPTIBLE kvm: Add new pfn error KVM_PFN_ERR_SIGPENDING kvm/x86: Allow to respond to generic signals during slow page faults arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/mmu/mmu.c | 16 +++++++++++-- include/linux/kvm_host.h | 15 ++++++++++-- include/linux/mm.h | 1 + mm/gup.c | 33 ++++++++++++++++++++++---- virt/kvm/kvm_main.c | 30 ++++++++++++++--------- virt/kvm/kvm_mm.h | 4 ++-- virt/kvm/pfncache.c | 2 +- 10 files changed, 82 insertions(+), 25 deletions(-) -- 2.32.0