Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp287802imm; Tue, 31 Jul 2018 18:56:30 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeyYi4U69MBRHZTjhG1Uqc8lMo7ZdkfVtHJ/1ZwiPJY3KjVCLVCQzSc7QKF/54OuoUaQSYE X-Received: by 2002:a62:cd3:: with SMTP id 80-v6mr24946480pfm.184.1533088590791; Tue, 31 Jul 2018 18:56:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533088590; cv=none; d=google.com; s=arc-20160816; b=ybLI/0bmvOs7blZTkZRFxAUh0vEwO+REL2Qtpg7wmq8HJgnBYZzpx575gUS0GyYQNR XAiSeUjT+ftQ+2+ZwcNRoefhsWy8b+O9oUSSsqRZdkOSnT5vPSD7Fs2g7eM6LwEb2lJU DdIpOAp6mZ3yIs1mAzokpoDmmfvRq0OF2O+SthDDw6Ypihe8bFtCABQ15NEn5S/fBU37 tVdBb8uuQIUu/Z/Y98LlcAjFg22Tbtz+ueKtmncZm0cpi5jS4H6fJccEoX+vaW2oEYOn /PHN/gaJQM2409m9VQ6dM20/iht/iuzu5AT6ZnWSB28kullFalWKMoemjz7daktQWV2w kFAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature:arc-authentication-results; bh=Ik5iusv4uqvW6wnz5He611dutxX5uhyw2T16xcNLR50=; b=XaneRabYnWC//EpAIhkgTS2lORgjjyFBU9LGOab1THV2YPDkH4EuKphUNvarY019AB IGOUYxbyNon5NgjDl9duFOGa9KSAeNAA8QpVgzjAHTwmA5SMyyM4bKSAyzr+7rzU9hGy UIht6NvkDpaWxJRILcvz6AlrKcp5oLodvyFqMDe1wabPB+i2IwjZuci55PBab5egZHDo jSE/DZSgFBbZy3gVjcGdFsnP7EKXZhuxHYKFgEaJufvHZw58h+YAeLjSZUoZ0SL1mhAR SzEalWR6MVSV0/CjNJYs7O3GDdAG1gJs2gzzOE4P2/mUTn4841aEIVMND4O6WrUWYUKI 9E0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=LYxggy6c; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a128-v6si15353358pfb.81.2018.07.31.18.56.14; Tue, 31 Jul 2018 18:56:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=LYxggy6c; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732982AbeHADi3 (ORCPT + 99 others); Tue, 31 Jul 2018 23:38:29 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:42467 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732933AbeHADi2 (ORCPT ); Tue, 31 Jul 2018 23:38:28 -0400 Received: by mail-pf1-f193.google.com with SMTP id l9-v6so6991610pff.9 for ; Tue, 31 Jul 2018 18:55:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ik5iusv4uqvW6wnz5He611dutxX5uhyw2T16xcNLR50=; b=LYxggy6cqaPrdEI2eVEXnkcWvZHtbGD0m0zixqhdjoDnWA4YAHPYFm9eWzDI9c1n/7 9VS314OvGtLfDCnyN3OOVHQG9MfW/7obGABvNKqpN+ff/RaOe6+v/fvy3ukuHb/n6SVW 7A0t6tiT1mntA4qt7952oumirQyz+XzHz0ibwEABT6gak+FWjyxS2bAk0BAXe/MM++yN QhhWkQfyYonTkzkhoAjBH61XJ2QoGlGmY6kWRjYkt9z9UzKnfuL+BJa+0fxSw+ZAVaMY 5xg6WujlizvWGfCXxdAhUPE/nFwZIP+PQj0uZboU0YLzjvckjPmqI3gXnptOwGYteu0a nLaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ik5iusv4uqvW6wnz5He611dutxX5uhyw2T16xcNLR50=; b=ekX++N2ioZHaVXLk25Pq0D6bHnn8yB4QPD9xL/hcUIIqrASEDnT3meu2jWJacXgbFY Y4e0ohvPqbA+MYzu0dHDDi16ePcJyAPq8AVZIgqOtFv3Kcmri2i+D8xI8M4V9sJ7WaOk 5aamaq+DJqL/kg+VN4psK2FplELCjDSBKootXhxb5AvmYrETjinNAz28A41LjaGaN6FG wO5mH0pfylth8P3hYt4/Ti6lOyE2hKjiX2ZEq4Pb8lRxiKXgW505y2YT/p2HwVSNzSpN RYXyNuQwOd2T+YCErh66dtxvadHWLFYlx768X/Mec6TgVxiN2zvFwD4UbJiFPDQSJ0M5 tbVg== X-Gm-Message-State: AOUpUlHHcJAX3n73lFYvEbNPg8p1e77NTfmKwe5bSmyM5EeVJrZjHw/b Zl5Z59cC76YlMDuPIqzVOFI= X-Received: by 2002:a65:498c:: with SMTP id r12-v6mr23078626pgs.112.1533088521717; Tue, 31 Jul 2018 18:55:21 -0700 (PDT) Received: from roar.ozlabs.ibm.com ([61.69.188.107]) by smtp.gmail.com with ESMTPSA id h10-v6sm30327932pfj.78.2018.07.31.18.55.17 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 31 Jul 2018 18:55:21 -0700 (PDT) Date: Wed, 1 Aug 2018 11:55:14 +1000 From: Nicholas Piggin To: Laurent Dufour Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, aneesh.kumar@linux.ibm.com, mpe@ellerman.id.au, benh@kernel.crashing.org, paulus@samba.org Subject: Re: [resend] [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Message-ID: <20180801115514.441eecc8@roar.ozlabs.ibm.com> In-Reply-To: <1532699493-10883-1-git-send-email-ldufour@linux.vnet.ibm.com> References: <1532699493-10883-1-git-send-email-ldufour@linux.vnet.ibm.com> X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 27 Jul 2018 15:51:30 +0200 Laurent Dufour wrote: > [Resending so everyone is getting the cover letter] > > On very large system we could see soft lockup fired when a process is exiting > > watchdog: BUG: soft lockup - CPU#851 stuck for 21s! [forkoff:215523] > Modules linked in: pseries_rng rng_core xfs raid10 vmx_crypto btrfs libcrc32c xor zstd_decompress zstd_compress xxhash lzo_compress raid6_pq crc32c_vpmsum lpfc crc_t10dif crct10dif_generic crct10dif_common dm_multipath scsi_dh_rdac scsi_dh_alua autofs4 > CPU: 851 PID: 215523 Comm: forkoff Not tainted 4.17.0 #1 > NIP: c0000000000b995c LR: c0000000000b8f64 CTR: 000000000000aa18 > REGS: c00006b0645b7610 TRAP: 0901 Not tainted (4.17.0) > MSR: 800000010280b033 CR: 22042082 XER: 00000000 > CFAR: 00000000006cf8f0 SOFTE: 0 > GPR00: 0010000000000000 c00006b0645b7890 c000000000f99200 0000000000000000 > GPR04: 8e000001a5a4de58 400249cf1bfd5480 8e000001a5a4de50 400249cf1bfd5480 > GPR08: 8e000001a5a4de48 400249cf1bfd5480 8e000001a5a4de40 400249cf1bfd5480 > GPR12: ffffffffffffffff c00000001e690800 > NIP [c0000000000b995c] plpar_hcall9+0x44/0x7c > LR [c0000000000b8f64] pSeries_lpar_flush_hash_range+0x324/0x3d0 > Call Trace: > [c00006b0645b7890] [8e000001a5a4dd20] 0x8e000001a5a4dd20 (unreliable) > [c00006b0645b7a00] [c00000000006d5b0] flush_hash_range+0x60/0x110 > [c00006b0645b7a50] [c000000000072a2c] __flush_tlb_pending+0x4c/0xd0 > [c00006b0645b7a80] [c0000000002eaf44] unmap_page_range+0x984/0xbd0 > [c00006b0645b7bc0] [c0000000002eb594] unmap_vmas+0x84/0x100 > [c00006b0645b7c10] [c0000000002f8afc] exit_mmap+0xac/0x1f0 > [c00006b0645b7cd0] [c0000000000f2638] mmput+0x98/0x1b0 > [c00006b0645b7d00] [c0000000000fc9d0] do_exit+0x330/0xc00 > [c00006b0645b7dc0] [c0000000000fd384] do_group_exit+0x64/0x100 > [c00006b0645b7e00] [c0000000000fd44c] sys_exit_group+0x2c/0x30 > [c00006b0645b7e30] [c00000000000b960] system_call+0x58/0x6c > Instruction dump: > 60000000 f8810028 7ca42b78 7cc53378 7ce63b78 7d074378 7d284b78 7d495378 > e9410060 e9610068 e9810070 44000022 <7d806378> e9810028 f88c0000 f8ac0008 > > This happens when removing the PTE by calling the hypervisor using the > H_BULK_REMOVE call. This call is processing up to 4 PTEs but is doing a > tlbie for each PTE it is processing. This could lead to long time spent in > the hypervisor (sometimes up to 4s) and soft lockup being raised because > the scheduler is not called in zap_pte_range(). > > Since the Power7's time, the hypervisor is providing a new hcall > H_BLOCK_REMOVE allowing processing up to 8 PTEs with one call to > tlbie. By limiting the amount of tlbie generated, this reduces the time > spent invalidating the PTEs. Oh that's a nice feature. I must have an ancient PAPR because I don't have it. It could be a good project for someone to implement it in KVM too. > > This hcall requires that the pages are "all within the same naturally > aligned 8 page virtual address block". > > With this patch series applied, I couldn't see any soft lockup raised on > the victim LPAR I was running the test one. > > This series is covering both normal pages and huge pages. Really nice, thanks for working on the problem. Thanks, Nick