Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp2954108rdg; Mon, 16 Oct 2023 23:05:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGEgo/YdkyFsHJY9exJzwpXmdc0Z8L5jkOm1Cnvdp3kptCYN52+KaM6pRbyihP+oqN+2jm7 X-Received: by 2002:a05:6a00:2355:b0:68b:f529:a329 with SMTP id j21-20020a056a00235500b0068bf529a329mr1419510pfj.5.1697522729558; Mon, 16 Oct 2023 23:05:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697522729; cv=none; d=google.com; s=arc-20160816; b=HqGtU8wIxk27h7FJhpq6x67xoVj5BMOj3OoNDsYtcVWrQxoMSP0huiAkvEsWO4q2B3 +QDaarGOFxo9pLXrXU3FWegjy1CMc1VUhy0t8wwhAgAk/MhZW8v7stCnqkaJbj5Ml0BF c9ClaFoggntOLbuNPXRM2e8jBS6pK7n3uAIrkGA7E17nMraP9B4b0N1YgxXf6fsPxSWk uymkd2IylfBlaNUiY7p5CX3qKS6uRoTf/2h9xnogxjm9rhXBkQe1isLlnSG2EIOHGUTF RX0vMl9TYcyG0fRBi3fFI0otS+P3MXuJZxbAOMknkGpu1L9T/rZGKFVSUoKl5cY5kvn+ w9Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=omYkSqC8VDjSbMbRVlhkhkaKoZUrM0gq5bjZn0HR/3A=; fh=2QoNJSaQSSSc+TA8Vz7yyW7zMM+J4iIzXMT7UQLAy2Y=; b=eFpe61NwE6ydxMKqN3RBwX8uqkdcP3Sic2vpKm4lIC2l1K6WcnPZINNxEOqAum3cjl /YGMRkU8HlniY+uWm++uz/vbMdichAEaeYftSkgjpAyMBGRNpr0TxeRmhKo9aWYPjT9f UZe9M+vB1CoNY4hSG2TrketQAeCi9VMvEEIiihDMUxGXGLQnPZ+o8LqnaK3OAIm40TeW HFKet1TxMnVHxy56belOCwJfR8hOlc8u45nmah+YGvUp9TJeVTG7Ozj9uEWXul773xbD 7O1SZWM5Wl109Sku9H47ydo5E/b1wl/EIHtU3nHRwRS43lM2/LzfiRkoqx8RFtn+d7rz R9Wg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=abM+XMbl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id k198-20020a633dcf000000b005a9e9e26549si1053016pga.193.2023.10.16.23.05.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 23:05:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=abM+XMbl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 3807E802FA71; Mon, 16 Oct 2023 23:05:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230343AbjJQGFF (ORCPT + 99 others); Tue, 17 Oct 2023 02:05:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234423AbjJQGFD (ORCPT ); Tue, 17 Oct 2023 02:05:03 -0400 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C6EBF5; Mon, 16 Oct 2023 23:05:00 -0700 (PDT) Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39H64Qn9014565; Tue, 17 Oct 2023 06:04:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=pp1; bh=omYkSqC8VDjSbMbRVlhkhkaKoZUrM0gq5bjZn0HR/3A=; b=abM+XMblr3M2AeWp46eqtdXEKO8yfBcNnrcetmEP3A46J9tKzmQhvCMC+1YRG3XUYdg+ jBjcpL2D+ZpEeq7gHa212ReevCBtYNHfnh/3dVllOg5mLCnxU1W/Mh4Jb+zsdQH4yENZ c2e9xTuSlhJvjcn2ZbAmdbAxzzMlK+9uyfaZW+r6sXP4M444ZBFUNjWDsHGAhXQn0pHl AclJzUpbIUTU/b6RYA3GrcT6QPGF3AYEf/AyN1r8hMRFV3faJLwB/NiF/Du8wJPbyQJN a5HE7Yg8x1yKkPRP+jHJnX6IPHkMeI0y66vVR6ADbfAYal7Vvs1DfFHmVSsQIFcNm14y vQ== Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tsmp7g06m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 17 Oct 2023 06:04:29 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39H4v8t6019672; Tue, 17 Oct 2023 06:04:28 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([172.16.1.8]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3tr811dj2x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 17 Oct 2023 06:04:28 +0000 Received: from smtpav01.dal12v.mail.ibm.com (smtpav01.dal12v.mail.ibm.com [10.241.53.100]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39H64RYL8979084 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 17 Oct 2023 06:04:28 GMT Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D46D658058; Tue, 17 Oct 2023 06:04:27 +0000 (GMT) Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8268658066; Tue, 17 Oct 2023 06:04:25 +0000 (GMT) Received: from skywalker.linux.ibm.com (unknown [9.109.212.144]) by smtpav01.dal12v.mail.ibm.com (Postfix) with ESMTP; Tue, 17 Oct 2023 06:04:25 +0000 (GMT) X-Mailer: emacs 29.1 (via feedmail 11-beta-1 I) From: "Aneesh Kumar K.V" To: Erhard Furtner , "Matthew Wilcox (Oracle)" Cc: Juergen Gross , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-sparc@vger.kernel.org, David Woodhouse Subject: Re: [PATCH 0/2] Allow nesting of lazy MMU mode In-Reply-To: <20231013154220.02fb2e6d@yea> References: <20231012195415.282357-1-willy@infradead.org> <20231013154220.02fb2e6d@yea> Date: Tue, 17 Oct 2023 11:34:23 +0530 Message-ID: <875y35zswo.fsf@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 3ef2F9UqU74NAkjCIImOuTF4hWOOEc3A X-Proofpoint-ORIG-GUID: 3ef2F9UqU74NAkjCIImOuTF4hWOOEc3A X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-16_13,2023-10-12_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 impostorscore=0 mlxlogscore=999 clxscore=1011 priorityscore=1501 malwarescore=0 lowpriorityscore=0 adultscore=0 phishscore=0 bulkscore=0 mlxscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310170049 X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 16 Oct 2023 23:05:26 -0700 (PDT) Erhard Furtner writes: > On Thu, 12 Oct 2023 20:54:13 +0100 > "Matthew Wilcox (Oracle)" wrote: > >> Dave Woodhouse reported that we now nest calls to >> arch_enter_lazy_mmu_mode(). That was inadvertent, but in principle we >> should allow it. On further investigation, Juergen already fixed it >> for Xen, but didn't tell anyone. Fix it for Sparc & PowerPC too. >> This may or may not help fix the problem that Erhard reported. >> >> Matthew Wilcox (Oracle) (2): >> powerpc: Allow nesting of lazy MMU mode >> sparc: Allow nesting of lazy MMU mode >> >> arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 ++--- >> arch/sparc/mm/tlb.c | 5 ++--- >> 2 files changed, 4 insertions(+), 6 deletions(-) >> >> -- >> 2.40.1 > > Applied the patch on top of v6.6-rc5 but unfortunately it did not fix my reported issue. > > Regards, > Erhard > With the problem reported I guess we are finding the page->compound_head wrong and hence folio->flags PG_dcache_clean check crashing. I still don't know why we find page->compound_head wrong. Michael noted we are using FLAT_MEM. That implies we are suppose to inialize struct page correctly via init_unavailable_range because we are hitting this on an ioremap address. We need to instrument the kernel to track the initialization of the struct page backing these pfns which we know is crashing. W.r.t arch_enter_lazy_mmu_mode() we can skip that completely on powerpc because we don't allow the usage of set_pte on a valid pte entries. pte updates are not done via set_pte interface and hence there is no TLB invalidate required while using set_pte(). ie, we can do something like below. The change also make sure we call set_pte_filter on all the ptes we are setting via set_ptes(). I haven't sent this as a proper patch because we still are not able to fix the issue Erhard reported. diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 3ba9fe411604..95ab20cca2da 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -191,28 +191,35 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { /* - * Make sure hardware valid bit is not set. We don't do - * tlb flush for this update. + * We don't need to call arch_enter/leave_lazy_mmu_mode() + * because we expect set_ptes to be only be used on not present + * and not hw_valid ptes. Hence there is not translation cache flush + * involved that need to be batched. */ - VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); + for (;;) { - /* Note: mm->context.id might not yet have been assigned as - * this context might not have been activated yet when this - * is called. - */ - pte = set_pte_filter(pte); + /* + * Make sure hardware valid bit is not set. We don't do + * tlb flush for this update. + */ + VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); - /* Perform the setting of the PTE */ - arch_enter_lazy_mmu_mode(); - for (;;) { + /* Note: mm->context.id might not yet have been assigned as + * this context might not have been activated yet when this + * is called. + */ + pte = set_pte_filter(pte); + + /* Perform the setting of the PTE */ __set_pte_at(mm, addr, ptep, pte, 0); if (--nr == 0) break; ptep++; - pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT)); addr += PAGE_SIZE; + /* increment the pfn */ + pte = __pte(pte_val(pte) + PAGE_SIZE); + } - arch_leave_lazy_mmu_mode(); } void unmap_kernel_page(unsigned long va)