Received: by 2002:a17:90a:c8b:0:0:0:0 with SMTP id v11csp2290524pja; Fri, 19 Apr 2019 11:22:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqyFILfjgRuz0vBOmrqI8s6s4Odgsna3Gx+smpeeN9HvS3p/Z+xLCteoQCbSIKFl7l1x67lD X-Received: by 2002:a17:902:6949:: with SMTP id k9mr5152638plt.59.1555698163665; Fri, 19 Apr 2019 11:22:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555698163; cv=none; d=google.com; s=arc-20160816; b=vqKesCSAEBV2UkAO1dT135px0FAiWLbzD5IcpKWYveKI4xOqeJ6RTtbPwEbpuJN48Q O4lkvEFgP3Li46KhgXM72l1W77k1GgNbn6O9yPctBbZnhMz/46kAG8nbwK/qrSpU1QL6 wBW2klvRPqMnSaXdngmVtCGXOrh90wMXfhLipJJbBEbsCdio03pwHh1JgoL4ZY1qmaRa xEpm7c7x2JYjXtmHltEkcrypLUEQ8lg2IDtwNwn9fwGGBDg1C2rp8+fbb2dfOfLcww6q oHs2SNV/qhJhh0h9RfaAI8FUtt5kgTorT4B8RR9xF9kFphXp2xBIFk/vbZBmHzWxdI/h LWjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :mime-version:references:in-reply-to:subject:cc:to:from:date; bh=vUhBAA8RZxInBbFNpzZw5qG5itZNKTyTbgeN2Mcw/Cc=; b=EKlT8+KVYNmE7jlOmoJNR00WyYL0188sZnDHPknJW01h6cOoA45hRhnPZw1HmOdaNC UxlDw3VzOdT73wpjObIZMkSYR2lXWjF0OTrXJeurjYdX4Tc83bL6+xHxI40RWYNgy3N5 D4RBJfyZ8+m2nFF6bTthAgPtqPU+HAmJOC0tC4+2p+fOQoBKAtxc/WNMavICyWFZNWAm tvvEXZZPgJg2+YRl/BJ7lDm1GbcWgMWUTF/zNpK+GN5Vsj1WMVMFFaQiLm/UGjcl3Kw7 aM0XZPXWVYYsCOKqjseUM0HCRXsJ2NfVyseKvzQjyEFv4zmYB88BRoE1DQk1aAoUXP1M az9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t2si4798157pgk.459.2019.04.19.11.22.28; Fri, 19 Apr 2019 11:22:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727453AbfDSSSV (ORCPT + 99 others); Fri, 19 Apr 2019 14:18:21 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:43606 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727483AbfDSSSU (ORCPT ); Fri, 19 Apr 2019 14:18:20 -0400 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3JDQR5e076133 for ; Fri, 19 Apr 2019 09:33:16 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 2rybsq09tk-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 19 Apr 2019 09:33:16 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 19 Apr 2019 14:33:14 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 19 Apr 2019 14:33:10 +0100 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x3JDXAFb52429038 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 19 Apr 2019 13:33:10 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E5304A405B; Fri, 19 Apr 2019 13:33:09 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2EA04A4054; Fri, 19 Apr 2019 13:33:09 +0000 (GMT) Received: from mschwideX1 (unknown [9.145.155.124]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 19 Apr 2019 13:33:09 +0000 (GMT) Date: Fri, 19 Apr 2019 15:33:07 +0200 From: Martin Schwidefsky To: Linus Torvalds Cc: Christoph Hellwig , Linux List Kernel Mailing , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-s390 Subject: Re: Linux 5.1-rc5 In-Reply-To: <20190418204144.16adf2a0@mschwideX1> References: <20190415051919.GA31481@infradead.org> <20190416110906.6c773aff@mschwideX1> <20190416140658.2cb73a3f@mschwideX1> <20190417094637.51ad4c67@mschwideX1> <20190417100244.42e29736@mschwideX1> <20190418100218.0a4afd51@mschwideX1> <20190418204144.16adf2a0@mschwideX1> X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 19041913-0008-0000-0000-000002DBA655 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19041913-0009-0000-0000-00002247EB5E Message-Id: <20190419153307.4f2911b5@mschwideX1> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-04-19_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904190102 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 18 Apr 2019 20:41:44 +0200 Martin Schwidefsky wrote: > On Thu, 18 Apr 2019 08:49:32 -0700 > Linus Torvalds wrote: > > > On Thu, Apr 18, 2019 at 1:02 AM Martin Schwidefsky > > wrote: > > > > > > The problematic lines in the generic gup code are these three: > > > > > > 1845: pmdp = pmd_offset(&pud, addr); > > > 1888: pudp = pud_offset(&p4d, addr); > > > 1916: p4dp = p4d_offset(&pgd, addr); > > > > > > Passing the pointer of a *copy* of a page table entry to pxd_offset() does > > > not work with the page table folding on s390. > > > > Hmm. I wonder why. x86 too does the folding thing for the p4d and pud case. > > > > The folding works with the local copy just the same way it works with > > the orignal value. > > The difference is that with the static page table folding pgd_offset() > does the index calculation of the actual hardware top-level table. With > dynamic page table folding as s390 is doing it, if the task does not use > a 5-level page table pgd_offset() will see a pgd_index() of 0, the indexing > of the actual top-level table is done later with p4d_offset(), pud_offset() > or pmd_offset(). > > As an example, with a three level page table we have three indexes x/y/z. > The common code "thinks" 5 indexing steps, with static folding the index > sequence is x 0 0 y z. With dynamic folding the sequence is 0 0 x y z. > By moving the first indexing operation to pgd_offset the static sequence > does not add an index to a non-dereferenced pointer to a stack variable, > the dynamic sequence does. That problem got stuck in my head and I thought more about it. Why not emulate the static folding sequence in the s390 page table code? As the table type is encoded in every entry for the region and segment tables, pgd_offset() can look at the first entry to find the table type and then do the correct index calculation for the given top-level table. Like this: static inline pgd_t *pgd_offset_raw(pgd_t *pgd, unsigned long address) { unsigned long rste; unsigned int shift; /* Get the first entry of the top level table */ rste = pgd_val(*pgd); /* Pick up the shift from the table type of the first entry */ shift = ((rste & _REGION_ENTRY_TYPE_MASK) >> 2) * 11 + 20; return pgd + ((address >> shift) & (PTRS_PER_PGD - 1)); } #define pgd_offset(mm, address) pgd_offset_raw((mm)->pgd, address) #define pgd_offset_k(address) pgd_offset(&init_mm, address) static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address) { if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R1) return (p4d_t *) pgd; return (p4d_t *) pgd_deref(*pgd) + p4d_index(address); } static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address) { if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R2) return (pud_t *) p4d; return (pud_t *) p4d_deref(*p4d) + pud_index(address); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R3) return (pmd_t *) pud; return (pmd_t *) pud_deref(*pud) + pmd_index(address); } This needs more thorough testing but in principle it does work. The kernel boots and survives a kernel compile. The only things that is slightly off is that pgd_offset() now has to look at the first table entry to do its job. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.