Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp1960018ybv; Fri, 14 Feb 2020 08:56:27 -0800 (PST) X-Google-Smtp-Source: APXvYqzoCkGYAdYEEVhTdyYiaQkmEn9pDOwQAe/WHopDXSlxfinEsRIET91K1UIg+HP62rdOXWTV X-Received: by 2002:aca:190b:: with SMTP id l11mr2592612oii.65.1581699387275; Fri, 14 Feb 2020 08:56:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581699387; cv=none; d=google.com; s=arc-20160816; b=BdBib+mx4Qp4MgEEUZFjQaiW2THDaLWmVeGerrrKqbvF6wYX4N3hBiwj9PLJF5tJ2o iA5kVu2lcReMH3lYi53F15Yp5GTCOm4VDQHx6nRWYDuYq6BMWs3Lucj1GMxCkpqFznmt cgy5KniVvhBNjEEo6taiWruw264DQcq1zZOoSXrkS7i0NhspRgAAI14l9R/mMiMAiL03 cSuSPO/4vWqSK3poJ+EtyEyEwAs6hDW9bx3+XXSXMT6HHcHmgeqdUfIzzvge2Igvj5Pz LmtxJ/q/3jfHzDYSUlfphxjEqsruvT9Ml9DHHzfk67UTBZS5tXUFsusPebwFWjRT1S7f F2Vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date:from :references:cc:to:subject; bh=BUSTYaTS9ABICGubVUjamq4CkwI3fABUWNR1a84RcUw=; b=UvXf3m0rCEQSppQM1gEQzKuI88kQwmWygJudEEA8PuVYkgyosavieAgXiPtPS6XekA qqjxI2hhfpLm7ZbXpYjIhwab6m8GnnCOcXJUaNzFLIXQ+GEkRkSM0E3D8Yl3KcnbDpYh GH/u7CMhyh6+4ztECRNRbr25hUlF1Jtshwj9xb6JLtziLMcKmMqPoAgEw67sa1TB1hNX brxug/bMf/T7RDG9KZrz0LLhIYuUH54ZEgJTP9ug9COIBthgUuVz/fiW/ARly8q81/Db 0ZGyeXCsX9lWfEpuZY3TfSkfiFNN1m2/Z5cpxBruTCktLvMlX3NyCk69s0HF1HMcNS+f PyuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r14si2942037oic.12.2020.02.14.08.56.15; Fri, 14 Feb 2020 08:56:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404681AbgBNQzj (ORCPT + 99 others); Fri, 14 Feb 2020 11:55:39 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:30174 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394268AbgBNQzh (ORCPT ); Fri, 14 Feb 2020 11:55:37 -0500 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01EGnPed052733 for ; Fri, 14 Feb 2020 11:55:37 -0500 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2y5x3hkc7x-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 14 Feb 2020 11:55:37 -0500 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 14 Feb 2020 16:55:34 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 14 Feb 2020 16:55:32 -0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01EGtV7W27656442 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 14 Feb 2020 16:55:31 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 52E7711C05C; Fri, 14 Feb 2020 16:55:31 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C111711C04A; Fri, 14 Feb 2020 16:55:29 +0000 (GMT) Received: from [9.85.93.41] (unknown [9.85.93.41]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 14 Feb 2020 16:55:29 +0000 (GMT) Subject: Re: [PATCH v2 2/4] libnvdimm/namespace: Enforce memremap_compat_align() To: Jeff Moyer , Dan Williams Cc: linux-nvdimm , Vishal L Verma , Linux Kernel Mailing List , linuxppc-dev References: <158155489850.3343782.2687127373754434980.stgit@dwillia2-desk3.amr.corp.intel.com> <158155490897.3343782.14216276134794923581.stgit@dwillia2-desk3.amr.corp.intel.com> From: "Aneesh Kumar K.V" Date: Fri, 14 Feb 2020 22:25:28 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 20021416-0028-0000-0000-000003DB0464 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20021416-0029-0000-0000-000024A0047E Message-Id: <0843d8bf-c9e4-37c9-d9c2-ba4407daae21@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-02-14_05:2020-02-12,2020-02-14 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 malwarescore=0 clxscore=1015 adultscore=0 mlxlogscore=999 priorityscore=1501 spamscore=0 bulkscore=0 mlxscore=0 impostorscore=0 suspectscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002140127 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/14/20 10:14 PM, Jeff Moyer wrote: > Dan Williams writes: > >> On Thu, Feb 13, 2020 at 1:55 PM Jeff Moyer wrote: >>> >>> Dan Williams writes: >>> >>>> The pmem driver on PowerPC crashes with the following signature when >>>> instantiating misaligned namespaces that map their capacity via >>>> memremap_pages(). >>>> >>>> BUG: Unable to handle kernel data access at 0xc001000406000000 >>>> Faulting instruction address: 0xc000000000090790 >>>> NIP [c000000000090790] arch_add_memory+0xc0/0x130 >>>> LR [c000000000090744] arch_add_memory+0x74/0x130 >>>> Call Trace: >>>> arch_add_memory+0x74/0x130 (unreliable) >>>> memremap_pages+0x74c/0xa30 >>>> devm_memremap_pages+0x3c/0xa0 >>>> pmem_attach_disk+0x188/0x770 >>>> nvdimm_bus_probe+0xd8/0x470 >>>> >>>> With the assumption that only memremap_pages() has alignment >>>> constraints, enforce memremap_compat_align() for >>>> pmem_should_map_pages(), nd_pfn, or nd_dax cases. >>>> >>>> Reported-by: Aneesh Kumar K.V >>>> Cc: Jeff Moyer >>>> Reviewed-by: Aneesh Kumar K.V >>>> Link: https://lore.kernel.org/r/158041477336.3889308.4581652885008605170.stgit@dwillia2-desk3.amr.corp.intel.com >>>> Signed-off-by: Dan Williams >>>> --- >>>> drivers/nvdimm/namespace_devs.c | 10 ++++++++++ >>>> 1 file changed, 10 insertions(+) >>>> >>>> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c >>>> index 032dc61725ff..aff1f32fdb4f 100644 >>>> --- a/drivers/nvdimm/namespace_devs.c >>>> +++ b/drivers/nvdimm/namespace_devs.c >>>> @@ -1739,6 +1739,16 @@ struct nd_namespace_common *nvdimm_namespace_common_probe(struct device *dev) >>>> return ERR_PTR(-ENODEV); >>>> } >>>> >>>> + if (pmem_should_map_pages(dev) || nd_pfn || nd_dax) { >>>> + struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); >>>> + resource_size_t start = nsio->res.start; >>>> + >>>> + if (!IS_ALIGNED(start | size, memremap_compat_align())) { >>>> + dev_dbg(&ndns->dev, "misaligned, unable to map\n"); >>>> + return ERR_PTR(-EOPNOTSUPP); >>>> + } >>>> + } >>>> + >>>> if (is_namespace_pmem(&ndns->dev)) { >>>> struct nd_namespace_pmem *nspm; >>>> >>> >>> Actually, I take back my ack. :) This prevents a previously working >>> namespace from being successfully probed/setup. >> >> Do you have a test case handy? I can see a potential gap with a >> namespace that used internal padding to fix up the alignment. > > # ndctl list -v -n namespace0.0 > [ > { > "dev":"namespace0.0", > "mode":"fsdax", > "map":"dev", > "size":52846133248, > "uuid":"b99f6f6a-2909-4189-9bfa-6eeebd95d40e", > "raw_uuid":"aff43777-015b-493f-bbf9-7c7b0fe33519", > "sector_size":512, > "align":4096, > "blockdev":"pmem0", > "numa_node":0 > } > ] > > # cat /sys/bus/nd/devices/region0/mappings > 6 > > # grep namespace0.0 /proc/iomem > 1860000000-24e0003fff : namespace0.0 > >> The goal of this check is to catch cases that are just going to fail >> devm_memremap_pages(), and the expectation is that it could not have >> worked before unless it was ported from another platform, or someone >> flipped the page-size switch on PowerPC. > > On x86, creation and probing of the namespace worked fine before this > patch. What *doesn't* work is creating another fsdax namespace after > this one. sector mode namespaces can still be created, though: > > [ > { > "dev":"namespace0.1", > "mode":"sector", > "size":53270768640, > "uuid":"67ea2c74-d4b1-4fc9-9c1a-a7d2a6c2a4a7", > "sector_size":512, > "blockdev":"pmem0.1s" > }, > > # grep namespace0.1 /proc/iomem > 24e0004000-3160007fff : namespace0.1 > >>> I thought we were only going to enforce the alignment for a newly >>> created namespace? This should only check whether the alignment >>> works for the current platform. >> >> The model is a new default 16MB alignment is enforced at creation >> time, but if you need to support previously created namespaces then >> you can manually trim that alignment requirement to no less than >> memremap_compat_align() because that's the point at which >> devm_memremap_pages() will start failing or crashing. > > The problem is that older kernels did not enforce alignment to > SUBSECTION_SIZE. We shouldn't prevent those namespaces from being > accessed. The probe itself will not cause the WARN_ON to trigger. > Creating new namespaces at misaligned addresses could, but you've > altered the free space allocation such that we won't hit that anymore. > > If I drop this patch, the probe will still work, and allocating new > namespaces will also work: > > # ndctl list > [ > { > "dev":"namespace0.1", > "mode":"sector", > "size":53270768640, > "uuid":"67ea2c74-d4b1-4fc9-9c1a-a7d2a6c2a4a7", > "sector_size":512, > "blockdev":"pmem0.1s" > }, > { > "dev":"namespace0.0", > "mode":"fsdax", > "map":"dev", > "size":52846133248, > "uuid":"b99f6f6a-2909-4189-9bfa-6eeebd95d40e", > "sector_size":512, > "align":4096, > "blockdev":"pmem0" > } > ] > ndctl create-namespace -m fsdax -s 36g -r 0 > { > "dev":"namespace0.2", > "mode":"fsdax", > "map":"dev", > "size":"35.44 GiB (38.05 GB)", > "uuid":"7893264c-c7ef-4cbe-95e1-ccf2aff041fb", > "sector_size":512, > "align":2097152, > "blockdev":"pmem0.2" > } > > proc/iomem: > > 1860000000-d55fffffff : Persistent Memory > 1860000000-24e0003fff : namespace0.0 > 24e0004000-3160007fff : namespace0.1 > 3162000000-3a61ffffff : namespace0.2 > > So, maybe the right thing is to make memremap_compat_align return > PAGE_SIZE for x86 instead of SUBSECTION_SIZE? > I did that as part of https://lore.kernel.org/linux-nvdimm/20200120140749.69549-2-aneesh.kumar@linux.ibm.com and applied the subsection details only when creating new namespace https://lore.kernel.org/linux-nvdimm/20200120140749.69549-4-aneesh.kumar@linux.ibm.com But I do agree with the approach that in-order to create a compatible namespace we need enforce max possible align value across all supported architectures. On POWER we should still be able to enforce SUBSECTION_SIZE restrictions. We did put that as document w.r.t. distributions like Suse https://www.suse.com/support/kb/doc/?id=7024300 -aneesh