Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp4012439ybv; Mon, 10 Feb 2020 10:34:33 -0800 (PST) X-Google-Smtp-Source: APXvYqwP9iBfNa8wh8ZPjC4AIJOuYLy9rcO3K9zxHKkBY/Xo8AzSfQQFfDnjb1pjw8rWNtaDIub7 X-Received: by 2002:a9d:811:: with SMTP id 17mr2115537oty.369.1581359673623; Mon, 10 Feb 2020 10:34:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581359673; cv=none; d=google.com; s=arc-20160816; b=KWkXsp7ZZg1O368wZIgnWOH2nBm2BISl9843edPwRtbuk77i4s9x2k7FWVlYLachkH VjVDdEkhwyEc4dUllk7cTuwjPZWY2LEc760tW8H0feYlJdou9F3r2l5vS31MHiWv31sf ME2DerOjCGFJm5Z/JV5FOKFt10+jVb7r8dt4KU0NkHEZyznc9x/uCpvnxtXL0ghV8kS2 88vq3cqiUhExL2l30GQCgbaTsZRzDz99n5o7OA96U6I/lgYCHJEe7BMB4NT50SGYK1Z2 iBuAXdTAfPxbXpmK6E3KuqV0pQkBXS6NSf7LM6WwIs7gK22doKI7qFbz56vSTTlHT9O+ EwEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=B+jHtAnJKkOCwyeMPvS5vGcxcEKKBJbfgvWPh9lpFCs=; b=UuLWVBiCyuyP48FxnxhNWJEHlmBS/5is8rX2vlUvoa3u3utYNnmUkLNtRr0u4ui9rF Lvjn/y+gVqRta/E1XNeOqLbXidPwivGKePTpAmpwDcgzOBFk+7kV7081QZdVwkVp6C0X MIJ6vizIsh/qp8yYe3T7OAW1Tk4u4sNqCgG03zfh4TekiaH3g0OaOR5SqpalwiAX5PXJ SSxWslzQcXUsDyib3EYFg1J9S3Ars+IkeSlxG1u0val8yL2Fjg9r5r4XRsw5yueiDKa8 MRSJSmEoIDHdSmHi7lpPxKx4Ws9e7Q9+RMGjc+IF2o27YWiNB77DClD36TMnWUUIO5y0 cbZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lixom-net.20150623.gappssmtp.com header.s=20150623 header.b=PdwRXeWH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d17si444753oij.136.2020.02.10.10.34.21; Mon, 10 Feb 2020 10:34:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@lixom-net.20150623.gappssmtp.com header.s=20150623 header.b=PdwRXeWH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727764AbgBJSde (ORCPT + 99 others); Mon, 10 Feb 2020 13:33:34 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:33561 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726816AbgBJSdd (ORCPT ); Mon, 10 Feb 2020 13:33:33 -0500 Received: by mail-io1-f65.google.com with SMTP id z8so8750707ioh.0 for ; Mon, 10 Feb 2020 10:33:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lixom-net.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=B+jHtAnJKkOCwyeMPvS5vGcxcEKKBJbfgvWPh9lpFCs=; b=PdwRXeWHiFH8yrpQPYpR0uye2yAyLKDZSAPha/h4Mbr5H5Oj09gWLIhL2M7t7QAK5s OgkWH4mTDX6zzUNkwCryxEGrAR1iwW8YBoVj2bJjKWPhnJLL9DpeT/b8HPXDGoBsv97s FVfQYfUH1eWx+xXVRLylK51uI380jwjqt2trAny+zTRTP0QXjC8e0mGrVinkItDsiHv6 4sz0lz5WCBoQsWHYoAWzIYKcpH4dRoeeM3KJR54Srcl/ci30YYHbj6BwghIvJdgflaBd N3jOvp46mkeMNlm70JAWogJwSOWt9p2KwrdVabQd8z7247nhhvXPAB6U9PDqElw4WH6C xKhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=B+jHtAnJKkOCwyeMPvS5vGcxcEKKBJbfgvWPh9lpFCs=; b=GJkv2xcZarx/RYXm9PImzKlN3gCgGD8unnig6eUhHIbKhQZSWaPBc5AZS7f1G6M4FP LmxnvPJ/oC5iFzn9eCRCFI8D4PbVpnE8gmZ9wIj1WiP1Y6oxRN2z5AhdemJq4kHKfS9q BWVBE7/3B/7wR7/4iQv2kcUJMMRG0eNLo8SEIjrNDuH7on4MXoFFDJvulop01oGTzIig eM/UMXPB8OTcOTtZHusDxUUxi95YEuAzOWf0KhKfCtwgyEXTocCWgK+ddvs0iHZJfyOh AZRsJKWYIgY2cMnetynhZjyUAdS3EX41yMjAaZF4I4qhQPmfsaKFh+AyvIQbntnTgAPZ 7Z7w== X-Gm-Message-State: APjAAAVsiVcjMJnDjAaE7wgdoSRtQ1REpQsxR1aZg9JL+MAK/iq/CLxA 7t9EuTsvCHWewKLFHBbS3lksupoZRXdQkapA/i1EoQ== X-Received: by 2002:a02:9581:: with SMTP id b1mr10894223jai.11.1581359612398; Mon, 10 Feb 2020 10:33:32 -0800 (PST) MIME-Version: 1.0 References: <20191120034451.30102-1-Zhiqiang.Hou@nxp.com> <20200110153347.GA29372@e121166-lin.cambridge.arm.com> <20200210152257.GD25745@shell.armlinux.org.uk> <20200210161553.GE25745@shell.armlinux.org.uk> In-Reply-To: <20200210161553.GE25745@shell.armlinux.org.uk> From: Olof Johansson Date: Mon, 10 Feb 2020 19:33:19 +0100 Message-ID: Subject: Re: [PATCHv9 00/12] PCI: Recode Mobiveil driver and add PCIe Gen4 driver for NXP Layerscape SoCs To: Russell King - ARM Linux admin Cc: "mark.rutland@arm.com" , "devicetree@vger.kernel.org" , Lorenzo Pieralisi , "arnd@arndb.de" , "m.karthikeyan@mobiveil.co.in" , "linux-pci@vger.kernel.org" , "Z.q. Hou" , "l.subrahmanya@mobiveil.co.in" , "will.deacon@arm.com" , "linux-kernel@vger.kernel.org" , Leo Li , "M.h. Lian" , "robh+dt@kernel.org" , Xiaowei Bao , "catalin.marinas@arm.com" , "bhelgaas@google.com" , "andrew.murray@arm.com" , "shawnguo@kernel.org" , Mingkai Hu , "linux-arm-kernel@lists.infradead.org" , honeycomb-users@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [cc:ing honeycomb-users, didn't think of that earlier] On Mon, Feb 10, 2020 at 5:16 PM Russell King - ARM Linux admin wrote: > > On Mon, Feb 10, 2020 at 04:28:23PM +0100, Olof Johansson wrote: > > On Mon, Feb 10, 2020 at 4:23 PM Russell King - ARM Linux admin > > wrote: > > > > > > On Mon, Feb 10, 2020 at 04:12:30PM +0100, Olof Johansson wrote: > > > > On Thu, Feb 6, 2020 at 11:57 AM Z.q. Hou wrote: > > > > > > > > > > Hi Olof, > > > > > > > > > > Thanks a lot for your comments! > > > > > And sorry for my delay respond! > > > > > > > > Actually, they apply with only minor conflicts on top of current -next. > > > > > > > > Bjorn, any chance we can get you to pick these up pretty soon? They > > > > enable full use of a promising ARM developer system, the SolidRun > > > > HoneyComb, and would be quite valuable for me and others to be able to > > > > use with mainline or -next without any additional patches applied -- > > > > which this patchset achieves. > > > > > > > > I know there are pending revisions based on feedback. I'll leave it up > > > > to you and others to determine if that can be done with incremental > > > > patches on top, or if it should be fixed before the initial patchset > > > > is applied. But all in all, it's holding up adaption by me and surely > > > > others of a very interesting platform -- I'm looking to replace my > > > > aging MacchiatoBin with one of these and would need PCIe/NVMe to work > > > > before I do. > > > > > > If you're going to be using NVMe, make sure you use a power-fail safe > > > version; I've already had one instance where ext4 failed to mount > > > because of a corrupted journal using an XPG SX8200 after the Honeycomb > > > Serror'd, and then I powered it down after a few hours before later > > > booting it back up. > > > > > > EXT4-fs (nvme0n1p2): INFO: recovery required on readonly filesystem > > > EXT4-fs (nvme0n1p2): write access will be enabled during recovery > > > JBD2: journal transaction 80849 on nvme0n1p2-8 is corrupt. > > > EXT4-fs (nvme0n1p2): error loading journal > > > > Hmm, using btrfs on mine, not sure if the exposure is similar or not. > > As I understand the problem, it isn't a filesystem issue. It's a data > integrity issue with the NVMe over power fail, how they cache the data, > and ultimately write it to the nand flash. > > Have a read of: > > https://www.kingston.com/en/solutions/servers-data-centers/ssd-power-loss-protection > > As NVMe and SSD are basically the same underlying technology (the host > interface is different) and the issues I've heard, and now experienced > with my NVMe, I think the above is a good pointer to the problems of > flash mass storage. > > As I understand it, the problem occurs when the mapping table has not > been written back to flash, power is lost without the Standby Immediate > command being sent, and there is no way for the firmware to quickly > save the table. On subsequent power up, the firmware has to > reconstruct the mapping table, and depending on how that is done, > incorrect (old?) data may be returned for some blocks. > > That can happen to any blocks on the drive, which means any data can > be at risk from a power loss event, whether that is a power failure > or after a crash. Makes me suspect if there's some board-level power/reset sequencing issue, or if there's a problem with one card going down disabling others. I haven't read the specs enough to know what's expected behavior but I've seen similar issues on other platforms so take it with a grain of salt. > > Do you know if the SErr was due to a known issue and/or if it's > > something that's fixed in production silicon? > > The SError is triggered by something on the PCIe side of things; if I > leave the Mellanox PCIe card out, then I don't get them. The errata > patches I have merged into my tree help a bit, turning the code from > being unable to boot without a SError with the card plugged in, to > being able to boot and last a while - but the SErrors still eventually > come, maybe taking a few days... and that's without the Mellanox > ethernet interface being up. > > > (I still can't enable SMMU since across a warm reboot it fails > > *completely*, with nothing coming up and working. NXP folks, you > > listening? :) > > Is it just a warm reboot? I thought I saw SMMU activity on a cold > boot as well, implying that there were devices active that Linux > did not know about. Yeah, 100% reproducible on warm reboot -- every single time. Not on cold boot though (100% success rate as far as I remember). I boot with kernel on NVMe on PCIe, native 1GbE for networking. u-boot from SD card. This is with the SolidRun u-boot from GitHub. -Olof