Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp569374img; Tue, 26 Feb 2019 05:07:45 -0800 (PST) X-Google-Smtp-Source: AHgI3IZ+0WVc+VWcwLJbM5eqf1nlYLbx2YA7CXGzXJFKJO05obnorWNL/DWCcaXWTTQNqsjy06EP X-Received: by 2002:aa7:91d7:: with SMTP id z23mr6099948pfa.137.1551186465616; Tue, 26 Feb 2019 05:07:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551186465; cv=none; d=google.com; s=arc-20160816; b=onnTSpzTsm5EcYCK0TOqMJxnHU2Z+kn9A35zJOTod06p5n+WvGs2KEtFhuLhr0CjMp VAJIL7XXviAreFOdEgRuS4WCU2oq2GSp2heOLIqaNRJnryvmxklRL51Ahik+Sv/AZTZd zW0e00jQbmpAFPsJ+7lAPgeGfrwT2hbfXwhtIJzvLy9g6s2vq+Qvxxhte7ZtnZ+V3ymk oAXPNcQxFhdHEkAC+m0eVsFoNY/hZKDGf88cxD+dkrLb9DxAoaU8c84/7mGjR0LeVHXf K1JAxWiAE+T1dbmM7+1rLSQdxJMUsljn4ZHOeJzgHznv+ISeSYqSEe1gt6ez+lZY5pr9 zAiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=5xeiGIsZuCi6J4un4ZG5IM3ex3eE1XlRMMfhyDCdx90=; b=WceozSflPokGVJ99k0gNPSzx7zWIw0bZdy2gB1SgqxF0ly/g9r8yBAdT5ZUfo7u7Iq n7m3TEqGYfqBylRbC5ZftYody8mGC4RqAJTrqAptKYp6PlbR9raST6wv0oJc83mydn/u obVsCTzNv2J4pJGtU2MV7LKXWWppbQGN1Wc9fvnQyuS87fhN9hWdk15ftWJsGiMpziKa aj+JZucmWeS1Xcf1hFbW8K0klwT+0DaD2OnQRdyreNFhB4wdnJBKHh6sWk4s/Dqk0X2P Gia5yAFiz+BkgRUGPRTI8oOINxHvuTGfr7A3Kv/srbR3V0VbFBFMXmJv2T4fTOASnaYy lguA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o20si4194497pll.382.2019.02.26.05.07.30; Tue, 26 Feb 2019 05:07:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726921AbfBZNHE (ORCPT + 99 others); Tue, 26 Feb 2019 08:07:04 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41542 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726084AbfBZNHE (ORCPT ); Tue, 26 Feb 2019 08:07:04 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CCFB130603BF; Tue, 26 Feb 2019 13:07:03 +0000 (UTC) Received: from ming.t460p (ovpn-8-17.pek2.redhat.com [10.72.8.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 72C1A5C26C; Tue, 26 Feb 2019 13:06:53 +0000 (UTC) Date: Tue, 26 Feb 2019 21:06:48 +0800 From: Ming Lei To: Faiz Abbas Cc: Ming Lei , Naresh Kamboju , open list , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , linux-mmc@vger.kernel.org, linux-omap@vger.kernel.org, Tero Kristo , nm@ti.com, Mark Rutland , Rob Herring , Omar Sandoval , Jens Axboe Subject: Re: Linux-next 20190218: am57xx-evm: mmc1: ADMA error Message-ID: <20190226130647.GA7150@ming.t460p> References: <34c78d7f-cc3c-853d-b3fd-a70abdb66293@ti.com> <22ccca46-7390-1707-cfca-43b3577b79f2@ti.com> <236a3ac2-5988-9150-4a53-4a33a45d595e@ti.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Tue, 26 Feb 2019 13:07:04 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 26, 2019 at 05:04:40PM +0530, Faiz Abbas wrote: > Hi, > > On 26/02/19 3:36 PM, Ming Lei wrote: > > On Tue, Feb 26, 2019 at 2:47 PM Faiz Abbas wrote: > >> > >> Hi Ming Lei, > >> > >> On 26/02/19 7:11 AM, Ming Lei wrote: > >>> On Mon, Feb 25, 2019 at 9:14 PM Faiz Abbas wrote: > >>>> > >>>> Hi Naresh, > >>>> > >>>> + Commit authors. > >>>> > >>>> On 19/02/19 6:38 PM, Faiz Abbas wrote: > >>>>> Hi Naresh, > >>>>> > >>>>> On 18/02/19 6:57 PM, Naresh Kamboju wrote: > >>>>>> Do you see this error on am57xx-evm running Linux next 20190218 ? > >>>>>> I have tested on multiple devices and found this error. > >>>>>> Please find the full boot log [1]. > >>>>>> Am i missing any pre required configs [2] ? > >>>>>> > >>>>>> [ 5.620263] mmc1: ADMA error > >>>>>> [ 5.623266] mmc1: sdhci: ============ SDHCI REGISTER DUMP =========== > >>>>>> [ 5.629740] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00003302 > >>>>>> [ 5.636215] mmc1: sdhci: Blk size: 0x00000200 | Blk cnt: 0x0000ffff > >>>>>> [ 5.642690] mmc1: sdhci: Argument: 0x002cec70 | Trn mode: 0x00000033 > >>>>>> [ 5.649162] mmc1: sdhci: Present: 0x01f00000 | Host ctl: 0x00000010 > >>>>>> [ 5.655634] mmc1: sdhci: Power: 0x0000000f | Blk gap: 0x00000000 > >>>>>> [ 5.662108] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000107 > >>>>>> [ 5.668582] mmc1: sdhci: Timeout: 0x0000000c | Int stat: 0x00000000 > >>>>>> [ 5.675055] mmc1: sdhci: Int enab: 0x027f000b | Sig enab: 0x027f000b > >>>>>> [ 5.681529] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 > >>>>>> [ 5.688002] mmc1: sdhci: Caps: 0x21e90080 | Caps_1: 0x00000f77 > >>>>>> [ 5.694474] mmc1: sdhci: Cmd: 0x0000123a | Max curr: 0x00000000 > >>>>>> [ 5.700949] mmc1: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0xffffffef > >>>>>> [ 5.707423] mmc1: sdhci: Resp[2]: 0x0f5903ff | Resp[3]: 0xd04f0132 > >>>>>> [ 5.713896] mmc1: sdhci: Host ctl2: 0x00000004 > >>>>>> [ 5.718364] mmc1: sdhci: ADMA Err: 0x00000007 | ADMA Ptr: 0xab868218 > >>>>>> > >>>>> > >>>>> I see this as well on my setup. Trying to bisect now. Will keep you posted. > >>>> > >>>> > >>>> Reverting the following commit fixes this. > >>>> commit 07173c3ec276cbb18dc0e0687d37d310e98a1480 > >>>> Author: Ming Lei > >>>> Date: Fri Feb 15 19:13:20 2019 +0800 > >>>> > >>>> block: enable multipage bvecs > >>>> > >>>> This patch pulls the trigger for multi-page bvecs. > >>>> > >>>> Reviewed-by: Omar Sandoval > >>>> Signed-off-by: Ming Lei > >>>> Signed-off-by: Jens Axboe > >>> > >>> Hi, > >>> > >>> Thanks for your report & bisect. > >>> > >>> Could you test the following patch? > >>> > >>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-5.1/block&id=8f4e80da764ec1ca44c83f3e17dbc9bf0209bccc > >>> > >>> Or simply run the latest -next? > >> > >> That didn't fix it for me. Still see ADMA error. > >> > >> [ 13.126186] mmc0: ADMA error > >> [ 13.129084] mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== > >> [ 13.135552] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00003302 > >> [ 13.142019] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000000 > >> [ 13.148485] mmc0: sdhci: Argument: 0x00000089 | Trn mode: 0x00000033 > >> [ 13.154952] mmc0: sdhci: Present: 0x00000000 | Host ctl: 0x00000012 > >> [ 13.161418] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000000 > >> [ 13.167885] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x00000107 > >> [ 13.174351] mmc0: sdhci: Timeout: 0x0000000a | Int stat: 0x00000000 > >> [ 13.180817] mmc0: sdhci: Int enab: 0x027f000b | Sig enab: 0x027f000b > >> [ 13.187282] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 > >> [ 13.193748] mmc0: sdhci: Caps: 0x25e90080 | Caps_1: 0x00000f77 > >> [ 13.200215] mmc0: sdhci: Cmd: 0x0000123a | Max curr: 0x00000000 > >> [ 13.206682] mmc0: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x3b377f80 > >> [ 13.213148] mmc0: sdhci: Resp[2]: 0x5b590000 | Resp[3]: 0x400e0032 > >> [ 13.219613] mmc0: sdhci: Host ctl2: 0x00000000 > >> [ 13.224073] mmc0: sdhci: ADMA Err: 0x00000007 | ADMA Ptr: 0xae857288 > >> [ 13.230538] mmc0: sdhci: ============================================ > > > > OK, I will write a debug patch to dump the sg data and see if it is > > generated as wrong. > > > > BTW, which kind of failure can you find from the mmc dma error log? > > > > It looks like it only happens for some requests. More verbose log with > dma descriptor entries: > > [ 14.840865] mmc0: ADMA error > [ 14.840869] mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== > [ 14.840874] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00003302 > [ 14.840879] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000000 > [ 14.840884] mmc0: sdhci: Argument: 0x00000200 | Trn mode: 0x00000033 > [ 14.840889] mmc0: sdhci: Present: 0x00000000 | Host ctl: 0x00000012 > [ 14.840893] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000000 > [ 14.840898] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x00000107 > [ 14.840903] mmc0: sdhci: Timeout: 0x0000000a | Int stat: 0x00000000 > [ 14.840908] mmc0: sdhci: Int enab: 0x027f000b | Sig enab: 0x027f000b > [ 14.840912] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 > [ 14.840917] mmc0: sdhci: Caps: 0x25e90080 | Caps_1: 0x00000f77 > [ 14.840922] mmc0: sdhci: Cmd: 0x0000123a | Max curr: 0x00000000 > [ 14.840926] mmc0: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x20050044 > [ 14.840931] mmc0: sdhci: Resp[2]: 0x53445531 | Resp[3]: 0x744a6055 > [ 14.840935] mmc0: sdhci: Host ctl2: 0x00000000 > [ 14.840939] mmc0: sdhci: ADMA Err: 0x00000007 | ADMA Ptr: 0xae857300 > [ 14.840943] mmc0: sdhci: ============================================ > [ 14.840950] mmc0: sdhci: be2c9004: DMA 0xab1bd000, LEN 0x1000, Attr=0x21 > [ 14.840956] mmc0: sdhci: 92173e21: DMA 0xab1bc000, LEN 0x1000, Attr=0x21 > [ 14.840962] mmc0: sdhci: c8a0cde4: DMA 0xab1bb000, LEN 0x1000, Attr=0x21 > [ 14.840967] mmc0: sdhci: 4bb03017: DMA 0xab1ba000, LEN 0x1000, Attr=0x21 > [ 14.840972] mmc0: sdhci: 2fb0d59e: DMA 0xab1b9000, LEN 0x1000, Attr=0x21 > [ 14.840978] mmc0: sdhci: c3024ff2: DMA 0xab1b8000, LEN 0x1000, Attr=0x21 > [ 14.840983] mmc0: sdhci: 0738188d: DMA 0xab179000, LEN 0x1000, Attr=0x21 > [ 14.840989] mmc0: sdhci: 78ecca83: DMA 0xab178000, LEN 0x1000, Attr=0x21 > [ 14.840994] mmc0: sdhci: 1432e5a9: DMA 0xab0d7000, LEN 0x1000, Attr=0x21 > [ 14.840999] mmc0: sdhci: 8a36c77c: DMA 0xab0d6000, LEN 0x1000, Attr=0x21 > [ 14.841005] mmc0: sdhci: b7196410: DMA 0xab0d5000, LEN 0x1000, Attr=0x21 > [ 14.841010] mmc0: sdhci: dcb25259: DMA 0xab0d4000, LEN 0x1000, Attr=0x21 > [ 14.841015] mmc0: sdhci: ef1e5d32: DMA 0xab0d3000, LEN 0x1000, Attr=0x21 > [ 14.841020] mmc0: sdhci: 0319c66c: DMA 0xab0d2000, LEN 0x1000, Attr=0x21 > [ 14.841026] mmc0: sdhci: 2e6b85d9: DMA 0xab0d1000, LEN 0x1000, Attr=0x21 > [ 14.841031] mmc0: sdhci: d4dd19da: DMA 0xab0d0000, LEN 0x1000, Attr=0x21 > [ 14.841036] mmc0: sdhci: 55cdc0f6: DMA 0xab27f000, LEN 0x1000, Attr=0x21 > [ 14.841041] mmc0: sdhci: a172f4f3: DMA 0xab27e000, LEN 0x1000, Attr=0x21 > [ 14.841046] mmc0: sdhci: ed27e53e: DMA 0xab27d000, LEN 0x1000, Attr=0x21 > [ 14.841051] mmc0: sdhci: c04971ce: DMA 0xab27c000, LEN 0x1000, Attr=0x21 > [ 14.841057] mmc0: sdhci: f43985d3: DMA 0xab27b000, LEN 0x1000, Attr=0x21 > [ 14.841062] mmc0: sdhci: b977bd17: DMA 0xab27a000, LEN 0x1000, Attr=0x21 > [ 14.841067] mmc0: sdhci: 8b74ee6f: DMA 0xab279000, LEN 0x1000, Attr=0x21 > [ 14.841072] mmc0: sdhci: 12e52bc8: DMA 0xab30d000, LEN 0xffff, Attr=0x21 > [ 14.841077] mmc0: sdhci: b39efa31: DMA 0xae857000, LEN 0x0001, Attr=0x21 > [ 14.841082] mmc0: sdhci: bc4b71f0: DMA 0xab31d000, LEN 0x3000, Attr=0x21 > [ 14.841087] mmc0: sdhci: 4cb5aa08: DMA 0xab2a8000, LEN 0x2000, Attr=0x21 > [ 14.841092] mmc0: sdhci: 5e717781: DMA 0xab12a000, LEN 0x2000, Attr=0x21 > [ 14.841098] mmc0: sdhci: 125d82b5: DMA 0xab2b4000, LEN 0x4000, Attr=0x21 > [ 14.841103] mmc0: sdhci: b33874b9: DMA 0xab148000, LEN 0x4000, Attr=0x21 > [ 14.841108] mmc0: sdhci: 9b0e47a5: DMA 0xab218000, LEN 0x8000, Attr=0x21 > [ 14.841113] mmc0: sdhci: 47ce17da: DMA 0xab2a0000, LEN 0x2000, Attr=0x21 > [ 14.841118] mmc0: sdhci: 97ea0d9f: DMA 0x00000000, LEN 0x0000, Attr=0x03 > > There is a big transfer of 0xffff length followed by a smaller transfer > of 0x1 (at address 0xae857000 above) and that is where it fails. This is > the same signature every time it happens. Thanks for the investigation, and that is very helpful! Then I guess it is caused by bad segment size, see sdhci_setup_host(): if (host->flags & SDHCI_USE_ADMA) { if (host->quirks & SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC) mmc->max_seg_size = 65535; else mmc->max_seg_size = 65536; } else { mmc->max_seg_size = mmc->max_req_size; } Could you confirm it by collecting the following log? (cd /sys/block/mmcblk0/queue && find . -type f -exec grep -aH . {} \;) If 'max_segment_size' is 65535, we may need the following patch: diff --git a/block/blk-settings.c b/block/blk-settings.c index 6375afaedcec..6fb7a312b4ea 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -309,7 +309,7 @@ void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size) __func__, max_size); } - q->limits.max_segment_size = max_size; + q->limits.max_segment_size = round_down(max_size, 512); } EXPORT_SYMBOL(blk_queue_max_segment_size); Thanks, Ming