Received: by 10.223.185.116 with SMTP id b49csp1856627wrg; Sun, 4 Mar 2018 11:41:10 -0800 (PST) X-Google-Smtp-Source: AG47ELs5kGxdl/NBiFeJKDf6HCfa/acFoxqJcwrUjssBi07BuPNKcG+Ae4SdIc8T5WAOX3xFHZla X-Received: by 2002:a17:902:6a8c:: with SMTP id n12-v6mr7007935plk.230.1520192470369; Sun, 04 Mar 2018 11:41:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520192470; cv=none; d=google.com; s=arc-20160816; b=aTpJ0X2k0NGsUwQmpaAFiepEYImfPPqom47rrX4hWX1MoEP0eT5yieLb34mnme7MGZ ueoUUcmnOnGavS5fT1aEu4ZNkeVZXPJvGwhyfhDmRMDCy6/vKcZCp2uOUzhs522lLova kOx24ewaHO55Ciq7HzeHkZcmBgyDROc0v3pLPu5rjgq9guWdfayJQ7ATINSNZW5xzSYj fcJh5L2k6nFzKffU2PC4ilQRs6gm8Z25sPvgbW5AWzPUZnMqCr+L3KL2FRizeFWgSdy3 ES1sqVCtKJzOEtzZ3RlpSdadhT0AjpdCj/tfxYAFDPES3bS2OlXY76nE3Pe6hi1JhWNJ jTPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=ys7nnLlzvXGIIz3Ficu3I2t9iGKRmjwtjpCODaYCFtQ=; b=AA7eOKu5hQHc8Pl+ymHjXrDaIVgbd1cOCI7gR1NOtZ+4B1dvdzQCd63Cl0A2H1IvGN lgNzktnljQm3FP+FFEbajsPA1AAc48SpmJIga9rvkLfG70Mk/kfSsx7UZ+W0kXEUUXbK eaN8cbRgU2rDO+kP6at9eT/JIfWs8C/1OrsuHgNQ4qatI/vTLmEeihJlDzrkDEYiq29B 8MboTKIxxvBDvklL7ko2E5aghLOg7U+YMtlW3N4AVkDejQFg+I6gCa9eS1jFZfj4j24y i4JNHzqKuvJ0jv7dn/Y/Rld/CVaZwOCsFuSusYufZNmKj2D65MJrhAn5e4hvFZ5k/Nsa FCgg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si8068218plv.604.2018.03.04.11.40.17; Sun, 04 Mar 2018 11:41:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752069AbeCDSoM convert rfc822-to-8bit (ORCPT + 99 others); Sun, 4 Mar 2018 13:44:12 -0500 Received: from mx2.suse.de ([195.135.220.15]:58838 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751791AbeCDSoL (ORCPT ); Sun, 4 Mar 2018 13:44:11 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B850CAEDB; Sun, 4 Mar 2018 18:44:08 +0000 (UTC) Date: Sun, 4 Mar 2018 19:44:05 +0100 From: Michal =?UTF-8?B?U3VjaMOhbmVr?= To: Stefan Wahren Cc: Eric Anholt , bcm-kernel-feedback-list@broadcom.com, linux-kernel@vger.kernel.org, Ray Jui , Scott Branden , Florian Fainelli , linux-rpi-kernel@lists.infradead.org, Phil Elwell , Gerd Hoffmann , linux-mmc@vger.kernel.org, Ulf Hansson , Julia Lawall , "Gustavo A. R. Silva" , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 1/2] mmc: bcm2835: reset host on timeout Message-ID: <20180304194405.442ef90d@naga.suse.cz> In-Reply-To: <46261671.179357.1520187109844@email.1und1.de> References: <97593d6e1a41af1baff61f7d9e6e68a450fc9da6.1518619058.git.msuchanek@suse.de> <1fbf0d77-cb53-f0fa-b810-e9954138d907@i2se.com> <20180214163649.3a0c9476@kitsune.suse.cz> <20180214165827.386b9bb1@kitsune.suse.cz> <20180214202454.6e7ebeaf@naga.suse.cz> <431948292.48734.1518640216077@email.1und1.de> <20180304165717.6d4d8e68@naga.suse.cz> <46261671.179357.1520187109844@email.1und1.de> X-Mailer: Claws Mail 3.15.1-dirty (GTK+ 2.24.32; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 4 Mar 2018 19:11:49 +0100 (CET) Stefan Wahren wrote: > Hi Michal, > > > Michal Suchánek hat am 4. März 2018 um 16:57 > > geschrieben: > > > > > > On Wed, 14 Feb 2018 21:30:16 +0100 (CET) > > Stefan Wahren wrote: > > > > > Hi Michal, > > > > > > > Michal Suchánek hat am 14. Februar 2018 um > > > > 20:24 geschrieben: > > > > > > > > > > > > On Wed, 14 Feb 2018 17:49:31 +0100 > > > > Stefan Wahren wrote: > > > > > > > > > Hi Michal, > > > > > > > > > > [add Phil] > > > > > > > > > > Am 14.02.2018 um 17:13 schrieb Michal Suchánek: > > > > > > On Wed, 14 Feb 2018 16:36:49 +0100 > > > > > > Michal Suchánek wrote: > > > > > > > > > > > >> On Wed, 14 Feb 2018 15:58:31 +0100 > > > > > >> Stefan Wahren wrote: > > > > > >> > > > > > >>> Hi Michal, > > > > > >>> > > > > > >>> Am 14.02.2018 um 15:38 schrieb Michal Suchanek: > > > > > >>>> The bcm2835 mmc host tends to lock up for unknown reason > > > > > >>>> so reset it on timeout. The upper mmc block layer tries > > > > > >>>> retransimitting with single blocks which tends to work > > > > > >>>> out after a long wait. > > > > > >>>> > > > > > >>>> This is better than giving up and leaving the machine > > > > > >>>> broken for no obvious reason. > > > > > >>> could you please provide more information about this issue > > > > > >>> (affected hardware, kernel config, version, dmesg, > > > > > >>> reproducible scenario)? > > > > > > It tends to reproduce when upgrading a few packages with > > > > > > zypper and otherwise at random during system operation. It > > > > > > seems that for my card it worsens with age to some degree > > > > > > so perhaps it depends on the fragmentation of the internal > > > > > > card flash. > > > > > > > > > > > > Attaching dmesg and kernel config. > > > > > > > > > > do you noticed this issue before 4.15-rc4? > > > > > > > > I initially noticed it with 4.4 kernel with some backports to > > > > make it bootable on RPi. > > > > > > this confuses me. Gerd and i ported this driver from downstream > > > and finally it's got merged in 4.12. > > > > > > So do you mean that you backported the mainline version to 4.4 or > > > the downstream version of 4.4? > > > > I did not backport it but looking at the changelog it is backport of > > the 4.12 driver. It does not look as the 4.15 driver though. Looks > > like there was some reorganization of the bcm mmc since then. > > > > > > > > On a quick look they seems identical, but they aren't. > > > > > > > > > > > > > Could you please test with 4.15 final again? > > > > > > > > I tried upgrading to the current master (4.16-rc3+) and the issue is > > still reproducible although less frequent. I did full upgrade from > > the install image which installs over 300 packages and the issue > > triggered somewhere around 200th while before installing a half > > dozen packages would usually trigger it. > > > > this is the same what i did during my stress tests. The step > installed 475 packages. The timeout never occured. First off, you did your testing with Tumbleweed image which probably uses btrfs for / while I use Leap 42.3 image which uses ext4. I was not able to reproduce the issue with installing packages so far - installed GNOME which is over 700 packages and the issue did not trigger. However, the upgrades also unlink the old files as does removing packages - removing GNOME removed over 200 packages and the issue triggered. However, re-installing and removing GNOME again did not trigger the issue. So nothing so far reproduces the issue reliably. With the new kernel the issue even reproduces less frequently than the 4.4 and 4.15 kernel - probably some i/o scheduler change affects the disk i/o patterns. Thanks Michal