Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp247241imm; Tue, 9 Oct 2018 17:26:21 -0700 (PDT) X-Google-Smtp-Source: ACcGV60G+GL4/AsHuF3zPemCc1ohgNSgS27DIATZ6ELtoyRLYVwbVI0+dYTu/V6ElvYUujQZKMlw X-Received: by 2002:a17:902:64c1:: with SMTP id y1-v6mr29847663pli.301.1539131181197; Tue, 09 Oct 2018 17:26:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539131181; cv=none; d=google.com; s=arc-20160816; b=WyyF+sGlO/RFIGCSx2KR/+r990nq0Kl/IMA7U7UBFcrN0o1mNerkS05jzGnCbZ0eeE tQ8tOL81xAtNVAJciA0YRA4IxwGlKFv75GXila12sWiz7HQaU+2tJlX4pKHVE1OCR7Ke kpS6Ws6eLQ2wUa+Q7fmJw9ADTScPGDOqBOHhNaL5o8wSaFaD1A6qhrskqKtS+e+vlMtp CaoJgwrr4n2sLViZQO7UkiNkGRS1I/0lc38Ho1wioAuURnI3XKo3ve48GFjTP0FqO2/u MdgkWLN5FG7uX2KAcFAaJeGNn09HwqSV5Z6HLsOWcNzkDaV4AUX972fSHC5RUwhRsLTA upDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:openpgp:from:references:cc:to:subject; bh=2XwmQXrN5LH7tznrBIv54tpeDjfXwnWPByqjeDrFd9M=; b=yGPcllzECd/N/kZSxJAGCD/+fQpi8SUEdqD4+HmiX3/xwQcGAT+wTyBwH5+mNquBED KXorsYsB/1GKgMiljTOecRf+5RunpYbepnFBbJUfriUn1YylMYxMmhQ6j4tGEHAvfwfq rXjHX724VYtoPENJsP0BiU7hlY/UNafMdWoKNRwbSQb0mBaNATlAxALeW4MnAWLnNkWI BZKTlwhenc7sJgamF61AKvzVmkjhjmYnEMCSI78dmbdRXC08B9M9YhRFBXqCxlv+68g0 Q+CHwnjjK3QpS/u2dARX+SiZMqHo5/VpuZbNUxxknnPgdOxBrSMEqjH7ziEpJoXjkffQ Y8yg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e73-v6si23162601pfb.98.2018.10.09.17.26.02; Tue, 09 Oct 2018 17:26:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726170AbeJJHoA (ORCPT + 99 others); Wed, 10 Oct 2018 03:44:00 -0400 Received: from vps-vb.mhejs.net ([37.28.154.113]:40166 "EHLO vps-vb.mhejs.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725766AbeJJHoA (ORCPT ); Wed, 10 Oct 2018 03:44:00 -0400 Received: from MUA by vps-vb.mhejs.net with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.90_1) (envelope-from ) id 1gA2Ic-0005KK-F2; Wed, 10 Oct 2018 02:24:26 +0200 Subject: Re: R8169: Network lockups in 4.18.{8,9,10} (and 4.19 dev) To: Chris Clayton Cc: Heiner Kallweit , "David S. Miller" , Azat Khuzhin , Greg Kroah-Hartman , Realtek linux nic maintainers , linux-kernel References: <54d8d7e9-a80d-dc2b-5628-22f9dc14e2ee@maciej.szmigiero.name> <535f42c7-6c3b-8e5a-49de-5dc975879b21@googlemail.com> <98680351-5123-761f-982a-726098da9716@gmail.com> <9980dcc1-f7fe-5de7-75be-99b1592c9206@googlemail.com> <6b1685ce-22ac-2c71-e1d4-b05748a7d977@googlemail.com> <7199b1e4-ce40-60ae-2a6a-ef7e95e563ea@googlemail.com> <0e206e6b-3d0c-de27-dedb-48c30e02649c@gmail.com> From: "Maciej S. Szmigiero" Openpgp: preference=signencrypt Autocrypt: addr=mail@maciej.szmigiero.name; prefer-encrypt=mutual; keydata= xsFNBFpGusUBEADXUMM2t7y9sHhI79+2QUnDdpauIBjZDukPZArwD+sDlx5P+jxaZ13XjUQc 6oJdk+jpvKiyzlbKqlDtw/Y2Ob24tg1g/zvkHn8AVUwX+ZWWewSZ0vcwp7u/LvA+w2nJbIL1 N0/QUUdmxfkWTHhNqgkNX5hEmYqhwUPozFR0zblfD/6+XFR7VM9yT0fZPLqYLNOmGfqAXlxY m8nWmi+lxkd/PYqQQwOq6GQwxjRFEvSc09m/YPYo9hxh7a6s8hAP88YOf2PD8oBB1r5E7KGb Fv10Qss4CU/3zaiyRTExWwOJnTQdzSbtnM3S8/ZO/sL0FY/b4VLtlZzERAraxHdnPn8GgxYk oPtAqoyf52RkCabL9dsXPWYQjkwG8WEUPScHDy8Uoo6imQujshG23A99iPuXcWc/5ld9mIo/ Ee7kN50MOXwS4vCJSv0cMkVhh77CmGUv5++E/rPcbXPLTPeRVy6SHgdDhIj7elmx2Lgo0cyh uyxyBKSuzPvb61nh5EKAGL7kPqflNw7LJkInzHqKHDNu57rVuCHEx4yxcKNB4pdE2SgyPxs9 9W7Cz0q2Hd7Yu8GOXvMfQfrBiEV4q4PzidUtV6sLqVq0RMK7LEi0RiZpthwxz0IUFwRw2KS/ 9Kgs9LmOXYimodrV0pMxpVqcyTepmDSoWzyXNP2NL1+GuQtaTQARAQABzTBNYWNpZWogUy4g U3ptaWdpZXJvIDxtYWlsQG1hY2llai5zem1pZ2llcm8ubmFtZT7CwZQEEwEIAD4WIQRyeg1N 257Z9gOb7O+Ef143kM4JdwUCWka6xQIbAwUJA8JnAAULCQgHAgYVCgkICwIEFgIDAQIeAQIX gAAKCRCEf143kM4Jdx4+EACwi1bXraGxNwgFj+KI8T0Xar3fYdaOF7bb7cAHllBCPQkutjnx 8SkYxqGvSNbBhGtpL1TqAYLB1Jr+ElB8qWEV6bJrffbRmsiBPORAxMfu8FF+kVqCYZs3nbku XNzmzp6R/eii40S+XySiscmpsrVQvz7I+xIIYdC0OTUu0Vl3IHf718GBYSD+TodCazEdN96k p9uD9kWNCU1vnL7FzhqClhPYLjPCkotrWM4gBNDbRiEHv1zMXb0/jVIR/wcDIUv6SLhzDIQn Lhre8LyKwid+WQxq7ZF0H+0VnPf5q56990cEBeB4xSyI+tr47uNP2K1kmW1FPd5q6XlIlvh2 WxsG6RNphbo8lIE6sd7NWSY3wXu4/R1AGdn2mnXKMp2O9039ewY6IhoeodCKN39ZR9LNld2w Dp0MU39LukPZKkVtbMEOEi0R1LXQAY0TQO//0IlAehfbkkYv6IAuNDd/exnj59GtwRfsXaVR Nw7XR/8bCvwU4svyRqI4luSuEiXvM9rwDAXbRKmu+Pk5h+1AOV+KjKPWCkBEHaASOxuApouQ aPZw6HDJ3fdFmN+m+vNcRPzST30QxGrXlS5GgY6CJ10W9gt/IJrFGoGxGxYjj4WzO97Rg6Mq WMa7wMPPNcnX5Nc/b8HW67Jhs3trj0szq6FKhqBsACktOU4g/ksV8eEtnM7AzQRaRrwiAQwA xnVmJqeP9VUTISps+WbyYFYlMFfIurl7tzK74bc67KUBp+PHuDP9p4ZcJUGC3UZJP85/GlUV dE1NairYWEJQUB7bpogTuzMI825QXIB9z842HwWfP2RW5eDtJMeujzJeFaUpmeTG9snzaYxY N3r0TDKj5dZwSIThIMQpsmhH2zylkT0jH7kBPxb8IkCQ1c6wgKITwoHFjTIO0B75U7bBNSDp XUaUDvd6T3xd1Fz57ujAvKHrZfWtaNSGwLmUYQAcFvrKDGPB5Z3ggkiTtkmW3OCQbnIxGJJw /+HefYhB5/kCcpKUQ2RYcYgCZ0/WcES1xU5dnNe4i0a5gsOFSOYCpNCfTHttVxKxZZTQ/rxj XwTuToXmTI4Nehn96t25DHZ0t9L9UEJ0yxH2y8Av4rtf75K2yAXFZa8dHnQgCkyjA/gs0ujG wD+Gs7dYQxP4i+rLhwBWD3mawJxLxY0vGwkG7k7npqanlsWlATHpOdqBMUiAR22hs02FikAo iXNgWTy7ABEBAAHCwXwEGAEIACYWIQRyeg1N257Z9gOb7O+Ef143kM4JdwUCWka8IgIbDAUJ A8JnAAAKCRCEf143kM4Jd9nXD/9jstJU6L1MLyr/ydKOnY48pSlZYgII9rSnFyLUHzNcW2c/ qw9LPMlDcK13tiVRQgKT4W+RvsET/tZCQcap2OF3Z6vd1naTur7oJvgvVM5lVhUia2O60kEZ XNlMLFwLSmGXhaAXNBySpzN2xStSLCtbK58r7Vf9QS0mR0PGU2v68Cb8fFWcYu2Yzn3RXf0Y dIVWvaQG9whxZq5MdJm5dknfTcCG+MtmbP/DnpQpjAlgVmDgMgYTBW1W9etU36YW0pTqEYuv 6cmRgSAKEDaYHhFLTR1+lLJkp5fFo3Sjm7XqmXzfSv9JGJGMKzoFOMBoLYv+VFnMoLX5UJAs 0JyFqFY2YxGyLd4J103NI/ocqQeU0TVvOZGVkENPSxIESnbxPghsEC0MWEbGsvqA8FwvU7Xf GhZPYzTRf7CndDnezEA69EhwpZXKs4CvxbXo5PDTv0OWzVaAWqq8s8aTMJWWAhvobFozJ63z afYHkuEjMo0Xps3o3uvKg7coooH521nNsv4ci+KeBq3mgMCRAy0g/Ef+Ql7mt900RCBHu4tk tOhPc3J1ep/e2WAJ4ngUqJhilzyCJnzVJ4cT79VK/uPtlfUCZdUz+jTC88TmP1p5wlucS31k Thy/CV4cqDFB8yzEujTSiRzd7neG3sH0vcxBd69uvSxLZPLGID840k0v5sftPA== Message-ID: <9d99060a-db1d-7177-3041-e407b131548e@maciej.szmigiero.name> Date: Wed, 10 Oct 2018 02:24:21 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <0e206e6b-3d0c-de27-dedb-48c30e02649c@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09.10.2018 22:36, Heiner Kallweit wrote: > On 09.10.2018 16:40, Chris Clayton wrote: >> Thanks to Maciej and Heiner for their replies. >> >> On 09/10/2018 13:32, Maciej S. Szmigiero wrote: >>> On 07.10.2018 21:36, Chris Clayton wrote: >>>> Hi again, >>>> >>>> I didn't think there was anything in 4.19-rc7 to fix this regression, but tried it anyway. I can confirm that the >>>> regression is still present and my network still fails when, after a resume from suspend (to ram or disk), I open my >>>> browser or my mail client. In both those cases the failure is almost immediate - e.g. my home page doesn't get displayed >>>> in the browser. Pinging one of my ISPs name servers doesn't fail quite so quickly but the reported time increases from >>>> 14-15ms to more than 1000ms. >>> >>> You can try comparing chip registers (ethtool -d eth0) in the working >>> state (before a suspend) and in the broken state (after a resume). >>> Maybe there will be some obvious in the difference. >>> >>> The same goes for the PCI configuration (lspci -d :8168 -vv). >>> >> Maciej suggested comparing the output from lspci -vv for the ethernet device. They are identical. >> >> Both Maciej and Heiner suggested comparing the output from "ethtool -d" pre and post suspend. Again, they are identical. >> Heiner specifically suggested looking at the RxConfig. The value of that is 0x0002870e both pre and post suspend. >> > Hmm, this is very weird, especially taking into account that in your original > report you state that removing the call to rtl_init_rxcfg() from rtl_hw_start() > fixes the issue. rtl_init_rxcfg() deals with the RxConfig register only and > register values seem to be the same before and after resume. So how can the > chip behave differently? > So far my best guess is that some chip quirk causes it to accept writes to > register RxConfig, but to misinterpret or ignore the written value. > So far your report is the only one (affecting RTL8411), but we don't know > whether other chip versions are affected too. Also, it is interesting that even if one removes a call to rtl_init_rxcfg() from rtl_hw_start() the RxConfig register will still get written to moments later by rtl_set_rx_mode(). The only chip accesses in the meantime seems to be a write to TxConfig by rtl_set_tx_config_registers() and then a read of RxConfig plus two writes to MAR0 earlier in rtl_set_rx_mode(). My proposals are: 1) Try swapping "rtl_init_rxcfg(tp);" and "rtl_set_tx_config_registers(tp);" in rtl_hw_start(). Maybe the chip does not like sometimes that RxConfig is written before TxConfig. 2) Check the original value of RxConfig (after a resume) before rtl_init_rxcfg() overwrites it (compile tested only): --- r8169.c.ori +++ r8169.c @@ -5155,6 +5155,9 @@ /* Initially a 10 us delay. Turned it into a PCI commit. - FR */ RTL_R8(tp, IntrMask); RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb); + + pr_notice("RxConfig before init was %.8x\n", + (unsigned int)RTL_R32(tp, RxConfig)); rtl_init_rxcfg(tp); rtl_set_tx_config_registers(tp); This should be the value that you got when you removed the call to rtl_init_rxcfg() for testing. Now, knowing the "right" value you can experiment with what rtl_init_rxcfg() writes (under the "default:" label for your NIC model). Hope this helps, Maciej