Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp3725151imm; Mon, 15 Oct 2018 03:08:48 -0700 (PDT) X-Google-Smtp-Source: ACcGV60nrcvusS/hrn4BIoD0qhJDdrnSgnWXts3Uzu+B+aRdtrJHu2pSD7WLR/MaeNBU25Nnxmgh X-Received: by 2002:a17:902:1004:: with SMTP id b4-v6mr1270410pla.172.1539598127996; Mon, 15 Oct 2018 03:08:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539598127; cv=none; d=google.com; s=arc-20160816; b=X0UI7hD3N17HbNO2HLFHtfGjVz7ZLcRG/NqcX2AAb7YxlzMKnI+TBwnZUG48urb4x7 2mT2aQyupDU0lXkkFrk8lfNjqNBKcb14YURiITcuxxrx2urjc6gQbzQ+ri7ZPi70f9E2 C/AQJfFOpnd/cmGE7M6EW5bbTN/F7/iGaC/g4u+X47go4EKCzKOe0kQq5rYpKtnGWRea q7ymd8grGjxXjQDKloEiUm1tDoUZo2ALApgsL/V0pdBF0CdJqhLVORk8Ju77KDGZ3bQN 63ROWTeCqoyvT3pzmeOh6Y2GI+MnDIUACrarD+T5N+087cn1g1NzV6wx7Yk45MzhbuZE LhfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject:dkim-signature; bh=IjL5ndWF8xOq6TVouw1POAv4CXPvATGh+J/h3uxGzYU=; b=hFIP1IYNCn9JnuVBt90cdfjSyBn/j7ccd+lRTfdFQNiQBXMQTt2lqkS9tctLYEAsYj mAfh+QBRexeZkBECbCW+xbfgbeXXhDw1oQfuuyd8vfVhZYvZ2RjX3pjAgjc37Do5iGtJ VkIJyI4qBYbAsEu4OH3zLykZVtic0KQN1dX242S4buOZ550Fi/wJt7wMOjjriGXJYIbN TE48GowXhxBaA+v1szH2JPg/o57Bip9aTgT0hj/fUHnHW/S+hkOFPR8Ije3fXiohiExw CV9cOJIRbuH2GtgO/QfwIq8mM2rUNQbouU53/yfd/wlRV5r1BP5m1O3tPZ0iWM33/DdQ DrPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@igel-co-jp.20150623.gappssmtp.com header.s=20150623 header.b=XSFXcuxM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h17-v6si9249761pfi.84.2018.10.15.03.08.32; Mon, 15 Oct 2018 03:08:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@igel-co-jp.20150623.gappssmtp.com header.s=20150623 header.b=XSFXcuxM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726636AbeJORwo (ORCPT + 99 others); Mon, 15 Oct 2018 13:52:44 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:38057 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726422AbeJORwo (ORCPT ); Mon, 15 Oct 2018 13:52:44 -0400 Received: by mail-pl1-f194.google.com with SMTP id q19-v6so5987628pll.5 for ; Mon, 15 Oct 2018 03:08:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=igel-co-jp.20150623.gappssmtp.com; s=20150623; h=subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=IjL5ndWF8xOq6TVouw1POAv4CXPvATGh+J/h3uxGzYU=; b=XSFXcuxMOwr04lKHhBxkxsLe4Te2NPu4znOa1RxAAXs1LTyFErohLyF+jpYiqc+QGp k4X40nFXUrV8poCAUAaNAihmwv9mjI90Vm8CDuMtCyIbrZzOvTzAaRsUQHhWkp+K8XIq fRN/mwSjlcPvFOQIV4SxXOCa+4YPNEdiqncv2C9hdaSHDM2vtJvZxhrwJ6ivmStqcsPz bofqyK5bjdF39lUF8/XJXxbi5gVTCyLl0nLOWVsPlMGh38u3cz0kB4T7f+mPDiQr6XUW pEgWf79vu6aRQaoIvn/MUGi1sTiLz+bGycjGieKMopurZEz5fOkUO7s+4/7TKe3ty/zw yq0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=IjL5ndWF8xOq6TVouw1POAv4CXPvATGh+J/h3uxGzYU=; b=uM0axRpTIhCryUoDqCJX9yJlfD4RxwQhMnZwQIC8ekm6/VfwcNTplCqMt7nDbhzGDh 09V8+jd63nOEbPrFXwydJL7W49QCJNpwXNbndCBX16xGlVSBw46btrNCPYGGhcCPJ/+S fFC9P0mw4ZTqFM5Fu4itpI5SoP3IXuohpz6OydcHw429xL6kKhoNhjyq6xWc2LYg/WOV NxCRuyirQ0B6NTbK7zJxXZLC7px5CqtLbqOnsUApatQMx+SunoMHASifOHVOQS4o/ve1 wT946jE3lDga3wobVzfIsOWK5ZZRD4ngnwj4wvh1BUUG/NRp74e/HNXT2iHSPkEfimoa fPXQ== X-Gm-Message-State: ABuFfoh4TqJOe0mljwIRibmKuFdVH6i1mcBbrACFlr0BtcplL5zU5zw5 gEhqdSfSl8PubzHcHv4Kdz2y5g7MGj1KIA== X-Received: by 2002:a17:902:b7c3:: with SMTP id v3-v6mr16733325plz.182.1539598089668; Mon, 15 Oct 2018 03:08:09 -0700 (PDT) Received: from [10.16.144.1] (napt.igel.co.jp. [219.106.231.132]) by smtp.gmail.com with ESMTPSA id q24-v6sm11937901pff.83.2018.10.15.03.08.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 15 Oct 2018 03:08:08 -0700 (PDT) Subject: Re: [PATCH] virtio_net: enable tx after resuming from suspend From: ake To: Jason Wang Cc: "Michael S. Tsirkin" , "David S. Miller" , virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <20181011075127.2608-1-ake@igel.co.jp> <7e87b140-79ae-c79e-40ed-dc76b38eeae4@igel.co.jp> <4918ed7c-4c63-6f19-530b-8e16b0c496d4@redhat.com> <1aff0ad2-9d63-6d38-6b25-5c681eafdfb2@igel.co.jp> Message-ID: Date: Mon, 15 Oct 2018 19:08:06 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <1aff0ad2-9d63-6d38-6b25-5c681eafdfb2@igel.co.jp> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018年10月12日 18:18, ake wrote: > > > On 2018年10月12日 17:23, Jason Wang wrote: >> >> >> On 2018年10月12日 12:30, ake wrote: >>> >>> On 2018年10月11日 22:06, Jason Wang wrote: >>>> >>>> On 2018年10月11日 18:22, ake wrote: >>>>> On 2018年10月11日 18:44, Jason Wang wrote: >>>>>> On 2018年10月11日 15:51, Ake Koomsin wrote: >>>>>>> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset") >>>>>>> disabled the virtio tx before going to suspend to avoid a use after >>>>>>> free. >>>>>>> However, after resuming, it causes the virtio_net device to lose its >>>>>>> network connectivity. >>>>>>> >>>>>>> To solve the issue, we need to enable tx after resuming. >>>>>>> >>>>>>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during >>>>>>> reset") >>>>>>> Signed-off-by: Ake Koomsin >>>>>>> --- >>>>>>>     drivers/net/virtio_net.c | 1 + >>>>>>>     1 file changed, 1 insertion(+) >>>>>>> >>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>>>>>> index dab504ec5e50..3453d80f5f81 100644 >>>>>>> --- a/drivers/net/virtio_net.c >>>>>>> +++ b/drivers/net/virtio_net.c >>>>>>> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct >>>>>>> virtio_device *vdev) >>>>>>>         } >>>>>>>           netif_device_attach(vi->dev); >>>>>>> +    netif_start_queue(vi->dev); >>>>>> I believe this is duplicated with netif_tx_wake_all_queues() in >>>>>> netif_device_attach() above? >>>>> Thank you for your review. >>>>> >>>>> If both netif_tx_wake_all_queues() and netif_start_queue() result in >>>>> clearing __QUEUE_STATE_DRV_XOFF, then is it possible that some >>>>> conditions in netif_device_attach() is not satisfied? >>>> Yes, maybe. One case I can see now is when the device is down, in this >>>> case netif_device_attach() won't try to wakeup the queue. >>>> >>>>>    Without >>>>> netif_start_queue(), the virtio_net device does not resume properly >>>>> after waking up. >>>> How do you trigger the issue? Just do suspend/resume? >>> Yes, simply suspend and resume. >>> >>> Here is how I trigger the issue: >>> >>> 1) Start the Virtual Machine Manager GUI program. >>> 2) Create a guest Linux OS. Make sure that the guest OS kernel is >>>     >= 4.12. Make sure that it uses virtio_net as its network device. >>>     In addition, make sure that the video adapter is VGA. Otherwise, >>>     waking up with the virtual power button does not work. >>> 3) After installing the guest OS, log in, and test the network >>>     connectivity by ping the host machine. >>> 4) Suspend. After this, the screen is blank. >>> 5) Resume by hitting the virtual power button. The login screen >>>     appears again. >>> 6) Log in again. The guest loses its network connection. >>> >>> In my test: >>> Guest: Ubuntu 16.04/18.04 with kernel 4.15.0-36-generic >>> Host: Ubuntu 16.04 with kernel 4.15.0-36-generic/4.4.0-137-generic >> >> I can not reproduce this issue if virtio-net interface is up in guest >> before the suspend. I'm using net-next.git and qemu master. But I do >> reproduce when virtio-net interface is down in guest before suspend, >> after resume, even if I make it up, the network is still lost. >> >> I think the interface is up in your case, but please confirm this. > > If you mean the interface state before I hit the suspend button, > the answer is yes. The interface is up before I suspend the guest > machine. > > Note that my current QEMU version is QEMU emulator version 2.5.0 > (Debian 1:2.5+dfsg-5ubuntu10.32). > > I will try with net-next.git and qemu master later and see if I can > reproduce the issue. Update. I tried with net-next and qemu master. Interestingly, the result is different from yours. The network is lost even if the virtio_net interface is up before suspending. Host: Ubuntu 16.04 with net-next kernel (default configuration) Guest: Ubuntu 18.04 with net-next kernel (default configuration) Qemu: master Qemu command: qemu-system-x86_64 -cpu host -m 2048 -enable-kvm \ -bios /usr/share/OVMF/OVMF_CODE.fd \ -drive file=/var/lib/libvirt/images/virtio_test.qcow2,if=virtio \ -netdev user,id=hostnet0 \ -device virtio-net-pci,netdev=hostnet0 \ -device VGA,id=video0,vgamem_mb=16 \ -global PIIX4_PM.disable_s3=1 \ -global PIIX4_PM.disable_s4=1 -monitor stdio >>> >>>>> Is it better to report this as a bug first? >>>> Nope, you're very welcome to post patch directly. >>>> >>>>> If I am to do more >>>>> investigation, what areas should I look into? >>>> As you've figured out, you can start with why netif_tx_wake_all_queues() >>>> were not executed? >>>> >>>> (Btw, does the issue disappear if you move netif_tx_disable() under the >>>> check of netif_running() in virtnet_freeze_down()?) >>> The issue disappears if I move netif_tx_disable() under the check of >>> netif_running() in virtnet_freeze_down(). Moving netif_tx_disable() >>> is probably better as its logic is consistent with >>> netif_device_attach() implementation. If you are OK with this idea, >>> I will submit another patch. >> >> I think the it helps for the case when interface is down before suspend. >> But it's still unclear why it help even if the interface is up >> (netif_running() is true). >> >> Please submit a patch but we should figure out why it help for a up >> interface as well. >> I will think about the proper reason first. >> Thanks >> >>> >>>> Thanks >>>> >>>>> Best Regards >>>>> Ake Koomsin >>>>> >>> Best Regards >> Best Regards