Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp135678ybl; Tue, 7 Jan 2020 03:36:03 -0800 (PST) X-Google-Smtp-Source: APXvYqxe0A3W9dlgD8x95ndfsIQvjhlyoU+TFYde4RP8KF/2tYfjc4PlFUuhVy0NGY0lL40COssb X-Received: by 2002:a05:6830:1503:: with SMTP id k3mr25157746otp.213.1578396963283; Tue, 07 Jan 2020 03:36:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578396963; cv=none; d=google.com; s=arc-20160816; b=RjArUd8yPyhDCIzUXem+UAiP9No+fldzZNW44lmDsb+chNWhvRKrKHPt5f4QawA/F3 vEzkmVBU1T2arGEAH8PSqV1ShHGWuGLi0JaioUyKhfpW47Tsi1QK1ka5pdUE3CtHScF9 SAWUDH/HVnIVh3JQ0DkS3hVsXzOBP17jLpyQpoTzOqvIClvwh/r4n2b8sLVoVz24E3cs kzu+xeFTMDwaLG3XgO7bKXUIUB5/f5zEe2yq3kJ3XBZUlxtmu4IhOF9cigz4B1bX3dIA txzQI4Z4bDTlTnoGrwWXmKfTNX4YX0jbaw+6AGH+2x+CS1s/YqGLUlBfNIRUrIsPsKgv lQ0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date:autocrypt :from:references:cc:to:subject; bh=zXZz5aF6GsDv3CJylH4ycU+9FlJz2WFm1i23vJQlNkU=; b=KeQ2VraorwaadtLO3UdOxNbszNj5ESJ3HOm3rgXUxfx3Gkz9IQC4UUe3KeuYwCxArk nsjA5e4hUc8njk5HliqyGKIFGwfyKplfNggAkkUQJWS1fKhDGHjxPFwH+KZhk+9oOVoG 1O9X5Df/3OnikLRqFL/+u1+0GewD5iOhCW/EEpZ67HPaBHZyuWLDR/bw4tLh1AVftkHY SMKS+OWPBqry4sI5Wp8RD6aX0wxgTztXQS9hlnzZu9/IUWeZIgTWkcDYnggD+ik7XWmU jBMHnDkPmk740GAmkXwPOYzPKTiLDx9KB85PLj+c8ZCdl/dij7kJEE5WmHR/GWNtycf9 D2mw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g5si39371703otn.232.2020.01.07.03.35.50; Tue, 07 Jan 2020 03:36:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727898AbgAGLe6 (ORCPT + 99 others); Tue, 7 Jan 2020 06:34:58 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:7918 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727559AbgAGLe6 (ORCPT ); Tue, 7 Jan 2020 06:34:58 -0500 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 007BVtwq037072 for ; Tue, 7 Jan 2020 06:34:57 -0500 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xb925c4ny-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Jan 2020 06:34:57 -0500 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Jan 2020 11:34:54 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Jan 2020 11:34:51 -0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 007BYohE47513776 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 7 Jan 2020 11:34:50 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8A1CB4C046; Tue, 7 Jan 2020 11:34:50 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 492354C04A; Tue, 7 Jan 2020 11:34:50 +0000 (GMT) Received: from oc7455500831.ibm.com (unknown [9.152.224.119]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Jan 2020 11:34:50 +0000 (GMT) Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot To: "Michael S. Tsirkin" Cc: "virtualization@lists.linux-foundation.org" , Stephen Rothwell , Linux Next Mailing List , "linux-kernel@vger.kernel.org" , kvm list , Halil Pasic References: <20191218100926-mutt-send-email-mst@kernel.org> <2ffdbd95-e375-a627-55a1-6990b0a0e37a@de.ibm.com> <20200106054041-mutt-send-email-mst@kernel.org> <08ae8d28-3d8c-04e8-bdeb-0117d06c6dc7@de.ibm.com> <20200107042401-mutt-send-email-mst@kernel.org> From: Christian Borntraeger Autocrypt: addr=borntraeger@de.ibm.com; prefer-encrypt=mutual; keydata= xsFNBE6cPPgBEAC2VpALY0UJjGmgAmavkL/iAdqul2/F9ONz42K6NrwmT+SI9CylKHIX+fdf J34pLNJDmDVEdeb+brtpwC9JEZOLVE0nb+SR83CsAINJYKG3V1b3Kfs0hydseYKsBYqJTN2j CmUXDYq9J7uOyQQ7TNVoQejmpp5ifR4EzwIFfmYDekxRVZDJygD0wL/EzUr8Je3/j548NLyL 4Uhv6CIPf3TY3/aLVKXdxz/ntbLgMcfZsDoHgDk3lY3r1iwbWwEM2+eYRdSZaR4VD+JRD7p8 0FBadNwWnBce1fmQp3EklodGi5y7TNZ/CKdJ+jRPAAnw7SINhSd7PhJMruDAJaUlbYaIm23A +82g+IGe4z9tRGQ9TAflezVMhT5J3ccu6cpIjjvwDlbxucSmtVi5VtPAMTLmfjYp7VY2Tgr+ T92v7+V96jAfE3Zy2nq52e8RDdUo/F6faxcumdl+aLhhKLXgrozpoe2nL0Nyc2uqFjkjwXXI OBQiaqGeWtxeKJP+O8MIpjyGuHUGzvjNx5S/592TQO3phpT5IFWfMgbu4OreZ9yekDhf7Cvn /fkYsiLDz9W6Clihd/xlpm79+jlhm4E3xBPiQOPCZowmHjx57mXVAypOP2Eu+i2nyQrkapaY IdisDQfWPdNeHNOiPnPS3+GhVlPcqSJAIWnuO7Ofw1ZVOyg/jwARAQABzUNDaHJpc3RpYW4g Qm9ybnRyYWVnZXIgKDJuZCBJQk0gYWRkcmVzcykgPGJvcm50cmFlZ2VyQGxpbnV4LmlibS5j b20+wsF5BBMBAgAjBQJdP/hMAhsDBwsJCAcDAgEGFQgCCQoLBBYCAwECHgECF4AACgkQEXu8 gLWmHHy/pA/+JHjpEnd01A0CCyfVnb5fmcOlQ0LdmoKWLWPvU840q65HycCBFTt6V62cDljB kXFFxMNA4y/2wqU0H5/CiL963y3gWIiJsZa4ent+KrHl5GK1nIgbbesfJyA7JqlB0w/E/SuY NRQwIWOo/uEvOgXnk/7+rtvBzNaPGoGiiV1LZzeaxBVWrqLtmdi1iulW/0X/AlQPuF9dD1Px hx+0mPjZ8ClLpdSp5d0yfpwgHtM1B7KMuQPQZGFKMXXTUd3ceBUGGczsgIMipZWJukqMJiJj QIMH0IN7XYErEnhf0GCxJ3xAn/J7iFpPFv8sFZTvukntJXSUssONnwiKuld6ttUaFhSuSoQg OFYR5v7pOfinM0FcScPKTkrRsB5iUvpdthLq5qgwdQjmyINt3cb+5aSvBX2nNN135oGOtlb5 tf4dh00kUR8XFHRrFxXx4Dbaw4PKgV3QLIHKEENlqnthH5t0tahDygQPnSucuXbVQEcDZaL9 WgJqlRAAj0pG8M6JNU5+2ftTFXoTcoIUbb0KTOibaO9zHVeGegwAvPLLNlKHiHXcgLX1tkjC DrvE2Z0e2/4q7wgZgn1kbvz7ZHQZB76OM2mjkFu7QNHlRJ2VXJA8tMXyTgBX6kq1cYMmd/Hl OhFrAU3QO1SjCsXA2CDk9MM1471mYB3CTXQuKzXckJnxHkHOwU0ETpw8+AEQAJjyNXvMQdJN t07BIPDtbAQk15FfB0hKuyZVs+0lsjPKBZCamAAexNRk11eVGXK/YrqwjChkk60rt3q5i42u PpNMO9aS8cLPOfVft89Y654Qd3Rs1WRFIQq9xLjdLfHh0i0jMq5Ty+aiddSXpZ7oU6E+ud+X Czs3k5RAnOdW6eV3+v10sUjEGiFNZwzN9Udd6PfKET0J70qjnpY3NuWn5Sp1ZEn6lkq2Zm+G 9G3FlBRVClT30OWeiRHCYB6e6j1x1u/rSU4JiNYjPwSJA8EPKnt1s/Eeq37qXXvk+9DYiHdT PcOa3aNCSbIygD3jyjkg6EV9ZLHibE2R/PMMid9FrqhKh/cwcYn9FrT0FE48/2IBW5mfDpAd YvpawQlRz3XJr2rYZJwMUm1y+49+1ZmDclaF3s9dcz2JvuywNq78z/VsUfGz4Sbxy4ShpNpG REojRcz/xOK+FqNuBk+HoWKw6OxgRzfNleDvScVmbY6cQQZfGx/T7xlgZjl5Mu/2z+ofeoxb vWWM1YCJAT91GFvj29Wvm8OAPN/+SJj8LQazd9uGzVMTz6lFjVtH7YkeW/NZrP6znAwv5P1a DdQfiB5F63AX++NlTiyA+GD/ggfRl68LheSskOcxDwgI5TqmaKtX1/8RkrLpnzO3evzkfJb1 D5qh3wM1t7PZ+JWTluSX8W25ABEBAAHCwV8EGAECAAkFAk6cPPgCGwwACgkQEXu8gLWmHHz8 2w//VjRlX+tKF3szc0lQi4X0t+pf88uIsvR/a1GRZpppQbn1jgE44hgF559K6/yYemcvTR7r 6Xt7cjWGS4wfaR0+pkWV+2dbw8Xi4DI07/fN00NoVEpYUUnOnupBgychtVpxkGqsplJZQpng v6fauZtyEcUK3dLJH3TdVQDLbUcL4qZpzHbsuUnTWsmNmG4Vi0NsEt1xyd/Wuw+0kM/oFEH1 4BN6X9xZcG8GYUbVUd8+bmio8ao8m0tzo4pseDZFo4ncDmlFWU6hHnAVfkAs4tqA6/fl7RLN JuWBiOL/mP5B6HDQT9JsnaRdzqF73FnU2+WrZPjinHPLeE74istVgjbowvsgUqtzjPIG5pOj cAsKoR0M1womzJVRfYauWhYiW/KeECklci4TPBDNx7YhahSUlexfoftltJA8swRshNA/M90/ i9zDo9ySSZHwsGxG06ZOH5/MzG6HpLja7g8NTgA0TD5YaFm/oOnsQVsf2DeAGPS2xNirmknD jaqYefx7yQ7FJXXETd2uVURiDeNEFhVZWb5CiBJM5c6qQMhmkS4VyT7/+raaEGgkEKEgHOWf ZDP8BHfXtszHqI3Fo1F4IKFo/AP8GOFFxMRgbvlAs8z/+rEEaQYjxYJqj08raw6P4LFBqozr nS4h0HDFPrrp1C2EMVYIQrMokWvlFZbCpsdYbBI= Date: Tue, 7 Jan 2020 12:34:50 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 MIME-Version: 1.0 In-Reply-To: <20200107042401-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 x-cbid: 20010711-0028-0000-0000-000003CEE3B4 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20010711-0029-0000-0000-00002492F331 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2020-01-07_03:2020-01-06,2020-01-07 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 impostorscore=0 malwarescore=0 bulkscore=0 adultscore=0 phishscore=0 mlxscore=0 mlxlogscore=999 priorityscore=1501 suspectscore=0 lowpriorityscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-2001070095 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07.01.20 10:39, Michael S. Tsirkin wrote: > On Tue, Jan 07, 2020 at 09:59:16AM +0100, Christian Borntraeger wrote: >> >> >> On 06.01.20 11:50, Michael S. Tsirkin wrote: >>> On Wed, Dec 18, 2019 at 04:59:02PM +0100, Christian Borntraeger wrote: >>>> On 18.12.19 16:10, Michael S. Tsirkin wrote: >>>>> On Wed, Dec 18, 2019 at 03:43:43PM +0100, Christian Borntraeger wrote: >>>>>> Michael, >>>>>> >>>>>> with >>>>>> commit db7286b100b503ef80612884453bed53d74c9a16 (refs/bisect/skip-db7286b100b503ef80612884453bed53d74c9a16) >>>>>> vhost: use batched version by default >>>>>> plus >>>>>> commit 6bd262d5eafcdf8cdfae491e2e748e4e434dcda6 (HEAD, refs/bisect/bad) >>>>>> Revert "vhost/net: add an option to test new code" >>>>>> to make things compile (your next tree is not easily bisectable, can you fix that as well?). >>>>> >>>>> I'll try. >>>>> >>>>>> >>>>>> I get random crashes in my s390 KVM guests after reboot. >>>>>> Reverting both patches together with commit decd9b8 "vhost: use vhost_desc instead of vhost_log" to >>>>>> make it compile again) on top of linux-next-1218 makes the problem go away. >>>>>> >>>>>> Looks like the batched version is not yet ready for prime time. Can you drop these patches until >>>>>> we have fixed the issues? >>>>>> >>>>>> Christian >>>>>> >>>>> >>>>> Will do, thanks for letting me know. >>>> >>>> I have confirmed with the initial reporter (internal test team) that >>>> with a known to be broken linux next kernel also fixes the problem, so it is really the >>>> vhost changes. >>> >>> OK I'm back and trying to make it more bisectable. >>> >>> I pushed a new tag "batch-v2". >>> It's same code but with this bisect should get more information. >> >> I get the following with this tag >> >> drivers/vhost/net.c: In function ‘vhost_net_tx_get_vq_desc’: >> drivers/vhost/net.c:574:7: error: implicit declaration of function ‘vhost_get_vq_desc_batch’; did you mean ‘vhost_get_vq_desc’? [-Werror=implicit-function-declaration] >> 574 | r = vhost_get_vq_desc_batch(tvq, tvq->iov, ARRAY_SIZE(tvq->iov), >> | ^~~~~~~~~~~~~~~~~~~~~~~ >> | vhost_get_vq_desc >> > > Not sure why but I pushed a wrong commit. Sorry. Should be good now. > during bisect: drivers/vhost/vhost.c: In function ‘vhost_get_vq_desc_batch’: drivers/vhost/vhost.c:2634:8: error: ‘id’ undeclared (first use in this function); did you mean ‘i’? 2634 | ret = id; | ^~ | i I changed that to i The last step then gave me (on commit 50297a8480b439efc5f3f23088cb2d90b799acef vhost: use batched version by default) net enc1: Unexpected TXQ (0) queue failure: -5 in the guest. bisect log so far: [cborntra@m83lp52 linux]$ git bisect log git bisect start # bad: [3131e79bb9e9892a5a6bd33513de9bc90b20e867] vhost: use vhost_desc instead of vhost_log git bisect bad 3131e79bb9e9892a5a6bd33513de9bc90b20e867 # good: [d1281e3a562ec6a08f944a876481dd043ba739b9] virtio-blk: remove VIRTIO_BLK_F_SCSI support git bisect good d1281e3a562ec6a08f944a876481dd043ba739b9 # good: [5b00aab5b6332a67e32dace1dcd3a198ab94ed56] vhost: option to fetch descriptors through an independent struct git bisect good 5b00aab5b6332a67e32dace1dcd3a198ab94ed56 # good: [5b00aab5b6332a67e32dace1dcd3a198ab94ed56] vhost: option to fetch descriptors through an independent struct git bisect good 5b00aab5b6332a67e32dace1dcd3a198ab94ed56 # bad: [1414d7ee3d10d2ec2bc4ee652d1d90ec91da1c79] vhost: batching fetches git bisect bad 1414d7ee3d10d2ec2bc4ee652d1d90ec91da1c79