Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1222574ybg; Fri, 18 Oct 2019 14:05:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqwj4c/kcy9+LhZ/SHDcz0tWo0Sl9JA5stbwylLzKIC47tUhXNUkDhQjAjmZ6mzNGEBbSslp X-Received: by 2002:aa7:d888:: with SMTP id u8mr11646375edq.144.1571432732227; Fri, 18 Oct 2019 14:05:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571432732; cv=none; d=google.com; s=arc-20160816; b=BdBk9dC690prEOx7kueC0/KZcWln2cmnXNW3737bqErN93t4W8hwrxKf8QFCp3yOIG xCzLutoEj8aBS2S4bCZPmQqlXtCO0e3BUC7yIlO1FIg3N8gLQSpb8ZUKAsoNEoKGiWv4 fM24ipKaQe45sVV8gSa/9lk8pctZoIQyfbrXemb38cRNcsSyWDCIorC55MN7jHIJrXYn VIzhbmoWBLHnUa8799lz8wwfwtgLMeuSEiNgYb1i6K3dxSBvfvN1S10snrCVjx+tetCT 7KJFov/tF/aCb+GG0aPXcnFfsQV6LagODPysVterHN5CU3qv7Kl9361Ohu3rvvhuYhdR axMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:mail-followup-to :message-id:subject:cc:to:from:date:dkim-signature; bh=24+Do9YqpSln9j0xPUbpv/PqiPnNw+KH8CrPJ9CUJIA=; b=jKP/5GYF66Z0ESg0Y1E0T/mSTeVKi/HrzMIEe8xyG4GNNPkcvDYq1+rhyWrhPuvWsJ WWa4UWNx4dloITspb+X+5fvQzOvRnDsufQOJ59/dvMM26WnoT9Pr8IplyowZSaoTMCol Iv3i1vzPwHfGhfKJki2byCixGkVtrmeT5XUZ8i7gkMnIINJTH+Q1IcWckJSEHePVUMMN sk8xcUpsHu+E8t+7xP3t7ZzCOCN3vyHprhMU49M8dFlijqOa8ilDrWlJ2p1+Yu8J4ER3 IDd/KA1yZAqEEaCtmoT2JNS7xfj6gRgrMmgKo122LMBqy0TBx46b6dgozpFa+Ggc/Hbn YbEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qRWfl0CF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t10si4508914edd.447.2019.10.18.14.05.09; Fri, 18 Oct 2019 14:05:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qRWfl0CF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405802AbfJQQgi (ORCPT + 99 others); Thu, 17 Oct 2019 12:36:38 -0400 Received: from mail.kernel.org ([198.145.29.99]:58030 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727508AbfJQQgi (ORCPT ); Thu, 17 Oct 2019 12:36:38 -0400 Received: from sol.localdomain (c-24-5-143-220.hsd1.ca.comcast.net [24.5.143.220]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4047E21835; Thu, 17 Oct 2019 16:36:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1571330196; bh=Z/KBP2YLrD5Y/wfplASDyaw24GCH9LYbK0NX0ij4yFA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=qRWfl0CF/fNtfLzCK9xoNX50b7918dQxGKCBhXclmgNbfprtOyU5vAVCyEDdHCPPN 1M/pOZ3Pb0zSTXB62LjUD/NfKWLBTOo9tKtHcNjObqoLiokmJ5AJ1R0I0MaSTBiomd PO7rHmnCrOP9TbWUHpUJKS+tzEKaW1cYXlqziFIo= Date: Thu, 17 Oct 2019 09:36:34 -0700 From: Eric Biggers To: "Richard W.M. Jones" Cc: Mike Christie , syzbot , axboe@kernel.dk, josef@toxicpanda.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, nbd@other.debian.org, syzkaller-bugs@googlegroups.com Subject: Re: INFO: task hung in nbd_ioctl Message-ID: <20191017163634.GD726@sol.localdomain> Mail-Followup-To: "Richard W.M. Jones" , Mike Christie , syzbot , axboe@kernel.dk, josef@toxicpanda.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, nbd@other.debian.org, syzkaller-bugs@googlegroups.com References: <000000000000b1b1ee0593cce78f@google.com> <5D93C2DD.10103@redhat.com> <20191017140330.GB25667@redhat.com> <5DA88D2F.7080907@redhat.com> <20191017162829.GA3888@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191017162829.GA3888@redhat.com> User-Agent: Mutt/1.12.2 (2019-09-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 17, 2019 at 05:28:29PM +0100, Richard W.M. Jones wrote: > On Thu, Oct 17, 2019 at 10:47:59AM -0500, Mike Christie wrote: > > On 10/17/2019 09:03 AM, Richard W.M. Jones wrote: > > > On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote: > > >> Hey Josef and nbd list, > > >> > > >> I had a question about if there are any socket family restrictions for nbd? > > > > > > In normal circumstances, in userspace, the NBD protocol would only be > > > used over AF_UNIX or AF_INET/AF_INET6. > > > > > > There's a bit of confusion because netlink is used by nbd-client to > > > configure the NBD device, setting things like block size and timeouts > > > (instead of ioctl which is deprecated). I think you don't mean this > > > use of netlink? > > > > I didn't. It looks like it is just a bad test. > > > > For the automated test in this thread the test created a AF_NETLINK > > socket and passed it into the NBD_SET_SOCK ioctl. That is what got used > > for the NBD_DO_IT ioctl. > > > > I was not sure if the test creator picked any old socket and it just > > happened to pick one nbd never supported, or it was trying to simulate > > sockets that did not support the shutdown method. > > > > I attached the automated test that got run (test.c). > > I'd say it sounds like a bad test, but I'm not familiar with syzkaller > nor how / from where it generates these tests. Did someone report a > bug and then syzkaller wrote this test? > > Rich. > > > > > > >> The bug here is that some socket familys do not support the > > >> sock->ops->shutdown callout, and when nbd calls kernel_sock_shutdown > > >> their callout returns -EOPNOTSUPP. That then leaves recv_work stuck in > > >> nbd_read_stat -> sock_xmit -> sock_recvmsg. My patch added a > > >> flush_workqueue call, so for socket familys like AF_NETLINK in this bug > > >> we hang like we see below. > > >> > > >> I can just remove the flush_workqueue call in that code path since it's > > >> not needed there, but it leaves the original bug my patch was hitting > > >> where we leave the recv_work running which can then result in leaked > > >> resources, or possible use after free crashes and you still get the hang > > >> if you remove the module. > > >> > > >> It looks like we have used kernel_sock_shutdown for a while so I thought > > >> we might never have supported sockets that did not support the callout. > > >> Is that correct? If so then I can just add a check for this in > > >> nbd_add_socket and fix that bug too. > > > > > > Rich. > > > It's an automatically generated fuzz test. There's rarely any such thing as a "bad" fuzz test. If userspace can do something that causes the kernel to crash or hang, it's a kernel bug, with very few exceptions (e.g. like writing to /dev/mem). If there are cases that aren't supported, like sockets that don't support a certain function or whatever, then the code needs to check for those cases and return an error, not hang the kernel. - Eric