Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp3414228ybv; Tue, 25 Feb 2020 00:51:00 -0800 (PST) X-Google-Smtp-Source: APXvYqxubrcicLI3jdtIwOTPZV1Gv96v61lP/E6NPG3I5DmxgzA3kCuCCell3NEzmdbbJpjPr+Vz X-Received: by 2002:aca:5c46:: with SMTP id q67mr2657045oib.75.1582620659678; Tue, 25 Feb 2020 00:50:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582620659; cv=none; d=google.com; s=arc-20160816; b=Aw82WzFXG1d4e9iOUQ+V/M0Q6+gAKkTY/EPtWT2TxqaI9fxeviPZ6X2ntWXhHWAiwd s8lsdV9J5DNeULWJ5HdJtGw7m+rSuWBrQXwSTfmMZmJzr98Ltf3SeSib3XRKprIwd756 cAFZ4XG9N8zSshSyKpWWzTT6uShbqOBltvGpyxpKuJ9KVMbTbm9pqDS60Dmet7Pi/kUo lFnnjTo/JTUvWEzQVlreiSoudNC7rO0EFIkLTIJ34eP0gjDeGxdSxFYs6PZtJYMyHPWV wixddnonpW2ZKWDv+FgYx2cMpBOFU7hxDjK7i7jr+Dz7fjd0RRvPl4eezWCtgmlifNHX rPbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=JmkjEEqq8m5BjNADnDl327dChnkc9rLHPLw9n41A+qY=; b=YAnybot0dvJB6/2n3O4yy60ixd8xfsuQvKHbOrqqzWcGnsFtvTGvSSaaJobiWlWSvm ZXbcTq5wGFzcgB9n0tEBzWmDHSfy+CX16gsSOpazQt9+kht/XqbEeLazDo74+J/TZkC7 yIoajwFJNcOtLH1eGnEPK3RgmdlSogLGgvioYokFxMEAGwscwIAF1BkcSWFxGO1vfE3u XWxgIZDMYg2s5zMFyDsUo2Yk2/4jQrvF958gayanMSGJAi8jR+x0aOZ1hcTtEcYf4PaG bCL1aXSZlXWdqkaiw8JxG1Itlv458XU4eWZPfZqHcdvBFRtLHLY6RhLqZmDotfmHHE2W pfjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=J2Ft6uL3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c76si4579396oig.149.2020.02.25.00.50.11; Tue, 25 Feb 2020 00:50:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=J2Ft6uL3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729792AbgBYIbH (ORCPT + 99 others); Tue, 25 Feb 2020 03:31:07 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:22847 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729622AbgBYIbH (ORCPT ); Tue, 25 Feb 2020 03:31:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582619465; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JmkjEEqq8m5BjNADnDl327dChnkc9rLHPLw9n41A+qY=; b=J2Ft6uL3qb2pUkrxSzHQSuCdB2OCostH1rV/uFJVI2B/TKrJmxoBV0yberxiiXKVlFyM7u JKS/xUMWIFdWvtFbJ2ye3E+rPUp/98hUpbCF0oEdytKtW/QOkTQgoDlQm9w+aPNnVzFO6L svGPKk94UfJIxUteZKeKnY/AtS+VR6A= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-432-w9QclnFEMXiaU5aR_jpDMA-1; Tue, 25 Feb 2020 03:31:04 -0500 X-MC-Unique: w9QclnFEMXiaU5aR_jpDMA-1 Received: by mail-wm1-f71.google.com with SMTP id k21so556790wmi.2 for ; Tue, 25 Feb 2020 00:31:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=JmkjEEqq8m5BjNADnDl327dChnkc9rLHPLw9n41A+qY=; b=kV+2IfvDBKT2F6n0jYfyDgwPGufDM7VeCbjKOFDYbVqxjrkdDsQxDgKvnpMx3oZo6k aRLvkR63z7+lUKTR52EtzIZNbvZRQfJufEOgsEgHLur7kPx1XKzcCbqdrTs9dphbhCqd y3JS7/NNqHnyhDYuDaby8xocIQ4tmfaWz/QoZjT51prvQpc4BXdsblSR4iWuO101UsxE 11snHgtzXlHOOHUFnJcJ1t99a6QquzUfSDze3jw2kpLmkt8Cqu68jLjQZQFG7ZLiv15r LbwSut6oIPJ2bDDAMBzO2KO3IsWaPL3A5w3fvA0pxjS84/JhaIEbc1MaCzj3skY5y2JS PBiw== X-Gm-Message-State: APjAAAW9RgVGUTUX/IN1XYB8h6ZMJzkZ5JaSuGgTBaKzL+TBawqEr6XM lcnYx1VdlIpKTT9zYqtWOdpdB5RIEuX72+7eQbgCv1JuNYNf4L7QzOj6IFxB9YLdGz2j2+nI5d1 BBW67eg9Ui8S3pGCM0aP8V0v4 X-Received: by 2002:adf:f586:: with SMTP id f6mr15471140wro.256.1582619462028; Tue, 25 Feb 2020 00:31:02 -0800 (PST) X-Received: by 2002:adf:f586:: with SMTP id f6mr15471097wro.256.1582619461690; Tue, 25 Feb 2020 00:31:01 -0800 (PST) Received: from steredhat (host209-4-dynamic.27-79-r.retail.telecomitalia.it. [79.27.4.209]) by smtp.gmail.com with ESMTPSA id a13sm5456577wrt.55.2020.02.25.00.31.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Feb 2020 00:31:01 -0800 (PST) Date: Tue, 25 Feb 2020 09:30:58 +0100 From: Stefano Garzarella To: Dexuan Cui Cc: Hillf Danton , Jorgen Hansen , Stefan Hajnoczi , syzbot , "davem@davemloft.net" , "kuba@kernel.org" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "syzkaller-bugs@googlegroups.com" , "virtualization@lists.linux-foundation.org" Subject: Re: INFO: task hung in lock_sock_nested (2) Message-ID: <20200225083058.nfrrvsdgjxj3zcmw@steredhat> References: <0000000000004241ff059f2eb8a4@google.com> <20200223075025.9068-1-hdanton@sina.com> <20200224100853.wd67e7rqmtidfg7y@steredhat> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 25, 2020 at 05:44:03AM +0000, Dexuan Cui wrote: > > From: Stefano Garzarella > > Sent: Monday, February 24, 2020 2:09 AM > > ... > > > > syz-executor280 D27912 9768 9766 0x00000000 > > > > Call Trace: > > > > context_switch kernel/sched/core.c:3386 [inline] > > > > __schedule+0x934/0x1f90 kernel/sched/core.c:4082 > > > > schedule+0xdc/0x2b0 kernel/sched/core.c:4156 > > > > __lock_sock+0x165/0x290 net/core/sock.c:2413 > > > > lock_sock_nested+0xfe/0x120 net/core/sock.c:2938 > > > > virtio_transport_release+0xc4/0xd60 > > net/vmw_vsock/virtio_transport_common.c:832 > > > > vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454 > > > > vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288 > > > > __sys_connect_file+0x161/0x1c0 net/socket.c:1857 > > > > __sys_connect+0x174/0x1b0 net/socket.c:1874 > > > > __do_sys_connect net/socket.c:1885 [inline] > > > > __se_sys_connect net/socket.c:1882 [inline] > > > > __x64_sys_connect+0x73/0xb0 net/socket.c:1882 > > > > do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 > > My understanding about the call trace is: in vsock_stream_connect() > after we call lock_sock(sk), we call vsock_assign_transport(), which may > call vsk->transport->release(vsk), i.e. virtio_transport_release(), and in > virtio_transport_release() we try to get the same lock and hang. Yes, that's what I got. > > > > Seems like vsock needs a word to track lock owner in an attempt to > > > avoid trying to lock sock while the current is the lock owner. > > I'm unfamilar with the g2h/h2g :-) > Here I'm wondering if it's acceptable to add an 'already_locked' > parameter like this: > bool already_locked = true; > vsk->transport->release(vsk, already_locked) ? Could be acceptable, but I prefer to avoid. > > > Thanks for this possible solution. > > What about using sock_owned_by_user()? > > > We should fix also hyperv_transport, because it could suffer from the same > > problem. > > IIUC hyperv_transport doesn't supprot the h2g/g2h feature, so it should not > suffers from the deadlock issue here? The h2g/g2h is in the vsock core, and the hyperv_transport is one of the g2h transports available. If we have a L1 VM with hyperv_transport (G2H) to communicate with L0 and a nested KVM VM (L2) we need to load also vhost_transport (H2G). If the user creates a socket and it tries the following: connect(fd, <2,1234>) - socket assigned to hyperv_transport (because the user wants to reach the host using CID_HOST) fails connect(fd, <3,1234>) - socket must be reassigned to vhost_transport (because the user wants to reach a nested guest) So, I think in this case we can have the deadlock. > > > At this point, it might be better to call vsk->transport->release(vsk) > > always with the lock taken and remove it in the transports as in the > > following patch. > > > > What do you think? > > > > > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c > > index 9c5b2a91baad..a073d8efca33 100644 > > --- a/net/vmw_vsock/af_vsock.c > > +++ b/net/vmw_vsock/af_vsock.c > > @@ -753,20 +753,18 @@ static void __vsock_release(struct sock *sk, int > > level) > > vsk = vsock_sk(sk); > > pending = NULL; /* Compiler warning. */ > > > > - /* The release call is supposed to use lock_sock_nested() > > - * rather than lock_sock(), if a sock lock should be acquired. > > - */ > > - if (vsk->transport) > > - vsk->transport->release(vsk); > > - else if (sk->sk_type == SOCK_STREAM) > > - vsock_remove_sock(vsk); > > - > > /* When "level" is SINGLE_DEPTH_NESTING, use the nested > > * version to avoid the warning "possible recursive locking > > * detected". When "level" is 0, lock_sock_nested(sk, level) > > * is the same as lock_sock(sk). > > */ > > lock_sock_nested(sk, level); > > + > > + if (vsk->transport) > > + vsk->transport->release(vsk); > > + else if (sk->sk_type == SOCK_STREAM) > > + vsock_remove_sock(vsk); > > + > > sock_orphan(sk); > > sk->sk_shutdown = SHUTDOWN_MASK; > > > > diff --git a/net/vmw_vsock/hyperv_transport.c > > b/net/vmw_vsock/hyperv_transport.c > > index 3492c021925f..510f25f4a856 100644 > > --- a/net/vmw_vsock/hyperv_transport.c > > +++ b/net/vmw_vsock/hyperv_transport.c > > @@ -529,9 +529,7 @@ static void hvs_release(struct vsock_sock *vsk) > > struct sock *sk = sk_vsock(vsk); > > bool remove_sock; > > > > - lock_sock_nested(sk, SINGLE_DEPTH_NESTING); > > remove_sock = hvs_close_lock_held(vsk); > > - release_sock(sk); > > if (remove_sock) > > vsock_remove_sock(vsk); > > } > > This looks good to me, but do we know why vsk->transport->release(vsk) > is called without holding the lock for 'sk' in the first place? Maybe because vmci_transport (the first transport implemented) doesn't require holding lock during the release. Okay, so I'll test this patch and then I'll send it out for reviews. Thanks for the feedback, Stefano