Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751836AbdHAPet (ORCPT ); Tue, 1 Aug 2017 11:34:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:48451 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751631AbdHAPes (ORCPT ); Tue, 1 Aug 2017 11:34:48 -0400 Subject: Re: [PATCH v2 11/13] xen/pvcalls: implement release command To: Boris Ostrovsky , Stefano Stabellini Cc: xen-devel@lists.xen.org, linux-kernel@vger.kernel.org, Stefano Stabellini References: <1501017730-12797-1-git-send-email-sstabellini@kernel.org> <1501017730-12797-11-git-send-email-sstabellini@kernel.org> <81df7507-287b-ee06-89e4-463e82628d10@oracle.com> From: Juergen Gross Message-ID: <7ace9427-5215-6be7-907a-46dd15ea2a8f@suse.com> Date: Tue, 1 Aug 2017 17:34:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Language: de-DE Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3668 Lines: 97 On 01/08/17 17:23, Boris Ostrovsky wrote: > On 07/31/2017 06:34 PM, Stefano Stabellini wrote: >> On Thu, 27 Jul 2017, Boris Ostrovsky wrote: >>>> +int pvcalls_front_release(struct socket *sock) >>>> +{ >>>> + struct pvcalls_bedata *bedata; >>>> + struct sock_mapping *map; >>>> + int req_id, notify; >>>> + struct xen_pvcalls_request *req; >>>> + >>>> + if (!pvcalls_front_dev) >>>> + return -EIO; >>>> + bedata = dev_get_drvdata(&pvcalls_front_dev->dev); >>>> + if (!bedata) >>>> + return -EIO; >>> Some (all?) other ops don't check bedata validity. Should they all do? >> No, I don't think they should: dev_set_drvdata is called in the probe >> function (pvcalls_front_probe). I'll remove it. >> >> >>>> + >>>> + if (sock->sk == NULL) >>>> + return 0; >>>> + >>>> + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); >>>> + if (map == NULL) >>>> + return 0; >>>> + >>>> + spin_lock(&bedata->pvcallss_lock); >>>> + req_id = bedata->ring.req_prod_pvt & (RING_SIZE(&bedata->ring) - 1); >>>> + if (RING_FULL(&bedata->ring) || >>>> + READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) { >>>> + spin_unlock(&bedata->pvcallss_lock); >>>> + return -EAGAIN; >>>> + } >>>> + WRITE_ONCE(sock->sk->sk_send_head, NULL); >>>> + >>>> + req = RING_GET_REQUEST(&bedata->ring, req_id); >>>> + req->req_id = req_id; >>>> + req->cmd = PVCALLS_RELEASE; >>>> + req->u.release.id = (uint64_t)sock; >>>> + >>>> + bedata->ring.req_prod_pvt++; >>>> + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&bedata->ring, notify); >>>> + spin_unlock(&bedata->pvcallss_lock); >>>> + if (notify) >>>> + notify_remote_via_irq(bedata->irq); >>>> + >>>> + wait_event(bedata->inflight_req, >>>> + READ_ONCE(bedata->rsp[req_id].req_id) == req_id); >>>> + >>>> + if (map->active_socket) { >>>> + /* >>>> + * Set in_error and wake up inflight_conn_req to force >>>> + * recvmsg waiters to exit. >>>> + */ >>>> + map->active.ring->in_error = -EBADF; >>>> + wake_up_interruptible(&map->active.inflight_conn_req); >>>> + >>>> + mutex_lock(&map->active.in_mutex); >>>> + mutex_lock(&map->active.out_mutex); >>>> + pvcalls_front_free_map(bedata, map); >>>> + mutex_unlock(&map->active.out_mutex); >>>> + mutex_unlock(&map->active.in_mutex); >>>> + kfree(map); >>> Since you are locking here I assume you expect that someone else might >>> also be trying to lock the map. But you are freeing it immediately after >>> unlocking. Wouldn't that mean that whoever is trying to grab the lock >>> might then dereference freed memory? >> The lock is to make sure there are no recvmsg or sendmsg in progress. We >> are sure that no newer sendmsg or recvmsg are waiting for >> pvcalls_front_release to release the lock because before send a message >> to the backend we set sk_send_head to NULL. > > Is there a chance that whoever is potentially calling send/rcvmsg has > checked that sk_send_head is non-NULL but hasn't grabbed the lock yet? > > Freeing a structure containing a lock right after releasing the lock > looks weird (to me). Is there any other way to synchronize with > sender/receiver? Any other lock? Right. This looks fishy. Either you don't need the locks or you can't just free the area right after releasing the lock. > BTW, I also noticed that in rcvmsg you are calling > wait_event_interruptible() while holding the lock. Have you tested with > CONFIG_DEBUG_ATOMIC_SLEEP? (or maybe it's some other config option that > would complain about those sorts of thing) I believe sleeping while holding a mutex is allowed. Sleeping in spinlocked paths is bad. BTW: You are looking for CONFIG_DEBUG_MUTEXES (see Documentation/locking/mutex-design.txt ). Juergen