Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751815AbdHAPYF (ORCPT ); Tue, 1 Aug 2017 11:24:05 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:16752 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751352AbdHAPYD (ORCPT ); Tue, 1 Aug 2017 11:24:03 -0400 Subject: Re: [PATCH v2 11/13] xen/pvcalls: implement release command To: Stefano Stabellini References: <1501017730-12797-1-git-send-email-sstabellini@kernel.org> <1501017730-12797-11-git-send-email-sstabellini@kernel.org> <81df7507-287b-ee06-89e4-463e82628d10@oracle.com> Cc: xen-devel@lists.xen.org, linux-kernel@vger.kernel.org, jgross@suse.com, Stefano Stabellini From: Boris Ostrovsky Message-ID: Date: Tue, 1 Aug 2017 11:23:52 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3224 Lines: 88 On 07/31/2017 06:34 PM, Stefano Stabellini wrote: > On Thu, 27 Jul 2017, Boris Ostrovsky wrote: >>> +int pvcalls_front_release(struct socket *sock) >>> +{ >>> + struct pvcalls_bedata *bedata; >>> + struct sock_mapping *map; >>> + int req_id, notify; >>> + struct xen_pvcalls_request *req; >>> + >>> + if (!pvcalls_front_dev) >>> + return -EIO; >>> + bedata = dev_get_drvdata(&pvcalls_front_dev->dev); >>> + if (!bedata) >>> + return -EIO; >> Some (all?) other ops don't check bedata validity. Should they all do? > No, I don't think they should: dev_set_drvdata is called in the probe > function (pvcalls_front_probe). I'll remove it. > > >>> + >>> + if (sock->sk == NULL) >>> + return 0; >>> + >>> + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); >>> + if (map == NULL) >>> + return 0; >>> + >>> + spin_lock(&bedata->pvcallss_lock); >>> + req_id = bedata->ring.req_prod_pvt & (RING_SIZE(&bedata->ring) - 1); >>> + if (RING_FULL(&bedata->ring) || >>> + READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) { >>> + spin_unlock(&bedata->pvcallss_lock); >>> + return -EAGAIN; >>> + } >>> + WRITE_ONCE(sock->sk->sk_send_head, NULL); >>> + >>> + req = RING_GET_REQUEST(&bedata->ring, req_id); >>> + req->req_id = req_id; >>> + req->cmd = PVCALLS_RELEASE; >>> + req->u.release.id = (uint64_t)sock; >>> + >>> + bedata->ring.req_prod_pvt++; >>> + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&bedata->ring, notify); >>> + spin_unlock(&bedata->pvcallss_lock); >>> + if (notify) >>> + notify_remote_via_irq(bedata->irq); >>> + >>> + wait_event(bedata->inflight_req, >>> + READ_ONCE(bedata->rsp[req_id].req_id) == req_id); >>> + >>> + if (map->active_socket) { >>> + /* >>> + * Set in_error and wake up inflight_conn_req to force >>> + * recvmsg waiters to exit. >>> + */ >>> + map->active.ring->in_error = -EBADF; >>> + wake_up_interruptible(&map->active.inflight_conn_req); >>> + >>> + mutex_lock(&map->active.in_mutex); >>> + mutex_lock(&map->active.out_mutex); >>> + pvcalls_front_free_map(bedata, map); >>> + mutex_unlock(&map->active.out_mutex); >>> + mutex_unlock(&map->active.in_mutex); >>> + kfree(map); >> Since you are locking here I assume you expect that someone else might >> also be trying to lock the map. But you are freeing it immediately after >> unlocking. Wouldn't that mean that whoever is trying to grab the lock >> might then dereference freed memory? > The lock is to make sure there are no recvmsg or sendmsg in progress. We > are sure that no newer sendmsg or recvmsg are waiting for > pvcalls_front_release to release the lock because before send a message > to the backend we set sk_send_head to NULL. Is there a chance that whoever is potentially calling send/rcvmsg has checked that sk_send_head is non-NULL but hasn't grabbed the lock yet? Freeing a structure containing a lock right after releasing the lock looks weird (to me). Is there any other way to synchronize with sender/receiver? Any other lock? BTW, I also noticed that in rcvmsg you are calling wait_event_interruptible() while holding the lock. Have you tested with CONFIG_DEBUG_ATOMIC_SLEEP? (or maybe it's some other config option that would complain about those sorts of thing) -boris