Received: by 10.223.185.116 with SMTP id b49csp8794865wrg; Fri, 2 Mar 2018 08:05:05 -0800 (PST) X-Google-Smtp-Source: AG47ELuUmsPi+w8o0Xnb4aOsgq9E0fjIZ5aN1Fe1jzrcO9lvaHrmSel9b2I9MmPovPuew52Szwy2 X-Received: by 10.101.97.139 with SMTP id c11mr4884669pgv.435.1520006705563; Fri, 02 Mar 2018 08:05:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520006705; cv=none; d=google.com; s=arc-20160816; b=nWE2Wn6lbPhWDuKS+lGScUepu2zOUSVYUVA8L0wUuzKPE9/BW9je1lO/iDf2pTJ2Tk yw8gSuhmExCVOAXUWfo4Dr4qjqbzIa39RAPZ6vIWPtzzP9y23VhwlV5UPe8ubcqT2e7v NnPOjvhmXLo/p6TcRHA91XASj5vzhA0M6l5LsJBptBjMbnvsJeH5efWdY2e6shimOMMb FcfcxP7lerWQBEK9FTIqVp2oiq9SuwXNdbq1h7zGh0D+89/C+KfyRkthnV1+zYHGsXet w24N+lQd8uLdCD8kKDIOjEUWVhudSu6YfIV7VUYuey2InzR9b9TkZAsBl+yWSCjWbJe4 TcMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=ezEwD17qfMjR1Hc/Ng+qsqvrasG6YG5fT2BQBDiJzqA=; b=qa97+IBgVA8HBkqp7kES9ejq+tfcO/ciQzC58U8QUri70f+vVpc6BdPw4Hh/31L0NL +Kj2KOzdAMwock6jRye7XiUW8ODe7qM1r/8vVOy6QdsV9YHKgz7ndfLw0rpQYZ72bgZH NPwu/LZq+MpFBX/8IFiJJPveQPSFECBoJGz9kwKDZftVbxlTj2q3YTRNqOR2L6HYTrOH uLeZ+YWIZYkKZvXCAxID0mM9UJpBy0RCQokoa5wjzXRY2ps7Y/8Ca1OHfGKnGmafLiqt joX15gdADa5HeNBUh0zJqzPpBJBQZregxTKAxQtVUVTrtoDhYJag8BtcU5trdTOPK4+A khUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b9si5055211pfm.327.2018.03.02.08.04.50; Fri, 02 Mar 2018 08:05:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1034324AbeCBPMU (ORCPT + 99 others); Fri, 2 Mar 2018 10:12:20 -0500 Received: from mx2.suse.de ([195.135.220.15]:37410 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030456AbeCBPMS (ORCPT ); Fri, 2 Mar 2018 10:12:18 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 48CD6AC71; Fri, 2 Mar 2018 15:12:17 +0000 (UTC) Subject: Re: [PATCH 1/2] xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling To: Simon Gaiser , xen-devel@lists.xenproject.org Cc: stable@vger.kernel.org, Boris Ostrovsky , linux-kernel@vger.kernel.org References: <20180207222236.7434-1-simon@invisiblethingslab.com> <1fbf69f9-f835-897e-144f-8c6f8b94cd26@suse.com> <1d10edc6-8ad6-bc58-432c-d1867f0ab57a@invisiblethingslab.com> <2263ddf3-54f6-d81e-7674-d0ae0802aa65@suse.com> From: Juergen Gross Message-ID: <23999f89-c2f0-0a3b-6262-1a77fa58490e@suse.com> Date: Fri, 2 Mar 2018 16:12:16 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: de-DE Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/03/18 15:58, Simon Gaiser wrote: > Juergen Gross: >> On 20/02/18 05:56, Simon Gaiser wrote: >>> Juergen Gross: >>>> On 07/02/18 23:22, Simon Gaiser wrote: >>>>> Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple >>>>> concurrent xenstore accesses") made a subtle change to the semantic of >>>>> xenbus_dev_request_and_reply() and xenbus_transaction_end(). >>>>> >>>>> Before on an error response to XS_TRANSACTION_END >>>>> xenbus_dev_request_and_reply() would not decrement the active >>>>> transaction counter. But xenbus_transaction_end() has always counted the >>>>> transaction as finished regardless of the response. >>>> >>>> Which is correct now. Xenstore will free all transaction related >>>> data regardless of the response. A once failed transaction can't >>>> be repaired, it has to be repeated completely. >>> >>> So if xenstore frees the transaction why should we keep it in the list >>> with pending transaction in xenbus_dev_frontend? That's exactly what >>> this patch fixes by always removing it from the list, not only on a >>> successful response (See below for the EINVAL case). >> >> Aah, sorry, I seem to have misread my own coding. :-( >> >> Yes, you are right. Sorry for not seeing it before. >> >>> >>> [...] >>>>> But xenbus_dev_frontend tries to end a transaction on closing of the >>>>> device if the XS_TRANSACTION_END failed before. Trying to close the >>>>> transaction twice corrupts the reference count. So fix this by also >>>>> considering a transaction closed if we have sent XS_TRANSACTION_END once >>>>> regardless of the return code. >>>> >>>> A transaction in the list of transactions should not considered to be >>>> finished. Either it is not on the list or it is still pending. >>> >>> With "considering a transaction closed" I mean "take the code path which >>> removes the transaction from the list with pending transactions". >>> >>> From the follow-up mail: >>>>>> The new behavior is that xenbus_dev_request_and_reply() and >>>>>> xenbus_transaction_end() will always count the transaction as finished >>>>>> regardless the response code (handled in xs_request_exit()). >>>>> >>>>> ENOENT should not decrement the transaction counter, while all >>>>> other responses to XS_TRANSACTION_END should still do so. >>>> >>>> Sorry, I stand corrected: the ENOENT case should never happen, as this >>>> case is tested in xenbus_write_transaction(). It doesn't hurt to test >>>> for ENOENT, though. >>>> >>>> What should be handled is EINVAL: this would happen if a user specified >>>> a string different from "T" and "F". >>> >>> Ok, I will handle those cases in xs_request_exit(). Although I don't >>> like that this depends on the internals of xenstore (At least to me it's >>> not obvious why it should only return ENOENT or EINVAL in this case). >>> >>> In the xenbus_write_transaction() case checking the string before >>> sending the transaction (like the transaction itself is verified) would >>> avoid this problem. >> >> Right. I'd prefer this solution. >> >> Remains the only problem you tried to tackle with your second patch: a >> kernel driver going crazy and ending transactions it never started (or >> ending them multiple times). The EINVAL case can't happen here, but >> ENOENT can. Either ENOENT has to be handled in xs_request_exit() or you >> need to keep track of the transactions like in the user interface and >> refuse ending an unknown transaction. Or you trust the kernel users. >> Trying to fix the usage counter seems to be the wrong approach IMO. > > The point of the second patch was to detect such bugs. This would have > saved quite some time to find this bug. I added the "fix" of the counter > I just because it was trivial after having the if there. > > Adding tracking seems to be a quite complex solution for a _potential_ > problem. I agree. > So I would go with checking ENOENT in xs_request_exit(). Should this be > WARN_ON_ONCE()? Since this normally should not happen I would say yes. Yes, having a WARN_ON_ONCE here will help. > Should I keep the reference counter sanity check? And if yes, with the > "fix" to the counter? I'd drop it. This really should not happen and blowing up kernel size with checks of impossible situations isn't the way to go. In case you really want to do something here you can add something like ASSERT(xs_state_users) before decrementing the counter. Juergen