Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1242732pxb; Fri, 1 Apr 2022 08:14:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwP/yNEplnMMCGhwpAO6GOaVCZqzByuKsA3XvovuboU2KNHh+FcAQpOoqB3OT/xJNrI9Lb9 X-Received: by 2002:a17:907:7704:b0:6cf:48ac:b4a8 with SMTP id kw4-20020a170907770400b006cf48acb4a8mr221689ejc.305.1648826049618; Fri, 01 Apr 2022 08:14:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648826049; cv=none; d=google.com; s=arc-20160816; b=ug/liuZJMgIdKLt6C6MpSo7G45JBhreIMKzdlFYuqitxgJU9EwHEUVW6XqC08f8UVe G5iaB1d/mBh9vijt4xz87CexPxjlj7yxdeqSGqc7SHYB3r4qKf4K9Gt/RxIkD33Q7Man AV7rflckiacSNcV9K1n6QmVf2SuECqZu79K5A3F84L7tGVDOabULLu9ixz/a2wUWQU9Q 0XmyA/2hzpM8w7Os+Tl9BBzxqYxwsgo+z/uug2Po2BZcj5roIPIqxrmkDzMZgu+GTxBg xyDjvVYFCmy7AM0oZJ6j6hB2NLE3v4Ef4jFLPd5w5/oBCzKIHKPnHvzXG8zwbhgB/rmK teuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=/zo/Xafvz2Zf+nfPcp+HD8oBavKsTtRIlxp2JGEau/E=; b=CpG3uN5HbcnHe0u8BprlTRwQqCcEv/o0ymNOhx3upE1M84u85JqWUHrI55lAdTQ6w2 DNJx6ZH3eB16Snj/nSwRdDumcgnNuTRYCc7D8kz3Gh4yIsA9Abnl7AKcszcPmreGm8n9 /p4bpCs9gXkljkBbq9yMoiBEd8prDPsBDcRHWRInCm6UYfg3ZrU+ffBKwwtWaeYMGzYN xUdaQLBc3iQRaxVH3gcPbcCCGFz93b/8iImqcs78+gkyLSrIGFFvw/YHsWh5eCptjN2L X0n2QJSPpVIS7t5Ji2hgaZWrvC1PbbT2QywSoMEy8P4yTSPs/hyuE475dHfg3cw7eIS/ JQTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QegZZHbD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id if7-20020a170906df4700b006df7ddb6296si1610167ejc.601.2022.04.01.08.13.44; Fri, 01 Apr 2022 08:14:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QegZZHbD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344078AbiDAH6O (ORCPT + 99 others); Fri, 1 Apr 2022 03:58:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344213AbiDAH6L (ORCPT ); Fri, 1 Apr 2022 03:58:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A6581B696C for ; Fri, 1 Apr 2022 00:56:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648799782; x=1680335782; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=Nbq8rAm0bEqRM6a80J2T5nBs+ODG6WOWlxJUu2ezd8c=; b=QegZZHbDcamcWZe/WpO5gxduXk/D9b526PS5B8BZ7FK/B/wkOgJPQp+i ng7nbAxvSpatBbJP/2KoeLTeNJIaTzvGj56hyIG0rprNnvaGF3UV4bAKj CB1uHUse3QFjt8ll0uk5+fKFVJ84PC++H5PvjWCWd85H3LpApm1RjbfOl Isq34SYjUs+TShNulc4ZZMgCYWp/ds3K+LrecM+/foxZKa9WUW/ScUBe9 mU9ZhCcWcqeqVt1a+phHGS0HzpjSZTdWJM42Kr2AxEz6szzccDJyd/zw+ NiuEbfs5hhwK1tKekL+dOPvHxnpp3PtPMrEBQqn3Ozf/vgCmhSBb8/doZ A==; X-IronPort-AV: E=McAfee;i="6200,9189,10303"; a="346509799" X-IronPort-AV: E=Sophos;i="5.90,226,1643702400"; d="scan'208";a="346509799" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2022 00:56:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,226,1643702400"; d="scan'208";a="720791980" Received: from louislifei-optiplex-7050.sh.intel.com (HELO louislifei-OptiPlex-7050) ([10.239.81.43]) by orsmga005.jf.intel.com with ESMTP; 01 Apr 2022 00:56:19 -0700 Date: Fri, 1 Apr 2022 15:57:11 +0800 From: Li Fei1 To: Jakob Koschel Cc: linux-kernel@vger.kernel.org, rppt@kernel.org, bjohannesmeyer@gmail.com, c.giuffrida@vu.nl, h.j.bos@vu.nl, fei1.li@intel.com Subject: Re: [PATCH] virt: acrn: fix invalid check past list iterator Message-ID: <20220401075711.GA31912@louislifei-OptiPlex-7050> References: <20220319203819.2559993-1-jakobkoschel@gmail.com> <20220330075742.GA22544@louislifei-OptiPlex-7050> <6E68C33F-9CBB-418C-A11D-2AD863C0B19A@gmail.com> <20220401011533.GA29696@louislifei-OptiPlex-7050> <20220401035742.GA31162@louislifei-OptiPlex-7050> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 01, 2022 at 09:16:48AM +0200, Jakob Koschel wrote: > > > > On 1. Apr 2022, at 05:57, Li Fei1 wrote: > > > > On Fri, Apr 01, 2022 at 05:22:36AM +0200, Jakob Koschel wrote: > >> > >>> On 1. Apr 2022, at 03:15, Li Fei1 wrote: > >>> > >>> On Thu, Mar 31, 2022 at 01:20:50PM +0200, Jakob Koschel wrote: > >>>> > >>>>> On 30. Mar 2022, at 09:57, Li Fei1 wrote: > >>>>> > >>>>> On Sat, Mar 19, 2022 at 09:38:19PM +0100, Jakob Koschel wrote: > >>>>>> The condition retry == 0 is theoretically possible even if 'client' > >>>>>> does not point to a valid element because no break was hit. > >>>>>> > >>>>>> To only execute the dev_warn if actually a break within the loop was > >>>>>> hit, a separate variable is used that is only set if it is ensured to > >>>>>> point to a valid client struct. > >>>>>> > >>>>> Hi Koschel > >>>>> > >>>>> Thanks for you to help us to try to improve the code. Maybe you don't get the point. > >>>>> The dev_warn should only been called when has_pending = true && retry == 0 > >>>> > >>>> Maybe I don't understand but looking isolated at this function I could see a way to call > >>>> the dev_warn() with has_pending = false && retry == 0. > >>> Yes, even has_pending = true && retry == 0 at the beginning, we could not make sure > >>> has_pending is true after schedule_timeout_interruptible and we even didn't check > >>> there're other pending client on the ioreq_clients (because we can't make sure > >>> when we dev_warn this clent is still pending). So we just use dev_warn not higher log level. > >> > >> I'm sorry, I don't quite understand what you mean by that. > >> Do you agree that has_pending = false && retry == 0 is possible when calling the dev_warn() > >> or not? > >> > > Yes, so what ? It just a hint there may have pending request. > > if has_pending == false && retry == 0 when calling dev_warn() there are very clear > dependencies met: > > * has_pending == false means that the list_for_each_entry() macro it *not* exit early. > * since list_for_each_entry() did *not* exit early, client will not hold a valid list entry > * using client->name is not safe and will not point to a valid string (perhaps not even an address) So you'd better to check when the client in ioreq_clients would been destroyed, right ? > > I'm *only* talking about the case where has_pending == false, in case that's not clear. > > > > Even retry == 0 && has_pending = true, > > When we call dev_warn, the clent may not is pending. > > > > > >>>> > >>>>> list_for_each_entry(client, &vm->ioreq_clients, list) { > >>>>> has_pending = has_pending_request(client); > >>>>> if (has_pending) > >>>>> } > >>>>> spin_unlock_bh(&vm->ioreq_clients_lock); > >>>> > >>>> imagine has_pending == false && retry == 1 here, then client will not hold a valid list entry. > >>> What do you mean "client will not hold a valid list entry" ? > >> > >> Imagine a very simple example: > >> > >> struct acrn_ioreq_client *client; > >> list_for_each_entry(client, &vm->ioreq_clients, list) { > >> continue; > >> } > >> > >> dev_warn(acrn_dev.this_device, > >> "%s cannot flush pending request!\n", client->name); /* NOT GOOD */ > >> > > If there're pending request, we would call schedule to schedule out then schedule back > > to check the list from the beginning. If there's no pending client, we point to the last > > client and break out the while loop. > > > > The code doesn't want to find the pending client and break out the while loop and call > > dev_warn. Please see the function comment. > > > > > >> > >> Since there is no break for the list_for_each_entry() iterator to return early, > >> client *cannot* be a valid entry of the list. In fact if you look at the list_for_each_entry() > >> macro, it will be an 'bogus' pointer, pointing somewhere into 'vm'. > >> Essentially before the terminating condition of the list traversal the following code is called: > >> > >> list_entry(&vm->ioreq_clients, struct acrn_ioreq_client *, list); > >> > >> resulting in a: > >> > >> container_of(&vm->ioreq_clients, struct acrn_ioreq_client *, list); > >> > >> &vm->ioreq_clients however is not contained in a struct acrn_ioreq_client, making > >> this call compute an invalid pointer. > >> Therefore using 'client' as in the example above (e.g. client->name) after the loop is > >> not safe. Since the loop can never return early in the simple example above it will > >> always break. On other cases (the one we are discussing here) there might be a chance that > >> there is one code path (in theory) where the loop did not exit early and 'client' > >> holds that 'invalid entry'. > >> > >> This would be the case with has_pending = false && retry == 0. > >> > >> I hope this makes sense. > >> > >>> > >>>> > >>>>> > >>>>> if (has_pending) > >>>>> schedule_timeout_interruptible(HZ / 100); > >>>>> } while (has_pending && --retry > 0); > >>>> > >>>> since has_pending && --retry > 0 is no longer true the loop stops. > >>>> > >>>>> if (retry == 0) > >>>>> dev_warn(acrn_dev.this_device, > >>>>> "%s cannot flush pending request!\n", client->name); > >>>> client->name is accessed since retry == 0 now, but client is not a valid struct ending up > >>>> in a type confusion. > >>>> > >>>>> > >>>>> If retry > 0 and has_pending is true, we would call schedule_timeout_interruptible > >>>>> to schedule out to wait all the pending I/O requests would been completed. > >>>>> > >>>>> Thanks. > >>>> > >>>> Again, I'm not sure if this is realistically possible. I'm trying to remove > >>>> any use of the list iterator after the loop to make such potentially issues detectable > >>> You may think we still in the loop (could we ?), even that we didn't write the list iterator then. > >> > >> I'm not exactly sure which loop you are referring to but no, I don't think we are still in > >> the do while loop. > >> > >> The only thing we know after the do while loop is that: !has_pending || retry == 0 > >> And iff has_pending && retry == 0, then we shouldn't call the dev_warn(). > >> > >>>> at compile time instead of relying on certain (difficult to maintain) conditions to be met > >>>> to avoid the type confusion. > >>>> > >>>> Thanks, > >>>> Jakob > >> > >> Jakob > > Jakob >