Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp947481pxb; Thu, 31 Mar 2022 23:53:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxWSFbhKjKtEBLVYcfqieSsW8+qUi0XED2PJ7JE77JdwUEEI5+nsq/OS96AFqC8YJmOCgoI X-Received: by 2002:a17:902:f787:b0:14f:43ba:55fc with SMTP id q7-20020a170902f78700b0014f43ba55fcmr45192777pln.3.1648796038738; Thu, 31 Mar 2022 23:53:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648796038; cv=none; d=google.com; s=arc-20160816; b=ncBMp2G35Hm6aULt6sf3YFrlUVfrLbO/o9f/zUAFIdD3B3c3qtBHrlN4hwqFGfOuv7 KOmF7WYqAN/YqbEC+vyhLa9Q/ldvsd0Xb91crhmGE81y/Y+Oy63EwO5xHR5h0/5sv0Wb bROZ3e6iBC8GzRgVfZoWSZB7m4w5A84uST+6h8JceeZjJbh2fgC7EO19TsT614tn4bjb ggMj3lD7zeAtPC288sH/X6gt9F46vgOUMggHEd74h9I2rz5VVTbsgf+AsvtvctB48ByE AI2H/G2OgkeWKceEDcA4YOFdjX7pF7ipmtQ/z12LhA1w+Bpk+XxKspC9GGHX4cz9RAye MziQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=sZlwQc4q6s2PeRCl4Iw0/VcIy0i7NzF9OE0Eg9Vgj4Y=; b=bDZd/iNqLD2nPy/b39VcGxJIypLCQVcYXCNVmA+igVuCOXSnkobutBDSeG9xDtckdX 7mn9FVzFyuEIqOh11erA6jUMCxsLyyrBNpsoHcBHL1Clua/VIEqvm2FBd+XqFmLlfpIS ky3EbvgySWGfGKz9RhgqsJ8c2h4+hh92c4aDnIrjp16tCB9FcTHbfPCWGX60sH+wZ54i J04NiMigI9F8YHHDOW8iuurueTx1sm2NWHgoMjQwn/o0/Py9vc5Ns9nqC+4OVeWmEfy9 bfB5q5kOMd7GMzKzIqhq+ijOMkLgRNJnrP/UzKyTtXkvYEQduXqPN2inz0uCe3BjeLJs TdfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aT280unJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t29-20020a63b25d000000b003816043f14dsi1656713pgo.834.2022.03.31.23.53.46; Thu, 31 Mar 2022 23:53:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aT280unJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244768AbiDAD6q (ORCPT + 99 others); Thu, 31 Mar 2022 23:58:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238315AbiDAD6n (ORCPT ); Thu, 31 Mar 2022 23:58:43 -0400 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B3F51E5A58 for ; Thu, 31 Mar 2022 20:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648785414; x=1680321414; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=rltqYO8zfprKIvlGQqshwiwS+GG5pzSco8Z2iya0gw4=; b=aT280unJNyQVZ5npfQQ0A/+3XeLIqyT3/jcZCJMVqa4NCf7SYGLByLCv xscPkUSie4rtBfXPHDMhsRLJNGOF14iq2ROBpR7N4KM5uplC/EOJjFee+ NMAvP3ioEGlycG8+iUJGhR7vXjw3jdt//AMSQ2c3loxDEh0NdyKcqOESo m6xTxITqe8yY85jVe4J9F75jVfW0Ctq6FbrUY6FIz7J9MYTNC8O1hGuZW HKcV5/Ret9kBm6WYR+jSXWjZzWdatxbAbhEN9JCyau8LIPoUh/XLTmb2N VHmseA0+ZmNumbjSGKqY3rnP4pqzlZPRioKFgpMmcAILr+7ZpdCp9pLbs A==; X-IronPort-AV: E=McAfee;i="6200,9189,10303"; a="320724431" X-IronPort-AV: E=Sophos;i="5.90,226,1643702400"; d="scan'208";a="320724431" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Mar 2022 20:56:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,226,1643702400"; d="scan'208";a="522621436" Received: from louislifei-optiplex-7050.sh.intel.com (HELO louislifei-OptiPlex-7050) ([10.239.81.43]) by orsmga006.jf.intel.com with ESMTP; 31 Mar 2022 20:56:49 -0700 Date: Fri, 1 Apr 2022 11:57:42 +0800 From: Li Fei1 To: Jakob Koschel Cc: linux-kernel@vger.kernel.org, rppt@kernel.org, bjohannesmeyer@gmail.com, c.giuffrida@vu.nl, h.j.bos@vu.nl, fei1.li@intel.com Subject: Re: [PATCH] virt: acrn: fix invalid check past list iterator Message-ID: <20220401035742.GA31162@louislifei-OptiPlex-7050> References: <20220319203819.2559993-1-jakobkoschel@gmail.com> <20220330075742.GA22544@louislifei-OptiPlex-7050> <6E68C33F-9CBB-418C-A11D-2AD863C0B19A@gmail.com> <20220401011533.GA29696@louislifei-OptiPlex-7050> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 01, 2022 at 05:22:36AM +0200, Jakob Koschel wrote: > > > On 1. Apr 2022, at 03:15, Li Fei1 wrote: > > > > On Thu, Mar 31, 2022 at 01:20:50PM +0200, Jakob Koschel wrote: > >> > >>> On 30. Mar 2022, at 09:57, Li Fei1 wrote: > >>> > >>> On Sat, Mar 19, 2022 at 09:38:19PM +0100, Jakob Koschel wrote: > >>>> The condition retry == 0 is theoretically possible even if 'client' > >>>> does not point to a valid element because no break was hit. > >>>> > >>>> To only execute the dev_warn if actually a break within the loop was > >>>> hit, a separate variable is used that is only set if it is ensured to > >>>> point to a valid client struct. > >>>> > >>> Hi Koschel > >>> > >>> Thanks for you to help us to try to improve the code. Maybe you don't get the point. > >>> The dev_warn should only been called when has_pending = true && retry == 0 > >> > >> Maybe I don't understand but looking isolated at this function I could see a way to call > >> the dev_warn() with has_pending = false && retry == 0. > > Yes, even has_pending = true && retry == 0 at the beginning, we could not make sure > > has_pending is true after schedule_timeout_interruptible and we even didn't check > > there're other pending client on the ioreq_clients (because we can't make sure > > when we dev_warn this clent is still pending). So we just use dev_warn not higher log level. > > I'm sorry, I don't quite understand what you mean by that. > Do you agree that has_pending = false && retry == 0 is possible when calling the dev_warn() > or not? > Yes, so what ? It just a hint there may have pending request. Even retry == 0 && has_pending = true, When we call dev_warn, the clent may not is pending. > >> > >>> list_for_each_entry(client, &vm->ioreq_clients, list) { > >>> has_pending = has_pending_request(client); > >>> if (has_pending) > >>> } > >>> spin_unlock_bh(&vm->ioreq_clients_lock); > >> > >> imagine has_pending == false && retry == 1 here, then client will not hold a valid list entry. > > What do you mean "client will not hold a valid list entry" ? > > Imagine a very simple example: > > struct acrn_ioreq_client *client; > list_for_each_entry(client, &vm->ioreq_clients, list) { > continue; > } > > dev_warn(acrn_dev.this_device, > "%s cannot flush pending request!\n", client->name); /* NOT GOOD */ > If there're pending request, we would call schedule to schedule out then schedule back to check the list from the beginning. If there's no pending client, we point to the last client and break out the while loop. The code doesn't want to find the pending client and break out the while loop and call dev_warn. Please see the function comment. > > Since there is no break for the list_for_each_entry() iterator to return early, > client *cannot* be a valid entry of the list. In fact if you look at the list_for_each_entry() > macro, it will be an 'bogus' pointer, pointing somewhere into 'vm'. > Essentially before the terminating condition of the list traversal the following code is called: > > list_entry(&vm->ioreq_clients, struct acrn_ioreq_client *, list); > > resulting in a: > > container_of(&vm->ioreq_clients, struct acrn_ioreq_client *, list); > > &vm->ioreq_clients however is not contained in a struct acrn_ioreq_client, making > this call compute an invalid pointer. > Therefore using 'client' as in the example above (e.g. client->name) after the loop is > not safe. Since the loop can never return early in the simple example above it will > always break. On other cases (the one we are discussing here) there might be a chance that > there is one code path (in theory) where the loop did not exit early and 'client' > holds that 'invalid entry'. > > This would be the case with has_pending = false && retry == 0. > > I hope this makes sense. > > > > >> > >>> > >>> if (has_pending) > >>> schedule_timeout_interruptible(HZ / 100); > >>> } while (has_pending && --retry > 0); > >> > >> since has_pending && --retry > 0 is no longer true the loop stops. > >> > >>> if (retry == 0) > >>> dev_warn(acrn_dev.this_device, > >>> "%s cannot flush pending request!\n", client->name); > >> client->name is accessed since retry == 0 now, but client is not a valid struct ending up > >> in a type confusion. > >> > >>> > >>> If retry > 0 and has_pending is true, we would call schedule_timeout_interruptible > >>> to schedule out to wait all the pending I/O requests would been completed. > >>> > >>> Thanks. > >> > >> Again, I'm not sure if this is realistically possible. I'm trying to remove > >> any use of the list iterator after the loop to make such potentially issues detectable > > You may think we still in the loop (could we ?), even that we didn't write the list iterator then. > > I'm not exactly sure which loop you are referring to but no, I don't think we are still in > the do while loop. > > The only thing we know after the do while loop is that: !has_pending || retry == 0 > And iff has_pending && retry == 0, then we shouldn't call the dev_warn(). > > >> at compile time instead of relying on certain (difficult to maintain) conditions to be met > >> to avoid the type confusion. > >> > >> Thanks, > >> Jakob > > Jakob >