Received: by 2002:ab3:784e:0:b0:1dc:8548:e819 with SMTP id f14csp131006ltk; Tue, 30 Aug 2022 19:08:09 -0700 (PDT) X-Google-Smtp-Source: AA6agR6r2CFkNBroGO+AiAiTj1x6wcDmwLeY4Pe6gtvydukw1AvjoNC1CtzRGduAW4J59PQ134b1 X-Received: by 2002:a17:907:743:b0:740:ef93:2ffc with SMTP id xc3-20020a170907074300b00740ef932ffcmr14997673ejb.514.1661911689312; Tue, 30 Aug 2022 19:08:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661911689; cv=none; d=google.com; s=arc-20160816; b=fukCX5yWL5DzmAA5YHUQ0tE2i5UWYryfZr5Kvjm8nlJoOYJD/Wi3vJAr7PuqE0Kzs3 CwP9fJAnC7jkG88DojdZUpRUK6muTWgR4BA01hNH5OuGVVWvht6P7yGsfmnkHA+okE0a uRiNNosr42o81+7y+y6U87BiHGnin9kGCiMbcen5KDQjeU+1ouGC1PKgIF8nTRQQIzz1 nt4QOxFPec6kPJtCL6WKpiOv79Cz1Ku29WnCu2sNMIKCswqT+7Xi8VLcETTr47I+wK40 51DCB85lfTFqbRTPKvyTEtEr/goj6LhZWUqxckfGVCYw22ujTiDOYHoez27SIwIV0C/V AC/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=LcwIVNU35QglUfRl1TVDFikPou1UH+xkQtzd24k3xeY=; b=gdD3xAZ+RrB01cMuHikaRDSjs2/D1nfM+9vk2GUO9bDKlKkSHP0BSKxOsZTdsvc/OZ sd3WBC+Ya46Lz6nfMizpg1P3ZSFv+jBZa824L0HbP/BGWoETLb9Wgt4l9bj4kEXFbsDJ WmoO3QjwQ/qHnoqfxttu4FZ3PcQwLyeP/Z7Yy0MB37hZy9x9lb6S6GhBwAG6ACPaKhcP vkFFIE9DeKMC+z+lZ+fqvtKVCLyZHTIhN1LS3rcOwY5LlBYFCbJi/Fbor8cLStPuR9h7 GFS6pfJcA9v4DiT4/63P7j75VMVbfsvhysJwS7INCqf/scs5w5aS0DVXlJcDOBOiMp9V Z0bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="tgyl/FbW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d3-20020aa7c1c3000000b004478363e5f0si8583311edp.524.2022.08.30.19.07.43; Tue, 30 Aug 2022 19:08:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="tgyl/FbW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229913AbiHaB7D (ORCPT + 99 others); Tue, 30 Aug 2022 21:59:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229481AbiHaB7C (ORCPT ); Tue, 30 Aug 2022 21:59:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0F49B2D89; Tue, 30 Aug 2022 18:59:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8933C6182A; Wed, 31 Aug 2022 01:59:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F789C433D6; Wed, 31 Aug 2022 01:58:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1661911139; bh=ThNB4RNzzCQx2g3tUsTTDlvSMbqK1GRsDGo/X8lxym0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=tgyl/FbWC20fOSojDfSk0t+7DoMptRu1+jAgYZvXvAmwPnUzQz13ekGphDK7dEP9A oQs/Z7r7hvoEw+OIcMrE4aNJ3UqWDRXvwhOb/Z4ZHUqOizFqIu7yySouSUB78ylzPw 23K48iUBvMGiUV+0jJEVbyAKxoaXby1fHimL4iIIj0ny1mKVm3zS3ONejvkzotTBfu nmLcrdD+Ziv8gPZZhyLnAUPDUQlZrGYrc3hGE4SIVHc7RjqPig5k0zVOXtUb8TOdp2 wajoZraVLCKv1o5B1j+IdkD2vY/2ivrhAcRZ44RXs35I1DYlGIXH20oT6yx9J5d0Yb /oAmNjGTu2gTw== Date: Wed, 31 Aug 2022 04:58:55 +0300 From: Jarkko Sakkinen To: Reinette Chatre Cc: linux-sgx@vger.kernel.org, Haitao Huang , Vijay Dhanraj , Dave Hansen , Paul Menzel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , "H. Peter Anvin" , "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" Subject: Re: [PATCH 1/6] x86/sgx: Do not consider unsanitized pages an error Message-ID: References: <20220830031206.13449-1-jarkko@kernel.org> <20220830031206.13449-2-jarkko@kernel.org> <1f43e7b9-c101-3872-bd1b-add66933b285@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 31, 2022 at 04:55:24AM +0300, Jarkko Sakkinen wrote: > On Tue, Aug 30, 2022 at 03:54:27PM -0700, Reinette Chatre wrote: > > Hi Jarkko, > > > > On 8/29/2022 8:12 PM, Jarkko Sakkinen wrote: > > > In sgx_init(), if misc_register() for the provision device fails, and > > > neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be > > > prematurely stopped. > > > > I do not think misc_register() is required to fail for the scenario to > > be triggered (rather use "or" than "and"?). Perhaps just > > "In sgx_init(), if a failure is encountered after ksgxd is started > > (via sgx_page_reclaimer_init()) ...". > > This would be the fixed version of the sentence: > > " > In sgx_init(), if misc_register() fails or misc_register() succeeds but > neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be > prematurely stopped. This may leave some unsanitized pages, which does > not matter, because SGX will be disabled for the whole power cycle. > " > > I want to keep the end states listed and not make it more abstract. > > The second sentence addresses the remark below. > > > To help the reader understand the subject of this patch it may help > > to explain that prematurely stopping ksgxd may leave some > > unsanitized pages, but that is not a problem since SGX cannot > > be used on the platform anyway. > > > > > This triggers WARN_ON() because sgx_dirty_page_list ends up being > > > non-empty, and dumps the call stack: > > > > > > > Traces like below can be frowned upon. I recommend that you follow the > > guidance in "Backtraces in commit mesages"(sic) in > > Documentation/process/submitting-patches.rst. > > > > > [ 0.268592] WARNING: CPU: 6 PID: 83 at > > > arch/x86/kernel/cpu/sgx/main.c:401 ksgxd+0x1b7/0x1d0 > > Is this good enough? I had not actually spotted this section before but > nice that it exists. Apparently has been added in 5.12. > > >> > > > > Ultimately this can crash the kernel, if the following is set: > > > > > > /proc/sys/kernel/panic_on_warn > > > > > > Print a simple warning instead, and improve the output by printing the > > > number of unsanitized pages, in order to provide debug informnation for > > > future needs. > > > > informnation -> information > > +1 > > > > > > > ... > > > > > Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org/T/#u > > > Reported-by: Paul Menzel > > > Tested-by: Paul Menzel > > > Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list") > > > Signed-off-by: Jarkko Sakkinen > > > > Should this go to stable? > > I guess it should. The hard reason for this that it can panic > the kernel. > > > > > > > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > > > index 515e2a5f25bb..903100fcfce3 100644 > > > --- a/arch/x86/kernel/cpu/sgx/main.c > > > +++ b/arch/x86/kernel/cpu/sgx/main.c > > > @@ -49,17 +49,20 @@ static LIST_HEAD(sgx_dirty_page_list); > > > * Reset post-kexec EPC pages to the uninitialized state. The pages are removed > > > * from the input list, and made available for the page allocator. SECS pages > > > * prepending their children in the input list are left intact. > > > + * > > > + * Contents of the @dirty_page_list must be thread-local, i.e. > > > + * not shared by multiple threads. > > > > Did you intend to mention something about the needed locking here? It looks > > like some information is lost during the move to the function description. > > Nothing about the locking that concerns the parameter, as the > sentence defines clear constraints for the caller. > > > > > > */ > > > -static void __sgx_sanitize_pages(struct list_head *dirty_page_list) > > > +static int __sgx_sanitize_pages(struct list_head *dirty_page_list) > > > { > > > struct sgx_epc_page *page; > > > + int left_dirty = 0; > > > > I do not know how many pages this code should be ready for but at least > > this could handle more by being an unsigned int considering that it is > > always positive ... maybe even unsigned long? > > I would go for 'long'. More information below. > > > > > > LIST_HEAD(dirty); > > > int ret; > > > > > > - /* dirty_page_list is thread-local, no need for a lock: */ > > > while (!list_empty(dirty_page_list)) { > > > if (kthread_should_stop()) > > > - return; > > > + break; > > > > > > page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); > > > > > > @@ -92,12 +95,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) > > > } else { > > > /* The page is not yet clean - move to the dirty list. */ > > > list_move_tail(&page->list, &dirty); > > > + left_dirty++; > > > } > > > > > > cond_resched(); > > > } > > > > > > list_splice(&dirty, dirty_page_list); > > > + return left_dirty; > > > } > > > > > > static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page) > > > @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void) > > > > > > static int ksgxd(void *p) > > > { > > > + int left_dirty; > > > + > > > set_freezable(); > > > > > > /* > > > @@ -395,10 +402,10 @@ static int ksgxd(void *p) > > > * required for SECS pages, whose child pages blocked EREMOVE. > > > */ > > > __sgx_sanitize_pages(&sgx_dirty_page_list); > > > - __sgx_sanitize_pages(&sgx_dirty_page_list); > > > > > > - /* sanity check: */ > > > - WARN_ON(!list_empty(&sgx_dirty_page_list)); > > > + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list); > > > + if (left_dirty) > > > + pr_warn("%d unsanitized pages\n", left_dirty); > > > > > > while (!kthread_should_stop()) { > > > if (try_to_freeze()) > > > > > > Reinette > > We need to return -ECANCELED on premature stop, and number of > pages otherwise. > > In premature stop, nothing should be printed, as the number > is by practical means a random number. Otherwise, it is an > indicator of a bug in the driver, and therefore a non-zero > number should be printed pr_err(), if that happens after the > second call. I.e. even though we print less we get more *information* what is going inside the kernel. Warning is not correct for either path IMHO. BR, Jarkko