Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp1315934rwb; Wed, 26 Jul 2023 10:32:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlEsu01UP5d9z16YqSdT3wXpNJirVEuT9yy1G04IA2g2mKgMvk3p1rBwxN+mS9qiLp31r3sn X-Received: by 2002:a05:6a20:8e07:b0:131:44f2:8691 with SMTP id y7-20020a056a208e0700b0013144f28691mr2993000pzj.37.1690392772681; Wed, 26 Jul 2023 10:32:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690392772; cv=none; d=google.com; s=arc-20160816; b=qSei9WoXBWUzlPUPZfWX1riE1L3vnPiIWc3fpVafRa8VBA6lpfWhcJMkWyG2OU4lGV Pc+bkbFKDsrChSbOGxj1OlkyAcEmEiOLsC8oHYASc/PMf4bYQJOAc3go986a09dkMkUr OPv2nF2C24jIkiLL3zjn2T7lDZXS2YvuuUkogZLjN+0TwiNo6j6ZUddM9AlkIqByKLTR u1XYqtHXiaoLbQh/XJUy1s9qurqN9IaZccFkS/eSLyeyKUd5glu3UFxlSJVqUVPGvT+4 atMYIdeTTfXCiZxjPQ4W9rfNg2fqAw75valedUQfTrFGhmVsX175RpWLT+2CJDW9uTjs sfcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:message-id:organization :from:content-transfer-encoding:mime-version:date:references:subject :cc:to:dkim-signature; bh=xCywLRhEbaXOzo3BUDh5bROYrYY6nNtXJhuCpAzKwTM=; fh=gnFDKroNJ2XOj9WD87tnTZD+DQI9oKUGBekrvajD+b4=; b=VKqSvz4T5+DPWKJzcWUQM0kzHM1kFDJdBCFp8wAom6WLrLnzcto9gSWDU8aoxGurdu HlFAFMSTR6Cv4cM+avbnCDbrm/hb0XtSVh2Of/B9lOfBkxN6KhsitOE92/kPS1kTPpm7 NAqxgTwZv8Ua5vC4xly9c02c+MlqxHuxa+lmjkTpKVzmAyCUA9AUp27pew406iQ8BAr7 qeIXeWJ2kjNVPJ7XSVSR0+JDCGUzf6VLvT5QSxBClvldZae62hmHbOkGaxSfBLvZg9vF ytXibJ08Z5ucPTmPACBRnP8ohGM+k3p74aQzSSVgU6he/4nAq3ZTe6AgWnuG1bUaqeFR ltMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=cQCxTsfe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h125-20020a636c83000000b00553c2f85095si12973788pgc.832.2023.07.26.10.32.39; Wed, 26 Jul 2023 10:32:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=cQCxTsfe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231951AbjGZQ4X (ORCPT + 99 others); Wed, 26 Jul 2023 12:56:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231263AbjGZQ4V (ORCPT ); Wed, 26 Jul 2023 12:56:21 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F239F211F; Wed, 26 Jul 2023 09:56:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690390580; x=1721926580; h=to:cc:subject:references:date:mime-version: content-transfer-encoding:from:message-id:in-reply-to; bh=JzePxdujMch5SdQ4iwj3VbDUHOAXybVBdROP+jWCD5c=; b=cQCxTsfelWtoh2gi5Wsnju8hIJJs4zVRN8x5g0JNpVlgy/yuJai1L7G7 6h4ioWNoa1MuYOrG2UZMLgxNWCf4L/yk7Wc6ibrD986prP+zdCcMX5W+H CPu+OKClgsKD2j3q/5oWoXpHaWgOj95hatV4ba6G6+aIcuHYjFVevnYF7 3rDCW8BEc4GmUrXUlEo+EOAFmvPvpSeDUAIT/JoYTUlQsJakJOVC/GfNc SmMU7q/bqqe97zqnnjfQorcnky/pl18Cq2G0sWJsAvkZVFlZ5rZh/fXeQ NH4eD7vH2EH/KeJVbanyhSSJp/eZbDFJ3y4kEVa5ShZWC7T0+jfQxRdeS g==; X-IronPort-AV: E=McAfee;i="6600,9927,10783"; a="348355030" X-IronPort-AV: E=Sophos;i="6.01,232,1684825200"; d="scan'208";a="348355030" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2023 09:56:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10783"; a="676777774" X-IronPort-AV: E=Sophos;i="6.01,232,1684825200"; d="scan'208";a="676777774" Received: from hhuan26-mobl.amr.corp.intel.com ([10.92.48.113]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 26 Jul 2023 09:56:17 -0700 Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: "Hansen, Dave" , "linux-sgx@vger.kernel.org" , "x86@kernel.org" , "bp@alien8.de" , "jarkko@kernel.org" , "dave.hansen@linux.intel.com" , "mingo@redhat.com" , "tglx@linutronix.de" , "hpa@zytor.com" , "linux-kernel@vger.kernel.org" , "Huang, Kai" Cc: "kristen@linux.intel.com" , "Chatre, Reinette" , "stable@vger.kernel.org" , "Christopherson,, Sean" Subject: Re: [PATCH] x86/sgx: fix a NULL pointer References: <20230717202938.94989-1-haitao.huang@linux.intel.com> <520111c9ccdd7356f9eaf20013e3e3c75b06398e.camel@intel.com> Date: Wed, 26 Jul 2023 11:56:16 -0500 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Haitao Huang" Organization: Intel Message-ID: In-Reply-To: User-Agent: Opera Mail/1.0 (Win32) X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 20 Jul 2023 19:52:22 -0500, Huang, Kai wrote: > On Fri, 2023-07-21 at 00:32 +0000, Huang, Kai wrote: >> On Wed, 2023-07-19 at 08:53 -0500, Haitao Huang wrote: >> > Hi Dave and Kai >> > On Tue, 18 Jul 2023 19:21:54 -0500, Dave Hansen >> >> > wrote: >> > >> > > On 7/18/23 17:14, Huang, Kai wrote: >> > > > Also perhaps the patch title is too vague. Adding more >> information >> > > > doesn't hurt >> > > > I think, e.g., mentioning it is a fix for NULL pointer >> dereference in >> > > > the EAUG >> > > > flow. >> > > >> > > Yeah, let's say something like: >> > > >> > > x86/sgx: Resolve SECS reclaim vs. page fault race >> > > >> > The patch is not to resolve SECS vs #PF race though the race is a >> > necessary condition to cause the NULL pointer. The same condition >> does not >> > cause NULL pointer in the ELDU path of #PF, only in EAUG path of #PF. >> > >> > And the issue really is the NULL pointer not checked and fix was to >> reuse >> > the same code to reload SECS in ELDU code path for EAUG code path >> > >> > >> > How about this: >> > >> > x86/sgx: Reload reclaimed SECS for EAUG on #PF >> > >> > or >> > >> > x86/sgx: Fix a NULL pointer to SECS used for EAUG on #PF >> > >> >> Perhaps you can add "EAUG" part to what Dave suggested? >> >> x86/sgx: Resolves SECS reclaim vs. page fault race on EAUG >> >> (assuming Dave is fine with this :-)) Sure, I can use this too. > Btw, do you have a real call trace? If you have, I think you can add > that to > the changelog too because that catches people's eye immediately. Previously I was not able to reproduce without SGX cgroup patches. Now I managed to get a trace with a QEMU setup with small EPC (8M), large RAM (128G) and 128 vCPUs: [ 1682.914263] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1682.922966] #PF: supervisor read access in kernel mode [ 1682.929115] #PF: error_code(0x0000) - not-present page [ 1682.935264] PGD 0 P4D 0 [ 1682.938383] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1682.943620] CPU: 43 PID: 2681 Comm: test_sgx Not tainted 6.3.0-rc4sgxcet #12 [ 1682.951989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 1682.965504] RIP: 0010:sgx_encl_eaug_page+0xc7/0x210 [ 1682.971359] Code: 25 49 8b 96 98 04 00 00 48 8d 40 48 48 89 42 08 48 89 56 48 49 8d 96 98 04 00 00 48 89 56 50 49 89 86 98 04 00 00 49 8b 46 60 <8b> 10 48 c1 e2 05 488 [ 1682.993330] RSP: 0000:ffffb2e64725bc00 EFLAGS: 00010246 [ 1682.999585] RAX: 0000000000000000 RBX: ffff987e5abac428 RCX: 0000000000000000 [ 1683.008059] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff987e61aee000 [ 1683.016533] RBP: ffffb2e64725bcf0 R08: 0000000000000000 R09: ffffb2e64725bb58 [ 1683.025008] R10: 0000000000000000 R11: 00007f3f5c418fff R12: ffff987e61aee020 [ 1683.033479] R13: ffff987e505bc080 R14: ffff987e61aee000 R15: ffffb2e6420fcb20 [ 1683.041949] FS: 00007f3f5cb48740(0000) GS:ffff989cfe8c0000(0000) knlGS:0000000000000000 [ 1683.051540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1683.058478] CR2: 0000000000000000 CR3: 0000000115896002 CR4: 0000000000770ee0 [ 1683.067018] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1683.075539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1683.084085] PKRU: 55555554 [ 1683.087465] Call Trace: [ 1683.090547] [ 1683.093220] ? __kmem_cache_alloc_node+0x16a/0x440 [ 1683.099034] ? xa_load+0x6e/0xa0 [ 1683.103038] sgx_vma_fault+0x119/0x230 [ 1683.107630] __do_fault+0x36/0x140 [ 1683.111828] do_fault+0x12f/0x400 [ 1683.115928] __handle_mm_fault+0x728/0x1110 [ 1683.121050] handle_mm_fault+0x105/0x310 [ 1683.125850] do_user_addr_fault+0x1ee/0x750 [ 1683.130957] ? __this_cpu_preempt_check+0x13/0x20 [ 1683.136667] exc_page_fault+0x76/0x180 [ 1683.141265] asm_exc_page_fault+0x27/0x30 [ 1683.146160] RIP: 0033:0x7ffc6496beea [ 1683.150563] Code: 43 48 8b 4d 10 48 c7 c3 28 00 00 00 48 83 3c 19 00 75 31 48 83 c3 08 48 81 fb 00 01 00 00 75 ec 48 8b 19 48 8d 0d 00 00 00 00 <0f> 01 d7 48 8b 5d 101 [ 1683.172773] RSP: 002b:00007ffc64935b68 EFLAGS: 00000202 [ 1683.179138] RAX: 0000000000000003 RBX: 00007f3800000000 RCX: 00007ffc6496beea [ 1683.187675] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 1683.196200] RBP: 00007ffc64935b70 R08: 0000000000000000 R09: 0000000000000000 [ 1683.204724] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 1683.213310] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 1683.221850] [ 1683.224636] Modules linked in: isofs intel_rapl_msr intel_rapl_common binfmt_misc kvm_intel nls_iso8859_1 kvm ppdev irqbypass input_leds parport_pc joydev parport rapi [ 1683.291173] CR2: 0000000000000000 [ 1683.295271] ---[ end trace 0000000000000000 ]--- I'll add this to the commit as well. Thanks Haitao