Received: by 2002:ab2:1689:0:b0:1f7:5705:b850 with SMTP id d9csp2082721lqa; Tue, 30 Apr 2024 07:46:22 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWcmVSYiLMCsMdZx18zN3I391coV+yeaHqYLO38nk1ZCLVM7Tq4Wffi5oPtdCpzbjg7Y6RCjuS71YCzQbUWLkaLZRq7CqwWGzX6/YgErA== X-Google-Smtp-Source: AGHT+IFhYjiYbCu3088UaUwia6V+tet7rfCW3unuG5y58b0OSmuF3U82OacFmaJ4AcQCo27k5Slx X-Received: by 2002:a05:620a:4015:b0:790:c251:3c26 with SMTP id h21-20020a05620a401500b00790c2513c26mr12475803qko.65.1714488382301; Tue, 30 Apr 2024 07:46:22 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714488382; cv=pass; d=google.com; s=arc-20160816; b=VTVLe0XL5SUPsSv9Y/ASDbaakpmlf/XIFya9Zu8j/kT/Mi0WOCltzzNBZjrYbm6DbW HK7tqmzcdG50LaatH5nN7JgHoaDeTu8bLq8eZhH+Kw8z9whpzMThYZFiA95lK3XA0FT6 4WoPtHZ8flFBiEhs5iIrgw/F6zMJWdyKUChP2Lcd+Bddg2ijX/qXf0kiTWM9Fu4bFK+M SY8KD0Z/WVsvX0GSzn/XFH9Y0NojJp3zq1jLlX/fNT/H1LLLh8jEMMUIAavdPP6AiD0Y 03iWTbRxxe1SF5+d6a5Lbw6HnnNgTKLSaEFBSASKu9QsWpgOPnZL+EgiGNg0rgNHzZM3 zl9w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:organization:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature; bh=xQ1P7k/2mbu1VlJaj87ap2waf7rf9toXEw11JFK/3H8=; fh=FcCEEGKlgZ3LOFc0xHiNFZhbQ2xGamBzjwBllZmcBmE=; b=bfx5bV4oP0q5MBl9gneMVHT5G5JaSkwyiVRg9cPtMWg3FWObL8PZ0Tlyw593k8wH61 YOHBhpILW8c7nHU9rDBr4is9AY7/hObi6QEIqe9nn9+jgL32JEmqVQcrJQT+59tZIzs+ vyNmyRRfHvNG6LAn18txgWryQg/dZdwmBe0KWNAoOphC6p6fpB9NOLl2WFCBaKg9rhP2 VadJOQHWmTaaLb28tnE88OPZANrBD/YEj7gnsY+hP5+RsRoW/2B1IpV25jzjg3nWXoVO U/pM8jUA0BqG2EAwl5G256tgNtpYyuoczo9mWprP28RlEi4DFips5K4rUght1XOCjsan xHkA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=RPtX7lVO; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-164189-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-164189-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id os39-20020a05620a812700b0078d543778c5si28622096qkn.91.2024.04.30.07.46.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Apr 2024 07:46:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-164189-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=RPtX7lVO; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-164189-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-164189-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id F3AFD1C22A88 for ; Tue, 30 Apr 2024 14:46:21 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6030A14375D; Tue, 30 Apr 2024 14:44:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="RPtX7lVO" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77EFA1527A8; Tue, 30 Apr 2024 14:44:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714488259; cv=none; b=LxI3ei0CH1JwEgnQ0ULKCbLrQDCdUi7/KhySd0k3A+3vnZjoafuUxhkV+jpGxlMT1lUgQljNpAoE20vbQnJ9luBrQOvctBV92vFo5EZgXnwjN0CkYRM8Kn1/IPktEoG2WNHJ6V8UMpwYX8dd2sF72mhD/npBL+3apmyjuDFMPPE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714488259; c=relaxed/simple; bh=ITJBFO8fU5qCoPRWLyqfUK06OptlD/pVjoJj4mvvjlc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=diazgDPK5TDL8oQoq1cZ/rQoz1S/RYbgI4CR/+JmnL5/9GhoFJajRGt9Dt/xdrs7XJyCdFal2jFz/FGgbBZKfHXDrA1u/9ZAGjh1+Sjq/vmiCjUtlcSMw1keEz1lVxtAJNNkBHGLo0jsZZPr2Fa7WTZFyuENDhBYlw1/ksCAC0M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=RPtX7lVO; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1714488258; x=1746024258; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ITJBFO8fU5qCoPRWLyqfUK06OptlD/pVjoJj4mvvjlc=; b=RPtX7lVOuIkDhJRi3CV5kiBn9svtvK0JefRJimC6nmif7bu7wvGAUX1h 4rjm8et5B2aZuEcJ2ZY/o5onnOWoklDFVsaDZY1wPHiF55gbawH968YH6 Pi+xFIiIx2uvtWerahBEP5bjpW1S3EVEhtK01mL8+bRdSft/r7qv8D5ey ru3b0l6hFasIVqQy0AHDEezzFwgggRDTpSg6RkKgQTerMA2rrmn9rjnJa aRHqwrsYKsC1P8zxvBfBk1L2ep2mpZ0I9A2Wyg10pz0x1OfXQojNRLDKW ve9EofxtQwKYIegogKHQOi+u1rLi8jciFu32krQKy8m4CiGtxW/vRIvcB g==; X-CSE-ConnectionGUID: cjlnOjnjT0mtOc5rUmuVAg== X-CSE-MsgGUID: eWjQD2fyQICMVGs8n8lkCw== X-IronPort-AV: E=McAfee;i="6600,9927,11060"; a="10420084" X-IronPort-AV: E=Sophos;i="6.07,242,1708416000"; d="scan'208";a="10420084" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Apr 2024 07:44:17 -0700 X-CSE-ConnectionGUID: CgbXh7TnQSaFTyUXC+T2Nw== X-CSE-MsgGUID: /9seFNKQR1uxdDrqYlKm/Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,242,1708416000"; d="scan'208";a="26491447" Received: from mehlow-prequal01.jf.intel.com ([10.54.102.156]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Apr 2024 07:44:16 -0700 From: Dmitrii Kuvaiskii To: jarkko@kernel.org Cc: dave.hansen@linux.intel.com, dmitrii.kuvaiskii@intel.com, haitao.huang@linux.intel.com, kai.huang@intel.com, kailun.qin@intel.com, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, mona.vij@intel.com, reinette.chatre@intel.com Subject: Re: [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Date: Tue, 30 Apr 2024 07:35:55 -0700 Message-Id: <20240430143555.893316-1-dmitrii.kuvaiskii@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Organization: Intel Deutschland GmbH - Registered Address: Am Campeon 10, 85579 Neubiberg, Germany Content-Transfer-Encoding: 8bit On Mon, Apr 29, 2024 at 04:06:39PM +0300, Jarkko Sakkinen wrote: > On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote: > > SGX runtimes such as Gramine may implement EDMM-based lazy allocation of > > enclave pages and may support MADV_DONTNEED semantics [1]. The former > > implies #PF-based page allocation, and the latter implies the usage of > > SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl. > > > > A trivial program like below (run under Gramine and with EDMM enabled) > > stresses these two flows in the SGX driver and hangs: > > > > /* repeatedly touch different enclave pages at random and mix with > > * `madvise(MADV_DONTNEED)` to stress EAUG/EREMOVE flows */ > > static void* thread_func(void* arg) { > > size_t num_pages = 0xA000 / page_size; > > for (int i = 0; i < 5000; i++) { > > size_t page = get_random_ulong() % num_pages; > > char data = READ_ONCE(((char*)arg)[page * page_size]); > > > > page = get_random_ulong() % num_pages; > > madvise(arg + page * page_size, page_size, MADV_DONTNEED); > > } > > } > > > > addr = mmap(NULL, 0xA000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS, -1, 0); > > pthread_t threads[16]; > > for (int i = 0; i < 16; i++) > > pthread_create(&threads[i], NULL, thread_func, addr); > > I'm not convinced that kernel is the problem here but it could be also > how Gramine is implemented. > > So maybe you could make a better case of that. The example looks a bit > artificial to me. I believe that these are the bugs in the kernel (in the SGX driver). I provided more detailed descriptions of the races and ensuing bugs in the other two replies, please check them. The example is a stress test written to debug very infrequent hangs of real-world applications that are run with Gramine, EDMM, and two optimizations (lazy allocation and MADV_DONTNEED semantics). We observed hangs of Node.js, PyTorch, R, iperf, Blender, Nginx. To root cause these hangs, we wrote this artificial stress test. This test succeeds on vanilla Linux, so ideally it should also pass on Gramine. Please also note that the optimizations of lazy allocation and MADV_DONTNEED provide significant performance improvement for some workloads that run on Gramine. For example, a Java workload with a 16GB enclave size has approx. 57x improvement in total runtime. Thus, we consider it important to permit these optimizations in Gramine, which IIUC requires bug fixes in the SGX driver. You can find more info at https://github.com/gramineproject/gramine/pull/1513. Which parts do you consider artificial, and how could I modify the stress test? -- Dmitrii Kuvaiskii