Received: by 2002:a05:7412:3b8b:b0:fc:a2b0:25d7 with SMTP id nd11csp2358917rdb; Mon, 12 Feb 2024 02:03:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IFIphYwbHeYm+bOVe9F6b+6rru0s65yM9QzNb0RoQsqFXAjw4jas15FdwfiiT0R5Y9Y99lU X-Received: by 2002:a17:90b:3590:b0:296:3cc5:9253 with SMTP id mm16-20020a17090b359000b002963cc59253mr3180425pjb.32.1707732181023; Mon, 12 Feb 2024 02:03:01 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707732181; cv=pass; d=google.com; s=arc-20160816; b=eU6oJQdXynEApaLMiTMzAL3gHXwB9eC3JSnLO0RINRiYUJei45LkTRtgpgqXi5++vX DjztemJfLWUJVrg3VuqhYDDu0Q9JFoyqiJiN+HoeALFDbWcoxSVamu380L9NUg7GSmfV QNuxRKV9JlYnrw59uBMUT3Wl8E6yq2kngA8qPahSMcPhfM3CY/W3J0WWGfby4otycrD7 Zx6jlRLfcqERf2G0wQu3FTRKuoFsqv+s0W43NWVUQnxmVFv9NIfInoqjS6SmdmZzEZXq u+1iWxvtOo9dkXfveEj3j3A7KDGgmky66WnhhgcPeIzBtaDYQBHleJxq1P8FnVZ1nJM5 ftYg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:dkim-signature; bh=EvNl2NgC20KtoxOIZI2VbWr9lTUl32SJ2A3KDaxxmJU=; fh=x9oMYIyciiGf7LgS/0Uy+OtN+w+E5F1MS9U/Xsh0TYw=; b=hN3t0UzbSgePqGpxdVeTLQv2GFRYsWl4p2TA1ZSyA7hHruFhn+GK1wjWe4NXi9GKbs zOrU3z4blSDa7NVXADq5UhfFuOTodQD1GVNZWxDxSxahK/0eBNJOIg314LU1dTCmZ2rT PGdDn+5bInnMhOYeje6r2iEWOfk82yfADznnMcQKJPkurZs8g+fwRUDMa4i6i8IITflM E46An9c+vwPL4H4sQLrBA0JoC+zfAZjpXAuP06ztY7KYoWRJIVUzqOw4f3hSd4RSHxAm O8QyTAcrttVbj4BLd16QTbMb/S5Is1ApyHl2bEd8BzbRnPxJXNhCsc6R+1ysOBfwbXYN A0Qg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=th2F7Xso; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-61278-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-61278-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com X-Forwarded-Encrypted: i=2; AJvYcCWnMbIqUou51odmFkBXnms+xGCi6+aw2ilf2CtqwS5H0SZzW7rYT+c238DTP+qgCTg7Rtg9OV0bEhM42MSlyFMOIIdCtRj6JCmmWchR1w== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id d22-20020a637356000000b005dc6576d8adsi14646pgn.386.2024.02.12.02.03.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Feb 2024 02:03:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-61278-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=th2F7Xso; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-61278-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-61278-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id AC1162895E4 for ; Mon, 12 Feb 2024 10:03:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3847118041; Mon, 12 Feb 2024 10:02:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="th2F7Xso" Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0A9918021 for ; Mon, 12 Feb 2024 10:02:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707732173; cv=none; b=OkfgH1Yk8j3Cb2ZrJ21GON01jKNAoZyVMXfbEA1vIQlrbVUaPIMT9J2HBqn8G/xO9oOFky+EsZU4Cqv3nvgedttI5Rb8jzbTAIOYbOOIaaXC0UYv52BdgtPrxXz7WQITDAOxxDV+/iCAGlw9biVt9Rxvg4LXRep0b1JLjf8JqtI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707732173; c=relaxed/simple; bh=EvNl2NgC20KtoxOIZI2VbWr9lTUl32SJ2A3KDaxxmJU=; h=MIME-Version:From:Date:Message-ID:Subject:To:Cc:Content-Type; b=ZxePSK4jOokAf5CrenVnr489OjhqacjOISaqjT+VH7pqsm0Q6O2DBkv1pUy7gqItvzHdEZsQ4VoQ7YS61mqcDpQrWe1DP+DFKmA5MnLtl8KeqPlNQAzYbGEcLHz1kZkJ5MU3fPT+6lqWhuRKt7aQxhMTo+oT+Dn3k+QZXkq1nWk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=th2F7Xso; arc=none smtp.client-ip=209.85.208.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-56037115bb8so40945a12.0 for ; Mon, 12 Feb 2024 02:02:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707732170; x=1708336970; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=EvNl2NgC20KtoxOIZI2VbWr9lTUl32SJ2A3KDaxxmJU=; b=th2F7XsorPbvY1ayTqmRHkU6YYCMSTy52i242xqZkLtIIE5U+5SFo0b1UIUhUtSsWd o8Zh/2M75h/D6O2VFce2IjGRh00AeYnZaQiMjfPzEimOG8yEHDLyZCvNqPPzj3Gp+4SZ GrFe+J5LNF95U+NgiJzaaSo5eFy7PomoYpxXlg1HUP7DJdxQyDUu4qdnnXeURxqt1l0q 9rmtcGqLXa4fmWP+f3lvyBmSgzvv1QM4+hckpyC/H20NNgZV0TyjEP2zVW5VWgWEN+hW +1ghwzJ0tAJcnpGT12GIht0X1cMgvFvvIiAZPd76N9PKjHgOPbAxYA6k1M5xm0ZyLdbp rzcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707732170; x=1708336970; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=EvNl2NgC20KtoxOIZI2VbWr9lTUl32SJ2A3KDaxxmJU=; b=ePFR+iWVWj5JMsDIrOCGXVxSD+xRHRTk2ZBK55DN6JpSm2FcHk0/LRZ5uNaskrc6ZH 72960m6yAncPPwRTCfcBE6IjQn6Y96I/mXiqyGs3//9JQseVb+VnZu5T4fGn7G3ABioX g73HQpuhjaLPuVdYJweNkcHlzs0D0WhzAKCaDFJ0EIGEmCMzPp/JRQZ3CTjUYGAuD1jy MUhMRIym97lPTEN7xJqze3dUHjA1p1Jo4uEfm6FkpjckKIhWfYrkVO4W8b4pobwZ9FHj kOAs+ERd8UWGrucdCeZoWAPmOZMcIgAe2HurkZA9sT/sG7z4PXCYeJO/X3ch3/EW/9br PetA== X-Gm-Message-State: AOJu0YwBOGGPrCSUEerSaxHvedM+7AGxJ7PEKm1WHEHFsulQTaJbBpxW dbdlCJenM2vSAJ/jJ3WoxlomowjVeJ1n17BbTV3jFDbWys0ZOdTdDUis2SM8uwZkuopykoGNl5G oZs5CroSQFcvmHL3Fgw/8tQMAQm8hG6Ccuz3rAp2/VGa2GvT/mxHN X-Received: by 2002:a50:8a9e:0:b0:560:f37e:2d5d with SMTP id j30-20020a508a9e000000b00560f37e2d5dmr170566edj.5.1707732169786; Mon, 12 Feb 2024 02:02:49 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dmitry Vyukov Date: Mon, 12 Feb 2024 11:02:38 +0100 Message-ID: Subject: Spurious SIGSEGV with rseq/membarrier To: Mathieu Desnoyers , Peter Oskolkov , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng Cc: LKML , Chris Kennelly Content-Type: text/plain; charset="UTF-8" Hi rseq/membarrier maintainers, I've spent a bit debugging some spurious SIGSEGVs and it turned out to be an interesting interaction between page faults, rseq and membarrier. The manifestation is that membarrier(EXPEDITED_RSEQ) is effectively not working for a thread (doesn't restart its rseq critical section). The real code is inside of tcmalloc and relates to the "slabs resing" procedure: https://github.com/google/tcmalloc/blob/39775a2d57969eda9497f3673421766bc1e886a0/tcmalloc/internal/percpu_tcmalloc.cc#L176 The essence is: Threads use a data structure inside of rseq critical section. The resize procedure replaces the old data structure with a new one, uses a membarrier to ensure that threads don't use the old one any more and unmaps/mprotects pages that back the old data structure. At this point no threads use the old data structure anymore and no threads should get SIGSEGV. However, what happens is as follows: A thread gets a minor page fault on the old data structure inside of rseq critical section. The page fault handler re-enables preemption and allows other threads to be scheduled (I am tno sure this is actually important, but that's what I observed in all traces, and it makes the failure scenario much more likely). Now, the resize procedure is executed, replaces all pointers to the old data structure to the new one, executes the membarrier and unmaps the old data structure. Now the page fault handler resumes, verifies VMA protection and finds out that the VMA is indeed inaccessible and the page fault is not a minor one, but rather should result in SIGSEGV and sends SIGSEGV. Note: at this point the thread has rseq restart pending (from both preemption and membarrier), and the restart indeed happens as part of SIGSEGV delivery, but it's already too late. I think the page fault handling should give the rseq restart preference in this case, and realize the thread shouldn't be executing the faulting instruction in the first place. In such case the thread would be restarted, and access the new data structure after the restart. Unmapping/mprotecting the old data in this case is useful for 2 reasons: 1. It allows to release memory (not possible to do reliably now). 2. It allows to ensure there are no logical bugs in the user-space code and thread don't access the old data when they shouldn't. I was actually tracking a potential bug in user-space code, but after mprotecting old data, started seeing more of more confusing crashes (this spurious SIGSEGV).