Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp305817iob; Fri, 13 May 2022 02:03:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyqyW2JGAVCRqbWiJQ8CKqbGgMAlrQHc+KnbCM8/u9AHmp1+Uy4a+oovYCGMM4npWbQHWEg X-Received: by 2002:a17:907:9053:b0:6f3:9f7e:5325 with SMTP id az19-20020a170907905300b006f39f7e5325mr3335841ejc.455.1652432619360; Fri, 13 May 2022 02:03:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652432619; cv=none; d=google.com; s=arc-20160816; b=ryRAOkc0XCF8mZ31+6cPDfyOQtCYrxj0dqgTwJ5rRtPvVuCEYF/y3rJeRkW7pMUYaH 2HXJXMc942aJ3NRSm+aMbWKk1trl4gL6ln9ZaOEGVZA6FsjLZH63irv3vIz7UYqGzDrj VM5xS+yfp6kl6MuiNA+wAdi90+GdRXBJ0a2fFbXDiWqdAGTJbt1BvGfl630Xtv/MRSDl jq7m94rt7puZgmjbrRgcd+msrRfZ6SVpDJMIMwTt3tfdFo1WWi+uHqzEzFG5kk/Y3+LH 54mQCG0DZiEf5oSiJX/cAI+sJVvkfbKwIVdO+OKgskZPYy1fO1Y4x17YglFpc78wt1sy lbkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=U1kBj3+zhDZG3ThWRtN9x32SEI/sWzg24SQsHG9HORA=; b=Y5KwQyuJvnIb835TMckLiN+6MijVjAYWnu8EtrDS3Ab9/we7rwhkAgusSR0yJJrK3i JkZMq8UAfzlzCI0PfWMNiF/U4eE45xD+LCH56wfYZVhKkn3GXDPvKuQm7YqxdZokj4ib WfBDapljamTlVBRDLkKkaCJCN7ttjNgR8pAdZfUbVaXEm8JWxob9LYCFEmSmlzEjKp3i IE4H9XqYJGdkpy9bxAIL23zcwNqFGLcdlOqrJSeP6ffYQ2s03qZNMx6RiGeiYQhLQ5rk vWJ5R3hAF4ZVGonm5K00wfAf+N5ZM7cXM676lttBkiKhKP1z8/2GFig7oZ4bnpyQzdGD j5og== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=orcon.net.nz Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qk6-20020a1709077f8600b006dffb6427bdsi1690363ejc.269.2022.05.13.02.03.12; Fri, 13 May 2022 02:03:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=orcon.net.nz Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347839AbiEKUhD (ORCPT + 99 others); Wed, 11 May 2022 16:37:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230383AbiEKUhC (ORCPT ); Wed, 11 May 2022 16:37:02 -0400 Received: from smtp-2.orcon.net.nz (smtp-2.orcon.net.nz [60.234.4.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA685289A8 for ; Wed, 11 May 2022 13:37:00 -0700 (PDT) Received: from [121.99.247.178] (port=10273 helo=creeky) by smtp-2.orcon.net.nz with esmtpa (Exim 4.90_1) (envelope-from ) id 1not4n-00080s-4K; Thu, 12 May 2022 08:36:53 +1200 Date: Thu, 12 May 2022 08:36:48 +1200 From: Michael Cree To: Yu Zhao Cc: Linux-MM , linux-kernel , Hillf Danton , Joonsoo Kim Subject: Re: Alpha: rare random memory corruption/segfault in user space bisected Message-ID: References: <20220507015646.5377-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-GeoIP: NZ X-Spam_score: -2.9 X-Spam_score_int: -28 X-Spam_bar: -- X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, May 07, 2022 at 11:27:15AM -0700, Yu Zhao wrote: > On Fri, May 6, 2022 at 6:57 PM Hillf Danton wrote: > > > > On Sat, 7 May 2022 09:21:25 +1200 Michael Cree wrote: > > > Alpha kernel has been exhibiting rare and random memory > > > corruptions/segaults in user space since the 5.9.y kernel. First seen > > > on the Debian Ports build daemon when running 5.10.y kernel resulting > > > in the occasional (one or two a day) build failures with gcc ICEs either > > > due to self detected corrupt memory structures or segfaults. Have been > > > running 5.8.y kernel without such problems for over six months. > > > > > > Tried bisecting last year but went off track with incorrect good/bad > > > determinations due to rare nature of bug. After trying a 5.16.y kernel > > > early this year and seen the bug is still present retried the bisection > > > and have got to: > > > > > > aae466b0052e1888edd1d7f473d4310d64936196 is the first bad commit > > > commit aae466b0052e1888edd1d7f473d4310d64936196 > > > Author: Joonsoo Kim > > > Date: Tue Aug 11 18:30:50 2020 -0700 > > > > > > mm/swap: implement workingset detection for anonymous LRU > > This commit seems innocent to me. While not ruling out anything, i.e., > this commit, compiler, qemu, userspace itself, etc., my wild guess is > the problem is memory barrier related. Two lock/unlock pairs, which > imply two full barriers, were removed. This is not a small deal on > Alpha, since it imposes no constraints on cache coherency, AFAIK. > > Can you please try the attached patch on top of this commit? Thanks! Thanks, I have that running now for a day without any problem showing up, but that's not long enough to be sure it has fixed the problem. Will get back to you after another day or two of testing. Cheers, Michael.