Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1809234pxf; Fri, 26 Mar 2021 15:44:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx/6VJMJZdj1knQIeOoWSARqyMOveH4ysszqk2D9hThSHcI/F8OSb1UtRoms0/9UVU0Iet0 X-Received: by 2002:a05:6402:17e9:: with SMTP id t9mr17533847edy.211.1616798680052; Fri, 26 Mar 2021 15:44:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616798680; cv=none; d=google.com; s=arc-20160816; b=KyLFOsNUd+yihHDXzgol5gcupvovEyxdDjHh/gAoVCO1eOXZbGWFJl2MNzYCd13U9c smhO/Fi9++fZy2nXbY7EqcsNw4tmiCE7/oeb2KaVLUPdQUB1d7mBPG4bX6wjjhvPUTs3 BRc4az0p9aM6WUZcKB0ELG5gItPoSMp3rmRSrW8ILecs9tJJirAhTwYXMUtA+nFUkP+g b2H2eK9Z3hOphuZXXm/ms+8H6pN/jZl2itO+DWt4cHP9Ph1N54T2HLc3uaB//tQgz6jc whrCsPGZTqbEn9g7i8ny9wugh48uBVCBneFGdinmq2sTO/Om2ajgmPZvB4HIohTmPHQC X8ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=KDXcvgtG0h83c1r3w5F4A7Z5EVH9x/tB4a4LIq1DDBo=; b=gQWrZxghfOQNEWVmCRFeCRBZ/BZVifDNs18Mz3l5bXI0aHqCoChgBmDiZCcTT23/Yy pjJC8fpdsPVTiLU5jZs1zKufuLx5PWegWvtQDGVc4vd6nMhhG0qzwZHBgKFkLH0M+08W PuOAeUWiaKfBiCY2aiKFfDBlquJIgIGtXZmbvu2FdjRazmXcRpQXz9M1m/QLxR/t62YG CCSrkx3gvN8QEjbgKEXqr39gLVSYYUywfz5gipQfrEqYj4ZEZX4mv3rFZ1q9bbIjjjMH uPnge2opLvyiXeoIGNcQnxBrt8nZTCy5Qp+zQLxFNaqHub9gLDnVQKzPBs3DzKG9k+Qa IikQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b=fTwGM6kv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y5si7652127eda.385.2021.03.26.15.44.17; Fri, 26 Mar 2021 15:44:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b=fTwGM6kv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229986AbhCZWnT (ORCPT + 99 others); Fri, 26 Mar 2021 18:43:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230138AbhCZWnR (ORCPT ); Fri, 26 Mar 2021 18:43:17 -0400 Received: from mail.skyhub.de (mail.skyhub.de [IPv6:2a01:4f8:190:11c2::b:1457]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35D92C0613AA; Fri, 26 Mar 2021 15:43:16 -0700 (PDT) Received: from zn.tnic (p200300ec2f075f009ccde034de5c142d.dip0.t-ipconnect.de [IPv6:2003:ec:2f07:5f00:9ccd:e034:de5c:142d]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id DE2E01EC0513; Fri, 26 Mar 2021 23:43:13 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1616798594; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=KDXcvgtG0h83c1r3w5F4A7Z5EVH9x/tB4a4LIq1DDBo=; b=fTwGM6kvZtMJ1IbJUYe6rAFQ2lleiR2MHmBQRUY33SUeJVMPUwOCj/vTkO6kgUghH/DRjl w5IBIBotWb5lYjhixoZED/DDoVXPCbkvGnnAXRc2kWcKiGWLrWDvavPhrflA8biCQMEySB 9bwybzp7wb4ogFIAWm+1iDwS6iWErxk= Date: Fri, 26 Mar 2021 23:43:10 +0100 From: Borislav Petkov To: William Roche Cc: linux-kernel@vger.kernel.org, Tony Luck , linux-edac@vger.kernel.org Subject: Re: [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering Message-ID: <20210326224310.GL25229@zn.tnic> References: <1616783429-6793-1-git-send-email-william.roche@oracle.com> <20210326190242.GI25229@zn.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 26, 2021 at 11:24:43PM +0100, William Roche wrote: > What we want is to make cec_add_elem() to return !0 value only > when the given pfn triggered an action, so that its callers should > log the error. No, this is not what the CEC does - it collects those errors and when it reaches the threshold for any pfn, it offlines the corresponding page. I know, the comment above talks about: * That error event entry causes cec_add_elem() to return !0 value and thus * signal to its callers to log the error. but it doesn't do that. Frankly, I don't see the point of logging the error - it already says pr_err("Soft-offlining pfn: 0x%llx\n", pfn); which pfn it has offlined. And that is probably only mildly interesting to people - so what, 4K got offlined, servers have so much memory nowadays. The only moment one should start worrying is if one gets those pretty often but then you're probably better off simply scheduling maintenance and replacing the faulty DIMM - problem solved. > What I'm expecting from ras_cec is to "hide" CEs until they reach the > action threshold where an action is tried against the impacted PFN, That it does. > and it's now the time to log the error with the entire notifiers > chain. And I'm not sure why we'd want to do that. It simply offlines the page. But maybe you could explain what you're trying to achieve... Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette