Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1501710pxk; Fri, 2 Oct 2020 11:04:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzCQCs3B78eZ1l86GqrM6ZvjLpZ/rPZi1jX350eFHl8XS8zMnvmsqbBFTcc50mcZA51mcsM X-Received: by 2002:a05:6402:1212:: with SMTP id c18mr3660140edw.344.1601661860657; Fri, 02 Oct 2020 11:04:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601661860; cv=none; d=google.com; s=arc-20160816; b=JQYYiNvwRKHFqnXgre5mx9E1zzOfdMAt7PfYjxboTOyIiK/2ttkZ9N1eLKhGdTIsV2 G4iSdbTjuZa1LIamOWClHDAa29nmwWtrnAjXtB/eYxRVsHX49QyqBE6l8ctfZYXHWMC+ hcR5BpP3EDKcGwm01JeCmCQ2An0j4/XvZtcSTu3qzRTQiqNm+vVL0MMeFd8ludm1dHR3 r0kUyqgTNyahGohOX+6Y9ltvtbXhkRuQq5GHqU/F8azNVuNv6udtpHgYANl4CaI+0yDx MhvWFGLie3JHZ4l7ijPQtoK/EzG0782ymaUCmTxxBqy3VCNJqrx+hvrG9JGYMzTtafli 218Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=pfeUMFmdB3fOIoL+4yEs3nJFfmZU0aNu3XTvE6X8GXU=; b=HyslTjpfloBqOig53hRuhXB7dyopcbwcYUdv/tAkYz+Azm3IQiAu/bN9H6sLovBE6W dFXhf5ONy5cojKu1ExrOgJdDpOJZLK8lv82y3Bf03g3bwdfXsgWbMSyyPQWN7/ZH706i Iy4SVf5CcKHEebR7XB5fqk6yBozf0mCjfLz0WAhJyTzyoUYTxwAxqyKlsT4695qi+CUt x1KthulmcoPeBOz5XMmtSVVY4tzh3zuY5r5f+RcCI3G46N7ZujpozUPy0re5MiX1jQpS 8JtIuUTEvLZw/hBWvq0235yNMWCKSX1jojyLck+lq8rd4qG6j2SBE6KeamiZRJUXM6+k LDDw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b="B/mvmiIi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o17si1697945ejg.470.2020.10.02.11.03.57; Fri, 02 Oct 2020 11:04:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b="B/mvmiIi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388337AbgJBSC2 (ORCPT + 99 others); Fri, 2 Oct 2020 14:02:28 -0400 Received: from mail.skyhub.de ([5.9.137.197]:47390 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726017AbgJBSC2 (ORCPT ); Fri, 2 Oct 2020 14:02:28 -0400 Received: from zn.tnic (p200300ec2f0d630076c6316353094260.dip0.t-ipconnect.de [IPv6:2003:ec:2f0d:6300:76c6:3163:5309:4260]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id A500E1EC046E; Fri, 2 Oct 2020 20:02:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1601661746; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=pfeUMFmdB3fOIoL+4yEs3nJFfmZU0aNu3XTvE6X8GXU=; b=B/mvmiIii64QTSUarSBZJugjzMpddCWKGsxAcph0YxkkuhKpWaYYDY/DsL6UewQPOxa1wl aLSdcrDW28mmwHgArBmsgGP71pNh44vPvN/d6RID81lJF1Qa0xpVe6TG2mc9u/gFtU3lbB 1XimJqU1CkH3cG26rooRW7vkRyLnVJQ= Date: Fri, 2 Oct 2020 20:02:18 +0200 From: Borislav Petkov To: Shiju Jose Cc: James Morse , "linux-edac@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "tony.luck@intel.com" , "rjw@rjwysocki.net" , "lenb@kernel.org" , Linuxarm Subject: Re: [RFC PATCH 0/7] RAS/CEC: Extend CEC for errors count check on short time period Message-ID: <20201002180125.GD17436@zn.tnic> References: <20201002122235.1280-1-shiju.jose@huawei.com> <20201002124352.GC17436@zn.tnic> <19a8cc62b11c49e9b584857a6a6664e5@huawei.com> <59950d44-906b-684f-c876-e09c76e5f827@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <59950d44-906b-684f-c876-e09c76e5f827@arm.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 02, 2020 at 06:33:17PM +0100, James Morse wrote: > > I think adding the CPU error collection to the kernel > > has the following advantages, > > 1. The CPU error collection and isolation would not be active if the > > rasdaemon stopped running or not running on a machine. Wasn't there this thing called systemd which promised that it would restart daemons when they fail? And even if it is not there, you can always do your own cronjob which checks rasdaemon presence and restarts it if it has died and sends a mail to the admin to check why it had died. Everything else I've trimmed but James has put it a lot more eloquently than me and I cannot agree more with what he says. Doing this in userspace is better in every aspect you can think of. The current CEC thing runs in the kernel because it has a completely different purpose - to limit corrected error reports which turn into very expensive support calls for errors which were corrected but people simply don't get that they were corrected. Instead, they throw hands in the air and go "OMG, my hardware is failing". Where those are, as James says: > These are corrected errors. Nothing has gone wrong. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette