2013-04-11 03:29:29

by Mitsuhiro Tanino

[permalink] [raw]
Subject: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages

This patch introduces new sysctl interfaces in order to limit
a rate of outputting memory error messages.

- vm.memory_failure_print_ratelimit:
Specify the minimum length of time between messages.
By default the rate limiting is disabled.

- vm.memory_failure_print_ratelimit_burst:
Specify the number of messages we can send before rate limiting.


Signed-off-by: Mitsuhiro Tanino <[email protected]>
---

diff --git a/a/Documentation/sysctl/vm.txt b/b/Documentation/sysctl/vm.txt
index 7dad994..eea6f4d 100644
--- a/a/Documentation/sysctl/vm.txt
+++ b/b/Documentation/sysctl/vm.txt
@@ -36,6 +36,8 @@ Currently, these files are in /proc/sys/vm:
- max_map_count
- memory_failure_dirty_panic
- memory_failure_early_kill
+- memory_failure_print_ratelimit
+- memory_failure_print_ratelimit_burst
- memory_failure_recovery
- min_free_kbytes
- min_slab_ratio
@@ -358,6 +360,30 @@ Applications can override this setting individually with the PR_MCE_KILL prctl

==============================================================

+memory_failure_print_ratelimit:
+
+Error messages related data lost which come from truncating
+dirty page cache are rate limited.
+memory_failure_print_ratelimit specifies the minimum length of
+time between these messages (in jiffies), by default the rate
+limiting is disabled.
+
+If a value is set to 5, we allow one error message every 5 seconds.
+
+==============================================================
+
+memory_failure_print_ratelimit_burst:
+
+While long term we enforce one message per
+memory_failure_print_ratelimit seconds, we do allow a burst of
+messages to pass through.
+memory_failure_print_ratelimit_burst specifies the number of
+messages we can send before rate limiting kicks in.
+If memory_failure_print_ratelimit is set to 0, this parameter
+is ineffective.
+
+==============================================================
+
memory_failure_recovery

Enable memory failure recovery (when supported by the platform)
diff --git a/a/include/linux/mm.h b/b/include/linux/mm.h
index 0025882..ca27bd9 100644
--- a/a/include/linux/mm.h
+++ b/b/include/linux/mm.h
@@ -1721,6 +1721,7 @@ extern int unpoison_memory(unsigned long pfn);
extern int sysctl_memory_failure_dirty_panic;
extern int sysctl_memory_failure_early_kill;
extern int sysctl_memory_failure_recovery;
+extern struct ratelimit_state sysctl_memory_failure_print_ratelimit;
extern void shake_page(struct page *p, int access);
extern atomic_long_t mce_bad_pages;
extern int soft_offline_page(struct page *page, int flags);
diff --git a/a/kernel/sysctl.c b/b/kernel/sysctl.c
index 452dd80..0703c2e 100644
--- a/a/kernel/sysctl.c
+++ b/b/kernel/sysctl.c
@@ -1421,6 +1421,20 @@ static struct ctl_table vm_table[] = {
.extra1 = &zero,
.extra2 = &one,
},
+ {
+ .procname = "memory_failure_print_ratelimit",
+ .data = &sysctl_memory_failure_print_ratelimit.interval,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_jiffies,
+ },
+ {
+ .procname = "memory_failure_print_ratelimit_burst",
+ .data = &sysctl_memory_failure_print_ratelimit.burst,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
#endif
{ }
};
diff --git a/a/mm/memory-failure.c b/b/mm/memory-failure.c
index 6d3c0ed..ce5bb1a 100644
--- a/a/mm/memory-failure.c
+++ b/b/mm/memory-failure.c
@@ -55,6 +55,7 @@
#include <linux/memory_hotplug.h>
#include <linux/mm_inline.h>
#include <linux/kfifo.h>
+#include <linux/ratelimit.h>
#include "internal.h"

int sysctl_memory_failure_dirty_panic __read_mostly = 0;
@@ -78,6 +79,16 @@ EXPORT_SYMBOL_GPL(hwpoison_filter_dev_minor);
EXPORT_SYMBOL_GPL(hwpoison_filter_flags_mask);
EXPORT_SYMBOL_GPL(hwpoison_filter_flags_value);

+/*
+ * This enforces a rate limit for outputting error message.
+ * The default interval is set to "0" HZ. This means that
+ * outputting error message is not limited by default.
+ * The default burst is set to "10". This parameter can control
+ * to output number of messages per interval.
+ * If interval is set to "0", the burst is ineffective.
+ */
+DEFINE_RATELIMIT_STATE(sysctl_memory_failure_print_ratelimit, 0 * HZ, 10);
+
static int hwpoison_filter_dev(struct page *p)
{
struct address_space *mapping;
@@ -622,13 +633,16 @@ static int me_pagecache_dirty(struct page *p, unsigned long pfn)
SetPageError(p);
if (mapping) {
/* Print more information about the file. */
- if (mapping->host != NULL && S_ISREG(mapping->host->i_mode))
- pr_info("MCE %#lx: File was corrupted: Dev:%s Inode:%lu Offset:%lu\n",
- page_to_pfn(p), mapping->host->i_sb->s_id,
- mapping->host->i_ino, page_index(p));
- else
- pr_info("MCE %#lx: A dirty page cache was corrupted.\n",
- page_to_pfn(p));
+ if (__ratelimit(&sysctl_memory_failure_print_ratelimit)) {
+ if (mapping->host != NULL &&
+ S_ISREG(mapping->host->i_mode))
+ pr_info("MCE %#lx: File was corrupted: Dev:%s Inode:%lu Offset:%lu\n",
+ page_to_pfn(p), mapping->host->i_sb->s_id,
+ mapping->host->i_ino, page_index(p));
+ else
+ pr_info("MCE %#lx: A dirty page cache was corrupted.\n",
+ page_to_pfn(p));
+ }

/*
* IO error will be reported by write(), fsync(), etc.


2013-04-11 07:32:12

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages

On Thu, Apr 11, 2013 at 12:59:38AM -0400, Naoya Horiguchi wrote:
> Hi Tanino-san,
>
> On Thu, Apr 11, 2013 at 12:27:15PM +0900, Mitsuhiro Tanino wrote:
> > This patch introduces new sysctl interfaces in order to limit
> > a rate of outputting memory error messages.
> >
> > - vm.memory_failure_print_ratelimit:
> > Specify the minimum length of time between messages.
> > By default the rate limiting is disabled.
> >
> > - vm.memory_failure_print_ratelimit_burst:
> > Specify the number of messages we can send before rate limiting.
> >
> > Signed-off-by: Mitsuhiro Tanino <[email protected]>
...
> > @@ -78,6 +79,16 @@ EXPORT_SYMBOL_GPL(hwpoison_filter_dev_minor);
> > EXPORT_SYMBOL_GPL(hwpoison_filter_flags_mask);
> > EXPORT_SYMBOL_GPL(hwpoison_filter_flags_value);
> >
> > +/*
> > + * This enforces a rate limit for outputting error message.
> > + * The default interval is set to "0" HZ. This means that
> > + * outputting error message is not limited by default.
> > + * The default burst is set to "10". This parameter can control
> > + * to output number of messages per interval.
> > + * If interval is set to "0", the burst is ineffective.
> > + */
> > +DEFINE_RATELIMIT_STATE(sysctl_memory_failure_print_ratelimit, 0 * HZ, 10);
> > +
> > static int hwpoison_filter_dev(struct page *p)
> > {
> > struct address_space *mapping;
> > @@ -622,13 +633,16 @@ static int me_pagecache_dirty(struct page *p, unsigned long pfn)
> > SetPageError(p);
> > if (mapping) {
> > /* Print more information about the file. */
> > - if (mapping->host != NULL && S_ISREG(mapping->host->i_mode))
> > - pr_info("MCE %#lx: File was corrupted: Dev:%s Inode:%lu Offset:%lu\n",
> > - page_to_pfn(p), mapping->host->i_sb->s_id,
> > - mapping->host->i_ino, page_index(p));
> > - else
> > - pr_info("MCE %#lx: A dirty page cache was corrupted.\n",
> > - page_to_pfn(p));
> > + if (__ratelimit(&sysctl_memory_failure_print_ratelimit)) {
> > + if (mapping->host != NULL &&
> > + S_ISREG(mapping->host->i_mode))
> > + pr_info("MCE %#lx: File was corrupted: Dev:%s Inode:%lu Offset:%lu\n",
> > + page_to_pfn(p), mapping->host->i_sb->s_id,
> > + mapping->host->i_ino, page_index(p));
> > + else
> > + pr_info("MCE %#lx: A dirty page cache was corrupted.\n",
> > + page_to_pfn(p));
> > + }
> >
> > /*
> > * IO error will be reported by write(), fsync(), etc.

I don't think it's enough to do ratelimit only for me_pagecache_dirty().
When tons of memory errors flood, all of printk()s in memory error handler
can print out tons of messages.

Thanks,
Naoya

2013-04-11 14:00:17

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages

> I don't think it's enough to do ratelimit only for me_pagecache_dirty().
> When tons of memory errors flood, all of printk()s in memory error handler
> can print out tons of messages.

Note that when you really have a flood of uncorrected errors you'll
likely die soon anyways as something unrecoverable is very likely to
happen. Error memory recovery cannot fix large scale memory corruptions,
just the rare events that slip through all the other memory error correction
schemes.

So I wouldn't worry too much about that.

The flooding problem is typically more with corrected error reporting.

-Andi

--
[email protected] -- Speaking for myself only.

2013-04-11 14:47:15

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages

On Thu, Apr 11, 2013 at 04:00:12PM +0200, Andi Kleen wrote:
> > I don't think it's enough to do ratelimit only for me_pagecache_dirty().
> > When tons of memory errors flood, all of printk()s in memory error handler
> > can print out tons of messages.
>
> Note that when you really have a flood of uncorrected errors you'll
> likely die soon anyways as something unrecoverable is very likely to
> happen. Error memory recovery cannot fix large scale memory corruptions,
> just the rare events that slip through all the other memory error correction
> schemes.
>
> So I wouldn't worry too much about that.

I agree.
My previous comment is valid only when we assume the flooding can happen
(and I personally don't believe that can happen except for in testing.)

And for paranoid users, we can suggest that they set up mcelog script
triggering to turn off vm.memory_failure_recovery when memory errors flood.
Such users don't expect that memory error handling works fine in flooding,
so just suppressing kernel messages is pointless.

Thanks,
Naoya

2013-04-12 13:30:33

by Mitsuhiro Tanino

[permalink] [raw]
Subject: Re: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages

(2013/04/11 23:47), Naoya Horiguchi wrote:
> On Thu, Apr 11, 2013 at 04:00:12PM +0200, Andi Kleen wrote:
>>> I don't think it's enough to do ratelimit only for me_pagecache_dirty().
>>> When tons of memory errors flood, all of printk()s in memory error handler
>>> can print out tons of messages.
>>
>> Note that when you really have a flood of uncorrected errors you'll
>> likely die soon anyways as something unrecoverable is very likely to
>> happen. Error memory recovery cannot fix large scale memory corruptions,
>> just the rare events that slip through all the other memory error correction
>> schemes.
>>
>> So I wouldn't worry too much about that.
>
> I agree.
> My previous comment is valid only when we assume the flooding can happen
> (and I personally don't believe that can happen except for in testing.)
>
> And for paranoid users, we can suggest that they set up mcelog script
> triggering to turn off vm.memory_failure_recovery when memory errors flood.
> Such users don't expect that memory error handling works fine in flooding,
> so just suppressing kernel messages is pointless.
>
> Thanks,
> Naoya

Hi Andi, Horiguchi-san, Kosaki-san

Thank you for your comments. I agree with your opinions.
I think that occurrence of uncorrected error is rare event, too.

I introduced a limitation feature using ratelimit in my patch in honor
of the previous discussion a half year ago. In the discussion, Andrew-san
threw a concern of a flood of uncorrected error for the patch proposed by
Horiguchi-san.

I think that ratelimit can be removed to output all "important messages".

I will try to resend patches sepalately,
one is for outputting error messages related to a corrupted file
and the other is for adding a panic knob to handle data lost of dirty cache
which is caused by both memory error and I/O error.

Regards,
Mitsuhiro Tanino