Quoting Aaron Staley ([email protected]):
> This is better explained here:
> http://serverfault.com/questions/516074/why-are-applications-in-a-memory-limited-lxc-container-writing-large-files-to-di
> (The
> highest-voted answer believes this to be a kernel bug.)
Hi,
in irc it has been suggested that indeed the kernel should be slowing
down new page creates while waiting for old page cache entries to be
written out to disk, rather than ooming.
With a 3.0.27-1-ac100 kernel, doing dd if=/dev/zero of=xxx bs=1M
count=100 is immediately killed. In contrast, doing the same from a
3.0.8 kernel did the right thing for me. But I did reproduce your
experiment below on ec2 with the same result.
So, cc:ing linux-mm in the hopes someone can tell us whether this
is expected behavior, known mis-behavior, or an unknown bug.
> Summary: I have set up a system where I am using LXC to create multiple
> virtualized containers on my system with limited resources. Unfortunately, I'm
> running into a troublesome scenario where the OOM killer is hard killing
> processes in my LXC container when I write a file with size exceeding the
> memory limitation (set to 300MB). There appears to be some issue with the
> file buffering respecting the containers memory limit.
>
>
> Reproducing:
>
> /done on a c1.xlarge instance running on Amazon EC2
>
> Create 6 empty lxc containers (in my case I did lxc-create -n testcon -t
> ubuntu -- -r precise)
>
> Modify the configuration of each container to set lxc.cgroup.memory.
> limit_in_bytes = 300M
>
> Within each container run:
> dd if=/dev/zero of=test2 bs=100k count=5010
> parallel
>
> This will with high probability activate the OOM (as seen in demsg); often
> the dd processes themselves will be killed.
>
> This has been verified to have problems on:
> Linux 3.8.0-25-generic #37-Ubuntu SMP and Linux ip-10-8-139-98
> 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 x86_64
> x86_64 GNU/Linux
>
> Please let me know your thoughts.
>
> Regards,
> Aaron Staley
> _______________________________________________
> Containers mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/containers
On Mon, Jul 01, 2013 at 01:01:01PM -0500, Serge Hallyn wrote:
> Quoting Aaron Staley ([email protected]):
> > This is better explained here:
> > http://serverfault.com/questions/516074/why-are-applications-in-a-memory-limited-lxc-container-writing-large-files-to-di
> > (The
> > highest-voted answer believes this to be a kernel bug.)
>
> Hi,
>
> in irc it has been suggested that indeed the kernel should be slowing
> down new page creates while waiting for old page cache entries to be
> written out to disk, rather than ooming.
>
> With a 3.0.27-1-ac100 kernel, doing dd if=/dev/zero of=xxx bs=1M
> count=100 is immediately killed. In contrast, doing the same from a
> 3.0.8 kernel did the right thing for me. But I did reproduce your
> experiment below on ec2 with the same result.
>
> So, cc:ing linux-mm in the hopes someone can tell us whether this
> is expected behavior, known mis-behavior, or an unknown bug.
It's a known issue that was fixed/improved in e62e384 'memcg: prevent
OOM with too many dirty pages', included in 3.6+.
> > Summary: I have set up a system where I am using LXC to create multiple
> > virtualized containers on my system with limited resources. Unfortunately, I'm
> > running into a troublesome scenario where the OOM killer is hard killing
> > processes in my LXC container when I write a file with size exceeding the
> > memory limitation (set to 300MB). There appears to be some issue with the
> > file buffering respecting the containers memory limit.
> >
> >
> > Reproducing:
> >
> > /done on a c1.xlarge instance running on Amazon EC2
> >
> > Create 6 empty lxc containers (in my case I did lxc-create -n testcon -t
> > ubuntu -- -r precise)
> >
> > Modify the configuration of each container to set lxc.cgroup.memory.
> > limit_in_bytes = 300M
> >
> > Within each container run:
> > dd if=/dev/zero of=test2 bs=100k count=5010
> > parallel
> >
> > This will with high probability activate the OOM (as seen in demsg); often
> > the dd processes themselves will be killed.
> >
> > This has been verified to have problems on:
> > Linux 3.8.0-25-generic #37-Ubuntu SMP and Linux ip-10-8-139-98
> > 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 x86_64
> > x86_64 GNU/Linux
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Aaron Staley
> > _______________________________________________
> > Containers mailing list
> > [email protected]
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
Quoting Johannes Weiner ([email protected]):
> On Mon, Jul 01, 2013 at 01:01:01PM -0500, Serge Hallyn wrote:
> > Quoting Aaron Staley ([email protected]):
> > > This is better explained here:
> > > http://serverfault.com/questions/516074/why-are-applications-in-a-memory-limited-lxc-container-writing-large-files-to-di
> > > (The
> > > highest-voted answer believes this to be a kernel bug.)
> >
> > Hi,
> >
> > in irc it has been suggested that indeed the kernel should be slowing
> > down new page creates while waiting for old page cache entries to be
> > written out to disk, rather than ooming.
> >
> > With a 3.0.27-1-ac100 kernel, doing dd if=/dev/zero of=xxx bs=1M
> > count=100 is immediately killed. In contrast, doing the same from a
> > 3.0.8 kernel did the right thing for me. But I did reproduce your
> > experiment below on ec2 with the same result.
> >
> > So, cc:ing linux-mm in the hopes someone can tell us whether this
> > is expected behavior, known mis-behavior, or an unknown bug.
>
> It's a known issue that was fixed/improved in e62e384 'memcg: prevent
Ah ok, I see the commit says:
The solution is far from being ideal - long term solution is memcg aware
dirty throttling - but it is meant to be a band aid until we have a real
fix. We are seeing this happening during nightly backups which are placed
... and ...
The issue is more visible with slower devices for output.
I'm guessing we see it on ec2 because of slowed fs.
> OOM with too many dirty pages', included in 3.6+.
Is anyone actively working on the long term solution?
thanks,
-serge
On Mon 01-07-13 14:02:22, Serge Hallyn wrote:
> Quoting Johannes Weiner ([email protected]):
[...]
> > OOM with too many dirty pages', included in 3.6+.
>
> Is anyone actively working on the long term solution?
Patches for memcg dirty pages accounted were posted quite some time ago.
I plan to look at the at some point but I am rather busy with other
stuff right now. That would be just a first step though. Then we need to
hook into dirty pages throttling and make it memcg aware which sounds
like a bigger challenge.
--
Michal Hocko
SUSE Labs