2006-09-03 22:44:00

by Marc Perkel

[permalink] [raw]
Subject: Raid 0 Swap?

If I have two drives and I want swap to be fast if I allocate swap spam
on both drives does it break up the load between them? Or would it run
faster if I did a Raid 0 swap?


--
VGER BF report: H 0.286654


2006-09-04 00:16:15

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: Raid 0 Swap?

In article <[email protected]> you wrote:
> If I have two drives and I want swap to be fast if I allocate swap spam
> on both drives does it break up the load between them? Or would it run
> faster if I did a Raid 0 swap?

if you set up two swap partitions with the same prio, it will distribute the
access, you dont need striping for that. (However with striping you can a
bit better control the stripe size).

Of course you should not plan for swapping, it is just slow...

Gruss
Bernd

--
VGER BF report: H 0.224635

2006-09-04 07:06:58

by Michael Tokarev

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Marc Perkel wrote:
> If I have two drives and I want swap to be fast if I allocate swap spam
> on both drives does it break up the load between them? Or would it run
> faster if I did a Raid 0 swap?

Don't do that - swap on raid0. Don't do that. Unless you don't care
about your data, ofcourse. Seriously.

If something with swap space goes wrong, God only knows what will break.
It is trivial to break userspace data this way, when an app is swapped
out and there's an error reading it from swap, its data file very likely
to be corrupt, especially when it is interrupted during file update.
It is probably possible to corrupt the whole filesystem this way too,
when some kernel memory has been swapped out and is needed to write some
parts of filesystem, but it can't be read back.

Ie, your swap space must be reliable. At least not worse than your memory.
And with striping, you've much more chances of disk failure...

Yes it sounds very promising at first, to let kernel stripe swap space,
for faster operations. But hell, first, try to avoid swappnig in the
first place, by installing appropriate amount memory which is cheap
nowadays, so there will be just no need for swapping. And when it's
done, it's not relevant anymore whenever your swap space is fast or
not. But make it *reliable*.

/mjt

--
VGER BF report: U 0.49924

2006-09-04 09:51:05

by Helge Hafting

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Michael Tokarev wrote:
> Marc Perkel wrote:
>
>> If I have two drives and I want swap to be fast if I allocate swap spam
>> on both drives does it break up the load between them? Or would it run
>> faster if I did a Raid 0 swap?
>>
>
> Don't do that - swap on raid0. Don't do that. Unless you don't care
> about your data, ofcourse. Seriously.
>
> If something with swap space goes wrong, God only knows what will break.
> It is trivial to break userspace data this way, when an app is swapped
> out and there's an error reading it from swap, its data file very likely
> to be corrupt, especially when it is interrupted during file update.
> It is probably possible to corrupt the whole filesystem this way too,
> when some kernel memory has been swapped out and is needed to write some
> parts of filesystem, but it can't be read back.
>
I thought kernel data weren't swapped at all?
Mostly because kernel data could be needed immediately, with
no option of waiting for swapin.
So, bad swap should only really kill userspace programs,
although it probably can cause some bad delays in cases
where the userspace program calls into the kernel,
passing an address that happens to be in damaged swap.
You might then stall the kernel holding some resources
while the disks retries umpteen times.
> Ie, your swap space must be reliable. At least not worse than your memory.
> And with striping, you've much more chances of disk failure...
>
> Yes it sounds very promising at first, to let kernel stripe swap space,
> for faster operations. But hell, first, try to avoid swappnig in the
> first place, by installing appropriate amount memory which is cheap
> nowadays, so there will be just no need for swapping. And when it's
> done, it's not relevant anymore whenever your swap space is fast or
> not. But make it *reliable*.
>
Some swap is nice to have. "Ouch - sluggish server today,
I will have to look into it" is so much better
than "Eww - the OOM serial killer took out another 5 processes,
people are screaming!"

As for reliable swap - swap on raid-1 is nice - and it
probably perform better than single-disk swap too,
although not as fast as striped swap.

Helge Hafting



--
VGER BF report: U 0.498988

2006-09-04 10:29:22

by Michael Tokarev

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Helge Hafting wrote:
> Michael Tokarev wrote:
[]
>> If something with swap space goes wrong, God only knows what will break.
>> It is trivial to break userspace data this way, when an app is swapped
>> out and there's an error reading it from swap, its data file very likely
>> to be corrupt, especially when it is interrupted during file update.
>> It is probably possible to corrupt the whole filesystem this way too,
>> when some kernel memory has been swapped out and is needed to write some
>> parts of filesystem, but it can't be read back.
>>
> I thought kernel data weren't swapped at all?
> Mostly because kernel data could be needed immediately, with
> no option of waiting for swapin. So, bad swap should only really kill
> userspace programs,
> although it probably can cause some bad delays in cases
> where the userspace program calls into the kernel,
> passing an address that happens to be in damaged swap.
> You might then stall the kernel holding some resources
> while the disks retries umpteen times.

Well, it's not that simple. Kernel uses both swappable and
non-swappable memory internally. For some things, it's
unswappable, for some, it's swappable. In general, it's
impossible to say which parts of kernel will break (and
in wich ways) if swap goes havoc.

>> Ie, your swap space must be reliable. At least not worse than your
>> memory.
>> And with striping, you've much more chances of disk failure...
>> Yes it sounds very promising at first, to let kernel stripe swap space,
>> for faster operations. But hell, first, try to avoid swappnig in the
>> first place, by installing appropriate amount memory which is cheap
>> nowadays, so there will be just no need for swapping. And when it's
>> done, it's not relevant anymore whenever your swap space is fast or
>> not. But make it *reliable*.
>>
> Some swap is nice to have. "Ouch - sluggish server today,
> I will have to look into it" is so much better
> than "Eww - the OOM serial killer took out another 5 processes,
> people are screaming!"

I didn't say "eliminate swap space", I said about trying to avoid
swap usage. In the other words, DO set up swap space, and DO it
in a reliable way. Not "swap isn't needed" - well yes, it's
not entirely clear from my statement.

> As for reliable swap - swap on raid-1 is nice - and it
> probably perform better than single-disk swap too,
> although not as fast as striped swap.

Yes it's slower. But you don't really care, because in normal
life, there should be almost no swap usage. When swapping starts
occuring in amounts where speed difference is noticeable, it's
time to add more memory (or to run less hungry processes),
not to speed up swap space.

/mjt

--
VGER BF report: H 0.182426

2006-09-04 10:48:27

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Raid 0 Swap?


>> I thought kernel data weren't swapped at all?

If the swap code was swapped, who would swap it in again?

>Well, it's not that simple. Kernel uses both swappable and
>non-swappable memory internally. For some things, it's
>unswappable, for some, it's swappable. In general, it's
>impossible to say which parts of kernel will break (and
>in wich ways) if swap goes havoc.

In general, everything you type in as C code (.bss, .data, .text) should be
unswappable. kmalloc()ed areas are resident too, and kmalloc has a
parameter which defines whether the allocation can/cannot push userspace
pages into the swap (GFP_ATOMIC/GFP_IO). So if there is some
kernel-allocation swapped out, it is most likely to be marked as
'userspace' so that the same algorithms can be used for swapin and -out.


Jan Engelhardt
--

--
VGER BF report: H 5.48632e-07

2006-09-04 13:24:00

by Bodo Eggert

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Michael Tokarev <[email protected]> wrote:

> Marc Perkel wrote:
>> If I have two drives and I want swap to be fast if I allocate swap spam
>> on both drives does it break up the load between them? Or would it run
>> faster if I did a Raid 0 swap?

Swap has priorities, and it will do something like striping if two swap spaces
have the same priority.

[...]
> Ie, your swap space must be reliable. At least not worse than your memory.

It's mostly enough if it's as reliable as the system disk.

> And with striping, you've much more chances of disk failure...

It won't increase because of using striping, but because of the amount
of disks used.
--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

http://david.woodhou.se/why-not-spf.html

2006-09-04 15:30:24

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Raid 0 Swap?

On Mon, 04 Sep 2006 11:06:50 +0400, Michael Tokarev said:

> for faster operations. But hell, first, try to avoid swappnig in the
> first place, by installing appropriate amount memory which is cheap
> nowadays,

Memory is indeed cheap. However, if you're already at the max supported
memory configuration for your system, buying another RAM socket to plug that
cheap memory card into can be *really* expensive.


Attachments:
(No filename) (226.00 B)

2006-09-04 20:06:50

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: Raid 0 Swap?

In article <[email protected]> you wrote:
> Memory is indeed cheap. However, if you're already at the max supported
> memory configuration for your system, buying another RAM socket to plug that
> cheap memory card into can be *really* expensive.

Dont expect any useable system performance if you swap regularly.

Gruss
Bernd

2006-09-05 13:41:15

by Helge Hafting

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Bernd Eckenfels wrote:
> In article <[email protected]> you wrote:
>
>> Memory is indeed cheap. However, if you're already at the max supported
>> memory configuration for your system, buying another RAM socket to plug that
>> cheap memory card into can be *really* expensive.
>>
>
> Dont expect any useable system performance if you swap regularly.
>
Not entirely correct. Performance with continous swapping will
be fine as long as the swap bandwidth is lower than available disk
bandwidth.

This is a narrow line to walk though, memory bandwidth being
much higher than disk bandwith so it don't take much more
swapping before performance drops like a rock.

Helge Hafting

2006-09-05 23:40:21

by Bill Davidsen

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Michael Tokarev wrote:
> Marc Perkel wrote:
>> If I have two drives and I want swap to be fast if I allocate swap spam
>> on both drives does it break up the load between them? Or would it run
>> faster if I did a Raid 0 swap?
>
> Don't do that - swap on raid0. Don't do that. Unless you don't care
> about your data, ofcourse. Seriously.

Please don't spread FUD. Particularly when there are valid technical
arguments to be made. RAID0 does increase the chance of any error, but
it is still a very small chance. With drive failures measured in years,
it's misleading to make it sound as if going RAID0 will result in a
rapid failure.
>
> If something with swap space goes wrong, God only knows what will break.

The same thing that will go wrong if your one and only swap drive goes
bad. You get a disk error, the kernel copes with it well or badly, the
system grinds to a halt or crashes. Flames do not come out and your cat
will NOT get pregnant (at least from swap failure).

> It is trivial to break userspace data this way, when an app is swapped
> out and there's an error reading it from swap, its data file very likely
> to be corrupt, especially when it is interrupted during file update.
> It is probably possible to corrupt the whole filesystem this way too,
> when some kernel memory has been swapped out and is needed to write some
> parts of filesystem, but it can't be read back.

More FUD.
>
> Ie, your swap space must be reliable. At least not worse than your memory.
> And with striping, you've much more chances of disk failure...

You roughly double your chance, which still leaves the chances of a
failure in one every few years. But wait, there more! Unless you use the
drive just for swapping, the chances are that an error will happen
somewhere else on the disk.
>
> Yes it sounds very promising at first, to let kernel stripe swap space,
> for faster operations. But hell, first, try to avoid swappnig in the
> first place, by installing appropriate amount memory which is cheap
> nowadays, so there will be just no need for swapping. And when it's
> done, it's not relevant anymore whenever your swap space is fast or
> not. But make it *reliable*.

The truth is that striping may not make swap faster, because (a) the
transfer rate may or may not be faster, but (b) the seek time will be
the slowest seek on any drive used. There's a good technical reason,
RAID0 may not help.

However, RAID1 will help, again not because it's more reliable, but
because when you do a swap in (that's where you really feel swap delay),
you increase the chance that there will be a copy of your data on a
drive which isn't busy.

So the correct answer is not related to reliability problems, which are
rare, but performance problems, which is why you did the RAID in the
first place.

Final note: if you are building a really reliable system, PAID6 on all
data, redundant power supplies (the highest point of total failure),
then you should go to RAID0 for swap, on multiple controllers,
preferably one drives in different enclosures. RAID6 for swap sucks
rocks off the bottom of the ocean, three way RAID1 performs well even
after a one drive failure.

--
Bill Davidsen <[email protected]>
Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a
normal user and is setuid root, with the "vi" line edit mode selected,
and the character set is "big5," an off-by-one errors occurs during
wildcard (glob) expansion.

2006-09-06 06:53:33

by Michael Tokarev

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Bill Davidsen wrote:
> Michael Tokarev wrote:
>> Marc Perkel wrote:
>>> If I have two drives and I want swap to be fast if I allocate swap spam
>>> on both drives does it break up the load between them? Or would it run
>>> faster if I did a Raid 0 swap?
>>
>> Don't do that - swap on raid0. Don't do that. Unless you don't care
>> about your data, ofcourse. Seriously.
>
> Please don't spread FUD. Particularly when there are valid technical
> arguments to be made. RAID0 does increase the chance of any error, but
> it is still a very small chance. With drive failures measured in years,
> it's misleading to make it sound as if going RAID0 will result in a
> rapid failure.

It's not FUD, unfortunately. It's real life.

We had about 70 systems with swap on raid0 (not even raid0, but using
several swap partitions with equal priority, but it makes almost no real
difference). And in a year, 5 or 6 of them failed in a way which
required major recovery efforts each, exactly due to this swap-on-raid0
thing. That is, from 2*70=140 disk drives, 5 failed in a year.

Yes, maybe, we just had a bad luck and used somehow defected batch of
drives. Yes, for an average home machine with only two drives you've
less chance of failure than one in 140. But it's still not something
to ignore, thinking "it's still very small chance".

>> If something with swap space goes wrong, God only knows what will break.
>
> The same thing that will go wrong if your one and only swap drive goes
> bad. You get a disk error, the kernel copes with it well or badly, the
> system grinds to a halt or crashes. Flames do not come out and your cat
> will NOT get pregnant (at least from swap failure).

Yes. Well, flames did come once, but cats, indeed, didn't become pregnant
due to swap issues.

>> It is trivial to break userspace data this way, when an app is swapped
>> out and there's an error reading it from swap, its data file very likely
>> to be corrupt, especially when it is interrupted during file update.
>> It is probably possible to corrupt the whole filesystem this way too,
>> when some kernel memory has been swapped out and is needed to write some
>> parts of filesystem, but it can't be read back.
>
> More FUD.

Which is?

It all depends on the application, and on the "place" where the failure
has been encountered. There are bad apps out there, and there are bad
places to interrupt them. In about 15 years we had plenty of examples
of e.g. corrupted m$ office files, some of them was due to swap problems
too (but it's difficult to say for sure what was actual prob; at least,
when system shows blue screen and we trying to repair the drive, disk
test shows errors in swap area; and at this time, the document which
was open during crash becomes corrupt. I dislike windows, I don't
use it last several years, but it still serves as an example)

Yes, if you have single drive, it's mostly irrelevant whenever it will
be swap who dies or your data partition (or whole drive) - data will
be corrupt somehow, or will be not. But when we're talking about
several drives, most likely the data is on raid1 (or raid5, 6, 10...),
so is protected better.

Speaking of kernel space, again, I had at least single example when
bad swap resulted in corrupt filesystem. It was quite some time
ago, with linux-2.2, and squid cache. The system was alive but any
access to the squid FS resulted in hanging process. After forced
reboot the fs was in quite bad state, alot of lost+found entries
(some of which where files with old timestamps), whole directories
lost (squid does not create/delete directories at will), and many
other errors during fsck, some of which it wasn't able to fix.
I don't remember details already, as it was quite some time ago.
And according to the logs, problematic place on disk was in swap
area, not elsewhere. I can't be sure what exactly caused this
corruption, whenever it's still possible with nowadays kernels,
but I don't really care - one trivial lesson I learned is not
to use raid0 for swap, that's all, that's trivial to accomplish,
and that costs nothing.

The point was that it's unreasonable thing to have swap on raid0 IF
you care about protecting your data. Especially since swap usage
should generally be reduced to a minimum where you don't really
care how fast it is.

Especially with modern drives, reliability of which, it seems,
decreases - I mean, error rate per megabyte stays almost the
same, but amount of those megabytes increases rapidly ;)
[]
> Final note: if you are building a really reliable system, PAID6 on all
> data, redundant power supplies (the highest point of total failure),
> then you should go to RAID0 for swap, on multiple controllers,
> preferably one drives in different enclosures. RAID6 for swap sucks
> rocks off the bottom of the ocean, three way RAID1 performs well even
> after a one drive failure.

Well, it's arguable which raid levels to use for data (eg, for database
workloads, with concurrent direct writes, it's better to use raid10
because of the cost to calculate parity).

But raid0 for swap - again? - no, no thank you ;) You probably mistyped
0 for 1 above, or else I don't understand the whole last statement ;)

By the way, there are at least two types of drive failure. When a single
sector becomes unreadable (media failure), and when the whole device
fails for some reason. It turns out both types of failure occurs on
a regular basis (well, it also depends on environment, like temperature
and the like). Yes, both aren't frequent, but it isn't something to
ignore either, if you care about your system and data.

/mjt

2006-09-06 17:21:54

by Kyle Moffett

[permalink] [raw]
Subject: Re: Raid 0 Swap?

On Sep 05, 2006, at 19:44:30, Bill Davidsen wrote:
> Final note: if you are building a really reliable system, RAID6 on
> all data, redundant power supplies (the highest point of total
> failure), then you should go to RAID0 for swap, on multiple
> controllers, preferably one drives in different enclosures. RAID6
> for swap sucks rocks off the bottom of the ocean, three way RAID1
> performs well even after a one drive failure.

There's also some interesting high-performance FPGA-based products
out there which stack another layer or two of reed-solomon coding on
top of a group of N existing drives so that you can handle up to M
drive failures where M < N, and optionally also a failure of a stripe
of up to K sectors out of every group of J sectors. IIRC your
average CD and DVD uses this kind of encoding, so if you have a bunch
of scattered errors or a single big error up to like 9k long you can
still recover all the data while decoding. Those kind of matrix
transformations would be dog-slow on a general purpose CPU, but with
custom FPGA or VLSI chips you can do it in parallel easily better
than disk bandwidth

Cheers,
Kyle Moffett

2006-09-18 09:54:31

by Denys Vlasenko

[permalink] [raw]
Subject: Re: Raid 0 Swap?

On Monday 04 September 2006 12:46, Jan Engelhardt wrote:
> >> I thought kernel data weren't swapped at all?
>
> If the swap code was swapped, who would swap it in again?
>
> >Well, it's not that simple. Kernel uses both swappable and
> >non-swappable memory internally. For some things, it's
> >unswappable, for some, it's swappable. In general, it's
> >impossible to say which parts of kernel will break (and
> >in wich ways) if swap goes havoc.
>
> In general, everything you type in as C code (.bss, .data, .text) should be
> unswappable. kmalloc()ed areas are resident too, and kmalloc has a
> parameter which defines whether the allocation can/cannot push userspace
> pages into the swap (GFP_ATOMIC/GFP_IO). So if there is some
> kernel-allocation swapped out, it is most likely to be marked as
> 'userspace' so that the same algorithms can be used for swapin and -out.

What are you guys talking about? IIRC kernel doesn't use
swap for its vital data structures. I recall only one
kernel thing which goes into swap: tmpfs data. Caching network
filesystems may also use swappable data, but currently grep
catches only cifs.

IOW swap is for dirtied userspace data. Please correct me
if I am wrong here.
--
vda

2006-12-28 09:17:41

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Raid 0 Swap?


On Dec 28 2006 00:06, Mike Huber wrote:
>
> I would like to point out one key argument against raid0 swap partitions,
> which is that, should a drive failure occur, the least used programs in
> memory are most drastically affected. Unfortunately, in the case of a
> drastic drive failure in a standalone server, one of the most likely
> programs to be affected is getty, disallowing you from manually logging in.

However, the footprint of getty is rather small, so its chance to run is higher
than an idle bigger task (dbus, resmgr, hal, perhaps cron or X)


-`J'
--

2007-01-01 02:01:38

by Bill Davidsen

[permalink] [raw]
Subject: Re: Raid 0 Swap?

Jan Engelhardt wrote:
> On Dec 28 2006 00:06, Mike Huber wrote:
>> I would like to point out one key argument against raid0 swap partitions,
>> which is that, should a drive failure occur, the least used programs in
>> memory are most drastically affected. Unfortunately, in the case of a
>> drastic drive failure in a standalone server, one of the most likely
>> programs to be affected is getty, disallowing you from manually logging in.
>
> However, the footprint of getty is rather small, so its chance to run is higher
> than an idle bigger task (dbus, resmgr, hal, perhaps cron or X)

RAID-0 swap is not the thing to run if reliability is a must, clearly.
Interestingly, after a long fight with poor RAID-5 write speed, I moved
my swap to RAID-10, only to find that recovery disks don't know how to
use it. Tried Fedora and then a live CD (puppy, I think).

Detail on the RAID-5 performance thing in the linux-raid archives, won't
rehash here.
--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979