Hello!
I am trying to understand how NetApp can be so much
better at NFS servicing than my quad Opteron 250 SAN
attached machine. So I need some help and some
pointers to understand how I can make my opteron
machine come on par (or within 70% NFS performance
range) as that of my NetApp R200. I have run through
the NFS-how-to's and have heard "that is why they cost
so much more", but I really have to consider that
probably most of the ideas that are in the NetApp are
common knowldge (just not in my head).
Can anyone shed some light on this?
TIA!
Phy
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
> I am trying to understand how NetApp can be so much
> better at NFS servicing than my quad Opteron 250 SAN
> attached machine. So I need some help and some
> pointers to understand how I can make my opteron
> machine come on par (or within 70% NFS performance
> range) as that of my NetApp R200. I have run through
> the NFS-how-to's and have heard "that is why they cost
> so much more", but I really have to consider that
> probably most of the ideas that are in the NetApp are
> common knowldge (just not in my head).
>
> Can anyone shed some light on this?
Definitely sounds like something is wrong. You can do your own
comparisons of Linux 2.6 vs Netapp here (the OpenPower 720 is a ppc64
Linux box):
http://www.spec.org/sfs97r1/results/sfs97r1.html
Anton
On Tue, 11 Jan 2005, Anton Blanchard wrote:
>
>> I am trying to understand how NetApp can be so much
>> better at NFS servicing than my quad Opteron 250 SAN
>> attached machine. So I need some help and some
>> pointers to understand how I can make my opteron
>> machine come on par (or within 70% NFS performance
>> range) as that of my NetApp R200. I have run through
>> the NFS-how-to's and have heard "that is why they cost
>> so much more", but I really have to consider that
>> probably most of the ideas that are in the NetApp are
>> common knowldge (just not in my head).
>>
>> Can anyone shed some light on this?
you have to quantify what sort of hardware you're benchmarking in either
case and how its configured before you can reasonably conclude to much...
I spent quite a bit of time benchmarking filers and linux configurations
recently and while I can say with some certainty that while netapp makes
some very fast and well balanced filers they don't by any means have a
lock on building a high-performance nfs box.
> Definitely sounds like something is wrong. You can do your own
> comparisons of Linux 2.6 vs Netapp here (the OpenPower 720 is a ppc64
> Linux box):
>
> http://www.spec.org/sfs97r1/results/sfs97r1.html
In actually using sfs97r1 published benchmarks to compare to hardware I
was benchmarking (from emc, netapp and several roll-your own linux boxes)
I found the published benchmark information alsmost entirely useless given
that vendors tend to provide wildly silly hardware configurations. In the
case of the openpower 720 (to use that for an example) the benchmarked
machine has 70 15k rpm disks spread across 12 fibre channel controllers,
64GB of ram, 12GB of nvram and 7 network interfaces...
> Anton
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
--------------------------------------------------------------------------
Joel Jaeggli Unix Consulting [email protected]
GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2
On Mon, 10 Jan 2005 23:42:30 PST, Joel Jaeggli said:
> In actually using sfs97r1 published benchmarks to compare to hardware I
> was benchmarking (from emc, netapp and several roll-your own linux boxes)
> I found the published benchmark information alsmost entirely useless given
> that vendors tend to provide wildly silly hardware configurations. In the
> case of the openpower 720 (to use that for an example) the benchmarked
> machine has 70 15k rpm disks spread across 12 fibre channel controllers,
> 64GB of ram, 12GB of nvram and 7 network interfaces...
If you threw that much hardware at a Linux system, and then tuned it so that it
didn't really care about userspace performance (oh.. say.. by giving the knfsd
thread a RT priority ;), and tuned things like the filesystem, the slab
allocator and the networking stack to NFS requirements, it probably would be
screaming fast too.. ;)
On Monday January 10, [email protected] wrote:
> Hello!
>
> I am trying to understand how NetApp can be so much
> better at NFS servicing than my quad Opteron 250 SAN
> attached machine. So I need some help and some
> pointers to understand how I can make my opteron
> machine come on par (or within 70% NFS performance
> range) as that of my NetApp R200. I have run through
> the NFS-how-to's and have heard "that is why they cost
> so much more", but I really have to consider that
> probably most of the ideas that are in the NetApp are
> common knowldge (just not in my head).
>
> Can anyone shed some light on this?
If you want to come anything close to comparable with a Netapp, get a
few hundred Megabytes of NVRAM (e.g. http://www.umem.com), and configure it
as an external journal for your filesystem (I know this can be done
for ext3, I don't know about other filesystems). Then make sure your
filesystem journals all data, not just metadata (data=journal option
to ext3).
If you use a dedicated drive (or mirrored pair) in place of the NVRAM,
you will come reasonably close.
NeilBrown
On Tue, Jan 11, 2005 at 04:19:57AM -0500, [email protected] wrote:
> On Mon, 10 Jan 2005 23:42:30 PST, Joel Jaeggli said:
>
> > In actually using sfs97r1 published benchmarks to compare to hardware I
> > was benchmarking (from emc, netapp and several roll-your own linux boxes)
> > I found the published benchmark information alsmost entirely useless given
> > that vendors tend to provide wildly silly hardware configurations. In the
> > case of the openpower 720 (to use that for an example) the benchmarked
> > machine has 70 15k rpm disks spread across 12 fibre channel controllers,
> > 64GB of ram, 12GB of nvram and 7 network interfaces...
>
> If you threw that much hardware at a Linux system,
... theory ... or have you actually tried?
> and then tuned it so that it
> didn't really care about userspace performance (oh.. say.. by giving the knfsd
> thread a RT priority ;), and tuned things like the filesystem, the slab
> allocator and the networking stack to NFS requirements, it probably would be
> screaming fast too.. ;)
You'd need to run a 2.4 kernel.
Current problems with 2.6:
1 ext3 causes kjournald oops on load
2 xfs has bad NFS/SMP/dcache interactions (you end up with undeletable
directories)
3 knfsd will give you stale handles (can be worked around by stat'ing
all your directories constantly on the server side)
The SGI XFS kernel from CVS actually almost solved (2) above, but not
entirely - I was going to report on that again to LKML. The other
problems are still, as far as I know, unsolved.
Not trying to flame anyone here, just trying to be realistic ;)
--
/ jakob
At 08:54 PM 11/01/2005, Neil Brown wrote:
>On Monday January 10, [email protected] wrote:
> > Hello!
> >
> > I am trying to understand how NetApp can be so much
> > better at NFS servicing than my quad Opteron 250 SAN
> > attached machine. So I need some help and some
> > pointers to understand how I can make my opteron
> > machine come on par (or within 70% NFS performance
> > range) as that of my NetApp R200. I have run through
> > the NFS-how-to's and have heard "that is why they cost
> > so much more", but I really have to consider that
> > probably most of the ideas that are in the NetApp are
> > common knowldge (just not in my head).
> >
> > Can anyone shed some light on this?
>
>If you want to come anything close to comparable with a Netapp, get a
>few hundred Megabytes of NVRAM (e.g. http://www.umem.com), and configure it
>as an external journal for your filesystem (I know this can be done
>for ext3, I don't know about other filesystems). Then make sure your
>filesystem journals all data, not just metadata (data=journal option
>to ext3).
NetApp's WAFL only journals metadata in NVRAM ...
(one of the primary reasons its called WAFL is that the data-write only
happens once..).
cheers,
lincoln.
On Tue, Jan 11, 2005 at 11:01:10AM +0100, Jakob Oestergaard wrote:
> 3 knfsd will give you stale handles (can be worked around by stat'ing
> all your directories constantly on the server side)
This should be fixed now. Bug reports to the contrary welcomed.
--Bruce Fields
On Tue, Jan 11, 2005 at 05:11:50PM +0100, Anders Saaby wrote:
> In which kernel version should this have been fixed?
2.6.10, I believe; see
http://marc.theaimsgroup.com/?l=linux-nfs&m=110021733807921&w=2
--Bruce Fields
Hi,
Thanks for the tip - it actually seems that it is fixed in 2.6.10.
I am now subscribed to linux-nfs :)
On Tuesday 11 January 2005 17:21, you wrote:
> On Tue, Jan 11, 2005 at 05:11:50PM +0100, Anders Saaby wrote:
> > In which kernel version should this have been fixed?
>
> 2.6.10, I believe; see
>
> http://marc.theaimsgroup.com/?l=linux-nfs&m=110021733807921&w=2
>
> --Bruce Fields
--
Med venlig hilsen - Best regards - Meilleures salutations
Anders Saaby
Systems Engineer
------------------------------------------------
Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby
Phone: +45 45 880 888 - Fax: +45 45 880 777
Mail: [email protected] - http://www.cohaesio.com
------------------------------------------------
On Tue, 11 Jan 2005 11:01:10 +0100, Jakob Oestergaard said:
> > If you threw that much hardware at a Linux system,
>
> ... theory ... or have you actually tried?
Merely indicating a method to approach the problem space...
> Current problems with 2.6:
> 1 ext3 causes kjournald oops on load
> 2 xfs has bad NFS/SMP/dcache interactions (you end up with undeletable
> directories)
> 3 knfsd will give you stale handles (can be worked around by stat'ing
> all your directories constantly on the server side)
A mere matter of debugging. ;)
On Tuesday January 11, [email protected] wrote:
> At 08:54 PM 11/01/2005, Neil Brown wrote:
> >If you want to come anything close to comparable with a Netapp, get a
> >few hundred Megabytes of NVRAM (e.g. http://www.umem.com), and configure it
> >as an external journal for your filesystem (I know this can be done
> >for ext3, I don't know about other filesystems). Then make sure your
> >filesystem journals all data, not just metadata (data=journal option
> >to ext3).
>
> NetApp's WAFL only journals metadata in NVRAM ...
> (one of the primary reasons its called WAFL is that the data-write only
> happens once..).
>
That may be, though it doesn't fit with my (admittedly limitted)
understanding of WAFL.
However Linux NFS definitely runs faster over ext3 if data=journal is
selected.
NeilBrown
In article <[email protected]> you wrote:
>> NetApp's WAFL only journals metadata in NVRAM ...
>> (one of the primary reasons its called WAFL is that the data-write only
>> happens once..).
> That may be, though it doesn't fit with my (admittedly limitted)
> understanding of WAFL.
Yes, AFAIK the NVRAM is used for the RAID-4, independend of WAFL and as a write-back cache.
However since also the read performance of Linux NFS is bad (at least not
very well selftuning) the Hardware is not really the reason for the fast NFS
implementation.
Greetings
Bernd
on den 12.01.2005 Klokka 00:36 (+0100) skreiv Bernd Eckenfels:
> However since also the read performance of Linux NFS is bad (at least not
> very well selftuning) the Hardware is not really the reason for the fast NFS
> implementation.
Indeed: NFS readahead requests are often processed out of order by the
server (due to the basic unordered nature of RPC calls, the lack of
ordering between nfsd server threads, use of UDP, etc) and so I suspect
the generic readahead algorithm will tend to default to the random
access mode in many cases where it should really be doing sequential
access.
Cheers,
Trond
--
Trond Myklebust <[email protected]>
On Tue, Jan 11, 2005 at 09:43:17AM -0500, J. Bruce Fields wrote:
> On Tue, Jan 11, 2005 at 11:01:10AM +0100, Jakob Oestergaard wrote:
> > 3 knfsd will give you stale handles (can be worked around by stat'ing
> > all your directories constantly on the server side)
>
> This should be fixed now. Bug reports to the contrary welcomed.
Excellent!
It seems SGI has merged their XFS kernel up to 2.6.10 - I'll give that a
try and see what happens.
Thanks,
--
/ jakob