2007-08-23 05:16:25

by Richard Ballantyne

[permalink] [raw]
Subject: file system for solid state disks

What file system that is already in the linux kernel do people recommend
I use for my laptop that now contains a solid state disk?

I appreciate your feedback.

Thank you,
Richard Ballantyne


2007-08-23 05:52:55

by Jan Engelhardt

[permalink] [raw]
Subject: Re: file system for solid state disks


On Aug 23 2007 01:01, Richard Ballantyne wrote:
>
>What file system that is already in the linux kernel do people recommend
>I use for my laptop that now contains a solid state disk?

If I had to choose, the list of options seems to be:

- logfs
[unmerged]

- UBI layer with any fs you like
[just a guess]

- UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
[does not support ACLs/quotas]



Jan
--

2007-08-23 08:55:49

by Daniel J Blueman

[permalink] [raw]
Subject: Re: file system for solid state disks

On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >What file system that is already in the linux kernel do people recommend
> >I use for my laptop that now contains a solid state disk?
>
> If I had to choose, the list of options seems to be:
>
> - logfs
> [unmerged]
>
> - UBI layer with any fs you like
> [just a guess]
>
> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> [does not support ACLs/quotas]

Isn't it that with modern rotational wear-levelling, re-writing hot
blocks many times is not an issue, as they are internally moved around
anyway? So, using a journalled filesystem such as ext3 is still good
(robustness and maturity in mind). Due to lack of write buffering,
perhaps a wandering log (journal) filesystem would be more suitable
though? I use ext3 on my >35MB/s compact flash filesystem.

I can see there being advantage in selecting a filesystem which is
lower complexity due to no additional spatial optimisation complexity,
but those advantages do buy other efficiency (eg the Orlov allocator
reducing fragmentation, thus less overhead), right?

Also, it would be natural to employ 'elevator=none', but perhaps there
is a small advantage in holding a group of flash blocks 'ready' (like
SDRAM pages being selected on-chip for lower bus access latency) -
however this no longer holds when logical->physical remapping is
performed, so perhaps it's better without an elevator.

Clearly, benchmarks speak...but perhaps it would make sense to have
libata disable the elevator for the (compact) flash block device?

Daniel
--
Daniel J Blueman

2007-08-23 10:26:43

by Theodore Ts'o

[permalink] [raw]
Subject: Re: file system for solid state disks

On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
>
> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >
> >What file system that is already in the linux kernel do people recommend
> >I use for my laptop that now contains a solid state disk?
>
> If I had to choose, the list of options seems to be:
>
> - logfs
> [unmerged]
>
> - UBI layer with any fs you like
> [just a guess]

The question is whether the solid state disk gives you access to the
raw flash, or whether you have to go through the flash translation
layer because it's trying to look (exclusively) like a PATA or SATA
drive. There are some SSD's that have a form factor and interfaces
that make them a drop-in replacement for a laptop hard drive, and a
number of the newer laptops that are supporting SSD's seem to be these
because (a) they don't have to radically change their design, (b) so
they can be compatible with Windows, and (c) so that users can
purchase the laptop either with a traditional hard drive or a SSD's as
an option, since at the moment SSD's are far more expensive than
disks.

So if you can't get access to the raw flash layer, then what you're
probably going to be looking at is a traditional block-oriented
filesystem, such as ext3, although there are clearly some things that
could be done such as disabling the elevator.

- Ted

2007-08-23 11:25:27

by Jens Axboe

[permalink] [raw]
Subject: Re: file system for solid state disks

On Thu, Aug 23 2007, Theodore Tso wrote:
> On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
> >
> > On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > >
> > >What file system that is already in the linux kernel do people recommend
> > >I use for my laptop that now contains a solid state disk?
> >
> > If I had to choose, the list of options seems to be:
> >
> > - logfs
> > [unmerged]
> >
> > - UBI layer with any fs you like
> > [just a guess]
>
> The question is whether the solid state disk gives you access to the
> raw flash, or whether you have to go through the flash translation
> layer because it's trying to look (exclusively) like a PATA or SATA
> drive. There are some SSD's that have a form factor and interfaces
> that make them a drop-in replacement for a laptop hard drive, and a
> number of the newer laptops that are supporting SSD's seem to be these
> because (a) they don't have to radically change their design, (b) so
> they can be compatible with Windows, and (c) so that users can
> purchase the laptop either with a traditional hard drive or a SSD's as
> an option, since at the moment SSD's are far more expensive than
> disks.
>
> So if you can't get access to the raw flash layer, then what you're
> probably going to be looking at is a traditional block-oriented
> filesystem, such as ext3, although there are clearly some things that
> could be done such as disabling the elevator.

It's more complicated than that, I'd say. If the job of the elevator was
purely to sort request based on sector criteria, then I'd agree that
noop was the best way to go. But the elevator also abritrates access to
the disk for processes. Even if you don't pay a seek penalty, you still
would rather like to get your sync reads in without having to wait for
that huge writer that just queued hundreds of megabytes of io in front
of you (and will have done so behind your read, making you wait again
for a subsequent read).

My plan in this area is to add a simple storage profile and attach it to
the queue. Just start simple, allow a device driver to inform the block
layer that this device has no seek penalty. Then the io scheduler can
make more informed decisions on what to do - eg for ssd, sector
proximity may not have much meaning, so we should not take that into
account.

--
Jens Axboe

2007-08-23 12:45:58

by James Courtier-Dutton

[permalink] [raw]
Subject: Re: file system for solid state disks

Daniel J Blueman wrote:
> On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
>
>> On Aug 23 2007 01:01, Richard Ballantyne wrote:
>>
>>> What file system that is already in the linux kernel do people recommend
>>> I use for my laptop that now contains a solid state disk?
>>>
>> If I had to choose, the list of options seems to be:
>>
>> - logfs
>> [unmerged]
>>
>> - UBI layer with any fs you like
>> [just a guess]
>>
>> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
>> [does not support ACLs/quotas]
>>
>
> Isn't it that with modern rotational wear-levelling, re-writing hot
> blocks many times is not an issue, as they are internally moved around
> anyway? So, using a journalled filesystem such as ext3 is still good
> (robustness and maturity in mind). Due to lack of write buffering,
> perhaps a wandering log (journal) filesystem would be more suitable
> though? I use ext3 on my >35MB/s compact flash filesystem.
>
> I can see there being advantage in selecting a filesystem which is
> lower complexity due to no additional spatial optimisation complexity,
> but those advantages do buy other efficiency (eg the Orlov allocator
> reducing fragmentation, thus less overhead), right?
>
> Also, it would be natural to employ 'elevator=none', but perhaps there
> is a small advantage in holding a group of flash blocks 'ready' (like
> SDRAM pages being selected on-chip for lower bus access latency) -
> however this no longer holds when logical->physical remapping is
> performed, so perhaps it's better without an elevator.
>
> Clearly, benchmarks speak...but perhaps it would make sense to have
> libata disable the elevator for the (compact) flash block device?
>
> Daniel
>

Also, sector read ahead will actually have a performance impact on
Flash, instead of speed things up with a spinning disc.
For example, a request might read 128 sectors instead of the one
requested at little or no extra performance impact for a spinning disc.
For flash, reading 128 sectors instead of the one requested will have a
noticeable performance impact.
Spinning discs have high seek latency, low serial sector read latency
and equal latency for read/write
Flash has low seek latency, high serial sector read latency and longer
write than read times.

James

2007-08-23 12:56:29

by Daniel J Blueman

[permalink] [raw]
Subject: Re: file system for solid state disks

Hi Fengguang,

On 23/08/07, James Courtier-Dutton <[email protected]> wrote:
> Daniel J Blueman wrote:
> > On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
> >> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >>
> >>> What file system that is already in the linux kernel do people recommend
> >>> I use for my laptop that now contains a solid state disk?
> >>>
> >> If I had to choose, the list of options seems to be:
> >>
> >> - logfs
> >> [unmerged]
> >>
> >> - UBI layer with any fs you like
> >> [just a guess]
> >>
> >> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> >> [does not support ACLs/quotas]
> >
> > Isn't it that with modern rotational wear-levelling, re-writing hot
> > blocks many times is not an issue, as they are internally moved around
> > anyway? So, using a journalled filesystem such as ext3 is still good
> > (robustness and maturity in mind). Due to lack of write buffering,
> > perhaps a wandering log (journal) filesystem would be more suitable
> > though? I use ext3 on my >35MB/s compact flash filesystem.
> >
> > I can see there being advantage in selecting a filesystem which is
> > lower complexity due to no additional spatial optimisation complexity,
> > but those advantages do buy other efficiency (eg the Orlov allocator
> > reducing fragmentation, thus less overhead), right?
> >
> > Also, it would be natural to employ 'elevator=none', but perhaps there
> > is a small advantage in holding a group of flash blocks 'ready' (like
> > SDRAM pages being selected on-chip for lower bus access latency) -
> > however this no longer holds when logical->physical remapping is
> > performed, so perhaps it's better without an elevator.
> >
> > Clearly, benchmarks speak...but perhaps it would make sense to have
> > libata disable the elevator for the (compact) flash block device?
> >
> > Daniel
>
> Also, sector read ahead will actually have a performance impact on
> Flash, instead of speed things up with a spinning disc.
> For example, a request might read 128 sectors instead of the one
> requested at little or no extra performance impact for a spinning disc.
> For flash, reading 128 sectors instead of the one requested will have a
> noticeable performance impact.
> Spinning discs have high seek latency, low serial sector read latency
> and equal latency for read/write
> Flash has low seek latency, high serial sector read latency and longer
> write than read times.

I was having problem invoking the readahead logic on my compact flash
rootfs (ext3) with tweaking the RA with 'hdparm -a' from 8 to 1024
blocks and some benchmarks (I forget which).

Fengguang, what is your favourite benchmark for finding differences in
readahead values (running on eg ext3 on a flashdisk), with the current
RA semantics in mainline kernels (eg 2.6.23-rc3)?

Thanks,
Daniel
--
Daniel J Blueman

2007-08-23 13:44:17

by Wu Fengguang

[permalink] [raw]
Subject: Re: file system for solid state disks

On Thu, Aug 23, 2007 at 01:56:17PM +0100, Daniel J Blueman wrote:
> Hi Fengguang,
>
> On 23/08/07, James Courtier-Dutton <[email protected]> wrote:
> > Daniel J Blueman wrote:
> > > On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
> > >> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > >>
> > >>> What file system that is already in the linux kernel do people recommend
> > >>> I use for my laptop that now contains a solid state disk?
> > >>>
> > >> If I had to choose, the list of options seems to be:
> > >>
> > >> - logfs
> > >> [unmerged]
> > >>
> > >> - UBI layer with any fs you like
> > >> [just a guess]
> > >>
> > >> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> > >> [does not support ACLs/quotas]
> > >
> > > Isn't it that with modern rotational wear-levelling, re-writing hot
> > > blocks many times is not an issue, as they are internally moved around
> > > anyway? So, using a journalled filesystem such as ext3 is still good
> > > (robustness and maturity in mind). Due to lack of write buffering,
> > > perhaps a wandering log (journal) filesystem would be more suitable
> > > though? I use ext3 on my >35MB/s compact flash filesystem.
> > >
> > > I can see there being advantage in selecting a filesystem which is
> > > lower complexity due to no additional spatial optimisation complexity,
> > > but those advantages do buy other efficiency (eg the Orlov allocator
> > > reducing fragmentation, thus less overhead), right?
> > >
> > > Also, it would be natural to employ 'elevator=none', but perhaps there
> > > is a small advantage in holding a group of flash blocks 'ready' (like
> > > SDRAM pages being selected on-chip for lower bus access latency) -
> > > however this no longer holds when logical->physical remapping is
> > > performed, so perhaps it's better without an elevator.
> > >
> > > Clearly, benchmarks speak...but perhaps it would make sense to have
> > > libata disable the elevator for the (compact) flash block device?
> > >
> > > Daniel
> >
> > Also, sector read ahead will actually have a performance impact on
> > Flash, instead of speed things up with a spinning disc.
> > For example, a request might read 128 sectors instead of the one
> > requested at little or no extra performance impact for a spinning disc.
> > For flash, reading 128 sectors instead of the one requested will have a
> > noticeable performance impact.
> > Spinning discs have high seek latency, low serial sector read latency
> > and equal latency for read/write
> > Flash has low seek latency, high serial sector read latency and longer
> > write than read times.

A little bit of readahead will be helpful for flash memory. Its latency is
low, but sure not zero. Asynchronous readahead will help to hide the latency.

> I was having problem invoking the readahead logic on my compact flash
> rootfs (ext3) with tweaking the RA with 'hdparm -a' from 8 to 1024
> blocks and some benchmarks (I forget which).
>
> Fengguang, what is your favourite benchmark for finding differences in
> readahead values (running on eg ext3 on a flashdisk), with the current
> RA semantics in mainline kernels (eg 2.6.23-rc3)?

My favorite test cases are

big file:
time cp $file /dev/null &>/dev/null
time dd if=$file of=/dev/null \ bs=${bs:-4k} &>/dev/null

big file, parallel:
time diff $file $file.clone

small files:
time grep -qr 'doruimi' $dir 2>/dev/null

Don't forget to clear the cache before each run:
echo 2 > /proc/sys/vm/drop_caches


Cheers,
Fengguang

2007-08-23 15:09:43

by Daniel J Blueman

[permalink] [raw]
Subject: Re: file system for solid state disks

On 23/08/07, Fengguang Wu <[email protected]> wrote:
> On Thu, Aug 23, 2007 at 01:56:17PM +0100, Daniel J Blueman wrote:
> > Hi Fengguang,
> >
> > On 23/08/07, James Courtier-Dutton <[email protected]> wrote:
> > > Daniel J Blueman wrote:
> > > > On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
> > > >> On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > > >>
> > > >>> What file system that is already in the linux kernel do people recommend
> > > >>> I use for my laptop that now contains a solid state disk?
> > > >>>
> > > >> If I had to choose, the list of options seems to be:
> > > >>
> > > >> - logfs
> > > >> [unmerged]
> > > >>
> > > >> - UBI layer with any fs you like
> > > >> [just a guess]
> > > >>
> > > >> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> > > >> [does not support ACLs/quotas]
> > > >
> > > > Isn't it that with modern rotational wear-levelling, re-writing hot
> > > > blocks many times is not an issue, as they are internally moved around
> > > > anyway? So, using a journalled filesystem such as ext3 is still good
> > > > (robustness and maturity in mind). Due to lack of write buffering,
> > > > perhaps a wandering log (journal) filesystem would be more suitable
> > > > though? I use ext3 on my >35MB/s compact flash filesystem.
> > > >
> > > > I can see there being advantage in selecting a filesystem which is
> > > > lower complexity due to no additional spatial optimisation complexity,
> > > > but those advantages do buy other efficiency (eg the Orlov allocator
> > > > reducing fragmentation, thus less overhead), right?
> > > >
> > > > Also, it would be natural to employ 'elevator=none', but perhaps there
> > > > is a small advantage in holding a group of flash blocks 'ready' (like
> > > > SDRAM pages being selected on-chip for lower bus access latency) -
> > > > however this no longer holds when logical->physical remapping is
> > > > performed, so perhaps it's better without an elevator.
> > > >
> > > > Clearly, benchmarks speak...but perhaps it would make sense to have
> > > > libata disable the elevator for the (compact) flash block device?
> > > >
> > > > Daniel
> > >
> > > Also, sector read ahead will actually have a performance impact on
> > > Flash, instead of speed things up with a spinning disc.
> > > For example, a request might read 128 sectors instead of the one
> > > requested at little or no extra performance impact for a spinning disc.
> > > For flash, reading 128 sectors instead of the one requested will have a
> > > noticeable performance impact.
> > > Spinning discs have high seek latency, low serial sector read latency
> > > and equal latency for read/write
> > > Flash has low seek latency, high serial sector read latency and longer
> > > write than read times.
>
> A little bit of readahead will be helpful for flash memory. Its latency is
> low, but sure not zero. Asynchronous readahead will help to hide the latency.
>
> > I was having problem invoking the readahead logic on my compact flash
> > rootfs (ext3) with tweaking the RA with 'hdparm -a' from 8 to 1024
> > blocks and some benchmarks (I forget which).
> >
> > Fengguang, what is your favourite benchmark for finding differences in
> > readahead values (running on eg ext3 on a flashdisk), with the current
> > RA semantics in mainline kernels (eg 2.6.23-rc3)?
>
> My favorite test cases are
>
> big file:
> time cp $file /dev/null &>/dev/null
> time dd if=$file of=/dev/null \ bs=${bs:-4k} &>/dev/null
>
> big file, parallel:
> time diff $file $file.clone
>
> small files:
> time grep -qr 'doruimi' $dir 2>/dev/null
>
> Don't forget to clear the cache before each run:
> echo 2 > /proc/sys/vm/drop_caches

The maximal case we're looking for is where up to 1024-block
read-ahead doesn't pay off, but actually wastes finite bandwidth, thus
time.

We clearly need a database-type workload which forward reads enough to
open the RA window some, but then reads at a different location. We
then prove that we benefit with a smaller read ahead window, at
negligible cost to your linear case.

To open the RA window, I know we need no competing threads, but how
far do we need to sequentially read? I'll cook a micro-benchmark with
memory-backed files.
--
Daniel J Blueman

2007-08-25 09:12:52

by Just Marc

[permalink] [raw]
Subject: Re: file system for solid state disks

Hi,

It's important to note that disk-replacement type SSDs perform much
better with very small block operations, generally 512 bytes. So the
lower your file system block size, the better -- this will be the single
most significant performance tweak one should do. This is true for the
benchmarks I've seen where the difference between 4KB and 512Byte block
sizes was almost 100%. YMMV -- always benchmark.

On SSDs which contain built in wear leveling, pretty much any file
system can be used. For SSDs that lack such low level housekeeping,
use stuff like JFFS2.

Marc

2007-08-29 17:36:57

by Bill Davidsen

[permalink] [raw]
Subject: Re: file system for solid state disks

Jens Axboe wrote:
> On Thu, Aug 23 2007, Theodore Tso wrote:
>> On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
>>> On Aug 23 2007 01:01, Richard Ballantyne wrote:
>>>> What file system that is already in the linux kernel do people recommend
>>>> I use for my laptop that now contains a solid state disk?
>>> If I had to choose, the list of options seems to be:
>>>
>>> - logfs
>>> [unmerged]
>>>
>>> - UBI layer with any fs you like
>>> [just a guess]
>> The question is whether the solid state disk gives you access to the
>> raw flash, or whether you have to go through the flash translation
>> layer because it's trying to look (exclusively) like a PATA or SATA
>> drive. There are some SSD's that have a form factor and interfaces
>> that make them a drop-in replacement for a laptop hard drive, and a
>> number of the newer laptops that are supporting SSD's seem to be these
>> because (a) they don't have to radically change their design, (b) so
>> they can be compatible with Windows, and (c) so that users can
>> purchase the laptop either with a traditional hard drive or a SSD's as
>> an option, since at the moment SSD's are far more expensive than
>> disks.
>>
>> So if you can't get access to the raw flash layer, then what you're
>> probably going to be looking at is a traditional block-oriented
>> filesystem, such as ext3, although there are clearly some things that
>> could be done such as disabling the elevator.
>
> It's more complicated than that, I'd say. If the job of the elevator was
> purely to sort request based on sector criteria, then I'd agree that
> noop was the best way to go. But the elevator also abritrates access to
> the disk for processes. Even if you don't pay a seek penalty, you still
> would rather like to get your sync reads in without having to wait for
> that huge writer that just queued hundreds of megabytes of io in front
> of you (and will have done so behind your read, making you wait again
> for a subsequent read).

In most cases the time in the elevator is minimal compared to the
benefits. Even without your next suggestion.
>
> My plan in this area is to add a simple storage profile and attach it to
> the queue. Just start simple, allow a device driver to inform the block
> layer that this device has no seek penalty. Then the io scheduler can
> make more informed decisions on what to do - eg for ssd, sector
> proximity may not have much meaning, so we should not take that into
> account.
>
Eventually the optimal solution may require both bandwidth and seek
information. If "solid state disk" means flash, it's on a peripheral
bus, it's probably not all that fast at transfer rate. If it means NV
memory, battery backed or core, probably nothing changes, again *if*
it's on a peripheral bus, but if it's on a card plugged to the
backplane, the transfer rate may be high enough to make ordering cost
more than waiting. This could be extended to nbd and iSCSI devices as
well, I think, to optimize performance.

Your plan seems a good one in this area, but if you agree that transfer
rate will be important (if it isn't already), perhaps you will be able
to design allowing for that capability to be easily added.

--
Bill Davidsen <[email protected]>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

2007-08-29 17:58:01

by Jens Axboe

[permalink] [raw]
Subject: Re: file system for solid state disks

On Wed, Aug 29 2007, Bill Davidsen wrote:
> Jens Axboe wrote:
> >On Thu, Aug 23 2007, Theodore Tso wrote:
> >>On Thu, Aug 23, 2007 at 07:52:46AM +0200, Jan Engelhardt wrote:
> >>>On Aug 23 2007 01:01, Richard Ballantyne wrote:
> >>>>What file system that is already in the linux kernel do people recommend
> >>>>I use for my laptop that now contains a solid state disk?
> >>>If I had to choose, the list of options seems to be:
> >>>
> >>>- logfs
> >>> [unmerged]
> >>>
> >>>- UBI layer with any fs you like
> >>> [just a guess]
> >>The question is whether the solid state disk gives you access to the
> >>raw flash, or whether you have to go through the flash translation
> >>layer because it's trying to look (exclusively) like a PATA or SATA
> >>drive. There are some SSD's that have a form factor and interfaces
> >>that make them a drop-in replacement for a laptop hard drive, and a
> >>number of the newer laptops that are supporting SSD's seem to be these
> >>because (a) they don't have to radically change their design, (b) so
> >>they can be compatible with Windows, and (c) so that users can
> >>purchase the laptop either with a traditional hard drive or a SSD's as
> >>an option, since at the moment SSD's are far more expensive than
> >>disks.
> >>
> >>So if you can't get access to the raw flash layer, then what you're
> >>probably going to be looking at is a traditional block-oriented
> >>filesystem, such as ext3, although there are clearly some things that
> >>could be done such as disabling the elevator.
> >
> >It's more complicated than that, I'd say. If the job of the elevator was
> >purely to sort request based on sector criteria, then I'd agree that
> >noop was the best way to go. But the elevator also abritrates access to
> >the disk for processes. Even if you don't pay a seek penalty, you still
> >would rather like to get your sync reads in without having to wait for
> >that huge writer that just queued hundreds of megabytes of io in front
> >of you (and will have done so behind your read, making you wait again
> >for a subsequent read).
>
> In most cases the time in the elevator is minimal compared to the
> benefits. Even without your next suggestion.

Runtime overhead, yes. Head optimizations like trying to avoid seeks,
definitely no. That can be several miliseconds for a request, and if you
waste that time often, then you are going noticably slower than you
could be.

> >My plan in this area is to add a simple storage profile and attach it to
> >the queue. Just start simple, allow a device driver to inform the block
> >layer that this device has no seek penalty. Then the io scheduler can
> >make more informed decisions on what to do - eg for ssd, sector
> >proximity may not have much meaning, so we should not take that into
> >account.
> >
> Eventually the optimal solution may require both bandwidth and seek
> information. If "solid state disk" means flash, it's on a peripheral
> bus, it's probably not all that fast at transfer rate. If it means NV
> memory, battery backed or core, probably nothing changes, again *if*
> it's on a peripheral bus, but if it's on a card plugged to the
> backplane, the transfer rate may be high enough to make ordering cost
> more than waiting. This could be extended to nbd and iSCSI devices as
> well, I think, to optimize performance.

I've yet to see any real runtime overhead problems for any workload, so
the ordering is not an issue imo. It's easy enough to bypass for any io
scheduler, should it become interesting.

--
Jens Axboe

2007-08-30 18:25:25

by Jan Engelhardt

[permalink] [raw]
Subject: Re: file system for solid state disks


On Aug 25 2007 09:41, Just Marc wrote:
>
> On SSDs which contain built in wear leveling, pretty much any file
> system can be used. For SSDs that lack such low level housekeeping,
> use stuff like JFFS2.

The question is, how can you find out whether it does automatic
wear-leveling? (Perhaps when a CF is advertised as "holds 10 years!"?)


Jan
--

2007-08-30 18:32:59

by Just Marc

[permalink] [raw]
Subject: Re: file system for solid state disks

One must consult the documentation of that device. This wear leveling
is low level and most devices do not export any information about it.
Recent SSDs start to export some values through SMART that let you
monitor the state.

Some companies think that hiding is better than exposing...


Jan Engelhardt wrote:
> On Aug 25 2007 09:41, Just Marc wrote:
>
>> On SSDs which contain built in wear leveling, pretty much any file
>> system can be used. For SSDs that lack such low level housekeeping,
>> use stuff like JFFS2.
>>
>
> The question is, how can you find out whether it does automatic
> wear-leveling? (Perhaps when a CF is advertised as "holds 10 years!"?)
>
>
> Jan
>

2007-09-05 12:34:35

by Denys Vlasenko

[permalink] [raw]
Subject: Re: file system for solid state disks

On Thursday 23 August 2007 09:55, Daniel J Blueman wrote:
> On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
> > On Aug 23 2007 01:01, Richard Ballantyne wrote:
> > >What file system that is already in the linux kernel do people recommend
> > >I use for my laptop that now contains a solid state disk?
> >
> > If I had to choose, the list of options seems to be:
> >
> > - logfs
> > [unmerged]
> >
> > - UBI layer with any fs you like
> > [just a guess]
> >
> > - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
> > [does not support ACLs/quotas]
>
> Isn't it that with modern rotational wear-levelling, re-writing hot
> blocks many times is not an issue, as they are internally moved around
> anyway? So, using a journalled filesystem such as ext3 is still good
> (robustness and maturity in mind).

Crap hardware (one which only _claim_ to do it) is out there,
and is typically cheaper, so users preferentially buy that ;)
--
vda

2007-09-05 12:56:31

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: file system for solid state disks


On Wed, 5 Sep 2007, Denys Vlasenko wrote:

> On Thursday 23 August 2007 09:55, Daniel J Blueman wrote:
>> On 23 Aug, 07:00, Jan Engelhardt <[email protected]> wrote:
>>> On Aug 23 2007 01:01, Richard Ballantyne wrote:
>>>> What file system that is already in the linux kernel do people recommend
>>>> I use for my laptop that now contains a solid state disk?
>>>
>>> If I had to choose, the list of options seems to be:
>>>
>>> - logfs
>>> [unmerged]
>>>
>>> - UBI layer with any fs you like
>>> [just a guess]
>>>
>>> - UDF in Spared Flavor (mkudffs --media-type=cdrw --utf8)
>>> [does not support ACLs/quotas]
>>
>> Isn't it that with modern rotational wear-levelling, re-writing hot
>> blocks many times is not an issue, as they are internally moved around
>> anyway? So, using a journalled filesystem such as ext3 is still good
>> (robustness and maturity in mind).
>
> Crap hardware (one which only _claim_ to do it) is out there,
> and is typically cheaper, so users preferentially buy that ;)
> --
> vda

You might want to check and see what is actually being
used for the solid-state disk. Some solid state disks
are SRAM and DRAM. SRAM is fast, it doesn't require refresh,
is now as cheap as flash, and does R/W forever. It retains
its data for 10 years of power being removed by using an
embedded battery.

http://en.wikipedia.org/wiki/Solid_state_drive

This is exactly what I proposed on this list a long
time ago. It is now a reality.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2007-09-05 13:05:15

by Manu Abraham

[permalink] [raw]
Subject: Re: file system for solid state disks

linux-os (Dick Johnson) wrote:


> http://en.wikipedia.org/wiki/Solid_state_drive
>
> This is exactly what I proposed on this list a long
> time ago. It is now a reality.

It's been around for a couple of years ;-)

http://forum.pcvsconsole.com/viewthread.php?tid=15802
http://www.anandtech.com/tradeshows/showdoc.aspx?i=2431&p=5