2004-03-03 18:57:39

by Steve Longerbeam

[permalink] [raw]
Subject: new special filesystem for consideration in 2.6/2.7

MontaVista Software has developed a new filesystem
targeted for embedded systems that we would like to
have considered for inclusion in 2.6 or 2.7. It is
called the Protected and Persistent RAM Special Filesystem
(PRAMFS). It was originally developed for three major consumer
electronics companies for use in their smart cell phones
and other consumer devices.

An intro to PRAMFS along with a technical specification
is at the SourceForge project web page at
http://pramfs.sourceforge.net/. A patch for 2.6.3 has
been released at the SF project site.

PRAMFS can be tested on a desktop by reserving some portion
of physical memory with "mem=". For example, a machine with
512M could reserve the top 32M with "mem=480M". PRAMFS would
then be mounted with:

mount -t pramfs -o physaddr=0x1e000000,init=0x2000000 none /mnt/pramfs

Thanks for your comments and consideration.

Steve



2004-03-05 18:09:20

by Steve Longerbeam

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7



Steve Longerbeam wrote:

> An intro to PRAMFS along with a technical specification
> is at the SourceForge project web page at
> http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> been released at the SF project site.


A new patch for 2.6.4 is available, but it and the 2.6.3 patch
are each ~2900 lines, so I won't post here. But here's the intro:


PRAMFS Overview
===============


Many embedded systems have a block of non-volatile RAM seperate from
normal system memory, i.e. of which the kernel maintains no memory page
descriptors. For such systems it would be beneficial to mount a
fast read/write filesystem over this "I/O memory", for storing frequently
accessed data that must survive system reboots and power cycles. An
example usage might be system logs under /var/log, or a user address
book in a cell phone or PDA.


Currently Linux has no support for a persistent, non-volatile RAM-based
filesystem, persistent meaning the filesystem survives a system reboot
or power cycle intact. The current RAM-based filesystems such as tmpfs
and ramfs have no actual backing store but exist entirely in the page and
buffer caches, hence the filesystem disappears after a system reboot or
power cycle.


A relatively straight-forward solution is to write a simple block driver
for the non-volatile RAM, and mount over it any disk-based filesystem such
as ext2/ext3, reiserfs, etc.


But the disk-based fs over non-volatile RAM block driver approach has
some drawbacks:


1. Disk-based filesystems such as ext2/ext3 were designed for optimum
performance on spinning disk media, so they implement features such
as block groups, which attempts to group inode data into a contiguous
set of data blocks to minimize disk seeking when accessing files. For
RAM there is no such concern; a file's data blocks can be scattered
throughout the media with no access speed penalty at all. So block
groups in a filesystem mounted over RAM just adds unnecessary
complexity. A better approach is to use a filesystem specifically
tailored to RAM media which does away with these disk-based features.
This increases the efficient use of space on the media, i.e. more
space is dedicated to actual file data storage and less to meta-data
needed to maintain that file data.


2. If the backing-store RAM is comparable in access speed to system memory,
there's really no point in caching the file I/O data in the page
cache. Better to move file data directly between the user buffers
and the backing store RAM, i.e. use direct I/O. This prevents the
unnecessary populating of the page cache with dirty pages. However
direct I/O has to be enabled at every file open. To enable direct
I/O at all times for all regular files requires either that
applications be modified to include the O_DIRECT flag on all file
opens, or that a new filesystem be used that always performs direct
I/O by default.


The Persistent/Protected RAM Special Filesystem (PRAMFS) is a
full-featured read/write filesystem that has been designed to address
these issues. PRAMFS is targeted to fast I/O memory, and if the memory
is non-volatile, the filesystem will be persistent.


In PRAMFS, direct I/O is enabled across all files in the filesystem, in
other words the O_DIRECT flag is forced on every open of a PRAMFS file.
Also, file I/O in the PRAMFS is always synchronous. There is no need
to block the current process while the transfer to/from the PRAMFS
is in progress, since one of the requirements of the PRAMFS is that the
filesystem exist in fast RAM. So file I/O in PRAMFS is always direct,
synchronous, and never blocks.


The data organization in PRAMFS can be thought of as an extremely
simplified version of ext2, such that the ratio of data to meta-data is
very high.


PRAMFS is also write protected. The page table entries that map the
backing-store RAM are normally marked read-only. Write operations into
the filesystem temporarily mark the affected pages as writeable, the
write operation is carried out with locks held, and then the pte is
marked read-only again. This feature provides some protection against
filesystem corruption caused by errant writes into the RAM due to
kernel bugs for instance. In case there are systems where the write
protection is not possible (for instance the RAM cannot be mapped
with page tables), this feature can be disabled with the
CONFIG_PRAMFS_NOWP config option.


In summary, PRAMFS is a light-weight, full-featured, and space-efficient
special filesystem that is ideal for systems with a block of fast
non-volatile RAM that need to access data on it using a standard
filesytem interface.

Supported mount options
=======================


The PRAMFS currently requires one mount option, and there are several
optional mount options:


physaddr= Required. It tells PRAMFS the physical address of the
start of the RAM that makes up the filesystem. The
physical address must be located on a page boundary.


init= Optional. It is used to initialize the memory to an
empty filesystem. Any data in an existing filesystem
will be lost if this option is given. The parameter to
"init=" is the RAM size in bytes.


bs= Optional. It is used to specify a block size. It is
ignored if the "init=" option is not specified, since
otherwise the block size is read from the PRAMFS
super-block. The default blocksize is 2048 bytes,
and the allowed block sizes are 512, 1024, 2048, and
4096.


bpi= Optional. It is used to specify the bytes per inode
ratio, i.e. For every N bytes in the filesystem, an
inode will be created. This behaves the same as the "-i"
option to mke2fs. It is ignored if the "init=" option is
not specified.


N= Optional. It is used to specify the number of inodes to
allocate in the inode table. If the option is not
specified, the bytes-per-inode ratio is used the
calculate the number of inodes. If neither the "N=" or
"bpi=" options are specified, the default behavior is to
reserve 5% of the total space in the filesystem for the
inode table. This option behaves the same as the "-N"
option to mke2fs. It is ignored if the "init=" option is
not specified.

Examples:


mount -t pramfs -o physaddr=0x20000000,init=0x2F000,bs=1024 none /mnt/pram


This example locates the filesystem at physical address 0x20000000, and
also requests an empty filesystem be initialized, of total size 0x2f000
bytes and blocksize 1024. The mount point is /mnt/pram.


mount -t pramfs -o physaddr=0x20000000 none /mnt/pram


This example locates the filesystem at physical address 0x20000000 as in
the first example, but uses the intact filesystem that already exists.


2004-03-05 18:44:34

by Dave Jones

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

On Fri, Mar 05, 2004 at 10:09:09AM -0800, Steve Longerbeam wrote:
> >An intro to PRAMFS along with a technical specification
> >is at the SourceForge project web page at
> >http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> >been released at the SF project site.
>
>
> A new patch for 2.6.4 is available, but it and the 2.6.3 patch
> are each ~2900 lines, so I won't post here. But here's the intro:

Without commenting on the code, the biggest thing holding back
inclusion of this is likely the comment about there likely being
patents held on parts of that code.

Dave

2004-03-05 18:53:44

by Steve Kenton

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

People have kicked around ideas for persistant memory use as a
disk replacement etc. with memory mapped data spaces, but until
there is actual (affordable) hardware it remains just an interesting
thought experiment.

If the recent news about giga-bit mram being a real possibility in
the not too far future pans out, this may be get more important.

smk

2004-03-05 19:00:01

by Steve Longerbeam

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7



Dave Jones wrote:

>On Fri, Mar 05, 2004 at 10:09:09AM -0800, Steve Longerbeam wrote:
> > >An intro to PRAMFS along with a technical specification
> > >is at the SourceForge project web page at
> > >http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> > >been released at the SF project site.
> >
> >
> > A new patch for 2.6.4 is available, but it and the 2.6.3 patch
> > are each ~2900 lines, so I won't post here. But here's the intro:
>
>Without commenting on the code, the biggest thing holding back
>inclusion of this is likely the comment about there likely being
>patents held on parts of that code.
>
> Dave
>

Dave, true MV has a patent pending, but it would only affect any future use
of the technology in a _non GPL_ operating system. Used in Linux or any
future GPL software, no patent licenses or royalties are involved at all.

I am not a lawyer, but I think I got that right :)

Steve

2004-03-06 12:43:59

by Pavel Machek

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

Hi!

> (PRAMFS). It was originally developed for three major consumer
> electronics companies for use in their smart cell phones
> and other consumer devices.
>
> An intro to PRAMFS along with a technical specification
> is at the SourceForge project web page at
> http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> been released at the SF project site.

Well, I'd certainly love to see some usable linux cell phones.
(Well, one such beast in my pocket would probably be enough :-)
(Is there a way to make linux cell phone without second
cpu just for GSM stack?)

Comments about pramfs: RAM is not really random access,
you'll find that doing byte-sized random reads is way slower
than linear read,
but you are right that it is very different from disk.

How do you handle powerfail in the middle of write?
Do you run fsck or do you have some kind of logging?


--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2004-03-07 03:06:25

by Mike Fedyk

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

Steve Longerbeam wrote:
>
>
> Dave Jones wrote:
>
>> On Fri, Mar 05, 2004 at 10:09:09AM -0800, Steve Longerbeam wrote:
>> > >An intro to PRAMFS along with a technical specification
>> > >is at the SourceForge project web page at
>> > >http://pramfs.sourceforge.net/. A patch for 2.6.3 has
>> > >been released at the SF project site. > > > A new patch for 2.6.4
>> is available, but it and the 2.6.3 patch
>> > are each ~2900 lines, so I won't post here. But here's the intro:
>>
>> Without commenting on the code, the biggest thing holding back
>> inclusion of this is likely the comment about there likely being
>> patents held on parts of that code.
>>
>> Dave
>>
>
> Dave, true MV has a patent pending, but it would only affect any future use
> of the technology in a _non GPL_ operating system. Used in Linux or any
> future GPL software, no patent licenses or royalties are involved at all.

A statement in legal terms should be in the patch.

2004-03-07 03:07:26

by Mike Fedyk

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

Steve Kenton wrote:
> People have kicked around ideas for persistant memory use as a
> disk replacement etc. with memory mapped data spaces, but until
> there is actual (affordable) hardware it remains just an interesting
> thought experiment.
>
> If the recent news about giga-bit mram being a real possibility in
> the not too far future pans out, this may be get more important.

This is a reality in embedded devices. Go read the message again...

2004-03-07 09:49:44

by Christoph Hellwig

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

On Wed, Mar 03, 2004 at 10:57:37AM -0800, Steve Longerbeam wrote:
> An intro to PRAMFS along with a technical specification
> is at the SourceForge project web page at
> http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> been released at the SF project site.

What about posting that patch here instead of hiding it behind
half a dozend indirections?

2004-03-07 10:23:52

by Willy Tarreau

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

On Sun, Mar 07, 2004 at 09:49:42AM +0000, Christoph Hellwig wrote:
> On Wed, Mar 03, 2004 at 10:57:37AM -0800, Steve Longerbeam wrote:
> > An intro to PRAMFS along with a technical specification
> > is at the SourceForge project web page at
> > http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> > been released at the SF project site.
>
> What about posting that patch here instead of hiding it behind
> half a dozend indirections?

Because he said it's 2900 lines. Seems fair enough to save vger's bandwidth.

Willy

2004-03-08 05:17:53

by Steve Kenton

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

>> If the recent news about giga-bit mram being a real possibility in
>> the not too far future pans out, this may be get more important.

>This is a reality in embedded devices. Go read the message again...

Umm, yes and no. I did not mean to dis this proposal because I think it
is worthwhile. Rather, I was thinking about the problems with really
large amounts of data. I don't really think that a few Kilo or Mega
bytes of data needs the same sort of infrastructure that will be
required
for Tera or Peta bytes. As an extreme example the few bytes of nv ram
in the
cmos clock chips in the original PC/AT did not require much support
while
the multiple terabytes of data in my raid farm at work would be very
vulnerable under this proposal since a rogue process could cause lots of
damage in very sort order as would losing a memory bank to hardware
failure.

In the last discussion I saw on the topic on lkml, there was discussion
about
whether to even preserve the volume/directory/file abstraction at all
for
memory mapped data spaces. That discussion was quite speculative given
the lack of affordable *really large* nvram type storage to compete with
100+ gigabyte disks and even larger raids. That situation may be
changing.
Hence, this may become more important.

smk

2004-03-08 17:57:23

by Steve Longerbeam

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7



Pavel Machek wrote:

>Hi!
>
>
>
>>(PRAMFS). It was originally developed for three major consumer
>>electronics companies for use in their smart cell phones
>>and other consumer devices.
>>
>>An intro to PRAMFS along with a technical specification
>>is at the SourceForge project web page at
>>http://pramfs.sourceforge.net/. A patch for 2.6.3 has
>>been released at the SF project site.
>>
>>
>
>Well, I'd certainly love to see some usable linux cell phones.
>(Well, one such beast in my pocket would probably be enough :-)
>(Is there a way to make linux cell phone without second
>cpu just for GSM stack?)
>

one of the chips used in their cell phones is the TI OMAP1510.
It has an embedded TMS320c55 DSP as well as an ARM 925.

>
>Comments about pramfs: RAM is not really random access,
>you'll find that doing byte-sized random reads is way slower
>than linear read,
>but you are right that it is very different from disk.
>
>
>How do you handle powerfail in the middle of write?
>

good question, I don't - not in software anyway. But the companies
I mentioned may have implemented some kind of h/w safe shutdown,
but I'm not sure.

>Do you run fsck or do you have some kind of logging?
>

If you mean journaling, no, pramfs is not a journaling fs.

And you're right, I still need to write an fsck for pramfs.
At this point there is no way to recover a corrupt fs.

Steve


2004-03-08 18:37:53

by Steve Longerbeam

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7



Stephen M. Kenton wrote:

>>>If the recent news about giga-bit mram being a real possibility in
>>>the not too far future pans out, this may be get more important.
>>>
>>>
>
>
>
>>This is a reality in embedded devices. Go read the message again...
>>
>>
>
>Umm, yes and no. I did not mean to dis this proposal because I think it
>is worthwhile. Rather, I was thinking about the problems with really
>large amounts of data. I don't really think that a few Kilo or Mega
>bytes of data needs the same sort of infrastructure that will be
>required
>for Tera or Peta bytes. As an extreme example the few bytes of nv ram
>in the
>cmos clock chips in the original PC/AT did not require much support
>while
>the multiple terabytes of data in my raid farm at work would be very
>vulnerable under this proposal since a rogue process could cause lots of
>damage in very sort order as would losing a memory bank to hardware
>failure.
>
>In the last discussion I saw on the topic on lkml, there was discussion
>about
>whether to even preserve the volume/directory/file abstraction at all
>for
>memory mapped data spaces. That discussion was quite speculative given
>the lack of affordable *really large* nvram type storage to compete with
>100+ gigabyte disks and even larger raids. That situation may be
>changing.
>Hence, this may become more important.
>

Hi Steve, I should note that pramfs was not designed with *really large*
nvram
storage in mind. It was more designed to use space efficiently on small
amounts
of nvram. For instance, pramfs inodes only have a 2-d block pointer table
(vs ext2/ext3's 3-d i_block[14] pointer table), so the max file size is
(b/4)^2
blocks or b^3/16 bytes, b being the blocksize. Also, offsets within the
fs are unsigned
long's, so there's a 4 gig limit already on 32-bit machines.

So in short, for 10s or even 100s of MB, pramfs is fine, but for GB or more
storage it's not an appropriate fs.

Steve


2004-03-08 18:43:03

by Steve Longerbeam

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7



Christoph Hellwig wrote:

>On Wed, Mar 03, 2004 at 10:57:37AM -0800, Steve Longerbeam wrote:
>
>
>>An intro to PRAMFS along with a technical specification
>>is at the SourceForge project web page at
>>http://pramfs.sourceforge.net/. A patch for 2.6.3 has
>>been released at the SF project site.
>>
>>
>
>What about posting that patch here instead of hiding it behind
>half a dozend indirections?
>

~2900 lines! But here's a more direct link:

http://prdownloads.sourceforge.net/pramfs/pramfs-2.6.4-1.0.2.tar.gz?download


2004-03-08 22:35:21

by Pavel Machek

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

Hi!

> >>(PRAMFS). It was originally developed for three major consumer
> >>electronics companies for use in their smart cell phones
> >>and other consumer devices.
> >>
> >>An intro to PRAMFS along with a technical specification
> >>is at the SourceForge project web page at
> >>http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> >>been released at the SF project site.
> >
> >Well, I'd certainly love to see some usable linux cell phones.
> >(Well, one such beast in my pocket would probably be enough :-)
> >(Is there a way to make linux cell phone without second
> >cpu just for GSM stack?)
> >
>
> one of the chips used in their cell phones is the TI OMAP1510.
> It has an embedded TMS320c55 DSP as well as an ARM 925.

Hmm, but GSM stack needs to be realtime, and it probably will not be
GPL compatible (?). It is just pure curiosity, but I wonder how that
one is being solved...

Well...

Probably GSM stack can be binary-only kernel module?

Pavel

--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-11 12:34:34

by Adrian Bunk

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

On Mon, Mar 08, 2004 at 10:42:52AM -0800, Steve Longerbeam wrote:
>
>
> Christoph Hellwig wrote:
>
> >On Wed, Mar 03, 2004 at 10:57:37AM -0800, Steve Longerbeam wrote:
> >
> >
> >>An intro to PRAMFS along with a technical specification
> >>is at the SourceForge project web page at
> >>http://pramfs.sourceforge.net/. A patch for 2.6.3 has
> >>been released at the SF project site.
> >>
> >>
> >
> >What about posting that patch here instead of hiding it behind
> >half a dozend indirections?
> >
>
> ~2900 lines! But here's a more direct link:
>
> http://prdownloads.sourceforge.net/pramfs/pramfs-2.6.4-1.0.2.tar.gz?download

The best link (wget'able) is
http://dl.sf.net/pramfs/pramfs-2.6.4-1.0.2.tar.gz

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2004-03-18 19:19:52

by Tim Bird

[permalink] [raw]
Subject: Re: new special filesystem for consideration in 2.6/2.7

Mike Fedyk wrote:
> Steve Longerbeam wrote:
>> Dave Jones wrote:
>>> Without commenting on the code, the biggest thing holding back
>>> inclusion of this is likely the comment about there likely being
>>> patents held on parts of that code.
>>
>> Dave, true MV has a patent pending, but it would only affect any
>> future use
>> of the technology in a _non GPL_ operating system. Used in Linux or any
>> future GPL software, no patent licenses or royalties are involved at all.
>
> A statement in legal terms should be in the patch.

This statement is in a comment at the top of every file in
the patch:

* This software is being distributed under the terms of the GNU General Public
* License version 2. Some or all of the technology encompassed by this
* software may be subject to one or more patents pending as of the date of
* this notice. No additional patent license will be required for GPL
* implementations of the technology. If you want to create a non-GPL
* implementation of the technology encompassed by this software, please
* contact [email protected] for details including licensing terms and fees.

I believe that this meets any required legal criteria for unencumbered
use in the Linux kernel (or other GPL projects).

=============================
Tim Bird
Architecture Group Co-Chair
CE Linux Forum
Senior Staff Engineer
Sony Electronics
E-mail: [email protected]
=============================