Hi!
This patch makes a lot of sense in my eyes, but maybe someone
disagrees. Applies cleanly to current 2.4.
Comments?
J?rn
--
Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
--- linux-2.4.20/fs/partitions/Config.in~msdospartitions 2002-11-29 00:53:15.000000000 +0100
+++ linux-2.4.20/fs/partitions/Config.in 2003-04-10 20:12:41.000000000 +0200
@@ -37,9 +37,7 @@
if [ "$CONFIG_ALPHA" = "y" ]; then
define_bool CONFIG_OSF_PARTITION y
fi
- if [ "$CONFIG_AMIGA" != "y" -a "$CONFIG_ATARI" != "y" -a \
- "$CONFIG_MAC" != "y" -a "$CONFIG_SGI_IP22" != "y" -a \
- "$CONFIG_SGI_IP27" != "y" ]; then
+ if [ "$CONFIG_PARTITION_ADVANCED" != "y" -a "$CONFIG_X86" = "y" ]; then
define_bool CONFIG_MSDOS_PARTITION y
fi
if [ "$CONFIG_AMIGA" = "y" -o "$CONFIG_AFFS_FS" = "y" ]; then
On Llu, 2003-05-05 at 22:08, Jörn Engel wrote:
> Hi!
>
> This patch makes a lot of sense in my eyes, but maybe someone
> disagrees. Applies cleanly to current 2.4.
>
> Comments?
MSDOS partitions turn up in lots of places beyond x86 boxes. The
port maintainers made their decisions for sound reasons
On Tue, 6 May 2003 11:48:11 +0100, Alan Cox wrote:
> On Llu, 2003-05-05 at 22:08, J?rn Engel wrote:
> >
> > This patch makes a lot of sense in my eyes, but maybe someone
> > disagrees. Applies cleanly to current 2.4.
> >
> > Comments?
>
> MSDOS partitions turn up in lots of places beyond x86 boxes. The
> port maintainers made their decisions for sound reasons
Which were?
My reasons for this patch:
- Even if x86 is not the only platform that uses MSDOS partitions,
this should be a positive list, not a negative list. Else every new
platform has to explicitly express, *not* to use MSDOS partitions.
Seems awkward.
- Any platform needing MSDOS partitions but not explicitly listed can
still turn that option on through the advanced partitioning.
Currently, allnoconfig gives you MSDOS partitioning, which is not very
intuitive.
J?rn
--
Mundie uses a textbook tactic of manipulation: start with some
reasonable talk, and lead the audience to an unreasonable conclusion.
-- Bruce Perens
> My reasons for this patch:
>
> - Even if x86 is not the only platform that uses MSDOS partitions,
> this should be a positive list, not a negative list. Else every new
> platform has to explicitly express, *not* to use MSDOS partitions.
> Seems awkward.
>
> - Any platform needing MSDOS partitions but not explicitly listed can
> still turn that option on through the advanced partitioning.
> Currently, allnoconfig gives you MSDOS partitioning, which is not very
> intuitive.
Every platform that supports USB will be able to read USB Storage
Devices which almost everytime have FAT filesystems with MSDOS partitions.
So short of S/390 you get like every platform.
Ciao, Marcus
On Tue, 6 May 2003 14:28:44 +0200, Marcus Meissner wrote:
>
> Every platform that supports USB will be able to read USB Storage
> Devices which almost everytime have FAT filesystems with MSDOS partitions.
>
> So short of S/390 you get like every platform.
And short of most embedded systems.
Wouldn't it make more sense to add USB to the positive list then?
(There is still the alternative in the back of my head, to add all the
mainly embedded cpus to the current negative list, until more people
agree that it is ugly.;)
J?rn
--
If you're willing to restrict the flexibility of your approach,
you can almost always do something better.
-- John Carmack
On Tue, May 06, 2003 at 02:42:12PM +0200, J?rn Engel wrote:
> On Tue, 6 May 2003 14:28:44 +0200, Marcus Meissner wrote:
> >
> > Every platform that supports USB will be able to read USB Storage
> > Devices which almost everytime have FAT filesystems with MSDOS partitions.
> >
> > So short of S/390 you get like every platform.
>
> And short of most embedded systems.
CF cards - these have MSDOS partition tables on. CF cards get used on
embedded systems.
Therefore, it follows that if you have an embedded system with a CF socket,
you'll probably want the MSDOS partitioning enabled.
--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
On Tue, 6 May 2003 15:03:18 +0100, Russell King wrote:
> On Tue, May 06, 2003 at 02:42:12PM +0200, J?rn Engel wrote:
> > On Tue, 6 May 2003 14:28:44 +0200, Marcus Meissner wrote:
> > >
> > > Every platform that supports USB will be able to read USB Storage
> > > Devices which almost everytime have FAT filesystems with MSDOS partitions.
> > >
> > > So short of S/390 you get like every platform.
> >
> > And short of most embedded systems.
>
> CF cards - these have MSDOS partition tables on. CF cards get used on
> embedded systems.
>
> Therefore, it follows that if you have an embedded system with a CF socket,
> you'll probably want the MSDOS partitioning enabled.
Maybe I was just thinking the wrong way. Given that my systems don't
use IDE, SCSI, a floppy or anything emulating one of them, like USB
storage or CF. I don't want MSDOS partitioning, but in fact, I don't
want any of the disk-centric code at all, fs/partitions is just a part
of that.
Then the real fix would be to have (no disk stuff) => (some magic) =>
(no MSDOS partitions,...).
My proposition for (some magic) would be to add an option for having
disks (or floppies, emulating stuff, yadda, yadda...) and have some
things like partitioning depend on that option. I better sharpen my
claymore to cut through all that code then.
J?rn
--
Fancy algorithms are buggier than simple ones, and they're much harder
to implement. Use simple algorithms as well as simple data structures.
-- Rob Pike
On Tue, May 06, 2003 at 05:32:52PM +0200, J?rn Engel wrote:
> Maybe I was just thinking the wrong way. Given that my systems don't
> use IDE, SCSI, a floppy or anything emulating one of them, like USB
> storage or CF. I don't want MSDOS partitioning, but in fact, I don't
> want any of the disk-centric code at all, fs/partitions is just a part
> of that.
Maybe introducing a CONFIG_DISK option and making partitioning as a whole
depend on that ?
--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
On Tue, 6 May 2003, Russell King wrote:
> On Tue, May 06, 2003 at 05:32:52PM +0200, J?rn Engel wrote:
> > Maybe I was just thinking the wrong way. Given that my systems don't
> > use IDE, SCSI, a floppy or anything emulating one of them, like USB
> > storage or CF. I don't want MSDOS partitioning, but in fact, I don't
> > want any of the disk-centric code at all, fs/partitions is just a part
> > of that.
>
> Maybe introducing a CONFIG_DISK option and making partitioning as a whole
> depend on that ?
According to Alan it's nearly possible to configure the block layer out
entirely, which would be a good thing to associate with a CONFIG_DISK option
too.
Nicolas
Hi Nicolas.
>> Maybe introducing a CONFIG_DISK option and making partitioning
>> as a whole depend on that ?
> According to Alan it's nearly possible to configure the block
> layer out entirely, which would be a good thing to associate
> with a CONFIG_DISK option too.
If we're killing the entire block layer, why call it CONFIG_DISK ???
Surely CONFIG_BLOCK_DEV would be much more appropriate? I certainly
don't think of my SmartMedia flash cards as disks...
Best wishes from Riley.
---
* Nothing as pretty as a smile, nothing as ugly as a frown.
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.476 / Virus Database: 273 - Release Date: 24-Apr-2003
On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> According to Alan it's nearly possible to configure the block layer out
> entirely, which would be a good thing to associate with a CONFIG_DISK option
> too.
David Woodhouse I believe..
On Tue, May 06 2003, Alan Cox wrote:
> On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> > According to Alan it's nearly possible to configure the block layer out
> > entirely, which would be a good thing to associate with a CONFIG_DISK option
> > too.
>
> David Woodhouse I believe..
Are we talking about everything below submit_bh/bio? Shouldn't be too
hard to write a small no-block.c for that...
--
Jens Axboe
On Tue, 6 May 2003, Jens Axboe wrote:
> On Tue, May 06 2003, Alan Cox wrote:
> > On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> > > According to Alan it's nearly possible to configure the block layer out
> > > entirely, which would be a good thing to associate with a CONFIG_DISK option
> > > too.
> >
> > David Woodhouse I believe..
>
> Are we talking about everything below submit_bh/bio? Shouldn't be too
> hard to write a small no-block.c for that...
The idea is to configure out everything not needed when only NFS and/or JFFS
(which doesn't rely on the block layer to work) are used. Pretty useful for
networked or embedded machines.
Nicolas
On Tue, May 06 2003, Nicolas Pitre wrote:
> On Tue, 6 May 2003, Jens Axboe wrote:
>
> > On Tue, May 06 2003, Alan Cox wrote:
> > > On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> > > > According to Alan it's nearly possible to configure the block layer out
> > > > entirely, which would be a good thing to associate with a CONFIG_DISK option
> > > > too.
> > >
> > > David Woodhouse I believe..
> >
> > Are we talking about everything below submit_bh/bio? Shouldn't be too
> > hard to write a small no-block.c for that...
>
> The idea is to configure out everything not needed when only NFS and/or JFFS
> (which doesn't rely on the block layer to work) are used. Pretty useful for
> networked or embedded machines.
I see, that would indeed be a bigger job :). Just the block layer would
not be hard, especially if you make the restriction that the block
drivers usable would be ones that used a make_request strategy for
handling requests. That would allow you to kill ll_rw_blk.c,
deadline-iosched.c, and elevator.c. That's some 21k of text and 2k of
data on this box.
--
Jens Axboe
On Tue, 6 May 2003, Jens Axboe wrote:
> On Tue, May 06 2003, Nicolas Pitre wrote:
> > On Tue, 6 May 2003, Jens Axboe wrote:
> >
> > > On Tue, May 06 2003, Alan Cox wrote:
> > > > On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> > > > > According to Alan it's nearly possible to configure the block layer out
> > > > > entirely, which would be a good thing to associate with a CONFIG_DISK option
> > > > > too.
> > > >
> > > > David Woodhouse I believe..
> > >
> > > Are we talking about everything below submit_bh/bio? Shouldn't be too
> > > hard to write a small no-block.c for that...
> >
> > The idea is to configure out everything not needed when only NFS and/or JFFS
> > (which doesn't rely on the block layer to work) are used. Pretty useful for
> > networked or embedded machines.
>
> I see, that would indeed be a bigger job :). Just the block layer would
> not be hard, especially if you make the restriction that the block
> drivers usable would be ones that used a make_request strategy for
> handling requests. That would allow you to kill ll_rw_blk.c,
> deadline-iosched.c, and elevator.c. That's some 21k of text and 2k of
> data on this box.
That's certainly a good start.
Nicolas
On Tue, May 06 2003, Nicolas Pitre wrote:
> On Tue, 6 May 2003, Jens Axboe wrote:
>
> > On Tue, May 06 2003, Nicolas Pitre wrote:
> > > On Tue, 6 May 2003, Jens Axboe wrote:
> > >
> > > > On Tue, May 06 2003, Alan Cox wrote:
> > > > > On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> > > > > > According to Alan it's nearly possible to configure the block layer out
> > > > > > entirely, which would be a good thing to associate with a CONFIG_DISK option
> > > > > > too.
> > > > >
> > > > > David Woodhouse I believe..
> > > >
> > > > Are we talking about everything below submit_bh/bio? Shouldn't be too
> > > > hard to write a small no-block.c for that...
> > >
> > > The idea is to configure out everything not needed when only NFS and/or JFFS
> > > (which doesn't rely on the block layer to work) are used. Pretty useful for
> > > networked or embedded machines.
> >
> > I see, that would indeed be a bigger job :). Just the block layer would
> > not be hard, especially if you make the restriction that the block
> > drivers usable would be ones that used a make_request strategy for
> > handling requests. That would allow you to kill ll_rw_blk.c,
> > deadline-iosched.c, and elevator.c. That's some 21k of text and 2k of
> > data on this box.
>
> That's certainly a good start.
How easy it is depends on which drivers need to work - which? mtdblock
comes to mind, a quick glance reveals that should be a breeze to make
work. It would even simplify it a bit, as the 'push request handling to
thread context' could be killed as well.
--
Jens Axboe
On Tue, May 06 2003, Jens Axboe wrote:
> On Tue, May 06 2003, Nicolas Pitre wrote:
> > On Tue, 6 May 2003, Jens Axboe wrote:
> >
> > > On Tue, May 06 2003, Nicolas Pitre wrote:
> > > > On Tue, 6 May 2003, Jens Axboe wrote:
> > > >
> > > > > On Tue, May 06 2003, Alan Cox wrote:
> > > > > > On Maw, 2003-05-06 at 18:23, Nicolas Pitre wrote:
> > > > > > > According to Alan it's nearly possible to configure the block layer out
> > > > > > > entirely, which would be a good thing to associate with a CONFIG_DISK option
> > > > > > > too.
> > > > > >
> > > > > > David Woodhouse I believe..
> > > > >
> > > > > Are we talking about everything below submit_bh/bio? Shouldn't be too
> > > > > hard to write a small no-block.c for that...
> > > >
> > > > The idea is to configure out everything not needed when only NFS and/or JFFS
> > > > (which doesn't rely on the block layer to work) are used. Pretty useful for
> > > > networked or embedded machines.
> > >
> > > I see, that would indeed be a bigger job :). Just the block layer would
> > > not be hard, especially if you make the restriction that the block
> > > drivers usable would be ones that used a make_request strategy for
> > > handling requests. That would allow you to kill ll_rw_blk.c,
> > > deadline-iosched.c, and elevator.c. That's some 21k of text and 2k of
> > > data on this box.
> >
> > That's certainly a good start.
>
> How easy it is depends on which drivers need to work - which? mtdblock
> comes to mind, a quick glance reveals that should be a breeze to make
> work. It would even simplify it a bit, as the 'push request handling to
> thread context' could be killed as well.
Something like this, I think the diffstat speaks for itself... Whether
this one works at all or not I dunno, it was done quickly and I don't
have any hardware to test it on. Should be trivial to make it work, if
it doesn't already.
axboe@smithers:/home/axboe $ diffstat mtdblock-1
mtdblock.c | 124 +++++++++++++++++--------------------------------------------
1 files changed, 36 insertions(+), 88 deletions(-)
===== drivers/mtd/mtdblock.c 1.43 vs edited =====
--- 1.43/drivers/mtd/mtdblock.c Mon Apr 21 01:21:18 2003
+++ edited/drivers/mtd/mtdblock.c Tue May 6 21:52:10 2003
@@ -381,105 +381,56 @@
* to dequeue requests before we are done.
*/
static struct request_queue mtd_queue;
-static void handle_mtdblock_request(void)
+static volatile int leaving = 0;
+static DECLARE_MUTEX_LOCKED(thread_sem);
+static DECLARE_WAIT_QUEUE_HEAD(thr_wq);
+
+static int mtdblock_make_request(request_queue_t *q, struct bio *bio)
{
- struct request *req;
- struct mtdblk_dev *mtdblk;
- unsigned int res;
+ struct mtdblk_dev **p = bio->bi_bdev->bd_disk->private_data;
+ struct mtdblk_dev *mtdblk = *p;
+ struct bio_vec *bv;
+ sector_t sector;
+ int i, err = 0;
- while ((req = elv_next_request(&mtd_queue)) != NULL) {
- struct mtdblk_dev **p = req->rq_disk->private_data;
- spin_unlock_irq(mtd_queue.queue_lock);
- mtdblk = *p;
- res = 0;
-
- if (! (req->flags & REQ_CMD))
- goto end_req;
-
- if ((req->sector + req->current_nr_sectors) > (mtdblk->mtd->size >> 9))
- goto end_req;
-
- // Handle the request
- switch (rq_data_dir(req))
- {
- int err;
-
- case READ:
- down(&mtdblk->cache_sem);
- err = do_cached_read (mtdblk, req->sector << 9,
- req->current_nr_sectors << 9,
- req->buffer);
- up(&mtdblk->cache_sem);
- if (!err)
- res = 1;
- break;
+ down(&mtdblk->cache_sem);
- case WRITE:
- // Read only device
- if ( !(mtdblk->mtd->flags & MTD_WRITEABLE) )
- break;
-
- // Do the write
- down(&mtdblk->cache_sem);
- err = do_cached_write (mtdblk, req->sector << 9,
- req->current_nr_sectors << 9,
- req->buffer);
- up(&mtdblk->cache_sem);
- if (!err)
- res = 1;
+ if (bio_data_dir(bio) == WRITE && !(mtdblk->mtd->flags & MTD_WRITEABLE))
+ goto end_io;
+
+ sector = bio->bi_sector;
+
+ bio_for_each_segment(bv, bio, i) {
+ unsigned long flags;
+ char *dst;
+
+ if (sector + (bv->bv_len >> 9) > (mtdblk->mtd->size >> 9)) {
+ err = -EIO;
break;
}
-end_req:
- spin_lock_irq(mtd_queue.queue_lock);
- if (!end_that_request_first(req, res, req->hard_cur_sectors)) {
- blkdev_dequeue_request(req);
- end_that_request_last(req);
- }
+ dst = bvec_kmap_irq(bv, &flags);
- }
-}
+ if (bio_data_dir(bio) == READ)
+ err = do_cached_read(mtdblk, sector << 9, bv->bv_len, dst);
+ else
+ err = do_cached_write(mtdblk, sector << 9, bv->bv_len, dst);
-static volatile int leaving = 0;
-static DECLARE_MUTEX_LOCKED(thread_sem);
-static DECLARE_WAIT_QUEUE_HEAD(thr_wq);
+ bvec_kunmap_irq(dst, &flags);
-int mtdblock_thread(void *dummy)
-{
- struct task_struct *tsk = current;
- DECLARE_WAITQUEUE(wait, tsk);
+ if (err)
+ break;
- /* we might get involved when memory gets low, so use PF_MEMALLOC */
- tsk->flags |= PF_MEMALLOC;
- daemonize("mtdblockd");
-
- while (!leaving) {
- add_wait_queue(&thr_wq, &wait);
- set_current_state(TASK_INTERRUPTIBLE);
- spin_lock_irq(mtd_queue.queue_lock);
- if (!elv_next_request(&mtd_queue) || blk_queue_plugged(&mtd_queue)) {
- spin_unlock_irq(mtd_queue.queue_lock);
- schedule();
- remove_wait_queue(&thr_wq, &wait);
- } else {
- remove_wait_queue(&thr_wq, &wait);
- set_current_state(TASK_RUNNING);
- handle_mtdblock_request();
- spin_unlock_irq(mtd_queue.queue_lock);
- }
+ sector += bv->bv_len >> 9;
+ bio_endio(bio, bv->bv_len, 0);
}
- up(&thread_sem);
+end_io:
+ up(&mtdblk->cache_sem);
+ bio_endio(bio, bio->bi_size, err);
return 0;
}
-static void mtdblock_request(struct request_queue *q)
-{
- /* Don't do anything, except wake the thread if necessary */
- wake_up(&thr_wq);
-}
-
-
static int mtdblock_ioctl(struct inode * inode, struct file * file,
unsigned int cmd, unsigned long arg)
{
@@ -557,8 +508,6 @@
}
}
-static spinlock_t mtddev_lock = SPIN_LOCK_UNLOCKED;
-
int __init init_mtdblock(void)
{
spin_lock_init(&mtdblks_lock);
@@ -572,8 +521,7 @@
register_mtd_user(¬ifier);
init_waitqueue_head(&thr_wq);
- blk_init_queue(&mtd_queue, &mtdblock_request, &mtddev_lock);
- kernel_thread (mtdblock_thread, NULL, CLONE_FS|CLONE_FILES|CLONE_SIGHAND);
+ blk_queue_make_request(&mtd_queue, mtdblock_make_request);
return 0;
}
--
Jens Axboe
On Tue, 2003-05-06 at 19:49, Jens Axboe wrote:
> I see, that would indeed be a bigger job :). Just the block layer would
> not be hard, especially if you make the restriction that the block
> drivers usable would be ones that used a make_request strategy for
> handling requests. That would allow you to kill ll_rw_blk.c,
> deadline-iosched.c, and elevator.c. That's some 21k of text and 2k of
> data on this box.
That's a little short of what I was intending. Ideally we stick 'struct
request', 'struct buffer_head' and 'struct bio' inside #ifdef
CONFIG_BLK_DEV, then kill all the dead code which uses them.
block_dev.c becomes...
int blkdev_open(struct inode * inode, struct file * filp)
{
return -ENXIO;
}
Don't look at JFFS; that still needs to be able to open a block device
even though it never actually _uses_ it. Look at the non-blkdev mount
path for JFFS2 instead. The _only_ thing we use the mtdblock device for
is to look at its minor number and use it to pick the right MTD device
-- it used to give us the locking on simultaneous mounts for free, a
constant device number for NFS exporting, and a cheap way to work around
the bug that the 'root=' command line option isn't available to
filesystems directly.
mtdblock.c cleanup noted with interest -- I'll play with that shortly;
thanks. Note that you don't actually need flash hardware, you can load
the 'mtdram' device which fakes it with vmalloc-backed storage instead.
Not too useful for powerfail-testing but for mounting something like
ext2 on mtdblock on mtdram it's fine.
--
dwmw2
On Tue, May 06 2003, David Woodhouse wrote:
> On Tue, 2003-05-06 at 19:49, Jens Axboe wrote:
> > I see, that would indeed be a bigger job :). Just the block layer would
> > not be hard, especially if you make the restriction that the block
> > drivers usable would be ones that used a make_request strategy for
> > handling requests. That would allow you to kill ll_rw_blk.c,
> > deadline-iosched.c, and elevator.c. That's some 21k of text and 2k of
> > data on this box.
>
> That's a little short of what I was intending. Ideally we stick 'struct
> request', 'struct buffer_head' and 'struct bio' inside #ifdef
> CONFIG_BLK_DEV, then kill all the dead code which uses them.
struct request can be a goner with my patch, the others not really.
request is really a block private property, so it's easy to kill off.
You are going for the really minimal approach, basically ruling out lots
of filesystems and requiring major surgery all around. While I can see
that make sense for an embedded kernel, I'm having a hard time
envisioning this as something mergable :-)
> mtdblock.c cleanup noted with interest -- I'll play with that shortly;
> thanks. Note that you don't actually need flash hardware, you can load
> the 'mtdram' device which fakes it with vmalloc-backed storage instead.
> Not too useful for powerfail-testing but for mounting something like
> ext2 on mtdblock on mtdram it's fine.
I'm attaching an updated version, I don't think it's safe to use atomic
kmaps across the do_cached_read/write.
Also, I want bio_endio() to increment the sector position of the bio as
well. Makes for a nicer api, and the sector var in mtdblock would then
be killable.
===== drivers/mtd/mtdblock.c 1.43 vs edited =====
--- 1.43/drivers/mtd/mtdblock.c Mon Apr 21 01:21:18 2003
+++ edited/drivers/mtd/mtdblock.c Wed May 7 09:18:30 2003
@@ -381,105 +381,55 @@
* to dequeue requests before we are done.
*/
static struct request_queue mtd_queue;
-static void handle_mtdblock_request(void)
+static volatile int leaving = 0;
+static DECLARE_MUTEX_LOCKED(thread_sem);
+static DECLARE_WAIT_QUEUE_HEAD(thr_wq);
+
+static int mtdblock_make_request(request_queue_t *q, struct bio *bio)
{
- struct request *req;
- struct mtdblk_dev *mtdblk;
- unsigned int res;
+ struct mtdblk_dev **p = bio->bi_bdev->bd_disk->private_data;
+ struct mtdblk_dev *mtdblk = *p;
+ struct bio_vec *bv;
+ sector_t sector;
+ int i, err = 0;
- while ((req = elv_next_request(&mtd_queue)) != NULL) {
- struct mtdblk_dev **p = req->rq_disk->private_data;
- spin_unlock_irq(mtd_queue.queue_lock);
- mtdblk = *p;
- res = 0;
-
- if (! (req->flags & REQ_CMD))
- goto end_req;
-
- if ((req->sector + req->current_nr_sectors) > (mtdblk->mtd->size >> 9))
- goto end_req;
-
- // Handle the request
- switch (rq_data_dir(req))
- {
- int err;
-
- case READ:
- down(&mtdblk->cache_sem);
- err = do_cached_read (mtdblk, req->sector << 9,
- req->current_nr_sectors << 9,
- req->buffer);
- up(&mtdblk->cache_sem);
- if (!err)
- res = 1;
- break;
+ down(&mtdblk->cache_sem);
- case WRITE:
- // Read only device
- if ( !(mtdblk->mtd->flags & MTD_WRITEABLE) )
- break;
-
- // Do the write
- down(&mtdblk->cache_sem);
- err = do_cached_write (mtdblk, req->sector << 9,
- req->current_nr_sectors << 9,
- req->buffer);
- up(&mtdblk->cache_sem);
- if (!err)
- res = 1;
+ if (bio_data_dir(bio) == WRITE && !(mtdblk->mtd->flags & MTD_WRITEABLE))
+ goto end_io;
+
+ sector = bio->bi_sector;
+
+ bio_for_each_segment(bv, bio, i) {
+ char *dst;
+
+ if (sector + (bv->bv_len >> 9) > (mtdblk->mtd->size >> 9)) {
+ err = -EIO;
break;
}
-end_req:
- spin_lock_irq(mtd_queue.queue_lock);
- if (!end_that_request_first(req, res, req->hard_cur_sectors)) {
- blkdev_dequeue_request(req);
- end_that_request_last(req);
- }
+ dst = kmap(bv->bv_page) + bv->bv_offset;
- }
-}
+ if (bio_data_dir(bio) == READ)
+ err = do_cached_read(mtdblk, sector << 9, bv->bv_len, dst);
+ else
+ err = do_cached_write(mtdblk, sector << 9, bv->bv_len, dst);
-static volatile int leaving = 0;
-static DECLARE_MUTEX_LOCKED(thread_sem);
-static DECLARE_WAIT_QUEUE_HEAD(thr_wq);
+ kunmap(bv->bv_page);
-int mtdblock_thread(void *dummy)
-{
- struct task_struct *tsk = current;
- DECLARE_WAITQUEUE(wait, tsk);
+ if (err)
+ break;
- /* we might get involved when memory gets low, so use PF_MEMALLOC */
- tsk->flags |= PF_MEMALLOC;
- daemonize("mtdblockd");
-
- while (!leaving) {
- add_wait_queue(&thr_wq, &wait);
- set_current_state(TASK_INTERRUPTIBLE);
- spin_lock_irq(mtd_queue.queue_lock);
- if (!elv_next_request(&mtd_queue) || blk_queue_plugged(&mtd_queue)) {
- spin_unlock_irq(mtd_queue.queue_lock);
- schedule();
- remove_wait_queue(&thr_wq, &wait);
- } else {
- remove_wait_queue(&thr_wq, &wait);
- set_current_state(TASK_RUNNING);
- handle_mtdblock_request();
- spin_unlock_irq(mtd_queue.queue_lock);
- }
+ sector += bv->bv_len >> 9;
+ bio_endio(bio, bv->bv_len, 0);
}
- up(&thread_sem);
+end_io:
+ up(&mtdblk->cache_sem);
+ bio_endio(bio, bio->bi_size, err);
return 0;
}
-static void mtdblock_request(struct request_queue *q)
-{
- /* Don't do anything, except wake the thread if necessary */
- wake_up(&thr_wq);
-}
-
-
static int mtdblock_ioctl(struct inode * inode, struct file * file,
unsigned int cmd, unsigned long arg)
{
@@ -557,8 +507,6 @@
}
}
-static spinlock_t mtddev_lock = SPIN_LOCK_UNLOCKED;
-
int __init init_mtdblock(void)
{
spin_lock_init(&mtdblks_lock);
@@ -572,8 +520,7 @@
register_mtd_user(¬ifier);
init_waitqueue_head(&thr_wq);
- blk_init_queue(&mtd_queue, &mtdblock_request, &mtddev_lock);
- kernel_thread (mtdblock_thread, NULL, CLONE_FS|CLONE_FILES|CLONE_SIGHAND);
+ blk_queue_make_request(&mtd_queue, mtdblock_make_request);
return 0;
}
--
Jens Axboe
On Wed, 2003-05-07 at 08:22, Jens Axboe wrote:
> > That's a little short of what I was intending. Ideally we stick 'struct
> > request', 'struct buffer_head' and 'struct bio' inside #ifdef
> > CONFIG_BLK_DEV, then kill all the dead code which uses them.
>
> struct request can be a goner with my patch, the others not really.
> request is really a block private property, so it's easy to kill off.
> You are going for the really minimal approach, basically ruling out lots
> of filesystems and requiring major surgery all around. While I can see
> that make sense for an embedded kernel, I'm having a hard time
> envisioning this as something mergable :-)
Last time I looked, it wasn't that bad until I got mired in VM code.
I haven't looked at that since we got CONFIG_SWAP, but I'm fairly sure
the VM bits will be a lot nicer now too.
> > mtdblock.c cleanup noted with interest -- I'll play with that shortly;
> > thanks. Note that you don't actually need flash hardware, you can load
> > the 'mtdram' device which fakes it with vmalloc-backed storage instead.
> > Not too useful for powerfail-testing but for mounting something like
> > ext2 on mtdblock on mtdram it's fine.
>
> I'm attaching an updated version, I don't think it's safe to use atomic
> kmaps across the do_cached_read/write.
It's not -- the flash read/write/erase functions may sleep.
> Also, I want bio_endio() to increment the sector position of the bio as
> well. Makes for a nicer api, and the sector var in mtdblock would then
> be killable.
OK. Let me know when you're done and I'll fix up FTL and NFTL
accordingly too.
--
dwmw2
On Wed, May 07 2003, David Woodhouse wrote:
> On Wed, 2003-05-07 at 08:22, Jens Axboe wrote:
> > > That's a little short of what I was intending. Ideally we stick 'struct
> > > request', 'struct buffer_head' and 'struct bio' inside #ifdef
> > > CONFIG_BLK_DEV, then kill all the dead code which uses them.
> >
> > struct request can be a goner with my patch, the others not really.
> > request is really a block private property, so it's easy to kill off.
> > You are going for the really minimal approach, basically ruling out lots
> > of filesystems and requiring major surgery all around. While I can see
> > that make sense for an embedded kernel, I'm having a hard time
> > envisioning this as something mergable :-)
>
> Last time I looked, it wasn't that bad until I got mired in VM code.
> I haven't looked at that since we got CONFIG_SWAP, but I'm fairly sure
> the VM bits will be a lot nicer now too.
I'll let you deal with that, my head is spinning from just thinking
about it :)
> > > mtdblock.c cleanup noted with interest -- I'll play with that shortly;
> > > thanks. Note that you don't actually need flash hardware, you can load
> > > the 'mtdram' device which fakes it with vmalloc-backed storage instead.
> > > Not too useful for powerfail-testing but for mounting something like
> > > ext2 on mtdblock on mtdram it's fine.
> >
> > I'm attaching an updated version, I don't think it's safe to use atomic
> > kmaps across the do_cached_read/write.
>
> It's not -- the flash read/write/erase functions may sleep.
Thought so.
> > Also, I want bio_endio() to increment the sector position of the bio as
> > well. Makes for a nicer api, and the sector var in mtdblock would then
> > be killable.
>
> OK. Let me know when you're done and I'll fix up FTL and NFTL
> accordingly too.
The only change to the patch attached in the previous mail is just to
drop the sector_t sector and use bio->bi_sector throughout the function.
So nothing major.
--
Jens Axboe