2006-08-24 21:33:00

by David Howells

[permalink] [raw]
Subject: [PATCH 01/17] BLOCK: Move functions out of buffer code [try #2]

From: David Howells <[email protected]>

Move some functions out of the buffering code that aren't strictly buffering
specific. This is a precursor to being able to disable the block layer.

(*) Moved some stuff out of fs/buffer.c:

(*) The file sync and general sync stuff moved to fs/sync.c.

(*) The superblock sync stuff moved to fs/super.c.

(*) do_invalidatepage() moved to mm/truncate.c.

(*) try_to_release_page() moved to mm/filemap.c.

(*) Moved some related declarations between header files:

(*) declarations for do_invalidatepage() and try_to_release_page() moved
to linux/mm.h.

(*) __set_page_dirty_buffers() moved to linux/buffer_head.h.

Signed-Off-By: David Howells <[email protected]>
---

fs/buffer.c | 174 -------------------------------------------
fs/super.c | 31 ++++++++
fs/sync.c | 113 ++++++++++++++++++++++++++++
include/linux/buffer_head.h | 3 -
include/linux/fs.h | 1
include/linux/mm.h | 4 +
mm/filemap.c | 30 +++++++
mm/page-writeback.c | 1
mm/truncate.c | 26 ++++++
9 files changed, 206 insertions(+), 177 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 71649ef..314b9c4 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -159,31 +159,6 @@ int sync_blockdev(struct block_device *b
}
EXPORT_SYMBOL(sync_blockdev);

-static void __fsync_super(struct super_block *sb)
-{
- sync_inodes_sb(sb, 0);
- DQUOT_SYNC(sb);
- lock_super(sb);
- if (sb->s_dirt && sb->s_op->write_super)
- sb->s_op->write_super(sb);
- unlock_super(sb);
- if (sb->s_op->sync_fs)
- sb->s_op->sync_fs(sb, 1);
- sync_blockdev(sb->s_bdev);
- sync_inodes_sb(sb, 1);
-}
-
-/*
- * Write out and wait upon all dirty data associated with this
- * superblock. Filesystem data as well as the underlying block
- * device. Takes the superblock lock.
- */
-int fsync_super(struct super_block *sb)
-{
- __fsync_super(sb);
- return sync_blockdev(sb->s_bdev);
-}
-
/*
* Write out and wait upon all dirty data associated with this
* device. Filesystem data as well as the underlying block
@@ -260,118 +235,6 @@ void thaw_bdev(struct block_device *bdev
EXPORT_SYMBOL(thaw_bdev);

/*
- * sync everything. Start out by waking pdflush, because that writes back
- * all queues in parallel.
- */
-static void do_sync(unsigned long wait)
-{
- wakeup_pdflush(0);
- sync_inodes(0); /* All mappings, inodes and their blockdevs */
- DQUOT_SYNC(NULL);
- sync_supers(); /* Write the superblocks */
- sync_filesystems(0); /* Start syncing the filesystems */
- sync_filesystems(wait); /* Waitingly sync the filesystems */
- sync_inodes(wait); /* Mappings, inodes and blockdevs, again. */
- if (!wait)
- printk("Emergency Sync complete\n");
- if (unlikely(laptop_mode))
- laptop_sync_completion();
-}
-
-asmlinkage long sys_sync(void)
-{
- do_sync(1);
- return 0;
-}
-
-void emergency_sync(void)
-{
- pdflush_operation(do_sync, 0);
-}
-
-/*
- * Generic function to fsync a file.
- *
- * filp may be NULL if called via the msync of a vma.
- */
-
-int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
-{
- struct inode * inode = dentry->d_inode;
- struct super_block * sb;
- int ret, err;
-
- /* sync the inode to buffers */
- ret = write_inode_now(inode, 0);
-
- /* sync the superblock to buffers */
- sb = inode->i_sb;
- lock_super(sb);
- if (sb->s_op->write_super)
- sb->s_op->write_super(sb);
- unlock_super(sb);
-
- /* .. finally sync the buffers to disk */
- err = sync_blockdev(sb->s_bdev);
- if (!ret)
- ret = err;
- return ret;
-}
-
-long do_fsync(struct file *file, int datasync)
-{
- int ret;
- int err;
- struct address_space *mapping = file->f_mapping;
-
- if (!file->f_op || !file->f_op->fsync) {
- /* Why? We can still call filemap_fdatawrite */
- ret = -EINVAL;
- goto out;
- }
-
- ret = filemap_fdatawrite(mapping);
-
- /*
- * We need to protect against concurrent writers, which could cause
- * livelocks in fsync_buffers_list().
- */
- mutex_lock(&mapping->host->i_mutex);
- err = file->f_op->fsync(file, file->f_dentry, datasync);
- if (!ret)
- ret = err;
- mutex_unlock(&mapping->host->i_mutex);
- err = filemap_fdatawait(mapping);
- if (!ret)
- ret = err;
-out:
- return ret;
-}
-
-static long __do_fsync(unsigned int fd, int datasync)
-{
- struct file *file;
- int ret = -EBADF;
-
- file = fget(fd);
- if (file) {
- ret = do_fsync(file, datasync);
- fput(file);
- }
- return ret;
-}
-
-asmlinkage long sys_fsync(unsigned int fd)
-{
- return __do_fsync(fd, 0);
-}
-
-asmlinkage long sys_fdatasync(unsigned int fd)
-{
- return __do_fsync(fd, 1);
-}
-
-/*
* Various filesystems appear to want __find_get_block to be non-blocking.
* But it's the page lock which protects the buffers. To get around this,
* we get exclusion from try_to_free_buffers with the blockdev mapping's
@@ -1551,35 +1414,6 @@ static void discard_buffer(struct buffer
}

/**
- * try_to_release_page() - release old fs-specific metadata on a page
- *
- * @page: the page which the kernel is trying to free
- * @gfp_mask: memory allocation flags (and I/O mode)
- *
- * The address_space is to try to release any data against the page
- * (presumably at page->private). If the release was successful, return `1'.
- * Otherwise return zero.
- *
- * The @gfp_mask argument specifies whether I/O may be performed to release
- * this page (__GFP_IO), and whether the call may block (__GFP_WAIT).
- *
- * NOTE: @gfp_mask may go away, and this function may become non-blocking.
- */
-int try_to_release_page(struct page *page, gfp_t gfp_mask)
-{
- struct address_space * const mapping = page->mapping;
-
- BUG_ON(!PageLocked(page));
- if (PageWriteback(page))
- return 0;
-
- if (mapping && mapping->a_ops->releasepage)
- return mapping->a_ops->releasepage(page, gfp_mask);
- return try_to_free_buffers(page);
-}
-EXPORT_SYMBOL(try_to_release_page);
-
-/**
* block_invalidatepage - invalidate part of all of a buffer-backed page
*
* @page: the page which is affected
@@ -1630,14 +1464,6 @@ out:
}
EXPORT_SYMBOL(block_invalidatepage);

-void do_invalidatepage(struct page *page, unsigned long offset)
-{
- void (*invalidatepage)(struct page *, unsigned long);
- invalidatepage = page->mapping->a_ops->invalidatepage ? :
- block_invalidatepage;
- (*invalidatepage)(page, offset);
-}
-
/*
* We attach and possibly dirty the buffers atomically wrt
* __set_page_dirty_buffers() via private_lock. try_to_free_buffers
diff --git a/fs/super.c b/fs/super.c
index 6d4e817..22c2fd1 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -219,6 +219,37 @@ static int grab_super(struct super_block
return 0;
}

+/*
+ * Write out and wait upon all dirty data associated with this
+ * superblock. Filesystem data as well as the underlying block
+ * device. Takes the superblock lock. Requires a second blkdev
+ * flush by the caller to complete the operation.
+ */
+void __fsync_super(struct super_block *sb)
+{
+ sync_inodes_sb(sb, 0);
+ DQUOT_SYNC(sb);
+ lock_super(sb);
+ if (sb->s_dirt && sb->s_op->write_super)
+ sb->s_op->write_super(sb);
+ unlock_super(sb);
+ if (sb->s_op->sync_fs)
+ sb->s_op->sync_fs(sb, 1);
+ sync_blockdev(sb->s_bdev);
+ sync_inodes_sb(sb, 1);
+}
+
+/*
+ * Write out and wait upon all dirty data associated with this
+ * superblock. Filesystem data as well as the underlying block
+ * device. Takes the superblock lock.
+ */
+int fsync_super(struct super_block *sb)
+{
+ __fsync_super(sb);
+ return sync_blockdev(sb->s_bdev);
+}
+
/**
* generic_shutdown_super - common helper for ->kill_sb()
* @sb: superblock to kill
diff --git a/fs/sync.c b/fs/sync.c
index 955aef0..1de747b 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -10,11 +10,124 @@ #include <linux/writeback.h>
#include <linux/syscalls.h>
#include <linux/linkage.h>
#include <linux/pagemap.h>
+#include <linux/quotaops.h>
+#include <linux/buffer_head.h>

#define VALID_FLAGS (SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE| \
SYNC_FILE_RANGE_WAIT_AFTER)

/*
+ * sync everything. Start out by waking pdflush, because that writes back
+ * all queues in parallel.
+ */
+static void do_sync(unsigned long wait)
+{
+ wakeup_pdflush(0);
+ sync_inodes(0); /* All mappings, inodes and their blockdevs */
+ DQUOT_SYNC(NULL);
+ sync_supers(); /* Write the superblocks */
+ sync_filesystems(0); /* Start syncing the filesystems */
+ sync_filesystems(wait); /* Waitingly sync the filesystems */
+ sync_inodes(wait); /* Mappings, inodes and blockdevs, again. */
+ if (!wait)
+ printk("Emergency Sync complete\n");
+ if (unlikely(laptop_mode))
+ laptop_sync_completion();
+}
+
+asmlinkage long sys_sync(void)
+{
+ do_sync(1);
+ return 0;
+}
+
+void emergency_sync(void)
+{
+ pdflush_operation(do_sync, 0);
+}
+
+/*
+ * Generic function to fsync a file.
+ *
+ * filp may be NULL if called via the msync of a vma.
+ */
+int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
+{
+ struct inode * inode = dentry->d_inode;
+ struct super_block * sb;
+ int ret, err;
+
+ /* sync the inode to buffers */
+ ret = write_inode_now(inode, 0);
+
+ /* sync the superblock to buffers */
+ sb = inode->i_sb;
+ lock_super(sb);
+ if (sb->s_op->write_super)
+ sb->s_op->write_super(sb);
+ unlock_super(sb);
+
+ /* .. finally sync the buffers to disk */
+ err = sync_blockdev(sb->s_bdev);
+ if (!ret)
+ ret = err;
+ return ret;
+}
+
+long do_fsync(struct file *file, int datasync)
+{
+ int ret;
+ int err;
+ struct address_space *mapping = file->f_mapping;
+
+ if (!file->f_op || !file->f_op->fsync) {
+ /* Why? We can still call filemap_fdatawrite */
+ ret = -EINVAL;
+ goto out;
+ }
+
+ ret = filemap_fdatawrite(mapping);
+
+ /*
+ * We need to protect against concurrent writers, which could cause
+ * livelocks in fsync_buffers_list().
+ */
+ mutex_lock(&mapping->host->i_mutex);
+ err = file->f_op->fsync(file, file->f_dentry, datasync);
+ if (!ret)
+ ret = err;
+ mutex_unlock(&mapping->host->i_mutex);
+ err = filemap_fdatawait(mapping);
+ if (!ret)
+ ret = err;
+out:
+ return ret;
+}
+
+static long __do_fsync(unsigned int fd, int datasync)
+{
+ struct file *file;
+ int ret = -EBADF;
+
+ file = fget(fd);
+ if (file) {
+ ret = do_fsync(file, datasync);
+ fput(file);
+ }
+ return ret;
+}
+
+asmlinkage long sys_fsync(unsigned int fd)
+{
+ return __do_fsync(fd, 0);
+}
+
+asmlinkage long sys_fdatasync(unsigned int fd)
+{
+ return __do_fsync(fd, 1);
+}
+
+/*
* sys_sync_file_range() permits finely controlled syncing over a segment of
* a file in the range offset .. (offset+nbytes-1) inclusive. If nbytes is
* zero then sys_sync_file_range() will operate from offset out to EOF.
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 737e407..64b508e 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -190,9 +190,7 @@ extern int buffer_heads_over_limit;
* Generic address_space_operations implementations for buffer_head-backed
* address_spaces.
*/
-int try_to_release_page(struct page * page, gfp_t gfp_mask);
void block_invalidatepage(struct page *page, unsigned long offset);
-void do_invalidatepage(struct page *page, unsigned long offset);
int block_write_full_page(struct page *page, get_block_t *get_block,
struct writeback_control *wbc);
int block_read_full_page(struct page*, get_block_t*);
@@ -302,4 +300,5 @@ static inline void lock_buffer(struct bu
__lock_buffer(bh);
}

+extern int __set_page_dirty_buffers(struct page *page);
#endif /* _LINUX_BUFFER_HEAD_H */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2561020..429bda5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1541,6 +1541,7 @@ extern int __filemap_fdatawrite_range(st
extern long do_fsync(struct file *file, int datasync);
extern void sync_supers(void);
extern void sync_filesystems(int wait);
+extern void __fsync_super(struct super_block *sb);
extern void emergency_sync(void);
extern void emergency_remount(void);
extern int do_remount_sb(struct super_block *sb, int flags,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index f0b135c..c3c25ef 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -767,7 +767,9 @@ int get_user_pages(struct task_struct *t
int len, int write, int force, struct page **pages, struct vm_area_struct **vmas);
void print_bad_pte(struct vm_area_struct *, pte_t, unsigned long);

-int __set_page_dirty_buffers(struct page *page);
+extern int try_to_release_page(struct page * page, gfp_t gfp_mask);
+extern void do_invalidatepage(struct page *page, unsigned long offset);
+
int __set_page_dirty_nobuffers(struct page *page);
int redirty_page_for_writepage(struct writeback_control *wbc,
struct page *page);
diff --git a/mm/filemap.c b/mm/filemap.c
index b9a60c4..a5ea7e0 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2474,3 +2474,33 @@ generic_file_direct_IO(int rw, struct ki
}
return retval;
}
+
+/**
+ * try_to_release_page() - release old fs-specific metadata on a page
+ *
+ * @page: the page which the kernel is trying to free
+ * @gfp_mask: memory allocation flags (and I/O mode)
+ *
+ * The address_space is to try to release any data against the page
+ * (presumably at page->private). If the release was successful, return `1'.
+ * Otherwise return zero.
+ *
+ * The @gfp_mask argument specifies whether I/O may be performed to release
+ * this page (__GFP_IO), and whether the call may block (__GFP_WAIT).
+ *
+ * NOTE: @gfp_mask may go away, and this function may become non-blocking.
+ */
+int try_to_release_page(struct page *page, gfp_t gfp_mask)
+{
+ struct address_space * const mapping = page->mapping;
+
+ BUG_ON(!PageLocked(page));
+ if (PageWriteback(page))
+ return 0;
+
+ if (mapping && mapping->a_ops->releasepage)
+ return mapping->a_ops->releasepage(page, gfp_mask);
+ return try_to_free_buffers(page);
+}
+
+EXPORT_SYMBOL(try_to_release_page);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index e630188..f75d033 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -29,6 +29,7 @@ #include <linux/smp.h>
#include <linux/sysctl.h>
#include <linux/cpu.h>
#include <linux/syscalls.h>
+#include <linux/buffer_head.h>

/*
* The maximum number of pages to writeout in a single bdflush/kupdate
diff --git a/mm/truncate.c b/mm/truncate.c
index cf1b015..081437d 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -16,6 +16,32 @@ #include <linux/buffer_head.h> /* grr. t
do_invalidatepage */


+/**
+ * do_invalidatepage - invalidate part of all of a page
+ * @page: the page which is affected
+ * @offset: the index of the truncation point
+ *
+ * do_invalidatepage() is called when all or part of the page has become
+ * invalidated by a truncate operation.
+ *
+ * do_invalidatepage() does not have to release all buffers, but it must
+ * ensure that no dirty buffer is left outside @offset and that no I/O
+ * is underway against any of the blocks which are outside the truncation
+ * point. Because the caller is about to free (and possibly reuse) those
+ * blocks on-disk.
+ */
+void do_invalidatepage(struct page *page, unsigned long offset)
+{
+ void (*invalidatepage)(struct page *, unsigned long);
+ invalidatepage = page->mapping->a_ops->invalidatepage;
+#ifdef CONFIG_BLOCK
+ if (!invalidatepage)
+ invalidatepage = block_invalidatepage;
+#endif
+ if (invalidatepage)
+ (*invalidatepage)(page, offset);
+}
+
static inline void truncate_partial_page(struct page *page, unsigned partial)
{
memclear_highpage_flush(page, partial, PAGE_CACHE_SIZE-partial);


2006-08-24 21:33:05

by David Howells

[permalink] [raw]
Subject: [PATCH 02/17] BLOCK: Remove duplicate declaration of exit_io_context() [try #2]

From: David Howells <[email protected]>

Remove the duplicate declaration of exit_io_context() from linux/sched.h.

Signed-Off-By: David Howells <[email protected]>
---

include/linux/sched.h | 1 -
kernel/exit.c | 1 +
2 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6674fc1..c12c5f9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -709,7 +709,6 @@ #endif /* CONFIG_SMP */


struct io_context; /* See blkdev.h */
-void exit_io_context(void);
struct cpuset;

#define NGROUPS_SMALL 32
diff --git a/kernel/exit.c b/kernel/exit.c
index dba194a..e0abd78 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -38,6 +38,7 @@ #include <linux/compat.h>
#include <linux/pipe_fs_i.h>
#include <linux/audit.h> /* for audit_free() */
#include <linux/resource.h>
+#include <linux/blkdev.h>

#include <asm/uaccess.h>
#include <asm/unistd.h>

2006-08-24 21:34:14

by David Howells

[permalink] [raw]
Subject: [PATCH 06/17] BLOCK: Move bdev_cache_init() declaration to headerfile [try #2]

From: David Howells <[email protected]>

Move the bdev_cache_init() extern declaration from fs/dcache.c to
linux/blkdev.h.

Signed-Off-By: David Howells <[email protected]>
---

fs/dcache.c | 2 +-
include/linux/blkdev.h | 1 +
2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 1b4a3a3..886ca6f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -32,6 +32,7 @@ #include <linux/security.h>
#include <linux/seqlock.h>
#include <linux/swap.h>
#include <linux/bootmem.h>
+#include <linux/blkdev.h>


int sysctl_vfs_cache_pressure __read_mostly = 100;
@@ -1742,7 +1743,6 @@ kmem_cache_t *filp_cachep __read_mostly;

EXPORT_SYMBOL(d_genocide);

-extern void bdev_cache_init(void);
extern void chrdev_init(void);

void __init vfs_caches_init_early(void)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index aafe827..41a643f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -840,5 +840,6 @@ #define MODULE_ALIAS_BLOCKDEV(major,mino
#define MODULE_ALIAS_BLOCKDEV_MAJOR(major) \
MODULE_ALIAS("block-major-" __stringify(major) "-*")

+extern void bdev_cache_init(void);

#endif

2006-08-24 21:33:36

by David Howells

[permalink] [raw]
Subject: [PATCH 14/17] BLOCK: Move the Ext3 device ioctl compat stuff to the Ext3 driver [try #2]

From: David Howells <[email protected]>

Move the Ext3 device ioctl compat stuff from fs/compat_ioctl.c to the Ext3
driver so that the Ext3 header file doesn't need to be included.

Signed-Off-By: David Howells <[email protected]>
---

fs/compat_ioctl.c | 27 -----------------------
fs/ext3/dir.c | 3 +++
fs/ext3/file.c | 3 +++
fs/ext3/ioctl.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++-
include/linux/ext3_fs.h | 6 +++++
5 files changed, 66 insertions(+), 28 deletions(-)

diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 24d5538..de3d422 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -45,8 +45,6 @@ #include <linux/auto_fs4.h>
#include <linux/tty.h>
#include <linux/vt_kern.h>
#include <linux/fb.h>
-#include <linux/ext3_jbd.h>
-#include <linux/ext3_fs.h>
#include <linux/videodev.h>
#include <linux/netdevice.h>
#include <linux/raw.h>
@@ -158,22 +156,6 @@ static int rw_long(unsigned int fd, unsi
return err;
}

-static int do_ext3_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
-{
- /* These are just misnamed, they actually get/put from/to user an int */
- switch (cmd) {
- case EXT3_IOC32_GETVERSION: cmd = EXT3_IOC_GETVERSION; break;
- case EXT3_IOC32_SETVERSION: cmd = EXT3_IOC_SETVERSION; break;
- case EXT3_IOC32_GETRSVSZ: cmd = EXT3_IOC_GETRSVSZ; break;
- case EXT3_IOC32_SETRSVSZ: cmd = EXT3_IOC_SETRSVSZ; break;
- case EXT3_IOC32_GROUP_EXTEND: cmd = EXT3_IOC_GROUP_EXTEND; break;
-#ifdef CONFIG_JBD_DEBUG
- case EXT3_IOC32_WAIT_FOR_READONLY: cmd = EXT3_IOC_WAIT_FOR_READONLY; break;
-#endif
- }
- return sys_ioctl(fd, cmd, (unsigned long)compat_ptr(arg));
-}
-
struct compat_video_event {
int32_t type;
compat_time_t timestamp;
@@ -2714,15 +2696,6 @@ HANDLE_IOCTL(PIO_UNIMAP, do_unimap_ioctl
HANDLE_IOCTL(GIO_UNIMAP, do_unimap_ioctl)
HANDLE_IOCTL(KDFONTOP, do_kdfontop_ioctl)
#endif
-HANDLE_IOCTL(EXT3_IOC32_GETVERSION, do_ext3_ioctl)
-HANDLE_IOCTL(EXT3_IOC32_SETVERSION, do_ext3_ioctl)
-HANDLE_IOCTL(EXT3_IOC32_GETRSVSZ, do_ext3_ioctl)
-HANDLE_IOCTL(EXT3_IOC32_SETRSVSZ, do_ext3_ioctl)
-HANDLE_IOCTL(EXT3_IOC32_GROUP_EXTEND, do_ext3_ioctl)
-COMPATIBLE_IOCTL(EXT3_IOC_GROUP_ADD)
-#ifdef CONFIG_JBD_DEBUG
-HANDLE_IOCTL(EXT3_IOC32_WAIT_FOR_READONLY, do_ext3_ioctl)
-#endif
/* One SMB ioctl needs translations. */
#define SMB_IOC_GETMOUNTUID_32 _IOR('u', 1, compat_uid_t)
HANDLE_IOCTL(SMB_IOC_GETMOUNTUID_32, do_smb_getmountuid)
diff --git a/fs/ext3/dir.c b/fs/ext3/dir.c
index fbb0d4e..7ba8917 100644
--- a/fs/ext3/dir.c
+++ b/fs/ext3/dir.c
@@ -44,6 +44,9 @@ const struct file_operations ext3_dir_op
.read = generic_read_dir,
.readdir = ext3_readdir, /* we take BKL. needed?*/
.ioctl = ext3_ioctl, /* BKL held */
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = ext3_compat_ioctl,
+#endif
.fsync = ext3_sync_file, /* BKL held */
#ifdef CONFIG_EXT3_INDEX
.release = ext3_release_dir,
diff --git a/fs/ext3/file.c b/fs/ext3/file.c
index 1efefb6..40320da 100644
--- a/fs/ext3/file.c
+++ b/fs/ext3/file.c
@@ -114,6 +114,9 @@ const struct file_operations ext3_file_o
.readv = generic_file_readv,
.writev = generic_file_writev,
.ioctl = ext3_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = ext3_compat_ioctl,
+#endif
.mmap = generic_file_mmap,
.open = generic_file_open,
.release = ext3_release_file,
diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c
index 3a6b012..12daa68 100644
--- a/fs/ext3/ioctl.c
+++ b/fs/ext3/ioctl.c
@@ -13,9 +13,10 @@ #include <linux/capability.h>
#include <linux/ext3_fs.h>
#include <linux/ext3_jbd.h>
#include <linux/time.h>
+#include <linux/compat.h>
+#include <linux/smp_lock.h>
#include <asm/uaccess.h>

-
int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
unsigned long arg)
{
@@ -252,3 +253,55 @@ #endif
return -ENOTTY;
}
}
+
+#ifdef CONFIG_COMPAT
+long ext3_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ int ret;
+
+ /* These are just misnamed, they actually get/put from/to user an int */
+ switch (cmd) {
+ case EXT3_IOC32_GETFLAGS:
+ cmd = EXT3_IOC_GETFLAGS;
+ break;
+ case EXT3_IOC32_SETFLAGS:
+ cmd = EXT3_IOC_SETFLAGS;
+ break;
+ case EXT3_IOC32_GETVERSION:
+ cmd = EXT3_IOC_GETVERSION;
+ break;
+ case EXT3_IOC32_SETVERSION:
+ cmd = EXT3_IOC_SETVERSION;
+ break;
+ case EXT3_IOC32_GROUP_EXTEND:
+ cmd = EXT3_IOC_GROUP_EXTEND;
+ break;
+ case EXT3_IOC32_GETVERSION_OLD:
+ cmd = EXT3_IOC_GETVERSION_OLD;
+ break;
+ case EXT3_IOC32_SETVERSION_OLD:
+ cmd = EXT3_IOC_SETVERSION_OLD;
+ break;
+#ifdef CONFIG_JBD_DEBUG
+ case EXT3_IOC32_WAIT_FOR_READONLY:
+ cmd = EXT3_IOC_WAIT_FOR_READONLY;
+ break;
+#endif
+ case EXT3_IOC32_GETRSVSZ:
+ cmd = EXT3_IOC_GETRSVSZ;
+ break;
+ case EXT3_IOC32_SETRSVSZ:
+ cmd = EXT3_IOC_SETRSVSZ;
+ break;
+ case EXT3_IOC_GROUP_ADD:
+ break;
+ default:
+ return -ENOIOCTLCMD;
+ }
+ lock_kernel();
+ ret = ext3_ioctl(inode, file, cmd, (unsigned long) compat_ptr(arg));
+ unlock_kernel();
+ return ret;
+}
+#endif
diff --git a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h
index 90cfba2..690c730 100644
--- a/include/linux/ext3_fs.h
+++ b/include/linux/ext3_fs.h
@@ -237,6 +237,8 @@ #define EXT3_IOC_SETRSVSZ _IOW('f', 6,
/*
* ioctl commands in 32 bit emulation
*/
+#define EXT3_IOC32_GETFLAGS FS_IOC32_GETFLAGS
+#define EXT3_IOC32_SETFLAGS FS_IOC32_SETFLAGS
#define EXT3_IOC32_GETVERSION _IOR('f', 3, int)
#define EXT3_IOC32_SETVERSION _IOW('f', 4, int)
#define EXT3_IOC32_GETRSVSZ _IOR('f', 5, int)
@@ -245,6 +247,9 @@ #define EXT3_IOC32_GROUP_EXTEND _IOW('f
#ifdef CONFIG_JBD_DEBUG
#define EXT3_IOC32_WAIT_FOR_READONLY _IOR('f', 99, int)
#endif
+#define EXT3_IOC32_GETVERSION_OLD FS_IOC32_GETVERSION
+#define EXT3_IOC32_SETVERSION_OLD FS_IOC32_SETVERSION
+

/*
* Mount options
@@ -828,6 +833,7 @@ extern void ext3_set_aops(struct inode *
/* ioctl.c */
extern int ext3_ioctl (struct inode *, struct file *, unsigned int,
unsigned long);
+extern long ext3_compat_ioctl (struct file *, unsigned int, unsigned long);

/* namei.c */
extern int ext3_orphan_add(handle_t *, struct inode *);

2006-08-24 21:34:13

by David Howells

[permalink] [raw]
Subject: [PATCH 16/17] BLOCK: Move the msdos device ioctl compat stuff to the msdos driver [try #2]

From: David Howells <[email protected]>

Move the msdos device ioctl compat stuff from fs/compat_ioctl.c to the msdos
driver so that the msdos header file doesn't need to be included.

Signed-Off-By: David Howells <[email protected]>
---

fs/compat_ioctl.c | 49 ------------------------------------------------
fs/fat/dir.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+), 49 deletions(-)

diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index de3d422..7b8a9b4 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -108,7 +108,6 @@ #include <linux/usbdevice_fs.h>
#include <linux/nbd.h>
#include <linux/random.h>
#include <linux/filter.h>
-#include <linux/msdos_fs.h>
#include <linux/pktcdvd.h>

#include <linux/hiddev.h>
@@ -1939,51 +1938,6 @@ static int mtd_rw_oob(unsigned int fd, u
return err;
}

-#define VFAT_IOCTL_READDIR_BOTH32 _IOR('r', 1, struct compat_dirent[2])
-#define VFAT_IOCTL_READDIR_SHORT32 _IOR('r', 2, struct compat_dirent[2])
-
-static long
-put_dirent32 (struct dirent *d, struct compat_dirent __user *d32)
-{
- if (!access_ok(VERIFY_WRITE, d32, sizeof(struct compat_dirent)))
- return -EFAULT;
-
- __put_user(d->d_ino, &d32->d_ino);
- __put_user(d->d_off, &d32->d_off);
- __put_user(d->d_reclen, &d32->d_reclen);
- if (__copy_to_user(d32->d_name, d->d_name, d->d_reclen))
- return -EFAULT;
-
- return 0;
-}
-
-static int vfat_ioctl32(unsigned fd, unsigned cmd, unsigned long arg)
-{
- struct compat_dirent __user *p = compat_ptr(arg);
- int ret;
- mm_segment_t oldfs = get_fs();
- struct dirent d[2];
-
- switch(cmd)
- {
- case VFAT_IOCTL_READDIR_BOTH32:
- cmd = VFAT_IOCTL_READDIR_BOTH;
- break;
- case VFAT_IOCTL_READDIR_SHORT32:
- cmd = VFAT_IOCTL_READDIR_SHORT;
- break;
- }
-
- set_fs(KERNEL_DS);
- ret = sys_ioctl(fd,cmd,(unsigned long)&d);
- set_fs(oldfs);
- if (ret >= 0) {
- ret |= put_dirent32(&d[0], p);
- ret |= put_dirent32(&d[1], p + 1);
- }
- return ret;
-}
-
struct raw32_config_request
{
compat_int_t raw_minor;
@@ -2728,9 +2682,6 @@ HANDLE_IOCTL(SONET_GETFRSENSE, do_atm_io
HANDLE_IOCTL(BLKBSZGET_32, do_blkbszget)
HANDLE_IOCTL(BLKBSZSET_32, do_blkbszset)
HANDLE_IOCTL(BLKGETSIZE64_32, do_blkgetsize64)
-/* vfat */
-HANDLE_IOCTL(VFAT_IOCTL_READDIR_BOTH32, vfat_ioctl32)
-HANDLE_IOCTL(VFAT_IOCTL_READDIR_SHORT32, vfat_ioctl32)
/* Raw devices */
HANDLE_IOCTL(RAW_SETBIND, raw_ioctl)
HANDLE_IOCTL(RAW_GETBIND, raw_ioctl)
diff --git a/fs/fat/dir.c b/fs/fat/dir.c
index 698b85b..8e99330 100644
--- a/fs/fat/dir.c
+++ b/fs/fat/dir.c
@@ -20,6 +20,7 @@ #include <linux/msdos_fs.h>
#include <linux/dirent.h>
#include <linux/smp_lock.h>
#include <linux/buffer_head.h>
+#include <linux/compat.h>
#include <asm/uaccess.h>

static inline loff_t fat_make_i_pos(struct super_block *sb,
@@ -740,11 +741,64 @@ static int fat_dir_ioctl(struct inode *
ret = buf.result;
return ret;
}
+#define VFAT_IOCTL_READDIR_BOTH32 _IOR('r', 1, struct compat_dirent[2])
+#define VFAT_IOCTL_READDIR_SHORT32 _IOR('r', 2, struct compat_dirent[2])
+
+static long fat_compat_put_dirent32(struct dirent *d,
+ struct compat_dirent __user *d32)
+{
+ if (!access_ok(VERIFY_WRITE, d32, sizeof(struct compat_dirent)))
+ return -EFAULT;
+
+ __put_user(d->d_ino, &d32->d_ino);
+ __put_user(d->d_off, &d32->d_off);
+ __put_user(d->d_reclen, &d32->d_reclen);
+ if (__copy_to_user(d32->d_name, d->d_name, d->d_reclen))
+ return -EFAULT;
+
+ return 0;
+}
+
+static long fat_compat_dir_ioctl(struct file *file, unsigned cmd,
+ unsigned long arg)
+{
+ struct compat_dirent __user *p = compat_ptr(arg);
+ int ret;
+ mm_segment_t oldfs = get_fs();
+ struct dirent d[2];
+
+ switch (cmd) {
+ case VFAT_IOCTL_READDIR_BOTH32:
+ cmd = VFAT_IOCTL_READDIR_BOTH;
+ break;
+ case VFAT_IOCTL_READDIR_SHORT32:
+ cmd = VFAT_IOCTL_READDIR_SHORT;
+ break;
+ default:
+ return -ENOIOCTLCMD;
+ }
+
+ set_fs(KERNEL_DS);
+ lock_kernel();
+ ret = fat_dir_ioctl(file->f_dentry->d_inode, file,
+ cmd, (unsigned long) &d);
+ unlock_kernel();
+ set_fs(oldfs);
+ if (ret >= 0) {
+ ret |= fat_compat_put_dirent32(&d[0], p);
+ ret |= fat_compat_put_dirent32(&d[1], p + 1);
+ }
+ return ret;
+}
+

const struct file_operations fat_dir_operations = {
.read = generic_read_dir,
.readdir = fat_readdir,
.ioctl = fat_dir_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = fat_compat_dir_ioctl,
+#endif
.fsync = file_fsync,
};

2006-08-24 21:35:56

by David Howells

[permalink] [raw]
Subject: [PATCH 10/17] BLOCK: Move the loop device ioctl compat stuff to the loop driver [try #2]

From: David Howells <[email protected]>

Move the loop device ioctl compat stuff from fs/compat_ioctl.c to the loop
driver so that the loop header file doesn't need to be included.

Signed-Off-By: David Howells <[email protected]>
---

drivers/block/loop.c | 76 ++++++++++++++++++++++++++++++++++++++++++
fs/compat_ioctl.c | 68 --------------------------------------
include/linux/compat_ioctl.h | 6 ---
3 files changed, 76 insertions(+), 74 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 7b3b94d..48ad173 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -66,6 +66,7 @@ #include <linux/smp_lock.h>
#include <linux/swap.h>
#include <linux/slab.h>
#include <linux/loop.h>
+#include <linux/compat.h>
#include <linux/suspend.h>
#include <linux/writeback.h>
#include <linux/buffer_head.h> /* for invalidate_bdev() */
@@ -1174,6 +1175,78 @@ static int lo_ioctl(struct inode * inode
return err;
}

+#ifdef CONFIG_COMPAT
+struct loop_info32 {
+ compat_int_t lo_number; /* ioctl r/o */
+ compat_dev_t lo_device; /* ioctl r/o */
+ compat_ulong_t lo_inode; /* ioctl r/o */
+ compat_dev_t lo_rdevice; /* ioctl r/o */
+ compat_int_t lo_offset;
+ compat_int_t lo_encrypt_type;
+ compat_int_t lo_encrypt_key_size; /* ioctl w/o */
+ compat_int_t lo_flags; /* ioctl r/o */
+ char lo_name[LO_NAME_SIZE];
+ unsigned char lo_encrypt_key[LO_KEY_SIZE]; /* ioctl w/o */
+ compat_ulong_t lo_init[2];
+ char reserved[4];
+};
+
+static long lo_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ mm_segment_t old_fs = get_fs();
+ struct loop_info l;
+ struct loop_info32 __user *ul;
+ int err = -ENOIOCTLCMD;
+
+ ul = compat_ptr(arg);
+
+ lock_kernel();
+ switch(cmd) {
+ case LOOP_SET_STATUS:
+ err = get_user(l.lo_number, &ul->lo_number);
+ err |= __get_user(l.lo_device, &ul->lo_device);
+ err |= __get_user(l.lo_inode, &ul->lo_inode);
+ err |= __get_user(l.lo_rdevice, &ul->lo_rdevice);
+ err |= __copy_from_user(&l.lo_offset, &ul->lo_offset,
+ 8 + (unsigned long)l.lo_init - (unsigned long)&l.lo_offset);
+ if (err) {
+ err = -EFAULT;
+ } else {
+ set_fs (KERNEL_DS);
+ err = lo_ioctl(inode, file, cmd, (unsigned long)&l);
+ set_fs (old_fs);
+ }
+ break;
+ case LOOP_GET_STATUS:
+ set_fs (KERNEL_DS);
+ err = lo_ioctl(inode, file, cmd, (unsigned long)&l);
+ set_fs (old_fs);
+ if (!err) {
+ err = put_user(l.lo_number, &ul->lo_number);
+ err |= __put_user(l.lo_device, &ul->lo_device);
+ err |= __put_user(l.lo_inode, &ul->lo_inode);
+ err |= __put_user(l.lo_rdevice, &ul->lo_rdevice);
+ err |= __copy_to_user(&ul->lo_offset, &l.lo_offset,
+ (unsigned long)l.lo_init - (unsigned long)&l.lo_offset);
+ if (err)
+ err = -EFAULT;
+ }
+ break;
+ case LOOP_CLR_FD:
+ case LOOP_GET_STATUS64:
+ case LOOP_SET_STATUS64:
+ arg = (unsigned long) compat_ptr(arg);
+ case LOOP_SET_FD:
+ case LOOP_CHANGE_FD:
+ err = lo_ioctl(inode, file, cmd, arg);
+ break;
+ }
+ unlock_kernel();
+ return err;
+}
+#endif
+
static int lo_open(struct inode *inode, struct file *file)
{
struct loop_device *lo = inode->i_bdev->bd_disk->private_data;
@@ -1201,6 +1274,9 @@ static struct block_device_operations lo
.open = lo_open,
.release = lo_release,
.ioctl = lo_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = lo_compat_ioctl,
+#endif
};

/*
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 4063a93..d33a2b1 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -40,7 +40,6 @@ #include <linux/if_ppp.h>
#include <linux/if_pppox.h>
#include <linux/mtio.h>
#include <linux/cdrom.h>
-#include <linux/loop.h>
#include <linux/auto_fs.h>
#include <linux/auto_fs4.h>
#include <linux/tty.h>
@@ -1214,71 +1213,6 @@ static int cdrom_ioctl_trans(unsigned in
return err;
}

-struct loop_info32 {
- compat_int_t lo_number; /* ioctl r/o */
- compat_dev_t lo_device; /* ioctl r/o */
- compat_ulong_t lo_inode; /* ioctl r/o */
- compat_dev_t lo_rdevice; /* ioctl r/o */
- compat_int_t lo_offset;
- compat_int_t lo_encrypt_type;
- compat_int_t lo_encrypt_key_size; /* ioctl w/o */
- compat_int_t lo_flags; /* ioctl r/o */
- char lo_name[LO_NAME_SIZE];
- unsigned char lo_encrypt_key[LO_KEY_SIZE]; /* ioctl w/o */
- compat_ulong_t lo_init[2];
- char reserved[4];
-};
-
-static int loop_status(unsigned int fd, unsigned int cmd, unsigned long arg)
-{
- mm_segment_t old_fs = get_fs();
- struct loop_info l;
- struct loop_info32 __user *ul;
- int err = -EINVAL;
-
- ul = compat_ptr(arg);
- switch(cmd) {
- case LOOP_SET_STATUS:
- err = get_user(l.lo_number, &ul->lo_number);
- err |= __get_user(l.lo_device, &ul->lo_device);
- err |= __get_user(l.lo_inode, &ul->lo_inode);
- err |= __get_user(l.lo_rdevice, &ul->lo_rdevice);
- err |= __copy_from_user(&l.lo_offset, &ul->lo_offset,
- 8 + (unsigned long)l.lo_init - (unsigned long)&l.lo_offset);
- if (err) {
- err = -EFAULT;
- } else {
- set_fs (KERNEL_DS);
- err = sys_ioctl (fd, cmd, (unsigned long)&l);
- set_fs (old_fs);
- }
- break;
- case LOOP_GET_STATUS:
- set_fs (KERNEL_DS);
- err = sys_ioctl (fd, cmd, (unsigned long)&l);
- set_fs (old_fs);
- if (!err) {
- err = put_user(l.lo_number, &ul->lo_number);
- err |= __put_user(l.lo_device, &ul->lo_device);
- err |= __put_user(l.lo_inode, &ul->lo_inode);
- err |= __put_user(l.lo_rdevice, &ul->lo_rdevice);
- err |= __copy_to_user(&ul->lo_offset, &l.lo_offset,
- (unsigned long)l.lo_init - (unsigned long)&l.lo_offset);
- if (err)
- err = -EFAULT;
- }
- break;
- default: {
- static int count;
- if (++count <= 20)
- printk("%s: Unknown loop ioctl cmd, fd(%d) "
- "cmd(%08x) arg(%08lx)\n",
- __FUNCTION__, fd, cmd, arg);
- }
- }
- return err;
-}
-
extern int tty_ioctl(struct inode * inode, struct file * file, unsigned int cmd, unsigned long arg);

#ifdef CONFIG_VT
@@ -2810,8 +2744,6 @@ HANDLE_IOCTL(MTIOCGET32, mt_ioctl_trans)
HANDLE_IOCTL(MTIOCPOS32, mt_ioctl_trans)
HANDLE_IOCTL(CDROMREADAUDIO, cdrom_ioctl_trans)
HANDLE_IOCTL(CDROM_SEND_PACKET, cdrom_ioctl_trans)
-HANDLE_IOCTL(LOOP_SET_STATUS, loop_status)
-HANDLE_IOCTL(LOOP_GET_STATUS, loop_status)
#define AUTOFS_IOC_SETTIMEOUT32 _IOWR(0x93,0x64,unsigned int)
HANDLE_IOCTL(AUTOFS_IOC_SETTIMEOUT32, ioc_settimeout)
#ifdef CONFIG_VT
diff --git a/include/linux/compat_ioctl.h b/include/linux/compat_ioctl.h
index 269d000..13cea44 100644
--- a/include/linux/compat_ioctl.h
+++ b/include/linux/compat_ioctl.h
@@ -394,12 +394,6 @@ COMPATIBLE_IOCTL(DVD_WRITE_STRUCT)
COMPATIBLE_IOCTL(DVD_AUTH)
/* pktcdvd */
COMPATIBLE_IOCTL(PACKET_CTRL_CMD)
-/* Big L */
-ULONG_IOCTL(LOOP_SET_FD)
-ULONG_IOCTL(LOOP_CHANGE_FD)
-COMPATIBLE_IOCTL(LOOP_CLR_FD)
-COMPATIBLE_IOCTL(LOOP_GET_STATUS64)
-COMPATIBLE_IOCTL(LOOP_SET_STATUS64)
/* Big A */
/* sparc only */
/* Big Q for sound/OSS */

2006-08-24 21:34:53

by David Howells

[permalink] [raw]
Subject: [PATCH 08/17] BLOCK: Dissociate generic_writepages() from mpage stuff [try #2]

From: David Howells <[email protected]>

Dissociate the generic_writepages() function from the mpage stuff, moving its
declaration to linux/mm.h and actually emitting a full implementation into
mm/page-writeback.c.

The implementation is a partial duplicate of mpage_writepages() with all BIO
references removed.

It is used by NFS to do writeback.

Signed-Off-By: David Howells <[email protected]>
---

fs/block_dev.c | 1
fs/mpage.c | 2 +
include/linux/mpage.h | 6 --
include/linux/writeback.h | 2 +
mm/page-writeback.c | 135 +++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 140 insertions(+), 6 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 3753457..8debde8 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -17,6 +17,7 @@ #include <linux/blkdev.h>
#include <linux/module.h>
#include <linux/blkpg.h>
#include <linux/buffer_head.h>
+#include <linux/writeback.h>
#include <linux/mpage.h>
#include <linux/mount.h>
#include <linux/uio.h>
diff --git a/fs/mpage.c b/fs/mpage.c
index 1e45982..6792251 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -693,6 +693,8 @@ out:
* the call was made get new I/O started against them. If wbc->sync_mode is
* WB_SYNC_ALL then we were called for data integrity and we must wait for
* existing IO to complete.
+ *
+ * !!!! If you fix this you should check generic_writepages() also!!!!
*/
int
mpage_writepages(struct address_space *mapping,
diff --git a/include/linux/mpage.h b/include/linux/mpage.h
index 3ca8804..517c098 100644
--- a/include/linux/mpage.h
+++ b/include/linux/mpage.h
@@ -20,9 +20,3 @@ int mpage_writepages(struct address_spac
struct writeback_control *wbc, get_block_t get_block);
int mpage_writepage(struct page *page, get_block_t *get_block,
struct writeback_control *wbc);
-
-static inline int
-generic_writepages(struct address_space *mapping, struct writeback_control *wbc)
-{
- return mpage_writepages(mapping, wbc, NULL);
-}
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 9e38b56..671c43b 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -110,6 +110,8 @@ balance_dirty_pages_ratelimited(struct a
}

int pdflush_operation(void (*fn)(unsigned long), unsigned long arg0);
+extern int generic_writepages(struct address_space *mapping,
+ struct writeback_control *wbc);
int do_writepages(struct address_space *mapping, struct writeback_control *wbc);
int sync_page_range(struct inode *inode, struct address_space *mapping,
loff_t pos, loff_t count);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f75d033..668716c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -30,6 +30,7 @@ #include <linux/sysctl.h>
#include <linux/cpu.h>
#include <linux/syscalls.h>
#include <linux/buffer_head.h>
+#include <linux/pagevec.h>

/*
* The maximum number of pages to writeout in a single bdflush/kupdate
@@ -543,6 +544,140 @@ void __init page_writeback_init(void)
register_cpu_notifier(&ratelimit_nb);
}

+/**
+ * generic_writepages - walk the list of dirty pages of the given
+ * address space and writepage() all of them.
+ *
+ * @mapping: address space structure to write
+ * @wbc: subtract the number of written pages from *@wbc->nr_to_write
+ *
+ * This is a library function, which implements the writepages()
+ * address_space_operation.
+ *
+ * If a page is already under I/O, generic_writepages() skips it, even
+ * if it's dirty. This is desirable behaviour for memory-cleaning writeback,
+ * but it is INCORRECT for data-integrity system calls such as fsync(). fsync()
+ * and msync() need to guarantee that all the data which was dirty at the time
+ * the call was made get new I/O started against them. If wbc->sync_mode is
+ * WB_SYNC_ALL then we were called for data integrity and we must wait for
+ * existing IO to complete.
+ *
+ * !!!! Derived from mpage_writepages() - if you fix this you should check that
+ * also !!!!
+ */
+int generic_writepages(struct address_space *mapping,
+ struct writeback_control *wbc)
+{
+ struct backing_dev_info *bdi = mapping->backing_dev_info;
+ int ret = 0;
+ int done = 0;
+ int (*writepage)(struct page *page, struct writeback_control *wbc);
+ struct pagevec pvec;
+ int nr_pages;
+ pgoff_t index;
+ pgoff_t end; /* Inclusive */
+ int scanned = 0;
+ int range_whole = 0;
+
+ if (wbc->nonblocking && bdi_write_congested(bdi)) {
+ wbc->encountered_congestion = 1;
+ return 0;
+ }
+
+ writepage = mapping->a_ops->writepage;
+
+ /* deal with chardevs and other special file */
+ if (!writepage)
+ return 0;
+
+ pagevec_init(&pvec, 0);
+ if (wbc->range_cyclic) {
+ index = mapping->writeback_index; /* Start from prev offset */
+ end = -1;
+ } else {
+ index = wbc->range_start >> PAGE_CACHE_SHIFT;
+ end = wbc->range_end >> PAGE_CACHE_SHIFT;
+ if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
+ range_whole = 1;
+ scanned = 1;
+ }
+retry:
+ while (!done && (index <= end) &&
+ (nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
+ PAGECACHE_TAG_DIRTY,
+ min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1))) {
+ unsigned i;
+
+ scanned = 1;
+ for (i = 0; i < nr_pages; i++) {
+ struct page *page = pvec.pages[i];
+
+ /*
+ * At this point we hold neither mapping->tree_lock nor
+ * lock on the page itself: the page may be truncated or
+ * invalidated (changing page->mapping to NULL), or even
+ * swizzled back from swapper_space to tmpfs file
+ * mapping
+ */
+
+ lock_page(page);
+
+ if (unlikely(page->mapping != mapping)) {
+ unlock_page(page);
+ continue;
+ }
+
+ if (!wbc->range_cyclic && page->index > end) {
+ done = 1;
+ unlock_page(page);
+ continue;
+ }
+
+ if (wbc->sync_mode != WB_SYNC_NONE)
+ wait_on_page_writeback(page);
+
+ if (PageWriteback(page) ||
+ !clear_page_dirty_for_io(page)) {
+ unlock_page(page);
+ continue;
+ }
+
+ ret = (*writepage)(page, wbc);
+ if (ret) {
+ if (ret == -ENOSPC)
+ set_bit(AS_ENOSPC, &mapping->flags);
+ else
+ set_bit(AS_EIO, &mapping->flags);
+ }
+
+ if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE))
+ unlock_page(page);
+ if (ret || (--(wbc->nr_to_write) <= 0))
+ done = 1;
+ if (wbc->nonblocking && bdi_write_congested(bdi)) {
+ wbc->encountered_congestion = 1;
+ done = 1;
+ }
+ }
+ pagevec_release(&pvec);
+ cond_resched();
+ }
+ if (!scanned && !done) {
+ /*
+ * We hit the last page and there is more work to be done: wrap
+ * back to the start of the file
+ */
+ scanned = 1;
+ index = 0;
+ goto retry;
+ }
+ if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
+ mapping->writeback_index = index;
+ return ret;
+}
+
+EXPORT_SYMBOL(generic_writepages);
+
int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
{
int ret;

2006-08-24 21:35:36

by David Howells

[permalink] [raw]
Subject: [PATCH 07/17] BLOCK: Remove dependence on existence of blockdev_superblock [try #2]

From: David Howells <[email protected]>

Move blockdev_superblock extern declaration from fs/fs-writeback.c to a
headerfile and remove the dependence on it by wrapping it in a macro.

Signed-Off-By: David Howells <[email protected]>
---

fs/fs-writeback.c | 8 +++-----
include/linux/blkdev.h | 4 ++++
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 892643d..d9de186 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -23,8 +23,6 @@ #include <linux/blkdev.h>
#include <linux/backing-dev.h>
#include <linux/buffer_head.h>

-extern struct super_block *blockdev_superblock;
-
/**
* __mark_inode_dirty - internal function
* @inode: inode to mark
@@ -320,7 +318,7 @@ sync_sb_inodes(struct super_block *sb, s

if (!bdi_cap_writeback_dirty(bdi)) {
list_move(&inode->i_list, &sb->s_dirty);
- if (sb == blockdev_superblock) {
+ if (sb_is_blkdev_sb(sb)) {
/*
* Dirty memory-backed blockdev: the ramdisk
* driver does this. Skip just this inode
@@ -337,14 +335,14 @@ sync_sb_inodes(struct super_block *sb, s

if (wbc->nonblocking && bdi_write_congested(bdi)) {
wbc->encountered_congestion = 1;
- if (sb != blockdev_superblock)
+ if (!sb_is_blkdev_sb(sb))
break; /* Skip a congested fs */
list_move(&inode->i_list, &sb->s_dirty);
continue; /* Skip a congested blockdev */
}

if (wbc->bdi && bdi != wbc->bdi) {
- if (sb != blockdev_superblock)
+ if (!sb_is_blkdev_sb(sb))
break; /* fs has the wrong queue */
list_move(&inode->i_list, &sb->s_dirty);
continue; /* blockdev has wrong queue */
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 41a643f..e3f30d5 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -16,6 +16,10 @@ #include <linux/stringify.h>

#include <asm/scatterlist.h>

+extern struct super_block *blockdev_superblock;
+
+#define sb_is_blkdev_sb(sb) ((sb) == blockdev_superblock)
+
struct scsi_ioctl_command;

struct request_queue;

2006-08-24 21:34:59

by David Howells

[permalink] [raw]
Subject: [PATCH 09/17] BLOCK: Move __invalidate_device() to block_dev.c [try #2]

From: David Howells <[email protected]>

Move __invalidate_device() from fs/inode.c to fs/block_dev.c so that it can
more easily be disabled when the block layer is disabled.

Signed-Off-By: David Howells <[email protected]>
---

fs/block_dev.c | 21 +++++++++++++++++++++
fs/inode.c | 21 ---------------------
2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 8debde8..ba26d3c 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1306,3 +1306,24 @@ void close_bdev_excl(struct block_device
}

EXPORT_SYMBOL(close_bdev_excl);
+
+int __invalidate_device(struct block_device *bdev)
+{
+ struct super_block *sb = get_super(bdev);
+ int res = 0;
+
+ if (sb) {
+ /*
+ * no need to lock the super, get_super holds the
+ * read mutex so the filesystem cannot go away
+ * under us (->put_super runs with the write lock
+ * hold).
+ */
+ shrink_dcache_sb(sb);
+ res = invalidate_inodes(sb);
+ drop_super(sb);
+ }
+ invalidate_bdev(bdev, 0);
+ return res;
+}
+EXPORT_SYMBOL(__invalidate_device);
diff --git a/fs/inode.c b/fs/inode.c
index 0bf9f04..6426bb0 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -363,27 +363,6 @@ int invalidate_inodes(struct super_block
}

EXPORT_SYMBOL(invalidate_inodes);
-
-int __invalidate_device(struct block_device *bdev)
-{
- struct super_block *sb = get_super(bdev);
- int res = 0;
-
- if (sb) {
- /*
- * no need to lock the super, get_super holds the
- * read mutex so the filesystem cannot go away
- * under us (->put_super runs with the write lock
- * hold).
- */
- shrink_dcache_sb(sb);
- res = invalidate_inodes(sb);
- drop_super(sb);
- }
- invalidate_bdev(bdev, 0);
- return res;
-}
-EXPORT_SYMBOL(__invalidate_device);

static int can_unuse(struct inode *inode)
{

2006-08-24 21:36:07

by David Howells

[permalink] [raw]
Subject: [PATCH 12/17] BLOCK: Move the ReiserFS device ioctl compat stuff to the ReiserFS driver [try #2]

From: David Howells <[email protected]>

Move the ReiserFS device ioctl compat stuff from fs/compat_ioctl.c to the
ReiserFS driver so that the ReiserFS header file doesn't need to be included.

Signed-Off-By: David Howells <[email protected]>
---

fs/compat_ioctl.c | 12 ------------
fs/reiserfs/dir.c | 3 +++
fs/reiserfs/file.c | 4 ++++
fs/reiserfs/ioctl.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/reiserfs_fs.h | 9 +++++++++
5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index c4d2849..5e84342 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -59,7 +59,6 @@ #include <linux/rtc.h>
#include <linux/pci.h>
#include <linux/module.h>
#include <linux/serial.h>
-#include <linux/reiserfs_fs.h>
#include <linux/if_tun.h>
#include <linux/ctype.h>
#include <linux/ioctl32.h>
@@ -2016,16 +2015,6 @@ static int vfat_ioctl32(unsigned fd, uns
return ret;
}

-#define REISERFS_IOC_UNPACK32 _IOW(0xCD,1,int)
-
-static int reiserfs_ioctl32(unsigned fd, unsigned cmd, unsigned long ptr)
-{
- if (cmd == REISERFS_IOC_UNPACK32)
- cmd = REISERFS_IOC_UNPACK;
-
- return sys_ioctl(fd,cmd,ptr);
-}
-
struct raw32_config_request
{
compat_int_t raw_minor;
@@ -2786,7 +2775,6 @@ HANDLE_IOCTL(BLKGETSIZE64_32, do_blkgets
/* vfat */
HANDLE_IOCTL(VFAT_IOCTL_READDIR_BOTH32, vfat_ioctl32)
HANDLE_IOCTL(VFAT_IOCTL_READDIR_SHORT32, vfat_ioctl32)
-HANDLE_IOCTL(REISERFS_IOC_UNPACK32, reiserfs_ioctl32)
/* Raw devices */
HANDLE_IOCTL(RAW_SETBIND, raw_ioctl)
HANDLE_IOCTL(RAW_GETBIND, raw_ioctl)
diff --git a/fs/reiserfs/dir.c b/fs/reiserfs/dir.c
index 9aabcc0..657050a 100644
--- a/fs/reiserfs/dir.c
+++ b/fs/reiserfs/dir.c
@@ -22,6 +22,9 @@ const struct file_operations reiserfs_di
.readdir = reiserfs_readdir,
.fsync = reiserfs_dir_fsync,
.ioctl = reiserfs_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = reiserfs_compat_ioctl,
+#endif
};

static int reiserfs_dir_fsync(struct file *filp, struct dentry *dentry,
diff --git a/fs/reiserfs/file.c b/fs/reiserfs/file.c
index 1627edd..719b367 100644
--- a/fs/reiserfs/file.c
+++ b/fs/reiserfs/file.c
@@ -2,6 +2,7 @@
* Copyright 2000 by Hans Reiser, licensing governed by reiserfs/README
*/

+#include <linux/config.h>
#include <linux/time.h>
#include <linux/reiserfs_fs.h>
#include <linux/reiserfs_acl.h>
@@ -1568,6 +1569,9 @@ const struct file_operations reiserfs_fi
.read = generic_file_read,
.write = reiserfs_file_write,
.ioctl = reiserfs_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = reiserfs_compat_ioctl,
+#endif
.mmap = generic_file_mmap,
.release = reiserfs_file_release,
.fsync = reiserfs_sync_file,
diff --git a/fs/reiserfs/ioctl.c b/fs/reiserfs/ioctl.c
index a986b5e..9c57578 100644
--- a/fs/reiserfs/ioctl.c
+++ b/fs/reiserfs/ioctl.c
@@ -9,6 +9,7 @@ #include <linux/time.h>
#include <asm/uaccess.h>
#include <linux/pagemap.h>
#include <linux/smp_lock.h>
+#include <linux/compat.h>

static int reiserfs_unpack(struct inode *inode, struct file *filp);

@@ -94,6 +95,40 @@ int reiserfs_ioctl(struct inode *inode,
}
}

+#ifdef CONFIG_COMPAT
+long reiserfs_compat_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ int ret;
+
+ /* These are just misnamed, they actually get/put from/to user an int */
+ switch (cmd) {
+ case REISERFS_IOC32_UNPACK:
+ cmd = REISERFS_IOC_UNPACK;
+ break;
+ case REISERFS_IOC32_GETFLAGS:
+ cmd = REISERFS_IOC_GETFLAGS;
+ break;
+ case REISERFS_IOC32_SETFLAGS:
+ cmd = REISERFS_IOC_SETFLAGS;
+ break;
+ case REISERFS_IOC32_GETVERSION:
+ cmd = REISERFS_IOC_GETVERSION;
+ break;
+ case REISERFS_IOC32_SETVERSION:
+ cmd = REISERFS_IOC_SETVERSION;
+ break;
+ default:
+ return -ENOIOCTLCMD;
+ }
+ lock_kernel();
+ ret = reiserfs_ioctl(inode, file, cmd, (unsigned long) compat_ptr(arg));
+ unlock_kernel();
+ return ret;
+}
+#endif
+
/*
** reiserfs_unpack
** Function try to convert tail from direct item into indirect.
diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h
index 54c3054..6ab7be9 100644
--- a/include/linux/reiserfs_fs.h
+++ b/include/linux/reiserfs_fs.h
@@ -2167,6 +2167,8 @@ #define SPARE_SPACE 500
/* prototypes from ioctl.c */
int reiserfs_ioctl(struct inode *inode, struct file *filp,
unsigned int cmd, unsigned long arg);
+long reiserfs_compat_ioctl(struct file *filp,
+ unsigned int cmd, unsigned long arg);

/* ioctl's command */
#define REISERFS_IOC_UNPACK _IOW(0xCD,1,long)
@@ -2177,6 +2179,13 @@ #define REISERFS_IOC_SETFLAGS FS_IOC_SE
#define REISERFS_IOC_GETVERSION FS_IOC_GETVERSION
#define REISERFS_IOC_SETVERSION FS_IOC_SETVERSION

+/* the 32 bit compat definitions with int argument */
+#define REISERFS_IOC32_UNPACK _IOW(0xCD, 1, int)
+#define REISERFS_IOC32_GETFLAGS FS_IOC32_GETFLAGS
+#define REISERFS_IOC32_SETFLAGS FS_IOC32_SETFLAGS
+#define REISERFS_IOC32_GETVERSION FS_IOC32_GETVERSION
+#define REISERFS_IOC32_SETVERSION FS_IOC32_SETVERSION
+
/* Locking primitives */
/* Right now we are still falling back to (un)lock_kernel, but eventually that
would evolve into real per-fs locks */

2006-08-24 21:36:40

by David Howells

[permalink] [raw]
Subject: [PATCH 15/17] BLOCK: Stop CIFS from using EXT2 ioctl numbers directly [try #2]

From: David Howells <[email protected]>

Stop CIFS from using EXT2 ioctl numbers directly, making it use the ones in
linux/fs.h instead.

Signed-Off-By: David Howells <[email protected]>
---

0 files changed, 0 insertions(+), 0 deletions(-)

2006-08-24 21:36:39

by David Howells

[permalink] [raw]
Subject: [PATCH 11/17] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #2]

From: David Howells <[email protected]>

Move common FS-specific ioctls from linux/ext2_fs.h to linux/fs.h as FS_IOC_*
and FS_IOC32_* and have the users of them use those as a base.

Also move the GETFLAGS/SETFLAGS flags to linux/fs.h as FS_*_FL macros, and then
have the other users use them as a base.

Signed-Off-By: David Howells <[email protected]>
---

fs/cifs/ioctl.c | 7 ++---
fs/compat_ioctl.c | 15 ----------
fs/hfsplus/hfsplus_fs.h | 8 +----
fs/hfsplus/ioctl.c | 17 +++++------
fs/jfs/ioctl.c | 15 +++++-----
include/linux/ext2_fs.h | 66 ++++++++++++++++++++++++-------------------
include/linux/ext3_fs.h | 20 ++++++++++---
include/linux/fs.h | 39 +++++++++++++++++++++++++
include/linux/reiserfs_fs.h | 28 ++++++++----------
9 files changed, 125 insertions(+), 90 deletions(-)

diff --git a/fs/cifs/ioctl.c b/fs/cifs/ioctl.c
index b0ea668..e34c7db 100644
--- a/fs/cifs/ioctl.c
+++ b/fs/cifs/ioctl.c
@@ -22,7 +22,6 @@
*/

#include <linux/fs.h>
-#include <linux/ext2_fs.h>
#include "cifspdu.h"
#include "cifsglob.h"
#include "cifsproto.h"
@@ -74,7 +73,7 @@ #endif /* CONFIG_CIFS_POSIX */
}
break;
#ifdef CONFIG_CIFS_POSIX
- case EXT2_IOC_GETFLAGS:
+ case FS_IOC_GETFLAGS:
if(CIFS_UNIX_EXTATTR_CAP & caps) {
if (pSMBFile == NULL)
break;
@@ -82,12 +81,12 @@ #ifdef CONFIG_CIFS_POSIX
&ExtAttrBits, &ExtAttrMask);
if(rc == 0)
rc = put_user(ExtAttrBits &
- EXT2_FL_USER_VISIBLE,
+ FS_FL_USER_VISIBLE,
(int __user *)arg);
}
break;

- case EXT2_IOC_SETFLAGS:
+ case FS_IOC_SETFLAGS:
if(CIFS_UNIX_EXTATTR_CAP & caps) {
if(get_user(ExtAttrBits,(int __user *)arg)) {
rc = -EFAULT;
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index d33a2b1..c4d2849 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -123,21 +123,6 @@ #include <linux/dvb/frontend.h>
#include <linux/dvb/video.h>
#include <linux/lp.h>

-/* Aiee. Someone does not find a difference between int and long */
-#define EXT2_IOC32_GETFLAGS _IOR('f', 1, int)
-#define EXT2_IOC32_SETFLAGS _IOW('f', 2, int)
-#define EXT3_IOC32_GETVERSION _IOR('f', 3, int)
-#define EXT3_IOC32_SETVERSION _IOW('f', 4, int)
-#define EXT3_IOC32_GETRSVSZ _IOR('f', 5, int)
-#define EXT3_IOC32_SETRSVSZ _IOW('f', 6, int)
-#define EXT3_IOC32_GROUP_EXTEND _IOW('f', 7, unsigned int)
-#ifdef CONFIG_JBD_DEBUG
-#define EXT3_IOC32_WAIT_FOR_READONLY _IOR('f', 99, int)
-#endif
-
-#define EXT2_IOC32_GETVERSION _IOR('v', 1, int)
-#define EXT2_IOC32_SETVERSION _IOW('v', 2, int)
-
static int do_ioctl32_pointer(unsigned int fd, unsigned int cmd,
unsigned long arg, struct file *f)
{
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index 8a1ca5e..3915635 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -246,12 +246,8 @@ #define hfs_part_find hfsplus_part_find

/* ext2 ioctls (EXT2_IOC_GETFLAGS and EXT2_IOC_SETFLAGS) to support
* chattr/lsattr */
-#define HFSPLUS_IOC_EXT2_GETFLAGS _IOR('f', 1, long)
-#define HFSPLUS_IOC_EXT2_SETFLAGS _IOW('f', 2, long)
-
-#define EXT2_FLAG_IMMUTABLE 0x00000010 /* Immutable file */
-#define EXT2_FLAG_APPEND 0x00000020 /* writes to file may only append */
-#define EXT2_FLAG_NODUMP 0x00000040 /* do not dump file */
+#define HFSPLUS_IOC_EXT2_GETFLAGS FS_IOC_GETFLAGS
+#define HFSPLUS_IOC_EXT2_SETFLAGS FS_IOC_SETFLAGS


/*
diff --git a/fs/hfsplus/ioctl.c b/fs/hfsplus/ioctl.c
index 13cf848..79fd104 100644
--- a/fs/hfsplus/ioctl.c
+++ b/fs/hfsplus/ioctl.c
@@ -28,11 +28,11 @@ int hfsplus_ioctl(struct inode *inode, s
case HFSPLUS_IOC_EXT2_GETFLAGS:
flags = 0;
if (HFSPLUS_I(inode).rootflags & HFSPLUS_FLG_IMMUTABLE)
- flags |= EXT2_FLAG_IMMUTABLE; /* EXT2_IMMUTABLE_FL */
+ flags |= FS_IMMUTABLE_FL; /* EXT2_IMMUTABLE_FL */
if (HFSPLUS_I(inode).rootflags & HFSPLUS_FLG_APPEND)
- flags |= EXT2_FLAG_APPEND; /* EXT2_APPEND_FL */
+ flags |= FS_APPEND_FL; /* EXT2_APPEND_FL */
if (HFSPLUS_I(inode).userflags & HFSPLUS_FLG_NODUMP)
- flags |= EXT2_FLAG_NODUMP; /* EXT2_NODUMP_FL */
+ flags |= FS_NODUMP_FL; /* EXT2_NODUMP_FL */
return put_user(flags, (int __user *)arg);
case HFSPLUS_IOC_EXT2_SETFLAGS: {
if (IS_RDONLY(inode))
@@ -44,32 +44,31 @@ int hfsplus_ioctl(struct inode *inode, s
if (get_user(flags, (int __user *)arg))
return -EFAULT;

- if (flags & (EXT2_FLAG_IMMUTABLE|EXT2_FLAG_APPEND) ||
+ if (flags & (FS_IMMUTABLE_FL|FS_APPEND_FL) ||
HFSPLUS_I(inode).rootflags & (HFSPLUS_FLG_IMMUTABLE|HFSPLUS_FLG_APPEND)) {
if (!capable(CAP_LINUX_IMMUTABLE))
return -EPERM;
}

/* don't silently ignore unsupported ext2 flags */
- if (flags & ~(EXT2_FLAG_IMMUTABLE|EXT2_FLAG_APPEND|
- EXT2_FLAG_NODUMP))
+ if (flags & ~(FS_IMMUTABLE_FL|FS_APPEND_FL|FS_NODUMP_FL))
return -EOPNOTSUPP;

- if (flags & EXT2_FLAG_IMMUTABLE) { /* EXT2_IMMUTABLE_FL */
+ if (flags & FS_IMMUTABLE_FL) { /* EXT2_IMMUTABLE_FL */
inode->i_flags |= S_IMMUTABLE;
HFSPLUS_I(inode).rootflags |= HFSPLUS_FLG_IMMUTABLE;
} else {
inode->i_flags &= ~S_IMMUTABLE;
HFSPLUS_I(inode).rootflags &= ~HFSPLUS_FLG_IMMUTABLE;
}
- if (flags & EXT2_FLAG_APPEND) { /* EXT2_APPEND_FL */
+ if (flags & FS_APPEND_FL) { /* EXT2_APPEND_FL */
inode->i_flags |= S_APPEND;
HFSPLUS_I(inode).rootflags |= HFSPLUS_FLG_APPEND;
} else {
inode->i_flags &= ~S_APPEND;
HFSPLUS_I(inode).rootflags &= ~HFSPLUS_FLG_APPEND;
}
- if (flags & EXT2_FLAG_NODUMP) /* EXT2_NODUMP_FL */
+ if (flags & FS_NODUMP_FL) /* EXT2_NODUMP_FL */
HFSPLUS_I(inode).userflags |= HFSPLUS_FLG_NODUMP;
else
HFSPLUS_I(inode).userflags &= ~HFSPLUS_FLG_NODUMP;
diff --git a/fs/jfs/ioctl.c b/fs/jfs/ioctl.c
index 67b3774..37db524 100644
--- a/fs/jfs/ioctl.c
+++ b/fs/jfs/ioctl.c
@@ -6,7 +6,6 @@
*/

#include <linux/fs.h>
-#include <linux/ext2_fs.h>
#include <linux/ctype.h>
#include <linux/capability.h>
#include <linux/time.h>
@@ -22,13 +21,13 @@ static struct {
long jfs_flag;
long ext2_flag;
} jfs_map[] = {
- {JFS_NOATIME_FL, EXT2_NOATIME_FL},
- {JFS_DIRSYNC_FL, EXT2_DIRSYNC_FL},
- {JFS_SYNC_FL, EXT2_SYNC_FL},
- {JFS_SECRM_FL, EXT2_SECRM_FL},
- {JFS_UNRM_FL, EXT2_UNRM_FL},
- {JFS_APPEND_FL, EXT2_APPEND_FL},
- {JFS_IMMUTABLE_FL, EXT2_IMMUTABLE_FL},
+ {JFS_NOATIME_FL, FS_NOATIME_FL},
+ {JFS_DIRSYNC_FL, FS_DIRSYNC_FL},
+ {JFS_SYNC_FL, FS_SYNC_FL},
+ {JFS_SECRM_FL, FS_SECRM_FL},
+ {JFS_UNRM_FL, FS_UNRM_FL},
+ {JFS_APPEND_FL, FS_APPEND_FL},
+ {JFS_IMMUTABLE_FL, FS_IMMUTABLE_FL},
{0, 0},
};

diff --git a/include/linux/ext2_fs.h b/include/linux/ext2_fs.h
index facf34e..9996d2e 100644
--- a/include/linux/ext2_fs.h
+++ b/include/linux/ext2_fs.h
@@ -169,41 +169,49 @@ #define EXT2_TIND_BLOCK (EXT2_DIND_BLO
#define EXT2_N_BLOCKS (EXT2_TIND_BLOCK + 1)

/*
- * Inode flags
- */
-#define EXT2_SECRM_FL 0x00000001 /* Secure deletion */
-#define EXT2_UNRM_FL 0x00000002 /* Undelete */
-#define EXT2_COMPR_FL 0x00000004 /* Compress file */
-#define EXT2_SYNC_FL 0x00000008 /* Synchronous updates */
-#define EXT2_IMMUTABLE_FL 0x00000010 /* Immutable file */
-#define EXT2_APPEND_FL 0x00000020 /* writes to file may only append */
-#define EXT2_NODUMP_FL 0x00000040 /* do not dump file */
-#define EXT2_NOATIME_FL 0x00000080 /* do not update atime */
+ * Inode flags (GETFLAGS/SETFLAGS)
+ */
+#define EXT2_SECRM_FL FS_SECRM_FL /* Secure deletion */
+#define EXT2_UNRM_FL FS_UNRM_FL /* Undelete */
+#define EXT2_COMPR_FL FS_COMPR_FL /* Compress file */
+#define EXT2_SYNC_FL FS_SYNC_FL /* Synchronous updates */
+#define EXT2_IMMUTABLE_FL FS_IMMUTABLE_FL /* Immutable file */
+#define EXT2_APPEND_FL FS_APPEND_FL /* writes to file may only append */
+#define EXT2_NODUMP_FL FS_NODUMP_FL /* do not dump file */
+#define EXT2_NOATIME_FL FS_NOATIME_FL /* do not update atime */
/* Reserved for compression usage... */
-#define EXT2_DIRTY_FL 0x00000100
-#define EXT2_COMPRBLK_FL 0x00000200 /* One or more compressed clusters */
-#define EXT2_NOCOMP_FL 0x00000400 /* Don't compress */
-#define EXT2_ECOMPR_FL 0x00000800 /* Compression error */
+#define EXT2_DIRTY_FL FS_DIRTY_FL
+#define EXT2_COMPRBLK_FL FS_COMPRBLK_FL /* One or more compressed clusters */
+#define EXT2_NOCOMP_FL FS_NOCOMP_FL /* Don't compress */
+#define EXT2_ECOMPR_FL FS_ECOMPR_FL /* Compression error */
/* End compression flags --- maybe not all used */
-#define EXT2_BTREE_FL 0x00001000 /* btree format dir */
-#define EXT2_INDEX_FL 0x00001000 /* hash-indexed directory */
-#define EXT2_IMAGIC_FL 0x00002000 /* AFS directory */
-#define EXT2_JOURNAL_DATA_FL 0x00004000 /* Reserved for ext3 */
-#define EXT2_NOTAIL_FL 0x00008000 /* file tail should not be merged */
-#define EXT2_DIRSYNC_FL 0x00010000 /* dirsync behaviour (directories only) */
-#define EXT2_TOPDIR_FL 0x00020000 /* Top of directory hierarchies*/
-#define EXT2_RESERVED_FL 0x80000000 /* reserved for ext2 lib */
-
-#define EXT2_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
-#define EXT2_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */
+#define EXT2_BTREE_FL FS_BTREE_FL /* btree format dir */
+#define EXT2_INDEX_FL FS_INDEX_FL /* hash-indexed directory */
+#define EXT2_IMAGIC_FL FS_IMAGIC_FL /* AFS directory */
+#define EXT2_JOURNAL_DATA_FL FS_JOURNAL_DATA_FL /* Reserved for ext3 */
+#define EXT2_NOTAIL_FL FS_NOTAIL_FL /* file tail should not be merged */
+#define EXT2_DIRSYNC_FL FS_DIRSYNC_FL /* dirsync behaviour (directories only) */
+#define EXT2_TOPDIR_FL FS_TOPDIR_FL /* Top of directory hierarchies*/
+#define EXT2_RESERVED_FL FS_RESERVED_FL /* reserved for ext2 lib */
+
+#define EXT2_FL_USER_VISIBLE FS_FL_USER_VISIBLE /* User visible flags */
+#define EXT2_FL_USER_MODIFIABLE FS_FL_USER_MODIFIABLE /* User modifiable flags */

/*
* ioctl commands
*/
-#define EXT2_IOC_GETFLAGS _IOR('f', 1, long)
-#define EXT2_IOC_SETFLAGS _IOW('f', 2, long)
-#define EXT2_IOC_GETVERSION _IOR('v', 1, long)
-#define EXT2_IOC_SETVERSION _IOW('v', 2, long)
+#define EXT2_IOC_GETFLAGS FS_IOC_GETFLAGS
+#define EXT2_IOC_SETFLAGS FS_IOC_SETFLAGS
+#define EXT2_IOC_GETVERSION FS_IOC_GETVERSION
+#define EXT2_IOC_SETVERSION FS_IOC_SETVERSION
+
+/*
+ * ioctl commands in 32 bit emulation
+ */
+#define EXT2_IOC32_GETFLAGS FS_IOC32_GETFLAGS
+#define EXT2_IOC32_SETFLAGS FS_IOC32_SETFLAGS
+#define EXT2_IOC32_GETVERSION FS_IOC32_GETVERSION
+#define EXT2_IOC32_SETVERSION FS_IOC32_SETVERSION

/*
* Structure of an inode on the disk
diff --git a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h
index 9f9cce7..90cfba2 100644
--- a/include/linux/ext3_fs.h
+++ b/include/linux/ext3_fs.h
@@ -220,14 +220,14 @@ struct ext3_new_group_data {
/*
* ioctl commands
*/
-#define EXT3_IOC_GETFLAGS _IOR('f', 1, long)
-#define EXT3_IOC_SETFLAGS _IOW('f', 2, long)
+#define EXT3_IOC_GETFLAGS FS_IOC_GETFLAGS
+#define EXT3_IOC_SETFLAGS FS_IOC_SETFLAGS
#define EXT3_IOC_GETVERSION _IOR('f', 3, long)
#define EXT3_IOC_SETVERSION _IOW('f', 4, long)
#define EXT3_IOC_GROUP_EXTEND _IOW('f', 7, unsigned long)
#define EXT3_IOC_GROUP_ADD _IOW('f', 8,struct ext3_new_group_input)
-#define EXT3_IOC_GETVERSION_OLD _IOR('v', 1, long)
-#define EXT3_IOC_SETVERSION_OLD _IOW('v', 2, long)
+#define EXT3_IOC_GETVERSION_OLD FS_IOC_GETVERSION
+#define EXT3_IOC_SETVERSION_OLD FS_IOC_SETVERSION
#ifdef CONFIG_JBD_DEBUG
#define EXT3_IOC_WAIT_FOR_READONLY _IOR('f', 99, long)
#endif
@@ -235,6 +235,18 @@ #define EXT3_IOC_GETRSVSZ _IOR('f', 5,
#define EXT3_IOC_SETRSVSZ _IOW('f', 6, long)

/*
+ * ioctl commands in 32 bit emulation
+ */
+#define EXT3_IOC32_GETVERSION _IOR('f', 3, int)
+#define EXT3_IOC32_SETVERSION _IOW('f', 4, int)
+#define EXT3_IOC32_GETRSVSZ _IOR('f', 5, int)
+#define EXT3_IOC32_SETRSVSZ _IOW('f', 6, int)
+#define EXT3_IOC32_GROUP_EXTEND _IOW('f', 7, unsigned int)
+#ifdef CONFIG_JBD_DEBUG
+#define EXT3_IOC32_WAIT_FOR_READONLY _IOR('f', 99, int)
+#endif
+
+/*
* Mount options
*/
struct ext3_mount_options {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 429bda5..7339d41 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -216,6 +216,45 @@ #define BMAP_IOCTL 1 /* obsolete - kept
#define FIBMAP _IO(0x00,1) /* bmap access */
#define FIGETBSZ _IO(0x00,2) /* get the block size used for bmap */

+#define FS_IOC_GETFLAGS _IOR('f', 1, long)
+#define FS_IOC_SETFLAGS _IOW('f', 2, long)
+#define FS_IOC_GETVERSION _IOR('v', 1, long)
+#define FS_IOC_SETVERSION _IOW('v', 2, long)
+#define FS_IOC32_GETFLAGS _IOR('f', 1, int)
+#define FS_IOC32_SETFLAGS _IOW('f', 2, int)
+#define FS_IOC32_GETVERSION _IOR('v', 1, int)
+#define FS_IOC32_SETVERSION _IOW('v', 2, int)
+
+/*
+ * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS)
+ */
+#define FS_SECRM_FL 0x00000001 /* Secure deletion */
+#define FS_UNRM_FL 0x00000002 /* Undelete */
+#define FS_COMPR_FL 0x00000004 /* Compress file */
+#define FS_SYNC_FL 0x00000008 /* Synchronous updates */
+#define FS_IMMUTABLE_FL 0x00000010 /* Immutable file */
+#define FS_APPEND_FL 0x00000020 /* writes to file may only append */
+#define FS_NODUMP_FL 0x00000040 /* do not dump file */
+#define FS_NOATIME_FL 0x00000080 /* do not update atime */
+/* Reserved for compression usage... */
+#define FS_DIRTY_FL 0x00000100
+#define FS_COMPRBLK_FL 0x00000200 /* One or more compressed clusters */
+#define FS_NOCOMP_FL 0x00000400 /* Don't compress */
+#define FS_ECOMPR_FL 0x00000800 /* Compression error */
+/* End compression flags --- maybe not all used */
+#define FS_BTREE_FL 0x00001000 /* btree format dir */
+#define FS_INDEX_FL 0x00001000 /* hash-indexed directory */
+#define FS_IMAGIC_FL 0x00002000 /* AFS directory */
+#define FS_JOURNAL_DATA_FL 0x00004000 /* Reserved for ext3 */
+#define FS_NOTAIL_FL 0x00008000 /* file tail should not be merged */
+#define FS_DIRSYNC_FL 0x00010000 /* dirsync behaviour (directories only) */
+#define FS_TOPDIR_FL 0x00020000 /* Top of directory hierarchies*/
+#define FS_RESERVED_FL 0x80000000 /* reserved for ext2 lib */
+
+#define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
+#define FS_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */
+
+
#define SYNC_FILE_RANGE_WAIT_BEFORE 1
#define SYNC_FILE_RANGE_WRITE 2
#define SYNC_FILE_RANGE_WAIT_AFTER 4
diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h
index daa2d83..54c3054 100644
--- a/include/linux/reiserfs_fs.h
+++ b/include/linux/reiserfs_fs.h
@@ -813,21 +813,19 @@ #define sd_v1_first_direct_byte(sdp) \
#define set_sd_v1_first_direct_byte(sdp,v) \
((sdp)->sd_first_direct_byte = cpu_to_le32(v))

-#include <linux/ext2_fs.h>
-
/* inode flags stored in sd_attrs (nee sd_reserved) */

/* we want common flags to have the same values as in ext2,
so chattr(1) will work without problems */
-#define REISERFS_IMMUTABLE_FL EXT2_IMMUTABLE_FL
-#define REISERFS_APPEND_FL EXT2_APPEND_FL
-#define REISERFS_SYNC_FL EXT2_SYNC_FL
-#define REISERFS_NOATIME_FL EXT2_NOATIME_FL
-#define REISERFS_NODUMP_FL EXT2_NODUMP_FL
-#define REISERFS_SECRM_FL EXT2_SECRM_FL
-#define REISERFS_UNRM_FL EXT2_UNRM_FL
-#define REISERFS_COMPR_FL EXT2_COMPR_FL
-#define REISERFS_NOTAIL_FL EXT2_NOTAIL_FL
+#define REISERFS_IMMUTABLE_FL FS_IMMUTABLE_FL
+#define REISERFS_APPEND_FL FS_APPEND_FL
+#define REISERFS_SYNC_FL FS_SYNC_FL
+#define REISERFS_NOATIME_FL FS_NOATIME_FL
+#define REISERFS_NODUMP_FL FS_NODUMP_FL
+#define REISERFS_SECRM_FL FS_SECRM_FL
+#define REISERFS_UNRM_FL FS_UNRM_FL
+#define REISERFS_COMPR_FL FS_COMPR_FL
+#define REISERFS_NOTAIL_FL FS_NOTAIL_FL

/* persistent flags that file inherits from the parent directory */
#define REISERFS_INHERIT_MASK ( REISERFS_IMMUTABLE_FL | \
@@ -2174,10 +2172,10 @@ int reiserfs_ioctl(struct inode *inode,
#define REISERFS_IOC_UNPACK _IOW(0xCD,1,long)
/* define following flags to be the same as in ext2, so that chattr(1),
lsattr(1) will work with us. */
-#define REISERFS_IOC_GETFLAGS EXT2_IOC_GETFLAGS
-#define REISERFS_IOC_SETFLAGS EXT2_IOC_SETFLAGS
-#define REISERFS_IOC_GETVERSION EXT2_IOC_GETVERSION
-#define REISERFS_IOC_SETVERSION EXT2_IOC_SETVERSION
+#define REISERFS_IOC_GETFLAGS FS_IOC_GETFLAGS
+#define REISERFS_IOC_SETFLAGS FS_IOC_SETFLAGS
+#define REISERFS_IOC_GETVERSION FS_IOC_GETVERSION
+#define REISERFS_IOC_SETVERSION FS_IOC_SETVERSION

/* Locking primitives */
/* Right now we are still falling back to (un)lock_kernel, but eventually that

2006-08-24 21:39:20

by David Howells

[permalink] [raw]
Subject: [PATCH 04/17] BLOCK: Separate the bounce buffering code from the highmem code [try #2]

From: David Howells <[email protected]>

Move the bounce buffer code from mm/highmem.c to mm/bounce.c so that it can be
more easily disabled when the block layer is disabled.

!!!NOTE!!! There may be a bug in this code: Should init_emergency_pool() be
contingent on CONFIG_HIGHMEM?

Signed-Off-By: David Howells <[email protected]>
---

mm/Makefile | 1
mm/bounce.c | 302 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mm/highmem.c | 281 ------------------------------------------------------
3 files changed, 303 insertions(+), 281 deletions(-)

diff --git a/mm/Makefile b/mm/Makefile
index 9dd824c..84fff32 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -12,6 +12,7 @@ obj-y := bootmem.o filemap.o mempool.o
readahead.o swap.o truncate.o vmscan.o \
prio_tree.o util.o mmzone.o vmstat.o $(mmu-y)

+obj-y += bounce.o
obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o
obj-$(CONFIG_HUGETLBFS) += hugetlb.o
obj-$(CONFIG_NUMA) += mempolicy.o
diff --git a/mm/bounce.c b/mm/bounce.c
new file mode 100644
index 0000000..e042f87
--- /dev/null
+++ b/mm/bounce.c
@@ -0,0 +1,302 @@
+/* bounce.c: bounce buffer handling for block devices
+ *
+ * - Split from highmem.c
+ */
+
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/swap.h>
+#include <linux/bio.h>
+#include <linux/pagemap.h>
+#include <linux/mempool.h>
+#include <linux/blkdev.h>
+#include <linux/init.h>
+#include <linux/hash.h>
+#include <linux/highmem.h>
+#include <linux/blktrace_api.h>
+#include <asm/tlbflush.h>
+
+#define POOL_SIZE 64
+#define ISA_POOL_SIZE 16
+
+static mempool_t *page_pool, *isa_page_pool;
+
+#ifdef CONFIG_HIGHMEM
+static __init int init_emergency_pool(void)
+{
+ struct sysinfo i;
+ si_meminfo(&i);
+ si_swapinfo(&i);
+
+ if (!i.totalhigh)
+ return 0;
+
+ page_pool = mempool_create_page_pool(POOL_SIZE, 0);
+ BUG_ON(!page_pool);
+ printk("highmem bounce pool size: %d pages\n", POOL_SIZE);
+
+ return 0;
+}
+
+__initcall(init_emergency_pool);
+
+/*
+ * highmem version, map in to vec
+ */
+static void bounce_copy_vec(struct bio_vec *to, unsigned char *vfrom)
+{
+ unsigned long flags;
+ unsigned char *vto;
+
+ local_irq_save(flags);
+ vto = kmap_atomic(to->bv_page, KM_BOUNCE_READ);
+ memcpy(vto + to->bv_offset, vfrom, to->bv_len);
+ kunmap_atomic(vto, KM_BOUNCE_READ);
+ local_irq_restore(flags);
+}
+
+#else /* CONFIG_HIGHMEM */
+
+#define bounce_copy_vec(to, vfrom) \
+ memcpy(page_address((to)->bv_page) + (to)->bv_offset, vfrom, (to)->bv_len)
+
+#endif /* CONFIG_HIGHMEM */
+
+/*
+ * allocate pages in the DMA region for the ISA pool
+ */
+static void *mempool_alloc_pages_isa(gfp_t gfp_mask, void *data)
+{
+ return mempool_alloc_pages(gfp_mask | GFP_DMA, data);
+}
+
+/*
+ * gets called "every" time someone init's a queue with BLK_BOUNCE_ISA
+ * as the max address, so check if the pool has already been created.
+ */
+int init_emergency_isa_pool(void)
+{
+ if (isa_page_pool)
+ return 0;
+
+ isa_page_pool = mempool_create(ISA_POOL_SIZE, mempool_alloc_pages_isa,
+ mempool_free_pages, (void *) 0);
+ BUG_ON(!isa_page_pool);
+
+ printk("isa bounce pool size: %d pages\n", ISA_POOL_SIZE);
+ return 0;
+}
+
+/*
+ * Simple bounce buffer support for highmem pages. Depending on the
+ * queue gfp mask set, *to may or may not be a highmem page. kmap it
+ * always, it will do the Right Thing
+ */
+static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
+{
+ unsigned char *vfrom;
+ struct bio_vec *tovec, *fromvec;
+ int i;
+
+ __bio_for_each_segment(tovec, to, i, 0) {
+ fromvec = from->bi_io_vec + i;
+
+ /*
+ * not bounced
+ */
+ if (tovec->bv_page == fromvec->bv_page)
+ continue;
+
+ /*
+ * fromvec->bv_offset and fromvec->bv_len might have been
+ * modified by the block layer, so use the original copy,
+ * bounce_copy_vec already uses tovec->bv_len
+ */
+ vfrom = page_address(fromvec->bv_page) + tovec->bv_offset;
+
+ flush_dcache_page(tovec->bv_page);
+ bounce_copy_vec(tovec, vfrom);
+ }
+}
+
+static void bounce_end_io(struct bio *bio, mempool_t *pool, int err)
+{
+ struct bio *bio_orig = bio->bi_private;
+ struct bio_vec *bvec, *org_vec;
+ int i;
+
+ if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags))
+ set_bit(BIO_EOPNOTSUPP, &bio_orig->bi_flags);
+
+ /*
+ * free up bounce indirect pages used
+ */
+ __bio_for_each_segment(bvec, bio, i, 0) {
+ org_vec = bio_orig->bi_io_vec + i;
+ if (bvec->bv_page == org_vec->bv_page)
+ continue;
+
+ dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
+ mempool_free(bvec->bv_page, pool);
+ }
+
+ bio_endio(bio_orig, bio_orig->bi_size, err);
+ bio_put(bio);
+}
+
+static int bounce_end_io_write(struct bio *bio, unsigned int bytes_done, int err)
+{
+ if (bio->bi_size)
+ return 1;
+
+ bounce_end_io(bio, page_pool, err);
+ return 0;
+}
+
+static int bounce_end_io_write_isa(struct bio *bio, unsigned int bytes_done, int err)
+{
+ if (bio->bi_size)
+ return 1;
+
+ bounce_end_io(bio, isa_page_pool, err);
+ return 0;
+}
+
+static void __bounce_end_io_read(struct bio *bio, mempool_t *pool, int err)
+{
+ struct bio *bio_orig = bio->bi_private;
+
+ if (test_bit(BIO_UPTODATE, &bio->bi_flags))
+ copy_to_high_bio_irq(bio_orig, bio);
+
+ bounce_end_io(bio, pool, err);
+}
+
+static int bounce_end_io_read(struct bio *bio, unsigned int bytes_done, int err)
+{
+ if (bio->bi_size)
+ return 1;
+
+ __bounce_end_io_read(bio, page_pool, err);
+ return 0;
+}
+
+static int bounce_end_io_read_isa(struct bio *bio, unsigned int bytes_done, int err)
+{
+ if (bio->bi_size)
+ return 1;
+
+ __bounce_end_io_read(bio, isa_page_pool, err);
+ return 0;
+}
+
+static void __blk_queue_bounce(request_queue_t *q, struct bio **bio_orig,
+ mempool_t *pool)
+{
+ struct page *page;
+ struct bio *bio = NULL;
+ int i, rw = bio_data_dir(*bio_orig);
+ struct bio_vec *to, *from;
+
+ bio_for_each_segment(from, *bio_orig, i) {
+ page = from->bv_page;
+
+ /*
+ * is destination page below bounce pfn?
+ */
+ if (page_to_pfn(page) < q->bounce_pfn)
+ continue;
+
+ /*
+ * irk, bounce it
+ */
+ if (!bio)
+ bio = bio_alloc(GFP_NOIO, (*bio_orig)->bi_vcnt);
+
+ to = bio->bi_io_vec + i;
+
+ to->bv_page = mempool_alloc(pool, q->bounce_gfp);
+ to->bv_len = from->bv_len;
+ to->bv_offset = from->bv_offset;
+ inc_zone_page_state(to->bv_page, NR_BOUNCE);
+
+ if (rw == WRITE) {
+ char *vto, *vfrom;
+
+ flush_dcache_page(from->bv_page);
+ vto = page_address(to->bv_page) + to->bv_offset;
+ vfrom = kmap(from->bv_page) + from->bv_offset;
+ memcpy(vto, vfrom, to->bv_len);
+ kunmap(from->bv_page);
+ }
+ }
+
+ /*
+ * no pages bounced
+ */
+ if (!bio)
+ return;
+
+ /*
+ * at least one page was bounced, fill in possible non-highmem
+ * pages
+ */
+ __bio_for_each_segment(from, *bio_orig, i, 0) {
+ to = bio_iovec_idx(bio, i);
+ if (!to->bv_page) {
+ to->bv_page = from->bv_page;
+ to->bv_len = from->bv_len;
+ to->bv_offset = from->bv_offset;
+ }
+ }
+
+ bio->bi_bdev = (*bio_orig)->bi_bdev;
+ bio->bi_flags |= (1 << BIO_BOUNCED);
+ bio->bi_sector = (*bio_orig)->bi_sector;
+ bio->bi_rw = (*bio_orig)->bi_rw;
+
+ bio->bi_vcnt = (*bio_orig)->bi_vcnt;
+ bio->bi_idx = (*bio_orig)->bi_idx;
+ bio->bi_size = (*bio_orig)->bi_size;
+
+ if (pool == page_pool) {
+ bio->bi_end_io = bounce_end_io_write;
+ if (rw == READ)
+ bio->bi_end_io = bounce_end_io_read;
+ } else {
+ bio->bi_end_io = bounce_end_io_write_isa;
+ if (rw == READ)
+ bio->bi_end_io = bounce_end_io_read_isa;
+ }
+
+ bio->bi_private = *bio_orig;
+ *bio_orig = bio;
+}
+
+void blk_queue_bounce(request_queue_t *q, struct bio **bio_orig)
+{
+ mempool_t *pool;
+
+ /*
+ * for non-isa bounce case, just check if the bounce pfn is equal
+ * to or bigger than the highest pfn in the system -- in that case,
+ * don't waste time iterating over bio segments
+ */
+ if (!(q->bounce_gfp & GFP_DMA)) {
+ if (q->bounce_pfn >= blk_max_pfn)
+ return;
+ pool = page_pool;
+ } else {
+ BUG_ON(!isa_page_pool);
+ pool = isa_page_pool;
+ }
+
+ blk_add_trace_bio(q, *bio_orig, BLK_TA_BOUNCE);
+
+ /*
+ * slow path
+ */
+ __blk_queue_bounce(q, bio_orig, pool);
+}
+
+EXPORT_SYMBOL(blk_queue_bounce);
diff --git a/mm/highmem.c b/mm/highmem.c
index 9b2a540..1ac20d6 100644
--- a/mm/highmem.c
+++ b/mm/highmem.c
@@ -29,13 +29,6 @@ #include <linux/highmem.h>
#include <linux/blktrace_api.h>
#include <asm/tlbflush.h>

-static mempool_t *page_pool, *isa_page_pool;
-
-static void *mempool_alloc_pages_isa(gfp_t gfp_mask, void *data)
-{
- return mempool_alloc_pages(gfp_mask | GFP_DMA, data);
-}
-
/*
* Virtual_count is not a pure "count".
* 0 means that it is not mapped, and has not been mapped
@@ -204,282 +197,8 @@ void fastcall kunmap_high(struct page *p
}

EXPORT_SYMBOL(kunmap_high);
-
-#define POOL_SIZE 64
-
-static __init int init_emergency_pool(void)
-{
- struct sysinfo i;
- si_meminfo(&i);
- si_swapinfo(&i);
-
- if (!i.totalhigh)
- return 0;
-
- page_pool = mempool_create_page_pool(POOL_SIZE, 0);
- BUG_ON(!page_pool);
- printk("highmem bounce pool size: %d pages\n", POOL_SIZE);
-
- return 0;
-}
-
-__initcall(init_emergency_pool);
-
-/*
- * highmem version, map in to vec
- */
-static void bounce_copy_vec(struct bio_vec *to, unsigned char *vfrom)
-{
- unsigned long flags;
- unsigned char *vto;
-
- local_irq_save(flags);
- vto = kmap_atomic(to->bv_page, KM_BOUNCE_READ);
- memcpy(vto + to->bv_offset, vfrom, to->bv_len);
- kunmap_atomic(vto, KM_BOUNCE_READ);
- local_irq_restore(flags);
-}
-
-#else /* CONFIG_HIGHMEM */
-
-#define bounce_copy_vec(to, vfrom) \
- memcpy(page_address((to)->bv_page) + (to)->bv_offset, vfrom, (to)->bv_len)
-
#endif

-#define ISA_POOL_SIZE 16
-
-/*
- * gets called "every" time someone init's a queue with BLK_BOUNCE_ISA
- * as the max address, so check if the pool has already been created.
- */
-int init_emergency_isa_pool(void)
-{
- if (isa_page_pool)
- return 0;
-
- isa_page_pool = mempool_create(ISA_POOL_SIZE, mempool_alloc_pages_isa,
- mempool_free_pages, (void *) 0);
- BUG_ON(!isa_page_pool);
-
- printk("isa bounce pool size: %d pages\n", ISA_POOL_SIZE);
- return 0;
-}
-
-/*
- * Simple bounce buffer support for highmem pages. Depending on the
- * queue gfp mask set, *to may or may not be a highmem page. kmap it
- * always, it will do the Right Thing
- */
-static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
-{
- unsigned char *vfrom;
- struct bio_vec *tovec, *fromvec;
- int i;
-
- __bio_for_each_segment(tovec, to, i, 0) {
- fromvec = from->bi_io_vec + i;
-
- /*
- * not bounced
- */
- if (tovec->bv_page == fromvec->bv_page)
- continue;
-
- /*
- * fromvec->bv_offset and fromvec->bv_len might have been
- * modified by the block layer, so use the original copy,
- * bounce_copy_vec already uses tovec->bv_len
- */
- vfrom = page_address(fromvec->bv_page) + tovec->bv_offset;
-
- flush_dcache_page(tovec->bv_page);
- bounce_copy_vec(tovec, vfrom);
- }
-}
-
-static void bounce_end_io(struct bio *bio, mempool_t *pool, int err)
-{
- struct bio *bio_orig = bio->bi_private;
- struct bio_vec *bvec, *org_vec;
- int i;
-
- if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags))
- set_bit(BIO_EOPNOTSUPP, &bio_orig->bi_flags);
-
- /*
- * free up bounce indirect pages used
- */
- __bio_for_each_segment(bvec, bio, i, 0) {
- org_vec = bio_orig->bi_io_vec + i;
- if (bvec->bv_page == org_vec->bv_page)
- continue;
-
- dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
- mempool_free(bvec->bv_page, pool);
- }
-
- bio_endio(bio_orig, bio_orig->bi_size, err);
- bio_put(bio);
-}
-
-static int bounce_end_io_write(struct bio *bio, unsigned int bytes_done, int err)
-{
- if (bio->bi_size)
- return 1;
-
- bounce_end_io(bio, page_pool, err);
- return 0;
-}
-
-static int bounce_end_io_write_isa(struct bio *bio, unsigned int bytes_done, int err)
-{
- if (bio->bi_size)
- return 1;
-
- bounce_end_io(bio, isa_page_pool, err);
- return 0;
-}
-
-static void __bounce_end_io_read(struct bio *bio, mempool_t *pool, int err)
-{
- struct bio *bio_orig = bio->bi_private;
-
- if (test_bit(BIO_UPTODATE, &bio->bi_flags))
- copy_to_high_bio_irq(bio_orig, bio);
-
- bounce_end_io(bio, pool, err);
-}
-
-static int bounce_end_io_read(struct bio *bio, unsigned int bytes_done, int err)
-{
- if (bio->bi_size)
- return 1;
-
- __bounce_end_io_read(bio, page_pool, err);
- return 0;
-}
-
-static int bounce_end_io_read_isa(struct bio *bio, unsigned int bytes_done, int err)
-{
- if (bio->bi_size)
- return 1;
-
- __bounce_end_io_read(bio, isa_page_pool, err);
- return 0;
-}
-
-static void __blk_queue_bounce(request_queue_t *q, struct bio **bio_orig,
- mempool_t *pool)
-{
- struct page *page;
- struct bio *bio = NULL;
- int i, rw = bio_data_dir(*bio_orig);
- struct bio_vec *to, *from;
-
- bio_for_each_segment(from, *bio_orig, i) {
- page = from->bv_page;
-
- /*
- * is destination page below bounce pfn?
- */
- if (page_to_pfn(page) < q->bounce_pfn)
- continue;
-
- /*
- * irk, bounce it
- */
- if (!bio)
- bio = bio_alloc(GFP_NOIO, (*bio_orig)->bi_vcnt);
-
- to = bio->bi_io_vec + i;
-
- to->bv_page = mempool_alloc(pool, q->bounce_gfp);
- to->bv_len = from->bv_len;
- to->bv_offset = from->bv_offset;
- inc_zone_page_state(to->bv_page, NR_BOUNCE);
-
- if (rw == WRITE) {
- char *vto, *vfrom;
-
- flush_dcache_page(from->bv_page);
- vto = page_address(to->bv_page) + to->bv_offset;
- vfrom = kmap(from->bv_page) + from->bv_offset;
- memcpy(vto, vfrom, to->bv_len);
- kunmap(from->bv_page);
- }
- }
-
- /*
- * no pages bounced
- */
- if (!bio)
- return;
-
- /*
- * at least one page was bounced, fill in possible non-highmem
- * pages
- */
- __bio_for_each_segment(from, *bio_orig, i, 0) {
- to = bio_iovec_idx(bio, i);
- if (!to->bv_page) {
- to->bv_page = from->bv_page;
- to->bv_len = from->bv_len;
- to->bv_offset = from->bv_offset;
- }
- }
-
- bio->bi_bdev = (*bio_orig)->bi_bdev;
- bio->bi_flags |= (1 << BIO_BOUNCED);
- bio->bi_sector = (*bio_orig)->bi_sector;
- bio->bi_rw = (*bio_orig)->bi_rw;
-
- bio->bi_vcnt = (*bio_orig)->bi_vcnt;
- bio->bi_idx = (*bio_orig)->bi_idx;
- bio->bi_size = (*bio_orig)->bi_size;
-
- if (pool == page_pool) {
- bio->bi_end_io = bounce_end_io_write;
- if (rw == READ)
- bio->bi_end_io = bounce_end_io_read;
- } else {
- bio->bi_end_io = bounce_end_io_write_isa;
- if (rw == READ)
- bio->bi_end_io = bounce_end_io_read_isa;
- }
-
- bio->bi_private = *bio_orig;
- *bio_orig = bio;
-}
-
-void blk_queue_bounce(request_queue_t *q, struct bio **bio_orig)
-{
- mempool_t *pool;
-
- /*
- * for non-isa bounce case, just check if the bounce pfn is equal
- * to or bigger than the highest pfn in the system -- in that case,
- * don't waste time iterating over bio segments
- */
- if (!(q->bounce_gfp & GFP_DMA)) {
- if (q->bounce_pfn >= blk_max_pfn)
- return;
- pool = page_pool;
- } else {
- BUG_ON(!isa_page_pool);
- pool = isa_page_pool;
- }
-
- blk_add_trace_bio(q, *bio_orig, BLK_TA_BOUNCE);
-
- /*
- * slow path
- */
- __blk_queue_bounce(q, bio_orig, pool);
-}
-
-EXPORT_SYMBOL(blk_queue_bounce);
-
#if defined(HASHED_PAGE_VIRTUAL)

#define PA_HASH_ORDER 7

2006-08-24 21:33:35

by David Howells

[permalink] [raw]
Subject: [PATCH 03/17] BLOCK: Stop fallback_migrate_page() from using page_has_buffers() [try #2]

From: David Howells <[email protected]>

Stop fallback_migrate_page() from using page_has_buffers() since that might not
be available. Use PagePrivate() instead since that's more general.

Signed-Off-By: David Howells <[email protected]>
---

mm/migrate.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 3f1e0c2..0227163 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -525,7 +525,7 @@ static int fallback_migrate_page(struct
* Buffers may be managed in a filesystem specific way.
* We must have no buffers or drop them.
*/
- if (page_has_buffers(page) &&
+ if (PagePrivate(page) &&
!try_to_release_page(page, GFP_KERNEL))
return -EAGAIN;

2006-08-24 21:33:41

by David Howells

[permalink] [raw]
Subject: [PATCH 13/17] BLOCK: Move the Ext2 device ioctl compat stuff to the Ext2 driver [try #2]

From: David Howells <[email protected]>

Move the Ext2 device ioctl compat stuff from fs/compat_ioctl.c to the Ext2
driver so that the Ext2 header file doesn't need to be included.

Signed-Off-By: David Howells <[email protected]>
---

fs/compat_ioctl.c | 17 -----------------
fs/ext2/dir.c | 3 +++
fs/ext2/ext2.h | 1 +
fs/ext2/file.c | 6 ++++++
fs/ext2/ioctl.c | 32 ++++++++++++++++++++++++++++++++
5 files changed, 42 insertions(+), 17 deletions(-)

diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 5e84342..24d5538 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -45,7 +45,6 @@ #include <linux/auto_fs4.h>
#include <linux/tty.h>
#include <linux/vt_kern.h>
#include <linux/fb.h>
-#include <linux/ext2_fs.h>
#include <linux/ext3_jbd.h>
#include <linux/ext3_fs.h>
#include <linux/videodev.h>
@@ -159,18 +158,6 @@ static int rw_long(unsigned int fd, unsi
return err;
}

-static int do_ext2_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
-{
- /* These are just misnamed, they actually get/put from/to user an int */
- switch (cmd) {
- case EXT2_IOC32_GETFLAGS: cmd = EXT2_IOC_GETFLAGS; break;
- case EXT2_IOC32_SETFLAGS: cmd = EXT2_IOC_SETFLAGS; break;
- case EXT2_IOC32_GETVERSION: cmd = EXT2_IOC_GETVERSION; break;
- case EXT2_IOC32_SETVERSION: cmd = EXT2_IOC_SETVERSION; break;
- }
- return sys_ioctl(fd, cmd, (unsigned long)compat_ptr(arg));
-}
-
static int do_ext3_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
{
/* These are just misnamed, they actually get/put from/to user an int */
@@ -2727,10 +2714,6 @@ HANDLE_IOCTL(PIO_UNIMAP, do_unimap_ioctl
HANDLE_IOCTL(GIO_UNIMAP, do_unimap_ioctl)
HANDLE_IOCTL(KDFONTOP, do_kdfontop_ioctl)
#endif
-HANDLE_IOCTL(EXT2_IOC32_GETFLAGS, do_ext2_ioctl)
-HANDLE_IOCTL(EXT2_IOC32_SETFLAGS, do_ext2_ioctl)
-HANDLE_IOCTL(EXT2_IOC32_GETVERSION, do_ext2_ioctl)
-HANDLE_IOCTL(EXT2_IOC32_SETVERSION, do_ext2_ioctl)
HANDLE_IOCTL(EXT3_IOC32_GETVERSION, do_ext3_ioctl)
HANDLE_IOCTL(EXT3_IOC32_SETVERSION, do_ext3_ioctl)
HANDLE_IOCTL(EXT3_IOC32_GETRSVSZ, do_ext3_ioctl)
diff --git a/fs/ext2/dir.c b/fs/ext2/dir.c
index 92ea826..3e7a84a 100644
--- a/fs/ext2/dir.c
+++ b/fs/ext2/dir.c
@@ -661,5 +661,8 @@ const struct file_operations ext2_dir_op
.read = generic_read_dir,
.readdir = ext2_readdir,
.ioctl = ext2_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = ext2_compat_ioctl,
+#endif
.fsync = ext2_sync_file,
};
diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
index e65a019..c19ac15 100644
--- a/fs/ext2/ext2.h
+++ b/fs/ext2/ext2.h
@@ -137,6 +137,7 @@ extern void ext2_set_inode_flags(struct
/* ioctl.c */
extern int ext2_ioctl (struct inode *, struct file *, unsigned int,
unsigned long);
+extern long ext2_compat_ioctl(struct file *, unsigned int, unsigned long);

/* namei.c */
struct dentry *ext2_get_parent(struct dentry *child);
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index 23e2c7c..e8bbed9 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -46,6 +46,9 @@ const struct file_operations ext2_file_o
.aio_read = generic_file_aio_read,
.aio_write = generic_file_aio_write,
.ioctl = ext2_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = ext2_compat_ioctl,
+#endif
.mmap = generic_file_mmap,
.open = generic_file_open,
.release = ext2_release_file,
@@ -63,6 +66,9 @@ const struct file_operations ext2_xip_fi
.read = xip_file_read,
.write = xip_file_write,
.ioctl = ext2_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = ext2_compat_ioctl,
+#endif
.mmap = xip_file_mmap,
.open = generic_file_open,
.release = ext2_release_file,
diff --git a/fs/ext2/ioctl.c b/fs/ext2/ioctl.c
index 3ca9afd..1dfba77 100644
--- a/fs/ext2/ioctl.c
+++ b/fs/ext2/ioctl.c
@@ -11,6 +11,8 @@ #include "ext2.h"
#include <linux/capability.h>
#include <linux/time.h>
#include <linux/sched.h>
+#include <linux/compat.h>
+#include <linux/smp_lock.h>
#include <asm/current.h>
#include <asm/uaccess.h>

@@ -80,3 +82,33 @@ int ext2_ioctl (struct inode * inode, st
return -ENOTTY;
}
}
+
+#ifdef CONFIG_COMPAT
+long ext2_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ int ret;
+
+ /* These are just misnamed, they actually get/put from/to user an int */
+ switch (cmd) {
+ case EXT2_IOC32_GETFLAGS:
+ cmd = EXT2_IOC_GETFLAGS;
+ break;
+ case EXT2_IOC32_SETFLAGS:
+ cmd = EXT2_IOC_SETFLAGS;
+ break;
+ case EXT2_IOC32_GETVERSION:
+ cmd = EXT2_IOC_GETVERSION;
+ break;
+ case EXT2_IOC32_SETVERSION:
+ cmd = EXT2_IOC_SETVERSION;
+ break;
+ default:
+ return -ENOIOCTLCMD;
+ }
+ lock_kernel();
+ ret = ext2_ioctl(inode, file, cmd, (unsigned long) compat_ptr(arg));
+ unlock_kernel();
+ return ret;
+}
+#endif

2006-08-24 21:38:11

by David Howells

[permalink] [raw]
Subject: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

From: David Howells <[email protected]>

Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.

This patch does the following:

(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.

(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:

(*) Block I/O tracing.

(*) Disk partition code.

(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling.

(*) Various block-based device drivers, such as IDE, the old CDROM
drivers and USB storage.

(*) MTD blockdev handling and FTL.

(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.

(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.

(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.

(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.

(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:

(*) Default blockdev file operations (to give error ENODEV on opening).

(*) Makes some /proc changes:

(*) /proc/devices does not list any blockdevs.

(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.

(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.

(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).

(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.

Signed-Off-By: David Howells <[email protected]>
---

block/Kconfig | 14 +++++++++
block/Kconfig.iosched | 3 ++
block/Makefile | 2 +
drivers/block/Kconfig | 4 +++
drivers/cdrom/Kconfig | 2 +
drivers/char/Kconfig | 1 +
drivers/char/random.c | 4 +++
drivers/ide/Kconfig | 4 +++
drivers/ieee1394/Kconfig | 2 +
drivers/infiniband/ulp/iser/Kconfig | 2 +
drivers/infiniband/ulp/srp/Kconfig | 2 +
drivers/md/Kconfig | 3 ++
drivers/message/i2o/Kconfig | 2 +
drivers/mmc/Kconfig | 2 +
drivers/mmc/Makefile | 3 +-
drivers/mtd/Kconfig | 12 ++++----
drivers/mtd/devices/Kconfig | 2 +
drivers/s390/block/Kconfig | 2 +
drivers/scsi/Kconfig | 8 +++--
drivers/usb/storage/Kconfig | 2 +
fs/Kconfig | 32 +++++++++++++++++-----
fs/Makefile | 14 +++++++--
fs/compat_ioctl.c | 18 ++++++++++++
fs/no-block.c | 22 +++++++++++++++
fs/partitions/Makefile | 2 +
fs/proc/proc_misc.c | 11 +++++++
fs/quota.c | 44 +++++++++++++++++++++---------
fs/super.c | 4 +++
fs/xfs/Kconfig | 1 +
include/linux/blkdev.h | 52 ++++++++++++++++++++++++++---------
include/linux/buffer_head.h | 16 +++++++++++
include/linux/compat_ioctl.h | 2 +
include/linux/elevator.h | 3 ++
include/linux/fs.h | 25 +++++++++++++++--
include/linux/genhd.h | 4 +++
include/linux/mpage.h | 3 ++
include/linux/raid/md.h | 3 ++
include/linux/raid/md_k.h | 3 ++
include/scsi/scsi_tcq.h | 3 +-
init/Kconfig | 2 +
init/do_mounts.c | 13 ++++++++-
kernel/sys_ni.c | 5 +++
mm/Makefile | 2 +
mm/filemap.c | 4 +++
mm/migrate.c | 2 +
mm/page-writeback.c | 8 +++--
security/seclvl.c | 4 +++
47 files changed, 308 insertions(+), 70 deletions(-)

diff --git a/block/Kconfig b/block/Kconfig
index b6f5f0a..9cc0d0b 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -1,6 +1,18 @@
#
# Block layer core configuration
#
+config BLOCK
+ bool "Enable the block layer"
+ default y
+ help
+ This permits the block layer to be removed from the kernel if it's not
+ needed (on some embedded devices for example). If this option is
+ disabled, then blockdev files will become unusable and some
+ filesystems (such as ext3) will become unavailable. Say Y here unless
+ you know you really don't want to mount disks and suchlike.
+
+if BLOCK
+
#XXX - it makes sense to enable this only for 32-bit subarch's, not for x86_64
#for instance.
config LBD
@@ -33,4 +45,6 @@ config LSF

If unsure, say Y.

+endif
+
source block/Kconfig.iosched
diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
index 48d090e..903f0d3 100644
--- a/block/Kconfig.iosched
+++ b/block/Kconfig.iosched
@@ -1,3 +1,4 @@
+if BLOCK

menu "IO Schedulers"

@@ -67,3 +68,5 @@ config DEFAULT_IOSCHED
default "noop" if DEFAULT_NOOP

endmenu
+
+endif
diff --git a/block/Makefile b/block/Makefile
index c05de0e..085e967 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -2,7 +2,7 @@ #
# Makefile for the kernel block layer
#

-obj-y := elevator.o ll_rw_blk.o ioctl.o genhd.o scsi_ioctl.o
+obj-$(CONFIG_BLOCK) := elevator.o ll_rw_blk.o ioctl.o genhd.o scsi_ioctl.o

obj-$(CONFIG_IOSCHED_NOOP) += noop-iosched.o
obj-$(CONFIG_IOSCHED_AS) += as-iosched.o
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index b5382ce..422e31d 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -2,6 +2,8 @@ #
# Block device driver configuration
#

+if BLOCK
+
menu "Block devices"

config BLK_DEV_FD
@@ -468,3 +470,5 @@ config ATA_OVER_ETH
devices like the Coraid EtherDrive (R) Storage Blade.

endmenu
+
+endif
diff --git a/drivers/cdrom/Kconfig b/drivers/cdrom/Kconfig
index ff5652d..4b12e90 100644
--- a/drivers/cdrom/Kconfig
+++ b/drivers/cdrom/Kconfig
@@ -3,7 +3,7 @@ # CDROM driver configuration
#

menu "Old CD-ROM drivers (not SCSI, not IDE)"
- depends on ISA
+ depends on ISA && BLOCK

config CD_NO_IDESCSI
bool "Support non-SCSI/IDE/ATAPI CDROM drives"
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index c40e487..b9c6777 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -984,6 +984,7 @@ config GPIO_VR41XX

config RAW_DRIVER
tristate "RAW driver (/dev/raw/rawN) (OBSOLETE)"
+ depends on BLOCK
help
The raw driver permits block devices to be bound to /dev/raw/rawN.
Once bound, I/O against /dev/raw/rawN uses efficient zero-copy I/O.
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4c3a5ca..b430a12 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -655,6 +655,7 @@ void add_interrupt_randomness(int irq)
add_timer_randomness(irq_timer_state[irq], 0x100 + irq);
}

+#ifdef CONFIG_BLOCK
void add_disk_randomness(struct gendisk *disk)
{
if (!disk || !disk->random)
@@ -667,6 +668,7 @@ void add_disk_randomness(struct gendisk
}

EXPORT_SYMBOL(add_disk_randomness);
+#endif

#define EXTRACT_SIZE 10

@@ -918,6 +920,7 @@ void rand_initialize_irq(int irq)
}
}

+#ifdef CONFIG_BLOCK
void rand_initialize_disk(struct gendisk *disk)
{
struct timer_rand_state *state;
@@ -932,6 +935,7 @@ void rand_initialize_disk(struct gendisk
disk->random = state;
}
}
+#endif

static ssize_t
random_read(struct file * file, char __user * buf, size_t nbytes, loff_t *ppos)
diff --git a/drivers/ide/Kconfig b/drivers/ide/Kconfig
index b6fb167..69d627b 100644
--- a/drivers/ide/Kconfig
+++ b/drivers/ide/Kconfig
@@ -4,6 +4,8 @@ #
# Andre Hedrick <[email protected]>
#

+if BLOCK
+
menu "ATA/ATAPI/MFM/RLL support"

config IDE
@@ -1082,3 +1084,5 @@ config BLK_DEV_HD
endif

endmenu
+
+endif
diff --git a/drivers/ieee1394/Kconfig b/drivers/ieee1394/Kconfig
index 1867375..c9d84b9 100644
--- a/drivers/ieee1394/Kconfig
+++ b/drivers/ieee1394/Kconfig
@@ -122,7 +122,7 @@ config IEEE1394_VIDEO1394

config IEEE1394_SBP2
tristate "SBP-2 support (Harddisks etc.)"
- depends on IEEE1394 && SCSI && (PCI || BROKEN)
+ depends on IEEE1394 && BLOCK && SCSI && (PCI || BROKEN)
help
This option enables you to use SBP-2 devices connected to your IEEE
1394 bus. SBP-2 devices include harddrives and DVD devices.
diff --git a/drivers/infiniband/ulp/iser/Kconfig b/drivers/infiniband/ulp/iser/Kconfig
index fead87d..f945953 100644
--- a/drivers/infiniband/ulp/iser/Kconfig
+++ b/drivers/infiniband/ulp/iser/Kconfig
@@ -1,6 +1,6 @@
config INFINIBAND_ISER
tristate "ISCSI RDMA Protocol"
- depends on INFINIBAND && SCSI
+ depends on INFINIBAND && BLOCK && SCSI
select SCSI_ISCSI_ATTRS
---help---
Support for the ISCSI RDMA Protocol over InfiniBand. This
diff --git a/drivers/infiniband/ulp/srp/Kconfig b/drivers/infiniband/ulp/srp/Kconfig
index 8fe3be4..63d7d5a 100644
--- a/drivers/infiniband/ulp/srp/Kconfig
+++ b/drivers/infiniband/ulp/srp/Kconfig
@@ -1,6 +1,6 @@
config INFINIBAND_SRP
tristate "InfiniBand SCSI RDMA Protocol"
- depends on INFINIBAND && SCSI
+ depends on INFINIBAND && BLOCK && SCSI
---help---
Support for the SCSI RDMA Protocol over InfiniBand. This
allows you to access storage devices that speak SRP over
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index bf869ed..1e91f90 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -2,6 +2,8 @@ #
# Block device driver configuration
#

+if CONFIG_BLOCK
+
menu "Multi-device support (RAID and LVM)"

config MD
@@ -251,3 +253,4 @@ config DM_MULTIPATH_EMC

endmenu

+endif
diff --git a/drivers/message/i2o/Kconfig b/drivers/message/i2o/Kconfig
index fef6771..6443392 100644
--- a/drivers/message/i2o/Kconfig
+++ b/drivers/message/i2o/Kconfig
@@ -88,7 +88,7 @@ config I2O_BUS

config I2O_BLOCK
tristate "I2O Block OSM"
- depends on I2O
+ depends on I2O && BLOCK
---help---
Include support for the I2O Block OSM. The Block OSM presents disk
and other structured block devices to the operating system. If you
diff --git a/drivers/mmc/Kconfig b/drivers/mmc/Kconfig
index 45bcf09..f540bd8 100644
--- a/drivers/mmc/Kconfig
+++ b/drivers/mmc/Kconfig
@@ -21,7 +21,7 @@ config MMC_DEBUG

config MMC_BLOCK
tristate "MMC block device driver"
- depends on MMC
+ depends on MMC && BLOCK
default y
help
Say Y here to enable the MMC block device driver support.
diff --git a/drivers/mmc/Makefile b/drivers/mmc/Makefile
index d2957e3..b1f6e03 100644
--- a/drivers/mmc/Makefile
+++ b/drivers/mmc/Makefile
@@ -24,7 +24,8 @@ obj-$(CONFIG_MMC_AU1X) += au1xmmc.o
obj-$(CONFIG_MMC_OMAP) += omap.o
obj-$(CONFIG_MMC_AT91RM9200) += at91_mci.o

-mmc_core-y := mmc.o mmc_queue.o mmc_sysfs.o
+mmc_core-y := mmc.o mmc_sysfs.o
+mmc_core-$(CONFIG_BLOCK) += mmc_queue.o

ifeq ($(CONFIG_MMC_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG
diff --git a/drivers/mtd/Kconfig b/drivers/mtd/Kconfig
index 1344ad7..188cd37 100644
--- a/drivers/mtd/Kconfig
+++ b/drivers/mtd/Kconfig
@@ -166,7 +166,7 @@ config MTD_CHAR

config MTD_BLOCK
tristate "Caching block device access to MTD devices"
- depends on MTD
+ depends on MTD && BLOCK
---help---
Although most flash chips have an erase size too large to be useful
as block devices, it is possible to use MTD devices which are based
@@ -188,7 +188,7 @@ config MTD_BLOCK

config MTD_BLOCK_RO
tristate "Readonly block device access to MTD devices"
- depends on MTD_BLOCK!=y && MTD
+ depends on MTD_BLOCK!=y && MTD && BLOCK
help
This allows you to mount read-only file systems (such as cramfs)
from an MTD device, without the overhead (and danger) of the caching
@@ -199,7 +199,7 @@ config MTD_BLOCK_RO

config FTL
tristate "FTL (Flash Translation Layer) support"
- depends on MTD
+ depends on MTD && BLOCK
---help---
This provides support for the original Flash Translation Layer which
is part of the PCMCIA specification. It uses a kind of pseudo-
@@ -215,7 +215,7 @@ config FTL

config NFTL
tristate "NFTL (NAND Flash Translation Layer) support"
- depends on MTD
+ depends on MTD && BLOCK
---help---
This provides support for the NAND Flash Translation Layer which is
used on M-Systems' DiskOnChip devices. It uses a kind of pseudo-
@@ -238,7 +238,7 @@ config NFTL_RW

config INFTL
tristate "INFTL (Inverse NAND Flash Translation Layer) support"
- depends on MTD
+ depends on MTD && BLOCK
---help---
This provides support for the Inverse NAND Flash Translation
Layer which is used on M-Systems' newer DiskOnChip devices. It
@@ -255,7 +255,7 @@ config INFTL

config RFD_FTL
tristate "Resident Flash Disk (Flash Translation Layer) support"
- depends on MTD
+ depends on MTD && BLOCK
---help---
This provides support for the flash translation layer known
as the Resident Flash Disk (RFD), as used by the Embedded BIOS
diff --git a/drivers/mtd/devices/Kconfig b/drivers/mtd/devices/Kconfig
index 16c02b5..440f685 100644
--- a/drivers/mtd/devices/Kconfig
+++ b/drivers/mtd/devices/Kconfig
@@ -136,7 +136,7 @@ config MTDRAM_ABS_POS

config MTD_BLOCK2MTD
tristate "MTD using block device"
- depends on MTD
+ depends on MTD && BLOCK
help
This driver allows a block device to appear as an MTD. It would
generally be used in the following cases:
diff --git a/drivers/s390/block/Kconfig b/drivers/s390/block/Kconfig
index 929d6ff..b250c53 100644
--- a/drivers/s390/block/Kconfig
+++ b/drivers/s390/block/Kconfig
@@ -1,4 +1,4 @@
-if S390
+if S390 && BLOCK

comment "S/390 block device drivers"
depends on S390
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 96a81cd..afcbe19 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -3,11 +3,13 @@ menu "SCSI device support"
config RAID_ATTRS
tristate "RAID Transport Class"
default n
+ depends on BLOCK
---help---
Provides RAID

config SCSI
tristate "SCSI device support"
+ depends on BLOCK
---help---
If you want to use a SCSI hard disk, SCSI tape drive, SCSI CD-ROM or
any other SCSI device under Linux, say Y and make sure that you know
@@ -43,7 +45,7 @@ comment "SCSI support type (disk, tape,

config BLK_DEV_SD
tristate "SCSI disk support"
- depends on SCSI
+ depends on SCSI && BLOCK
---help---
If you want to use SCSI hard disks, Fibre Channel disks,
USB storage or the SCSI or parallel port version of
@@ -98,7 +100,7 @@ config CHR_DEV_OSST

config BLK_DEV_SR
tristate "SCSI CDROM support"
- depends on SCSI
+ depends on SCSI && BLOCK
---help---
If you want to use a SCSI or FireWire CD-ROM under Linux,
say Y and read the SCSI-HOWTO and the CDROM-HOWTO at
@@ -473,7 +475,7 @@ source "drivers/scsi/megaraid/Kconfig.me

config SCSI_SATA
tristate "Serial ATA (SATA) support"
- depends on SCSI
+ depends on SCSI && BLOCK
help
This driver family supports Serial ATA host controllers
and devices.
diff --git a/drivers/usb/storage/Kconfig b/drivers/usb/storage/Kconfig
index be9eec2..578aa13 100644
--- a/drivers/usb/storage/Kconfig
+++ b/drivers/usb/storage/Kconfig
@@ -8,7 +8,7 @@ comment "may also be needed; see USB_STO

config USB_STORAGE
tristate "USB Mass Storage support"
- depends on USB
+ depends on USB && BLOCK
select SCSI
---help---
Say Y here if you want to connect USB mass storage devices to your
diff --git a/fs/Kconfig b/fs/Kconfig
index 3f00a9f..dc5e69b 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -4,6 +4,8 @@ #

menu "File systems"

+if BLOCK
+
config EXT2_FS
tristate "Second extended fs support"
help
@@ -383,8 +385,11 @@ config MINIX_FS
partition (the one containing the directory /) cannot be compiled as
a module.

+endif
+
config ROMFS_FS
tristate "ROM file system support"
+ depends on BLOCK
---help---
This is a very small read-only file system mainly intended for
initial ram disks of installation disks, but it could be used for
@@ -530,6 +535,7 @@ config FUSE_FS
If you want to develop a userspace FS, or if you want to use
a filesystem based on FUSE, answer Y or M.

+if BLOCK
menu "CD-ROM/DVD Filesystems"

config ISO9660_FS
@@ -597,7 +603,9 @@ config UDF_NLS
depends on (UDF_FS=m && NLS) || (UDF_FS=y && NLS=y)

endmenu
+endif

+if BLOCK
menu "DOS/FAT/NT Filesystems"

config FAT_FS
@@ -782,6 +790,7 @@ config NTFS_RW
It is perfectly safe to say N here.

endmenu
+endif

menu "Pseudo filesystems"

@@ -907,7 +916,7 @@ menu "Miscellaneous filesystems"

config ADFS_FS
tristate "ADFS file system support (EXPERIMENTAL)"
- depends on EXPERIMENTAL
+ depends on BLOCK && EXPERIMENTAL
help
The Acorn Disc Filing System is the standard file system of the
RiscOS operating system which runs on Acorn's ARM-based Risc PC
@@ -935,7 +944,7 @@ config ADFS_FS_RW

config AFFS_FS
tristate "Amiga FFS file system support (EXPERIMENTAL)"
- depends on EXPERIMENTAL
+ depends on BLOCK && EXPERIMENTAL
help
The Fast File System (FFS) is the common file system used on hard
disks by Amiga(tm) systems since AmigaOS Version 1.3 (34.20). Say Y
@@ -957,7 +966,7 @@ config AFFS_FS

config HFS_FS
tristate "Apple Macintosh file system support (EXPERIMENTAL)"
- depends on EXPERIMENTAL
+ depends on BLOCK && EXPERIMENTAL
select NLS
help
If you say Y here, you will be able to mount Macintosh-formatted
@@ -970,6 +979,7 @@ config HFS_FS

config HFSPLUS_FS
tristate "Apple Extended HFS file system support"
+ depends on BLOCK
select NLS
select NLS_UTF8
help
@@ -983,7 +993,7 @@ config HFSPLUS_FS

config BEFS_FS
tristate "BeOS file system (BeFS) support (read only) (EXPERIMENTAL)"
- depends on EXPERIMENTAL
+ depends on BLOCK && EXPERIMENTAL
select NLS
help
The BeOS File System (BeFS) is the native file system of Be, Inc's
@@ -1010,7 +1020,7 @@ config BEFS_DEBUG

config BFS_FS
tristate "BFS file system support (EXPERIMENTAL)"
- depends on EXPERIMENTAL
+ depends on BLOCK && EXPERIMENTAL
help
Boot File System (BFS) is a file system used under SCO UnixWare to
allow the bootloader access to the kernel image and other important
@@ -1032,7 +1042,7 @@ config BFS_FS

config EFS_FS
tristate "EFS file system support (read only) (EXPERIMENTAL)"
- depends on EXPERIMENTAL
+ depends on BLOCK && EXPERIMENTAL
help
EFS is an older file system used for non-ISO9660 CD-ROMs and hard
disk partitions by SGI's IRIX operating system (IRIX 6.0 and newer
@@ -1047,7 +1057,7 @@ config EFS_FS

config JFFS_FS
tristate "Journalling Flash File System (JFFS) support"
- depends on MTD
+ depends on MTD && BLOCK
help
JFFS is the Journaling Flash File System developed by Axis
Communications in Sweden, aimed at providing a crash/powerdown-safe
@@ -1232,6 +1242,7 @@ endchoice

config CRAMFS
tristate "Compressed ROM file system support (cramfs)"
+ depends on BLOCK
select ZLIB_INFLATE
help
Saying Y here includes support for CramFs (Compressed ROM File
@@ -1251,6 +1262,7 @@ config CRAMFS

config VXFS_FS
tristate "FreeVxFS file system support (VERITAS VxFS(TM) compatible)"
+ depends on BLOCK
help
FreeVxFS is a file system driver that support the VERITAS VxFS(TM)
file system format. VERITAS VxFS(TM) is the standard file system
@@ -1268,6 +1280,7 @@ config VXFS_FS

config HPFS_FS
tristate "OS/2 HPFS file system support"
+ depends on BLOCK
help
OS/2 is IBM's operating system for PC's, the same as Warp, and HPFS
is the file system used for organizing files on OS/2 hard disk
@@ -1284,6 +1297,7 @@ config HPFS_FS

config QNX4FS_FS
tristate "QNX4 file system support (read only)"
+ depends on BLOCK
help
This is the file system used by the real-time operating systems
QNX 4 and QNX 6 (the latter is also called QNX RTP).
@@ -1311,6 +1325,7 @@ config QNX4FS_RW

config SYSV_FS
tristate "System V/Xenix/V7/Coherent file system support"
+ depends on BLOCK
help
SCO, Xenix and Coherent are commercial Unix systems for Intel
machines, and Version 7 was used on the DEC PDP-11. Saying Y
@@ -1349,6 +1364,7 @@ config SYSV_FS

config UFS_FS
tristate "UFS file system support (read only)"
+ depends on BLOCK
help
BSD and derivate versions of Unix (such as SunOS, FreeBSD, NetBSD,
OpenBSD and NeXTstep) use a file system called UFS. Some System V
@@ -1923,11 +1939,13 @@ config 9P_FS

endmenu

+if BLOCK
menu "Partition Types"

source "fs/partitions/Kconfig"

endmenu
+endif

source "fs/nls/Kconfig"

diff --git a/fs/Makefile b/fs/Makefile
index 8913542..8071c64 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -5,12 +5,18 @@ # 14 Sep 2000, Christoph Hellwig <hch@in
# Rewritten to use lists instead of if-statements.
#

-obj-y := open.o read_write.o file_table.o buffer.o bio.o super.o \
- block_dev.o char_dev.o stat.o exec.o pipe.o namei.o fcntl.o \
+obj-y := open.o read_write.o file_table.o super.o \
+ char_dev.o stat.o exec.o pipe.o namei.o fcntl.o \
ioctl.o readdir.o select.o fifo.o locks.o dcache.o inode.o \
attr.o bad_inode.o file.o filesystems.o namespace.o aio.o \
- seq_file.o xattr.o libfs.o fs-writeback.o mpage.o direct-io.o \
- ioprio.o pnode.o drop_caches.o splice.o sync.o
+ seq_file.o xattr.o libfs.o fs-writeback.o \
+ pnode.o drop_caches.o splice.o sync.o
+
+ifeq ($(CONFIG_BLOCK),y)
+obj-y += buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
+else
+obj-y += no-block.o
+endif

obj-$(CONFIG_INOTIFY) += inotify.o
obj-$(CONFIG_INOTIFY_USER) += inotify_user.o
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 7b8a9b4..af160e9 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -645,6 +645,7 @@ out:
}
#endif

+#ifdef CONFIG_BLOCK
struct hd_geometry32 {
unsigned char heads;
unsigned char sectors;
@@ -869,6 +870,7 @@ static int sg_grt_trans(unsigned int fd,
}
return err;
}
+#endif /* CONFIG_BLOCK */

struct sock_fprog32 {
unsigned short len;
@@ -992,6 +994,7 @@ static int ppp_ioctl_trans(unsigned int
}


+#ifdef CONFIG_BLOCK
struct mtget32 {
compat_long_t mt_type;
compat_long_t mt_resid;
@@ -1164,6 +1167,7 @@ static int cdrom_ioctl_trans(unsigned in

return err;
}
+#endif /* CONFIG_BLOCK */

extern int tty_ioctl(struct inode * inode, struct file * file, unsigned int cmd, unsigned long arg);

@@ -1493,6 +1497,7 @@ ret_einval(unsigned int fd, unsigned int
return -EINVAL;
}

+#ifdef CONFIG_BLOCK
static int broken_blkgetsize(unsigned int fd, unsigned int cmd, unsigned long arg)
{
/* The mkswap binary hard codes it to Intel value :-((( */
@@ -1527,12 +1532,14 @@ static int blkpg_ioctl_trans(unsigned in

return sys_ioctl(fd, cmd, (unsigned long)a);
}
+#endif

static int ioc_settimeout(unsigned int fd, unsigned int cmd, unsigned long arg)
{
return rw_long(fd, AUTOFS_IOC_SETTIMEOUT, arg);
}

+#ifdef CONFIG_BLOCK
/* Fix sizeof(sizeof()) breakage */
#define BLKBSZGET_32 _IOR(0x12,112,int)
#define BLKBSZSET_32 _IOW(0x12,113,int)
@@ -1553,6 +1560,7 @@ static int do_blkgetsize64(unsigned int
{
return sys_ioctl(fd, BLKGETSIZE64, (unsigned long)compat_ptr(arg));
}
+#endif

/* Bluetooth ioctls */
#define HCIUARTSETPROTO _IOW('U', 200, int)
@@ -1573,6 +1581,7 @@ #define HIDPCONNDEL _IOW('H', 201, int)
#define HIDPGETCONNLIST _IOR('H', 210, int)
#define HIDPGETCONNINFO _IOR('H', 211, int)

+#ifdef CONFIG_BLOCK
struct floppy_struct32 {
compat_uint_t size;
compat_uint_t sect;
@@ -1897,6 +1906,7 @@ out:
kfree(karg);
return err;
}
+#endif

struct mtd_oob_buf32 {
u_int32_t start;
@@ -1938,6 +1948,7 @@ static int mtd_rw_oob(unsigned int fd, u
return err;
}

+#ifdef CONFIG_BLOCK
struct raw32_config_request
{
compat_int_t raw_minor;
@@ -2002,6 +2013,7 @@ static int raw_ioctl(unsigned fd, unsign
}
return ret;
}
+#endif /* CONFIG_BLOCK */

struct serial_struct32 {
compat_int_t type;
@@ -2608,6 +2620,7 @@ HANDLE_IOCTL(SIOCBRDELIF, dev_ifsioc)
HANDLE_IOCTL(SIOCRTMSG, ret_einval)
HANDLE_IOCTL(SIOCGSTAMP, do_siocgstamp)
#endif
+#ifdef CONFIG_BLOCK
HANDLE_IOCTL(HDIO_GETGEO, hdio_getgeo)
HANDLE_IOCTL(BLKRAGET, w_long)
HANDLE_IOCTL(BLKGETSIZE, w_long)
@@ -2633,14 +2646,17 @@ HANDLE_IOCTL(FDGETFDCSTAT32, fd_ioctl_tr
HANDLE_IOCTL(FDWERRORGET32, fd_ioctl_trans)
HANDLE_IOCTL(SG_IO,sg_ioctl_trans)
HANDLE_IOCTL(SG_GET_REQUEST_TABLE, sg_grt_trans)
+#endif
HANDLE_IOCTL(PPPIOCGIDLE32, ppp_ioctl_trans)
HANDLE_IOCTL(PPPIOCSCOMPRESS32, ppp_ioctl_trans)
HANDLE_IOCTL(PPPIOCSPASS32, ppp_sock_fprog_ioctl_trans)
HANDLE_IOCTL(PPPIOCSACTIVE32, ppp_sock_fprog_ioctl_trans)
+#ifdef CONFIG_BLOCK
HANDLE_IOCTL(MTIOCGET32, mt_ioctl_trans)
HANDLE_IOCTL(MTIOCPOS32, mt_ioctl_trans)
HANDLE_IOCTL(CDROMREADAUDIO, cdrom_ioctl_trans)
HANDLE_IOCTL(CDROM_SEND_PACKET, cdrom_ioctl_trans)
+#endif
#define AUTOFS_IOC_SETTIMEOUT32 _IOWR(0x93,0x64,unsigned int)
HANDLE_IOCTL(AUTOFS_IOC_SETTIMEOUT32, ioc_settimeout)
#ifdef CONFIG_VT
@@ -2679,12 +2695,14 @@ HANDLE_IOCTL(SONET_SETFRAMING, do_atm_io
HANDLE_IOCTL(SONET_GETFRAMING, do_atm_ioctl)
HANDLE_IOCTL(SONET_GETFRSENSE, do_atm_ioctl)
/* block stuff */
+#ifdef CONFIG_BLOCK
HANDLE_IOCTL(BLKBSZGET_32, do_blkbszget)
HANDLE_IOCTL(BLKBSZSET_32, do_blkbszset)
HANDLE_IOCTL(BLKGETSIZE64_32, do_blkgetsize64)
/* Raw devices */
HANDLE_IOCTL(RAW_SETBIND, raw_ioctl)
HANDLE_IOCTL(RAW_GETBIND, raw_ioctl)
+#endif
/* Serial */
HANDLE_IOCTL(TIOCGSERIAL, serial_struct_ioctl)
HANDLE_IOCTL(TIOCSSERIAL, serial_struct_ioctl)
diff --git a/fs/no-block.c b/fs/no-block.c
new file mode 100644
index 0000000..d269a93
--- /dev/null
+++ b/fs/no-block.c
@@ -0,0 +1,22 @@
+/* no-block.c: implementation of routines required for non-BLOCK configuration
+ *
+ * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([email protected])
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+
+static int no_blkdev_open(struct inode * inode, struct file * filp)
+{
+ return -ENODEV;
+}
+
+const struct file_operations def_blk_fops = {
+ .open = no_blkdev_open,
+};
diff --git a/fs/partitions/Makefile b/fs/partitions/Makefile
index d713ce6..67e665f 100644
--- a/fs/partitions/Makefile
+++ b/fs/partitions/Makefile
@@ -2,7 +2,7 @@ #
# Makefile for the linux kernel.
#

-obj-y := check.o
+obj-$(CONFIG_BLOCK) := check.o

obj-$(CONFIG_ACORN_PARTITION) += acorn.o
obj-$(CONFIG_AMIGA_PARTITION) += amiga.o
diff --git a/fs/proc/proc_misc.c b/fs/proc/proc_misc.c
index 9f2cfc3..ed8646e 100644
--- a/fs/proc/proc_misc.c
+++ b/fs/proc/proc_misc.c
@@ -268,12 +268,15 @@ static int devinfo_show(struct seq_file
if (i == 0)
seq_printf(f, "Character devices:\n");
chrdev_show(f, i);
- } else {
+ }
+#ifdef CONFIG_BLOCK
+ else {
i -= CHRDEV_MAJOR_HASH_SIZE;
if (i == 0)
seq_printf(f, "\nBlock devices:\n");
blkdev_show(f, i);
}
+#endif
return 0;
}

@@ -346,6 +349,7 @@ static int stram_read_proc(char *page, c
}
#endif

+#ifdef CONFIG_BLOCK
extern struct seq_operations partitions_op;
static int partitions_open(struct inode *inode, struct file *file)
{
@@ -369,6 +373,7 @@ static struct file_operations proc_disks
.llseek = seq_lseek,
.release = seq_release,
};
+#endif

#ifdef CONFIG_MODULES
extern struct seq_operations modules_op;
@@ -686,7 +691,9 @@ #endif
entry->proc_fops = &proc_kmsg_operations;
create_seq_entry("devices", 0, &proc_devinfo_operations);
create_seq_entry("cpuinfo", 0, &proc_cpuinfo_operations);
+#ifdef CONFIG_BLOCK
create_seq_entry("partitions", 0, &proc_partitions_operations);
+#endif
create_seq_entry("stat", 0, &proc_stat_operations);
create_seq_entry("interrupts", 0, &proc_interrupts_operations);
#ifdef CONFIG_SLAB
@@ -698,7 +705,9 @@ #endif
create_seq_entry("buddyinfo",S_IRUGO, &fragmentation_file_operations);
create_seq_entry("vmstat",S_IRUGO, &proc_vmstat_file_operations);
create_seq_entry("zoneinfo",S_IRUGO, &proc_zoneinfo_file_operations);
+#ifdef CONFIG_BLOCK
create_seq_entry("diskstats", 0, &proc_diskstats_operations);
+#endif
#ifdef CONFIG_MODULES
create_seq_entry("modules", 0, &proc_modules_operations);
#endif
diff --git a/fs/quota.c b/fs/quota.c
index d6a2be8..b9dae76 100644
--- a/fs/quota.c
+++ b/fs/quota.c
@@ -338,6 +338,34 @@ static int do_quotactl(struct super_bloc
}

/*
+ * look up a superblock on which quota ops will be performed
+ * - use the name of a block device to find the superblock thereon
+ */
+static inline struct super_block *quotactl_block(const char __user *special)
+{
+#ifdef CONFIG_BLOCK
+ struct block_device *bdev;
+ struct super_block *sb;
+ char *tmp = getname(special);
+
+ if (IS_ERR(tmp))
+ return ERR_PTR(PTR_ERR(tmp));
+ bdev = lookup_bdev(tmp);
+ putname(tmp);
+ if (IS_ERR(bdev))
+ return ERR_PTR(PTR_ERR(bdev));
+ sb = get_super(bdev);
+ bdput(bdev);
+ if (!sb)
+ return ERR_PTR(-ENODEV);
+
+ return sb;
+#else
+ return ERR_PTR(-ENODEV);
+#endif
+}
+
+/*
* This is the system call interface. This communicates with
* the user-level programs. Currently this only supports diskquota
* calls. Maybe we need to add the process quotas etc. in the future,
@@ -347,25 +375,15 @@ asmlinkage long sys_quotactl(unsigned in
{
uint cmds, type;
struct super_block *sb = NULL;
- struct block_device *bdev;
- char *tmp;
int ret;

cmds = cmd >> SUBCMDSHIFT;
type = cmd & SUBCMDMASK;

if (cmds != Q_SYNC || special) {
- tmp = getname(special);
- if (IS_ERR(tmp))
- return PTR_ERR(tmp);
- bdev = lookup_bdev(tmp);
- putname(tmp);
- if (IS_ERR(bdev))
- return PTR_ERR(bdev);
- sb = get_super(bdev);
- bdput(bdev);
- if (!sb)
- return -ENODEV;
+ sb = quotactl_block(special);
+ if (IS_ERR(sb))
+ return PTR_ERR(sb);
}

ret = check_quotactl_valid(sb, type, cmds, id);
diff --git a/fs/super.c b/fs/super.c
index 22c2fd1..33ce475 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -570,8 +570,10 @@ int do_remount_sb(struct super_block *sb
{
int retval;

+#ifdef CONFIG_BLOCK
if (!(flags & MS_RDONLY) && bdev_read_only(sb->s_bdev))
return -EACCES;
+#endif
if (flags & MS_RDONLY)
acct_auto_close(sb);
shrink_dcache_sb(sb);
@@ -691,6 +693,7 @@ void kill_litter_super(struct super_bloc

EXPORT_SYMBOL(kill_litter_super);

+#ifdef CONFIG_BLOCK
static int set_bdev_super(struct super_block *s, void *data)
{
s->s_bdev = data;
@@ -786,6 +789,7 @@ void kill_block_super(struct super_block
}

EXPORT_SYMBOL(kill_block_super);
+#endif

int get_sb_nodev(struct file_system_type *fs_type,
int flags, void *data,
diff --git a/fs/xfs/Kconfig b/fs/xfs/Kconfig
index 26b364c..35115bc 100644
--- a/fs/xfs/Kconfig
+++ b/fs/xfs/Kconfig
@@ -1,5 +1,6 @@
config XFS_FS
tristate "XFS filesystem support"
+ depends on BLOCK
help
XFS is a high performance journaling filesystem which originated
on the SGI IRIX platform. It is completely multi-threaded, can
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e3f30d5..fb85d63 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -16,6 +16,21 @@ #include <linux/stringify.h>

#include <asm/scatterlist.h>

+#ifdef CONFIG_LBD
+# include <asm/div64.h>
+# define sector_div(a, b) do_div(a, b)
+#else
+# define sector_div(n, b)( \
+{ \
+ int _res; \
+ _res = (n) % (b); \
+ (n) /= (b); \
+ _res; \
+} \
+)
+#endif
+
+#ifdef CONFIG_BLOCK
extern struct super_block *blockdev_superblock;

#define sb_is_blkdev_sb(sb) ((sb) == blockdev_superblock)
@@ -825,20 +840,6 @@ struct work_struct;
int kblockd_schedule_work(struct work_struct *work);
void kblockd_flush(void);

-#ifdef CONFIG_LBD
-# include <asm/div64.h>
-# define sector_div(a, b) do_div(a, b)
-#else
-# define sector_div(n, b)( \
-{ \
- int _res; \
- _res = (n) % (b); \
- (n) /= (b); \
- _res; \
-} \
-)
-#endif
-
#define MODULE_ALIAS_BLOCKDEV(major,minor) \
MODULE_ALIAS("block-major-" __stringify(major) "-" __stringify(minor))
#define MODULE_ALIAS_BLOCKDEV_MAJOR(major) \
@@ -846,4 +847,27 @@ #define MODULE_ALIAS_BLOCKDEV_MAJOR(majo

extern void bdev_cache_init(void);

+#else /* CONFIG_BLOCK */
+/*
+ * stubs for when the block layer is configured out
+ */
+#define buffer_heads_over_limit 0
+
+static inline long blk_congestion_wait(int rw, long timeout)
+{
+ return timeout;
+}
+
+static inline long nr_blockdev_pages(void)
+{
+ return 0;
+}
+
+static inline void bdev_cache_init(void) {}
+static inline void exit_io_context(void) {}
+
+#define sb_is_blkdev_sb(sb) 0
+
+#endif /* CONFIG_BLOCK */
+
#endif
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 64b508e..131ffd3 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -14,6 +14,8 @@ #include <linux/pagemap.h>
#include <linux/wait.h>
#include <asm/atomic.h>

+#ifdef CONFIG_BLOCK
+
enum bh_state_bits {
BH_Uptodate, /* Contains valid data */
BH_Dirty, /* Is dirty */
@@ -301,4 +303,18 @@ static inline void lock_buffer(struct bu
}

extern int __set_page_dirty_buffers(struct page *page);
+
+#else /* CONFIG_BLOCK */
+
+static inline void buffer_init(void) {}
+static inline int try_to_free_buffers(struct page *page) { return 1; }
+static inline int sync_blockdev(struct block_device *bdev) { return 0; }
+static inline int inode_has_buffers(struct inode *inode) { return 0; }
+static inline void invalidate_inode_buffers(struct inode *inode) {}
+static inline int remove_inode_buffers(struct inode *inode) { return 1; }
+static inline int sync_mapping_buffers(struct address_space *mapping) { return 0; }
+static inline void invalidate_bdev(struct block_device *bdev, int destroy_dirty_buffers) {}
+
+
+#endif /* CONFIG_BLOCK */
#endif /* _LINUX_BUFFER_HEAD_H */
diff --git a/include/linux/compat_ioctl.h b/include/linux/compat_ioctl.h
index 13cea44..307f2db 100644
--- a/include/linux/compat_ioctl.h
+++ b/include/linux/compat_ioctl.h
@@ -90,6 +90,7 @@ COMPATIBLE_IOCTL(FDTWADDLE)
COMPATIBLE_IOCTL(FDFMTTRK)
COMPATIBLE_IOCTL(FDRAWCMD)
/* 0x12 */
+#ifdef CONFIG_BLOCK
COMPATIBLE_IOCTL(BLKRASET)
COMPATIBLE_IOCTL(BLKROSET)
COMPATIBLE_IOCTL(BLKROGET)
@@ -103,6 +104,7 @@ COMPATIBLE_IOCTL(BLKTRACESETUP)
COMPATIBLE_IOCTL(BLKTRACETEARDOWN)
ULONG_IOCTL(BLKRASET)
ULONG_IOCTL(BLKFRASET)
+#endif
/* RAID */
COMPATIBLE_IOCTL(RAID_VERSION)
COMPATIBLE_IOCTL(GET_ARRAY_INFO)
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 1713ace..d2f4b0a 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -1,6 +1,8 @@
#ifndef _LINUX_ELEVATOR_H
#define _LINUX_ELEVATOR_H

+#ifdef CONFIG_BLOCK
+
typedef int (elevator_merge_fn) (request_queue_t *, struct request **,
struct bio *);

@@ -150,4 +152,5 @@ enum {

#define rq_end_sector(rq) ((rq)->sector + (rq)->nr_sectors)

+#endif /* CONFIG_BLOCK */
#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7339d41..b72e3d0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1477,6 +1477,7 @@ #else
extern void putname(const char *name);
#endif

+#ifdef CONFIG_BLOCK
extern int register_blkdev(unsigned int, const char *);
extern int unregister_blkdev(unsigned int, const char *);
extern struct block_device *bdget(dev_t);
@@ -1485,11 +1486,15 @@ extern void bd_forget(struct inode *inod
extern void bdput(struct block_device *);
extern struct block_device *open_by_devnum(dev_t, unsigned);
extern struct block_device *open_partition_by_devnum(dev_t, unsigned);
-extern const struct file_operations def_blk_fops;
extern const struct address_space_operations def_blk_aops;
+#else
+static inline void bd_forget(struct inode *inode) {}
+#endif
+extern const struct file_operations def_blk_fops;
extern const struct file_operations def_chr_fops;
extern const struct file_operations bad_sock_fops;
extern const struct file_operations def_fifo_fops;
+#ifdef CONFIG_BLOCK
extern int ioctl_by_bdev(struct block_device *, unsigned, unsigned long);
extern int blkdev_ioctl(struct inode *, struct file *, unsigned, unsigned long);
extern long compat_blkdev_ioctl(struct file *, unsigned, unsigned long);
@@ -1505,6 +1510,7 @@ #else
#define bd_claim_by_disk(bdev, holder, disk) bd_claim(bdev, holder)
#define bd_release_from_disk(bdev, disk) bd_release(bdev)
#endif
+#endif

/* fs/char_dev.c */
#define CHRDEV_MAJOR_HASH_SIZE 255
@@ -1518,14 +1524,19 @@ extern int chrdev_open(struct inode *, s
extern void chrdev_show(struct seq_file *,off_t);

/* fs/block_dev.c */
-#define BLKDEV_MAJOR_HASH_SIZE 255
#define BDEVNAME_SIZE 32 /* Largest string for a blockdev identifier */
+
+#ifdef CONFIG_BLOCK
+#define BLKDEV_MAJOR_HASH_SIZE 255
extern const char *__bdevname(dev_t, char *buffer);
extern const char *bdevname(struct block_device *bdev, char *buffer);
extern struct block_device *lookup_bdev(const char *);
extern struct block_device *open_bdev_excl(const char *, int, void *);
extern void close_bdev_excl(struct block_device *);
extern void blkdev_show(struct seq_file *,off_t);
+#else
+#define BLKDEV_MAJOR_HASH_SIZE 0
+#endif

extern void init_special_inode(struct inode *, umode_t, dev_t);

@@ -1539,6 +1550,7 @@ extern const struct file_operations rdwr

extern int fs_may_remount_ro(struct super_block *);

+#ifdef CONFIG_BLOCK
/*
* return READ, READA, or WRITE
*/
@@ -1550,9 +1562,10 @@ #define bio_rw(bio) ((bio)->bi_rw & (RW
#define bio_data_dir(bio) ((bio)->bi_rw & 1)

extern int check_disk_change(struct block_device *);
-extern int invalidate_inodes(struct super_block *);
extern int __invalidate_device(struct block_device *);
extern int invalidate_partition(struct gendisk *, int);
+#endif
+extern int invalidate_inodes(struct super_block *);
unsigned long invalidate_mapping_pages(struct address_space *mapping,
pgoff_t start, pgoff_t end);
unsigned long invalidate_inode_pages(struct address_space *mapping);
@@ -1585,7 +1598,9 @@ extern void emergency_sync(void);
extern void emergency_remount(void);
extern int do_remount_sb(struct super_block *sb, int flags,
void *data, int force);
+#ifdef CONFIG_BLOCK
extern sector_t bmap(struct inode *, sector_t);
+#endif
extern int notify_change(struct dentry *, struct iattr *);
extern int permission(struct inode *, int, struct nameidata *);
extern int generic_permission(struct inode *, int,
@@ -1668,9 +1683,11 @@ static inline void insert_inode_hash(str
extern struct file * get_empty_filp(void);
extern void file_move(struct file *f, struct list_head *list);
extern void file_kill(struct file *f);
+#ifdef CONFIG_BLOCK
struct bio;
extern void submit_bio(int, struct bio *);
extern int bdev_read_only(struct block_device *);
+#endif
extern int set_blocksize(struct block_device *, int);
extern int sb_set_blocksize(struct super_block *, int);
extern int sb_min_blocksize(struct super_block *, int);
@@ -1751,6 +1768,7 @@ static inline void do_generic_file_read(
actor);
}

+#ifdef CONFIG_BLOCK
ssize_t __blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
struct block_device *bdev, const struct iovec *iov, loff_t offset,
unsigned long nr_segs, get_block_t get_block, dio_iodone_t end_io,
@@ -1788,6 +1806,7 @@ static inline ssize_t blockdev_direct_IO
return __blockdev_direct_IO(rw, iocb, inode, bdev, iov, offset,
nr_segs, get_block, end_io, DIO_OWN_LOCKING);
}
+#endif

extern const struct file_operations generic_ro_fops;

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index e4af57e..41f276f 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -11,6 +11,8 @@ #define _LINUX_GENHD_H

#include <linux/types.h>

+#ifdef CONFIG_BLOCK
+
enum {
/* These three have identical behaviour; use the second one if DOS FDISK gets
confused about extended/logical partitions starting past cylinder 1023. */
@@ -420,3 +422,5 @@ static inline struct block_device *bdget
#endif

#endif
+
+#endif
diff --git a/include/linux/mpage.h b/include/linux/mpage.h
index 517c098..cc5fb75 100644
--- a/include/linux/mpage.h
+++ b/include/linux/mpage.h
@@ -9,6 +9,7 @@
* (And no, it doesn't do the #ifdef __MPAGE_H thing, and it doesn't do
* nested includes. Get it right in the .c file).
*/
+#ifdef CONFIG_BLOCK

struct writeback_control;
typedef int (writepage_t)(struct page *page, struct writeback_control *wbc);
@@ -20,3 +21,5 @@ int mpage_writepages(struct address_spac
struct writeback_control *wbc, get_block_t get_block);
int mpage_writepage(struct page *page, get_block_t *get_block,
struct writeback_control *wbc);
+
+#endif
diff --git a/include/linux/raid/md.h b/include/linux/raid/md.h
index eb3e547..c588709 100644
--- a/include/linux/raid/md.h
+++ b/include/linux/raid/md.h
@@ -53,6 +53,8 @@ #include <linux/raid/md_p.h>
#include <linux/raid/md_u.h>
#include <linux/raid/md_k.h>

+#ifdef CONFIG_MD
+
/*
* Different major versions are not compatible.
* Different minor versions are only downward compatible.
@@ -95,5 +97,6 @@ extern void md_new_event(mddev_t *mddev)

extern void md_update_sb(mddev_t * mddev);

+#endif /* CONFIG_MD */
#endif

diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
index d288902..920b94f 100644
--- a/include/linux/raid/md_k.h
+++ b/include/linux/raid/md_k.h
@@ -18,6 +18,8 @@ #define _MD_K_H
/* and dm-bio-list.h is not under include/linux because.... ??? */
#include "../../../drivers/md/dm-bio-list.h"

+#ifdef CONFIG_BLOCK
+
#define LEVEL_MULTIPATH (-4)
#define LEVEL_LINEAR (-1)
#define LEVEL_FAULTY (-5)
@@ -362,5 +364,6 @@ static inline void safe_put_page(struct
if (p) put_page(p);
}

+#endif /* CONFIG_BLOCK */
#endif

diff --git a/include/scsi/scsi_tcq.h b/include/scsi/scsi_tcq.h
index e47e36a..bc34746 100644
--- a/include/scsi/scsi_tcq.h
+++ b/include/scsi/scsi_tcq.h
@@ -5,7 +5,6 @@ #include <linux/blkdev.h>
#include <scsi/scsi_cmnd.h>
#include <scsi/scsi_device.h>

-
#define MSG_SIMPLE_TAG 0x20
#define MSG_HEAD_TAG 0x21
#define MSG_ORDERED_TAG 0x22
@@ -13,6 +12,7 @@ #define MSG_ORDERED_TAG 0x22
#define SCSI_NO_TAG (-1) /* identify no tag in use */


+#ifdef CONFIG_BLOCK

/**
* scsi_get_tag_type - get the type of tag the device supports
@@ -131,4 +131,5 @@ static inline struct scsi_cmnd *scsi_fin
return sdev->current_cmnd;
}

+#endif /* CONFIG_BLOCK */
#endif /* _SCSI_SCSI_TCQ_H */
diff --git a/init/Kconfig b/init/Kconfig
index a099fc6..814bacc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -92,7 +92,7 @@ config LOCALVERSION_AUTO

config SWAP
bool "Support for paging of anonymous memory (swap)"
- depends on MMU
+ depends on MMU && BLOCK
default y
help
This option allows you to choose whether you want to have support
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 94aeec7..dbb2604 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -284,7 +284,11 @@ void __init mount_block_root(char *name,
{
char *fs_names = __getname();
char *p;
+#ifdef CONFIG_BLOCK
char b[BDEVNAME_SIZE];
+#else
+ const char *b = name;
+#endif

get_fs_names(fs_names);
retry:
@@ -303,7 +307,9 @@ retry:
* Allow the user to distinguish between failed sys_open
* and bad superblock on root device.
*/
+#ifdef CONFIG_BLOCK
__bdevname(ROOT_DEV, b);
+#endif
printk("VFS: Cannot open root device \"%s\" or %s\n",
root_device_name, b);
printk("Please append a correct \"root=\" boot option\n");
@@ -315,7 +321,10 @@ retry:
for (p = fs_names; *p; p += strlen(p)+1)
printk(" %s", p);
printk("\n");
- panic("VFS: Unable to mount root fs on %s", __bdevname(ROOT_DEV, b));
+#ifdef CONFIG_BLOCK
+ __bdevname(ROOT_DEV, b);
+#endif
+ panic("VFS: Unable to mount root fs on %s", b);
out:
putname(fs_names);
}
@@ -386,8 +395,10 @@ #ifdef CONFIG_BLK_DEV_FD
change_floppy("root floppy");
}
#endif
+#ifdef CONFIG_BLOCK
create_dev("/dev/root", ROOT_DEV);
mount_block_root("/dev/root", root_mountflags);
+#endif
}

/*
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 6991bec..7a3b2e7 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -134,3 +134,8 @@ cond_syscall(sys_madvise);
cond_syscall(sys_mremap);
cond_syscall(sys_remap_file_pages);
cond_syscall(compat_sys_move_pages);
+
+/* block-layer dependent */
+cond_syscall(sys_bdflush);
+cond_syscall(sys_ioprio_set);
+cond_syscall(sys_ioprio_get);
diff --git a/mm/Makefile b/mm/Makefile
index 84fff32..3af3154 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -12,7 +12,7 @@ obj-y := bootmem.o filemap.o mempool.o
readahead.o swap.o truncate.o vmscan.o \
prio_tree.o util.o mmzone.o vmstat.o $(mmu-y)

-obj-y += bounce.o
+obj-$(CONFIG_BLOCK) += bounce.o
obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o
obj-$(CONFIG_HUGETLBFS) += hugetlb.o
obj-$(CONFIG_NUMA) += mempolicy.o
diff --git a/mm/filemap.c b/mm/filemap.c
index a5ea7e0..88d9cd1 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2003,6 +2003,7 @@ inline int generic_write_checks(struct f
if (unlikely(*pos + *count > inode->i_sb->s_maxbytes))
*count = inode->i_sb->s_maxbytes - *pos;
} else {
+#ifdef CONFIG_BLOCK
loff_t isize;
if (bdev_read_only(I_BDEV(inode)))
return -EPERM;
@@ -2014,6 +2015,9 @@ inline int generic_write_checks(struct f

if (*pos + *count > isize)
*count = isize - *pos;
+#else
+ return -EPERM;
+#endif
}
return 0;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index 0227163..bedc0ed 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -409,6 +409,7 @@ int migrate_page(struct address_space *m
}
EXPORT_SYMBOL(migrate_page);

+#ifdef CONFIG_BLOCK
/*
* Migration function for pages with buffers. This function can only be used
* if the underlying filesystem guarantees that no other references to "page"
@@ -466,6 +467,7 @@ int buffer_migrate_page(struct address_s
return 0;
}
EXPORT_SYMBOL(buffer_migrate_page);
+#endif

/*
* Writeback a page to clean the dirty state
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 668716c..b450520 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -800,9 +800,11 @@ int fastcall set_page_dirty(struct page

if (likely(mapping)) {
int (*spd)(struct page *) = mapping->a_ops->set_page_dirty;
- if (spd)
- return (*spd)(page);
- return __set_page_dirty_buffers(page);
+#ifdef CONFIG_BLOCK
+ if (!spd)
+ spd = __set_page_dirty_buffers;
+#endif
+ return (*spd)(page);
}
if (!PageDirty(page)) {
if (!TestSetPageDirty(page))
diff --git a/security/seclvl.c b/security/seclvl.c
index c26dd7d..fc00df2 100644
--- a/security/seclvl.c
+++ b/security/seclvl.c
@@ -377,6 +377,7 @@ static int seclvl_settime(struct timespe
/* claim the blockdev to exclude mounters, release on file close */
static int seclvl_bd_claim(struct inode *inode)
{
+#ifdef CONFIG_BLOCK
int holder;
struct block_device *bdev = NULL;
dev_t dev = inode->i_rdev;
@@ -389,12 +390,14 @@ static int seclvl_bd_claim(struct inode
/* claimed, mark it to release on close */
inode->i_security = current;
}
+#endif
return 0;
}

/* release the blockdev if you claimed it */
static void seclvl_bd_release(struct inode *inode)
{
+#ifdef CONFIG_BLOCK
if (inode && S_ISBLK(inode->i_mode) && inode->i_security == current) {
struct block_device *bdev = inode->i_bdev;
if (bdev) {
@@ -403,6 +406,7 @@ static void seclvl_bd_release(struct ino
inode->i_security = NULL;
}
}
+#endif
}

/**

2006-08-24 21:39:12

by David Howells

[permalink] [raw]
Subject: [PATCH 05/17] BLOCK: Don't call block_sync_page() from AFS [try #2]

From: David Howells <[email protected]>

The AFS filesystem specifies block_sync_page() as its sync_page address op,
which needs to be checked, and so is commented out for the moment.

Signed-Off-By: David Howells <[email protected]>
---

fs/afs/file.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 67d6634..e1ba855 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -37,7 +37,7 @@ struct inode_operations afs_file_inode_o

const struct address_space_operations afs_fs_aops = {
.readpage = afs_file_readpage,
- .sync_page = block_sync_page,
+// .sync_page = block_sync_page,
.set_page_dirty = __set_page_dirty_nobuffers,
.releasepage = afs_file_releasepage,
.invalidatepage = afs_file_invalidatepage,

2006-08-24 21:41:19

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 15/17] BLOCK: Stop CIFS from using EXT2 ioctl numbers directly [try #2]

On Thu, 2006-08-24 at 22:33 +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Stop CIFS from using EXT2 ioctl numbers directly, making it use the ones in
> linux/fs.h instead.
>
> Signed-Off-By: David Howells <[email protected]>
> ---
>
> 0 files changed, 0 insertions(+), 0 deletions(-)

Err... NACK?

Trond

2006-08-25 08:12:31

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 15/17] BLOCK: Stop CIFS from using EXT2 ioctl numbers directly [try #2]

Trond Myklebust <[email protected]> wrote:

> Err... NACK?

I failed to notice that StGIT folded this into a previous patch.

David

2006-08-25 08:38:42

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 11/17] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #2]

Arnd Bergmann <[email protected]> wrote:

> case FS_IOC_GETFLAGS32:
> case FS_IOC_GETFLAGS64:

That'll get you a "duplicate case statement" warning on a 32-bit arch.

David

2006-08-25 08:44:55

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 11/17] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #2]

On Friday 25 August 2006 10:38, David Howells wrote:
>
> Arnd Bergmann <[email protected]> wrote:
>
> > ??????case FS_IOC_GETFLAGS32:
> > ??????case FS_IOC_GETFLAGS64:
>
> That'll get you a "duplicate case statement" warning on a 32-bit arch.
>

No, I defined them with u32 and u64 arguments, respectively, so the
numbers are distinct on 32 bit.

Arnd <><

2006-08-25 09:27:40

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 10/17] BLOCK: Move the loop device ioctl compat stuff to the loop driver [try #2]

Arnd Bergmann <[email protected]> wrote:

> The structure should be called compat_loop_info by convention, when it
> is moved to the file itself.

Seems reasonable.

> I guess this should be implemented like loop_set_status_old(), by calling a
> new loop_info64_from_compat() function an then the regular
> loop_set_status().

Whilst that does seem reasonable, it seems like a lot of extra code for
something that I wouldn't have thought would be that performance critical.
Surely these two ioctls aren't called that often...

On the other hand, it does reduce stack space usage somewhat - which is a good
thing - and that can be reduced still further by moving the on-stack struct
compat_loop_info variable and the copy_to/from_user into loop_set_status_compat
and loop_get_status_compat.

Anyway, how about the attached patch (to go on top of the one you've already
got)?

David


diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 48ad173..82bc5bd 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1176,7 +1176,7 @@ static int lo_ioctl(struct inode * inode
}

#ifdef CONFIG_COMPAT
-struct loop_info32 {
+struct compat_loop_info {
compat_int_t lo_number; /* ioctl r/o */
compat_dev_t lo_device; /* ioctl r/o */
compat_ulong_t lo_inode; /* ioctl r/o */
@@ -1191,47 +1191,120 @@ struct loop_info32 {
char reserved[4];
};

+static noinline int
+loop_info64_from_compat(const struct compat_loop_info *arg,
+ struct loop_info64 *info64)
+{
+ struct compat_loop_info info;
+
+ if (copy_from_user(&info, arg, sizeof(info)))
+ return -EFAULT;
+
+ memset(info64, 0, sizeof(*info64));
+ info64->lo_number = info.lo_number;
+ info64->lo_device = info.lo_device;
+ info64->lo_inode = info.lo_inode;
+ info64->lo_rdevice = info.lo_rdevice;
+ info64->lo_offset = info.lo_offset;
+ info64->lo_sizelimit = 0;
+ info64->lo_encrypt_type = info.lo_encrypt_type;
+ info64->lo_encrypt_key_size = info.lo_encrypt_key_size;
+ info64->lo_flags = info.lo_flags;
+ info64->lo_init[0] = info.lo_init[0];
+ info64->lo_init[1] = info.lo_init[1];
+ if (info.lo_encrypt_type == LO_CRYPT_CRYPTOAPI)
+ memcpy(info64->lo_crypt_name, info.lo_name, LO_NAME_SIZE);
+ else
+ memcpy(info64->lo_file_name, info.lo_name, LO_NAME_SIZE);
+ memcpy(info64->lo_encrypt_key, info.lo_encrypt_key, LO_KEY_SIZE);
+ return 0;
+}
+
+static noinline int
+loop_info64_to_compat(const struct loop_info64 *info64,
+ struct compat_loop_info __user *arg)
+{
+ struct compat_loop_info info;
+
+ memset(&info, 0, sizeof(info));
+ info.lo_number = info64->lo_number;
+ info.lo_device = info64->lo_device;
+ info.lo_inode = info64->lo_inode;
+ info.lo_rdevice = info64->lo_rdevice;
+ info.lo_offset = info64->lo_offset;
+ info.lo_encrypt_type = info64->lo_encrypt_type;
+ info.lo_encrypt_key_size = info64->lo_encrypt_key_size;
+ info.lo_flags = info64->lo_flags;
+ info.lo_init[0] = info64->lo_init[0];
+ info.lo_init[1] = info64->lo_init[1];
+ if (info.lo_encrypt_type == LO_CRYPT_CRYPTOAPI)
+ memcpy(info.lo_name, info64->lo_crypt_name, LO_NAME_SIZE);
+ else
+ memcpy(info.lo_name, info64->lo_file_name, LO_NAME_SIZE);
+ memcpy(info.lo_encrypt_key, info64->lo_encrypt_key, LO_KEY_SIZE);
+
+ /* error in case values were truncated */
+ if (info.lo_device != info64->lo_device ||
+ info.lo_rdevice != info64->lo_rdevice ||
+ info.lo_inode != info64->lo_inode ||
+ info.lo_offset != info64->lo_offset ||
+ info.lo_init[0] != info64->lo_init[0] ||
+ info.lo_init[1] != info64->lo_init[1])
+ return -EOVERFLOW;
+
+ if (copy_to_user(arg, &info, sizeof(info)))
+ return -EFAULT;
+ return 0;
+}
+
+static int
+loop_set_status_compat(struct loop_device *lo,
+ const struct compat_loop_info __user *arg)
+{
+ struct loop_info64 info64;
+ int ret;
+
+ ret = loop_info64_from_compat(arg, &info64);
+ if (ret < 0)
+ return ret;
+ return loop_set_status(lo, &info64);
+}
+
+static int
+loop_get_status_compat(struct loop_device *lo,
+ struct compat_loop_info __user *arg)
+{
+ struct loop_info64 info64;
+ int err = 0;
+
+ if (!arg)
+ err = -EINVAL;
+ if (!err)
+ err = loop_get_status(lo, &info64);
+ if (!err)
+ err = loop_info64_to_compat(&info64, arg);
+ return err;
+}
+
static long lo_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
struct inode *inode = file->f_dentry->d_inode;
- mm_segment_t old_fs = get_fs();
- struct loop_info l;
- struct loop_info32 __user *ul;
- int err = -ENOIOCTLCMD;
-
- ul = compat_ptr(arg);
+ struct loop_device *lo = inode->i_bdev->bd_disk->private_data;
+ int err;

lock_kernel();
switch(cmd) {
case LOOP_SET_STATUS:
- err = get_user(l.lo_number, &ul->lo_number);
- err |= __get_user(l.lo_device, &ul->lo_device);
- err |= __get_user(l.lo_inode, &ul->lo_inode);
- err |= __get_user(l.lo_rdevice, &ul->lo_rdevice);
- err |= __copy_from_user(&l.lo_offset, &ul->lo_offset,
- 8 + (unsigned long)l.lo_init - (unsigned long)&l.lo_offset);
- if (err) {
- err = -EFAULT;
- } else {
- set_fs (KERNEL_DS);
- err = lo_ioctl(inode, file, cmd, (unsigned long)&l);
- set_fs (old_fs);
- }
+ mutex_lock(&lo->lo_ctl_mutex);
+ err = loop_set_status_compat(
+ lo, (const struct compat_loop_info __user *) arg);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
case LOOP_GET_STATUS:
- set_fs (KERNEL_DS);
- err = lo_ioctl(inode, file, cmd, (unsigned long)&l);
- set_fs (old_fs);
- if (!err) {
- err = put_user(l.lo_number, &ul->lo_number);
- err |= __put_user(l.lo_device, &ul->lo_device);
- err |= __put_user(l.lo_inode, &ul->lo_inode);
- err |= __put_user(l.lo_rdevice, &ul->lo_rdevice);
- err |= __copy_to_user(&ul->lo_offset, &l.lo_offset,
- (unsigned long)l.lo_init - (unsigned long)&l.lo_offset);
- if (err)
- err = -EFAULT;
- }
+ mutex_lock(&lo->lo_ctl_mutex);
+ err = loop_get_status_compat(
+ lo, (struct compat_loop_info __user *) arg);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
case LOOP_CLR_FD:
case LOOP_GET_STATUS64:
@@ -1241,6 +1314,9 @@ static long lo_compat_ioctl(struct file
case LOOP_CHANGE_FD:
err = lo_ioctl(inode, file, cmd, arg);
break;
+ default:
+ err = -ENOIOCTLCMD;
+ break;
}
unlock_kernel();
return err;

2006-08-25 11:02:08

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 10/17] BLOCK: Move the loop device ioctl compat stuff to the loop driver [try #2]

Arnd Bergmann <[email protected]> wrote:

> My idea was to do the copy_from_user in loop_set_status_compat instead
> of loop_info64_from_compat, but your solution should be completely
> equivalent.

This way will have used less stack when it gets to the main part of the loop
driver.

David

2006-08-25 14:09:13

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 02/17] BLOCK: Remove duplicate declaration of exit_io_context() [try #2]

On Thu, Aug 24, 2006 at 10:32:56PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Remove the duplicate declaration of exit_io_context() from linux/sched.h.
>
> Signed-Off-By: David Howells <[email protected]>

Ok.

2006-08-25 14:08:47

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 01/17] BLOCK: Move functions out of buffer code [try #2]

On Thu, Aug 24, 2006 at 10:32:53PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Move some functions out of the buffering code that aren't strictly buffering
> specific. This is a precursor to being able to disable the block layer.
>
> (*) Moved some stuff out of fs/buffer.c:
>
> (*) The file sync and general sync stuff moved to fs/sync.c.
>
> (*) The superblock sync stuff moved to fs/super.c.
>
> (*) do_invalidatepage() moved to mm/truncate.c.
>
> (*) try_to_release_page() moved to mm/filemap.c.
>
> (*) Moved some related declarations between header files:
>
> (*) declarations for do_invalidatepage() and try_to_release_page() moved
> to linux/mm.h.
>
> (*) __set_page_dirty_buffers() moved to linux/buffer_head.h.
>
> Signed-Off-By: David Howells <[email protected]>

Please remove the CONFIG_BLOCK that splipped in and should only be in the
final patch. Otherwise ACK.

2006-08-25 14:10:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 03/17] BLOCK: Stop fallback_migrate_page() from using page_has_buffers() [try #2]

On Thu, Aug 24, 2006 at 10:32:58PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Stop fallback_migrate_page() from using page_has_buffers() since that might not
> be available. Use PagePrivate() instead since that's more general.

We should document somewhere where to use which of those functions,
especially as they are currently 100% functionally identical.

Also if we ever get private data for anything but buffers these kinds of
checks in generic code will cause problems. Maybe we should just
kill the default fallback in this case?

2006-08-25 14:12:48

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 04/17] BLOCK: Separate the bounce buffering code from the highmem code [try #2]

> --- /dev/null
> +++ b/mm/bounce.c
> @@ -0,0 +1,302 @@
> +/* bounce.c: bounce buffer handling for block devices
> + *
> + * - Split from highmem.c
> + */

please don't mention the filename in the top of file comment, it's redundant
and get out of sync too easily on renames. Otherwise the patch looks good.

2006-08-25 14:13:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 06/17] BLOCK: Move bdev_cache_init() declaration to headerfile [try #2]

On Thu, Aug 24, 2006 at 10:33:06PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Move the bdev_cache_init() extern declaration from fs/dcache.c to
> linux/blkdev.h.
>
> Signed-Off-By: David Howells <[email protected]>
> ---
>
> fs/dcache.c | 2 +-
> include/linux/blkdev.h | 1 +
> 2 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 1b4a3a3..886ca6f 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -32,6 +32,7 @@ #include <linux/security.h>
> #include <linux/seqlock.h>
> #include <linux/swap.h>
> #include <linux/bootmem.h>
> +#include <linux/blkdev.h>
>
>
> int sysctl_vfs_cache_pressure __read_mostly = 100;
> @@ -1742,7 +1743,6 @@ kmem_cache_t *filp_cachep __read_mostly;
>
> EXPORT_SYMBOL(d_genocide);
>
> -extern void bdev_cache_init(void);
> extern void chrdev_init(void);

please move chrdev_init aswell while you're at it. Otherwise ok.

2006-08-25 14:13:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 05/17] BLOCK: Don't call block_sync_page() from AFS [try #2]

> diff --git a/fs/afs/file.c b/fs/afs/file.c
> index 67d6634..e1ba855 100644
> --- a/fs/afs/file.c
> +++ b/fs/afs/file.c
> @@ -37,7 +37,7 @@ struct inode_operations afs_file_inode_o
>
> const struct address_space_operations afs_fs_aops = {
> .readpage = afs_file_readpage,
> - .sync_page = block_sync_page,
> +// .sync_page = block_sync_page,

commenting out thing using // isn't very nice. Either remove it completely
or use #if 0 with a normal /* */ that describes why it's not compiled.

2006-08-25 14:14:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 07/17] BLOCK: Remove dependence on existence of blockdev_superblock [try #2]

On Thu, Aug 24, 2006 at 10:33:08PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Move blockdev_superblock extern declaration from fs/fs-writeback.c to a
> headerfile and remove the dependence on it by wrapping it in a macro.

Ok.

2006-08-25 14:16:47

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 08/17] BLOCK: Dissociate generic_writepages() from mpage stuff [try #2]

On Thu, Aug 24, 2006 at 10:33:11PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Dissociate the generic_writepages() function from the mpage stuff, moving its
> declaration to linux/mm.h and actually emitting a full implementation into
> mm/page-writeback.c.
>
> The implementation is a partial duplicate of mpage_writepages() with all BIO
> references removed.
>
> It is used by NFS to do writeback.

This duplication is rather unfortunate, but I don't see a way to distangle
this any better, so ok.

> @@ -693,6 +693,8 @@ out:
> * the call was made get new I/O started against them. If wbc->sync_mode is
> * WB_SYNC_ALL then we were called for data integrity and we must wait for
> * existing IO to complete.
> + *
> + * !!!! If you fix this you should check generic_writepages() also!!!!

This isn't very elegant comment style :) What about a little less shouting..

> int pdflush_operation(void (*fn)(unsigned long), unsigned long arg0);
> +extern int generic_writepages(struct address_space *mapping,
> + struct writeback_control *wbc);
> int do_writepages(struct address_space *mapping, struct writeback_control *wbc);

please try to fit the style of the surrounding prototypes, that is no
'extern'

2006-08-25 14:17:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 09/17] BLOCK: Move __invalidate_device() to block_dev.c [try #2]

On Thu, Aug 24, 2006 at 10:33:13PM +0100, David Howells wrote:
> From: David Howells <[email protected]>
>
> Move __invalidate_device() from fs/inode.c to fs/block_dev.c so that it can
> more easily be disabled when the block layer is disabled.

Ok.

2006-08-25 14:27:56

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

> +++ b/drivers/char/random.c
> @@ -655,6 +655,7 @@ void add_interrupt_randomness(int irq)
> add_timer_randomness(irq_timer_state[irq], 0x100 + irq);
> }
>
> +#ifdef CONFIG_BLOCK
> void add_disk_randomness(struct gendisk *disk)
> {
> if (!disk || !disk->random)
> @@ -667,6 +668,7 @@ void add_disk_randomness(struct gendisk
> }
>
> EXPORT_SYMBOL(add_disk_randomness);
> +#endif
>
> #define EXTRACT_SIZE 10
>
> @@ -918,6 +920,7 @@ void rand_initialize_irq(int irq)
> }
> }
>
> +#ifdef CONFIG_BLOCK
> void rand_initialize_disk(struct gendisk *disk)
> {
> struct timer_rand_state *state;
> @@ -932,6 +935,7 @@ void rand_initialize_disk(struct gendisk
> disk->random = state;
> }
> }
> +#endif

Can you put this two into a single ifdef block?

> index fead87d..f945953 100644
> --- a/drivers/infiniband/ulp/iser/Kconfig
> +++ b/drivers/infiniband/ulp/iser/Kconfig
> @@ -1,6 +1,6 @@
> config INFINIBAND_ISER
> tristate "ISCSI RDMA Protocol"
> - depends on INFINIBAND && SCSI
> + depends on INFINIBAND && BLOCK && SCSI

SCSI should (and does in your patch) depend on BLOCK, so you don't
need this additional dependency.

> - depends on INFINIBAND && SCSI
> + depends on INFINIBAND && BLOCK && SCSI

ditto.

> config BLK_DEV_SD
> tristate "SCSI disk support"
> - depends on SCSI
> + depends on SCSI && BLOCK

ditto.

> config BLK_DEV_SR
> tristate "SCSI CDROM support"
> - depends on SCSI
> + depends on SCSI && BLOCK

ditto.

> config SCSI_SATA
> tristate "Serial ATA (SATA) support"
> - depends on SCSI
> + depends on SCSI && BLOCK

ditto.

> config USB_STORAGE
> tristate "USB Mass Storage support"
> - depends on USB
> + depends on USB && BLOCK

ditto.

> index 3f00a9f..dc5e69b 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -4,6 +4,8 @@ #
>
> menu "File systems"
>
> +if BLOCK
> +
> config EXT2_FS
> tristate "Second extended fs support"
> help
> @@ -383,8 +385,11 @@ config MINIX_FS
> partition (the one containing the directory /) cannot be compiled as
> a module.
>
> +endif
> +
> config ROMFS_FS
> tristate "ROM file system support"
> + depends on BLOCK

care to group all block-based filesystem in a group so that a single
if BLOCK will do it?

> +ifeq ($(CONFIG_BLOCK),y)
> +obj-y += buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
> +else
> +obj-y += no-block.o
> +endif
>
> obj-$(CONFIG_INOTIFY) += inotify.o
> obj-$(CONFIG_INOTIFY_USER) += inotify_user.o

> index 7b8a9b4..af160e9 100644
> --- a/fs/compat_ioctl.c
> +++ b/fs/compat_ioctl.c
> @@ -645,6 +645,7 @@ out:
> }
> #endif
>
> +#ifdef CONFIG_BLOCK
> struct hd_geometry32 {
> unsigned char heads;
> unsigned char sectors;
> @@ -869,6 +870,7 @@ static int sg_grt_trans(unsigned int fd,
> }
> return err;
> }
> +#endif /* CONFIG_BLOCK */

again, try to reorder things here to only require a single ifdef block
(or rather two, a second one for the array entries) if possible.

> --- /dev/null
> +++ b/fs/no-block.c
> @@ -0,0 +1,22 @@
> +/* no-block.c: implementation of routines required for non-BLOCK configuration
> + *
> + * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells ([email protected])
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/fs.h>
> +
> +static int no_blkdev_open(struct inode * inode, struct file * filp)
> +{
> + return -ENODEV;
> +}
> +
> +const struct file_operations def_blk_fops = {
> + .open = no_blkdev_open,
> +};

Can we put this into some other file under #ifndef CONFIG_BLOCK to
avoid the separate file and makefile ugliness?

> diff --git a/include/scsi/scsi_tcq.h b/include/scsi/scsi_tcq.h
> index e47e36a..bc34746 100644
> --- a/include/scsi/scsi_tcq.h
> +++ b/include/scsi/scsi_tcq.h
> @@ -5,7 +5,6 @@ #include <linux/blkdev.h>
> #include <scsi/scsi_cmnd.h>
> #include <scsi/scsi_device.h>
>
> -
> #define MSG_SIMPLE_TAG 0x20
> #define MSG_HEAD_TAG 0x21
> #define MSG_ORDERED_TAG 0x22
> @@ -13,6 +12,7 @@ #define MSG_ORDERED_TAG 0x22
> #define SCSI_NO_TAG (-1) /* identify no tag in use */
>
>
> +#ifdef CONFIG_BLOCK
>
> /**
> * scsi_get_tag_type - get the type of tag the device supports
> @@ -131,4 +131,5 @@ static inline struct scsi_cmnd *scsi_fin
> return sdev->current_cmnd;
> }
>
> +#endif /* CONFIG_BLOCK */
> #endif /* _SCSI_SCSI_TCQ_H */

No one should include this file unless block device support is enabled,
so I don't see the point for the ifdefs. Ditto for many other header
files you touch that don't contain any stubs for generic code.


And btw, shouldn't the option be CONFIG_BLK_DEV instead of CONFIG_BLOCK
to fit the variour CONFIG_BLK_DEV_FOO options we have?

2006-08-25 14:53:08

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Fri, Aug 25, 2006 at 03:27:53PM +0100, Christoph Hellwig wrote:
> > --- a/fs/Kconfig
> > +++ b/fs/Kconfig
> > @@ -4,6 +4,8 @@ #
> >
> > menu "File systems"
> >
> > +if BLOCK
> > +
> > config EXT2_FS
> > tristate "Second extended fs support"
> > help
> > @@ -383,8 +385,11 @@ config MINIX_FS
> > partition (the one containing the directory /) cannot be compiled as
> > a module.
> >
> > +endif
> > +
> > config ROMFS_FS
> > tristate "ROM file system support"
> > + depends on BLOCK
>
> care to group all block-based filesystem in a group so that a single
> if BLOCK will do it?

Note that fs/Kconfig in -mm is mostly split into individual fs/*/Kconfig
files.

2006-08-25 16:11:25

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 03/17] BLOCK: Stop fallback_migrate_page() from using page_has_buffers() [try #2]

Christoph Hellwig <[email protected]> wrote:

> Also if we ever get private data for anything but buffers these kinds of
> checks in generic code will cause problems.

Like NFS, for example.

David

2006-08-25 16:12:30

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 05/17] BLOCK: Don't call block_sync_page() from AFS [try #2]

Christoph Hellwig <[email protected]> wrote:

> commenting out thing using // isn't very nice. Either remove it completely
> or use #if 0 with a normal /* */ that describes why it's not compiled.

It's something I'll look at when the new motherboard for my AFS server
arrives. Till then, however...

David

2006-08-25 16:23:12

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Christoph Hellwig <[email protected]> wrote:

> Can you put this two into a single ifdef block?

I suppose it could make sense to move the two disk random source functions
together.

> > config USB_STORAGE
> > tristate "USB Mass Storage support"
> > - depends on USB
> > + depends on USB && BLOCK
>
> ditto.

ditto?

> again, try to reorder things here to only require a single ifdef block
> (or rather two, a second one for the array entries) if possible.

The problem with reordering things is that it makes the patch bigger, and that
makes people complain about not minimalising the changes.

> Can we put this into some other file under #ifndef CONFIG_BLOCK to
> avoid the separate file and makefile ugliness?

*blink*

What've you done with the real Christoph Hellwig? You're actually *advocating*
the use of a cpp-conditional in a .c file!

It doesn't really belong in any of the files that are left.

> No one should include this file unless block device support is enabled,
> so I don't see the point for the ifdefs. Ditto for many other header
> files you touch that don't contain any stubs for generic code.

Someone did. Might've been USB storage now that I think about it.

> And btw, shouldn't the option be CONFIG_BLK_DEV instead of CONFIG_BLOCK
> to fit the variour CONFIG_BLK_DEV_FOO options we have?

No.

I'm not enabling a specific block device driver. I'm taking out the entire
block layer, block drivers, block scheduler and everything that depends on it
(such as SCSI).

David

2006-08-25 18:46:29

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

David Howells <[email protected]> wrote:

> Christoph Hellwig <[email protected]> wrote:
>
> > Can you put this two into a single ifdef block?
>
> I suppose it could make sense to move the two disk random source functions
> together.

I don't think I should. drivers/char/random.c seems to be carefully laid out
with similar functions grouped together under grouping banners.

David

2006-08-29 11:51:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Fri, Aug 25, 2006 at 05:23:05PM +0100, David Howells wrote:
> > > config USB_STORAGE
> > > tristate "USB Mass Storage support"
> > > - depends on USB
> > > + depends on USB && BLOCK
> >
> > ditto.
>
> ditto?

Same as above. USB_STORAGE already selects scsi so it shouldn't need
to depend on block.

2006-08-29 12:23:25

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Christoph Hellwig <[email protected]> wrote:

> Same as above. USB_STORAGE already selects scsi so it shouldn't need
> to depend on block.

Ah, you've got it the wrong way round.

Because USB_STORAGE _selects_ SCSI rather than depending on it, even if SCSI
is disabled, USB_STORAGE can be enabled, and that turns on CONFIG_SCSI, even
if not all of its dependencies are available.

Run "make allyesconfig" and then try to turn off CONFIG_SCSI without this...

David

2006-08-29 12:25:46

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Christoph Hellwig wrote:
> On Fri, Aug 25, 2006 at 05:23:05PM +0100, David Howells wrote:
>> > > config USB_STORAGE
>> > > tristate "USB Mass Storage support"
>> > > - depends on USB
>> > > + depends on USB && BLOCK
>> >
>> > ditto.
>>
>> ditto?
>
> Same as above. USB_STORAGE already selects scsi so it shouldn't need
> to depend on block.

David,
same with config IEEE1394_SBP2.

(sbp2 and usb-storage use one to two block layer symbols directly for
the single purpose to tune the SCSI request queue. I.e. they depend on
BLOCK just because they are SCSI drivers.)
--
Stefan Richter
-=====-=-==- =--- ===-=
http://arcgraph.de/sr/

2006-08-29 12:25:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Tue, Aug 29, 2006 at 01:23:18PM +0100, David Howells wrote:
> Christoph Hellwig <[email protected]> wrote:
>
> > Same as above. USB_STORAGE already selects scsi so it shouldn't need
> > to depend on block.
>
> Ah, you've got it the wrong way round.
>
> Because USB_STORAGE _selects_ SCSI rather than depending on it, even if SCSI
> is disabled, USB_STORAGE can be enabled, and that turns on CONFIG_SCSI, even
> if not all of its dependencies are available.
>
> Run "make allyesconfig" and then try to turn off CONFIG_SCSI without this...

Eeek. The easy fix is to change USB_STORAGE to depend on SCSI (*), but in
addition to that we should probably fix Kconfig aswell to adhere to
such constraints.


(*) that selects is really wrong to start with, the other scsi drivers don't
select scsi either.

2006-08-29 13:54:42

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Christoph Hellwig wrote:
> On Tue, Aug 29, 2006 at 01:23:18PM +0100, David Howells wrote:
>> Christoph Hellwig <[email protected]> wrote:
>>
>>> Same as above. USB_STORAGE already selects scsi so it shouldn't need
>>> to depend on block.
>> Ah, you've got it the wrong way round.
>>
>> Because USB_STORAGE _selects_ SCSI rather than depending on it, even if SCSI
>> is disabled, USB_STORAGE can be enabled, and that turns on CONFIG_SCSI, even
>> if not all of its dependencies are available.
>>
>> Run "make allyesconfig" and then try to turn off CONFIG_SCSI without this...
>
> Eeek. The easy fix is to change USB_STORAGE to depend on SCSI (*), but in
> addition to that we should probably fix Kconfig aswell to adhere to
> such constraints.
>
>
> (*) that selects is really wrong to start with, the other scsi drivers don't
> select scsi either.

It is not wrong per se.

If SCSI is set to "N", then any menu items which depend on SCSI are not
visible anymore. This is not a problem with any of the items in the SCSI
configuration section.

But it is a problem for any items that live _in other configuration
sections_, such as USB_STORAGE (currently not affected because it
selects SCSI) and IEEE1394_SBP2 (does select SCSI now too in -mm).

If "select" cannot be fixed or is not en vogue for any other reason, the
configuration tools need to be improved otherwise, so that users are
guided to options like USB_STORAGE and IEEE1394_SBP2 when SCSI or other
"foreign" options were disabled.

The kernel configuration is currently presented as a tree, although the
dependencies of config options are not a tree. That's were "select" helps.
--
Stefan Richter
-=====-=-==- =--- ===-=
http://arcgraph.de/sr/

2006-08-29 14:17:10

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

I wrote:
> If SCSI is set to "N", then any menu items which depend on SCSI are not
> visible anymore.
[...]
> If "select" cannot be fixed or is not en vogue for any other reason, the
> configuration tools need to be improved otherwise, so that users are
> guided to options like USB_STORAGE and IEEE1394_SBP2 when SCSI or other
> "foreign" options were disabled.

An easy but crude fix would be to add an according hint at the help text
of the immediately superordinate config option. E.g. at IEEE1394: "Also
enable SCSI support to be able to switch on SBP-2 support (IEEE 1394
protocol e.g. for storage devices)." But this is extremely ugly /1./
because it would litter help texts of generic options with redundant
information about specific options and /2./ because it requires users to
find and read help texts in order to convince the configurator to make
options visible.
--
Stefan Richter
-=====-=-==- =--- ===-=
http://arcgraph.de/sr/

2006-08-29 19:59:49

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Tue, Aug 29, 2006 at 01:25:01PM +0100, Christoph Hellwig wrote:
> On Tue, Aug 29, 2006 at 01:23:18PM +0100, David Howells wrote:
> > Christoph Hellwig <[email protected]> wrote:
> >
> > > Same as above. USB_STORAGE already selects scsi so it shouldn't need
> > > to depend on block.
> >
> > Ah, you've got it the wrong way round.
> >
> > Because USB_STORAGE _selects_ SCSI rather than depending on it, even if SCSI
> > is disabled, USB_STORAGE can be enabled, and that turns on CONFIG_SCSI, even
> > if not all of its dependencies are available.
> >
> > Run "make allyesconfig" and then try to turn off CONFIG_SCSI without this...
>
> Eeek. The easy fix is to change USB_STORAGE to depend on SCSI (*), but in
> addition to that we should probably fix Kconfig aswell to adhere to
> such constraints.

No, the reason this was switched around like this (it used to be the
other way), was that people constantly complained about not being able
to select the usb-storage driver in their configurations.

Can't seem to please everyone these days :)

thanks,

greg k-h

2006-08-29 21:09:58

by John Stoffel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

>>>>> "Greg" == Greg KH <[email protected]> writes:

Greg> On Tue, Aug 29, 2006 at 01:25:01PM +0100, Christoph Hellwig wrote:
>> On Tue, Aug 29, 2006 at 01:23:18PM +0100, David Howells wrote:
>> > Christoph Hellwig <[email protected]> wrote:
>> >
>> > > Same as above. USB_STORAGE already selects scsi so it shouldn't need
>> > > to depend on block.
>> >
>> > Ah, you've got it the wrong way round.
>> >
>> > Because USB_STORAGE _selects_ SCSI rather than depending on it, even if SCSI
>> > is disabled, USB_STORAGE can be enabled, and that turns on CONFIG_SCSI, even
>> > if not all of its dependencies are available.
>> >
>> > Run "make allyesconfig" and then try to turn off CONFIG_SCSI without this...
>>
>> Eeek. The easy fix is to change USB_STORAGE to depend on SCSI (*), but in
>> addition to that we should probably fix Kconfig aswell to adhere to
>> such constraints.

Greg> No, the reason this was switched around like this (it used to be the
Greg> other way), was that people constantly complained about not being able
Greg> to select the usb-storage driver in their configurations.

Maybe the better solution is to remove SCSI as an option, and to just
offer SCSI drivers and USB-STORAGE and other SCSI core using drivers
instead. Then the SCSI core gets pulled in automatically. It's not
like people care about the SCSI core, just the drivers which depend on
it.

John

2006-08-30 01:11:46

by Roman Zippel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Hi,

On Tue, 29 Aug 2006, Stefan Richter wrote:

> If "select" cannot be fixed or is not en vogue for any other reason, the
> configuration tools need to be improved otherwise, so that users are guided to
> options like USB_STORAGE and IEEE1394_SBP2 when SCSI or other "foreign"
> options were disabled.
>
> The kernel configuration is currently presented as a tree, although the
> dependencies of config options are not a tree. That's were "select" helps.

Actually dependencies are a tree and kconfig verifies that it's valid as
well and that's there "select" can wreak havoc.
select really creates a reverse dependency, i.e. the value of SCSI depends
now on the USB_STORAGE value. This means now that all dependencies of the
selected symbol have to be selected as well (either by the selecting
symbol or by the selected symbol). With more complex dependencies this can
quickly get out of hand in order to maintain a valid and correct
dependency tree. That's why I'm not really happy about the current massive
use of select and I'd rather find solutions with normal dependencies,
which unfortunately isn't trivial, select OTOH was a simple hack.

bye, Roman

2006-08-30 01:13:03

by Roman Zippel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Hi,

On Tue, 29 Aug 2006, Stefan Richter wrote:

> An easy but crude fix would be to add an according hint at the help text of
> the immediately superordinate config option. E.g. at IEEE1394: "Also enable
> SCSI support to be able to switch on SBP-2 support (IEEE 1394 protocol e.g.
> for storage devices)." But this is extremely ugly /1./ because it would litter
> help texts of generic options with redundant information about specific
> options and /2./ because it requires users to find and read help texts in
> order to convince the configurator to make options visible.

You can also add a simple comment which is only visible if !SCSI.

bye, Roman

2006-08-30 11:33:36

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Roman Zippel wrote:
> On Tue, 29 Aug 2006, Stefan Richter wrote:
[...]
>> The kernel configuration is currently presented as a tree, although the
>> dependencies of config options are not a tree. That's were "select" helps.
>
> Actually dependencies are a tree and kconfig verifies that it's valid as
> well and that's there "select" can wreak havoc.

OK. You are right, they are both trees. But the menu tree is different
from the dependency tree. I can see two reasons: 1. We expect the menu's
layout to reflect function rather than implementation. 2. Menu tree and
dependency tree are directed trees, but only the menu tree has a root
(i.e. _one_ root).

> select really creates a reverse dependency, i.e. the value of SCSI depends
> now on the USB_STORAGE value.

It doesn't really revert the dependency. It changes the path that the
user takes to enable interdependent options. Thereby it changes _how_
the configurator ensures (or rather, _should_ ensure) that dependencies
are fulfilled.

> This means now that all dependencies of the
> selected symbol have to be selected as well (either by the selecting
> symbol or by the selected symbol). With more complex dependencies this can
> quickly get out of hand in order to maintain a valid and correct
> dependency tree. That's why I'm not really happy about the current massive
> use of select and I'd rather find solutions with normal dependencies,
> which unfortunately isn't trivial, select OTOH was a simple hack.

"select" would not be needed if the configurator wouldn't make an option
_invisible_ if it depends on another disabled option. It would be nice
if the option would stay visible (or better yet, would be optionally
visible) and had pointers to unfulfilled dependencies.

Or more generally spoken, "select" would not be needed if there were
other means to switch the configurator's UI to a layout that exposes
more details about dependencies. There is already such a UI mode which
fully exposes _fulfilled_ dependencies.
--
Stefan Richter
-=====-=-==- =--- ====-
http://arcgraph.de/sr/

2006-08-30 18:37:41

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Roman Zippel wrote:
> On Tue, 29 Aug 2006, Stefan Richter wrote:
>> An easy but crude fix would be to add an according hint at the help text of
>> the immediately superordinate config option.
[...]
> You can also add a simple comment which is only visible if !SCSI.

Thanks, I will do so.
--
Stefan Richter
-=====-=-==- =--- ====-
http://arcgraph.de/sr/

2006-08-30 21:43:59

by Adrian Bunk

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Wed, Aug 30, 2006 at 08:33:36PM +0200, Stefan Richter wrote:
> Roman Zippel wrote:
> >On Tue, 29 Aug 2006, Stefan Richter wrote:
> >>An easy but crude fix would be to add an according hint at the help text
> >>of
> >>the immediately superordinate config option.
> [...]
> >You can also add a simple comment which is only visible if !SCSI.
>
> Thanks, I will do so.

Please don't do this.

USB_STORAGE switched from a depending on SCSI to select'ing SCSI three
years ago, and ATA in 2.6.19 will also select SCSI for a good reason:

When doing anything kconfig related, you must always remember that the
vast majority of kconfig users are not kernel hackers.

> Stefan Richter

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-08-30 22:44:45

by Roman Zippel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Hi,

On Wed, 30 Aug 2006, Adrian Bunk wrote:

> USB_STORAGE switched from a depending on SCSI to select'ing SCSI three
> years ago, and ATA in 2.6.19 will also select SCSI for a good reason:

It was already silly three years ago.

> When doing anything kconfig related, you must always remember that the
> vast majority of kconfig users are not kernel hackers.

What does that mean, that only kernel hackers can read?

bye, Roman

2006-08-30 22:54:30

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Adrian Bunk wrote:
> USB_STORAGE switched from a depending on SCSI to select'ing SCSI three
> years ago, and ATA in 2.6.19 will also select SCSI for a good reason:
>
> When doing anything kconfig related, you must always remember that the
> vast majority of kconfig users are not kernel hackers.

I agree with that.
But multi-level dependencies are a show-stopper at the moment.
--
Stefan Richter
-=====-=-==- =--- =====
http://arcgraph.de/sr/

2006-08-30 23:12:27

by Adrian Bunk

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, Aug 31, 2006 at 12:50:03AM +0200, Stefan Richter wrote:
> Adrian Bunk wrote:
> >USB_STORAGE switched from a depending on SCSI to select'ing SCSI three
> >years ago, and ATA in 2.6.19 will also select SCSI for a good reason:
> >
> >When doing anything kconfig related, you must always remember that the
> >vast majority of kconfig users are not kernel hackers.
>
> I agree with that.
> But multi-level dependencies are a show-stopper at the moment.

config IEEE1394_SBP2
tristate "SBP-2 support (Harddisks etc.)"
depends on IEEE1394 && BLOCK && (PCI || BROKEN)
select SCSI

should work fine.

> Stefan Richter

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-08-30 23:38:38

by Adrian Bunk

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, Aug 31, 2006 at 12:41:02AM +0200, Roman Zippel wrote:

> Hi,

Hi Roman,

> On Wed, 30 Aug 2006, Adrian Bunk wrote:
>...
> > When doing anything kconfig related, you must always remember that the
> > vast majority of kconfig users are not kernel hackers.
>
> What does that mean, that only kernel hackers can read?

no. But sending users from one menu to another for first manually
selecting this or that option is less easy for the user than the usage
of select.

> bye, Roman

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-08-31 00:01:33

by Roman Zippel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Hi,

On Thu, 31 Aug 2006, Adrian Bunk wrote:

> > > When doing anything kconfig related, you must always remember that the
> > > vast majority of kconfig users are not kernel hackers.
> >
> > What does that mean, that only kernel hackers can read?
>
> no. But sending users from one menu to another for first manually
> selecting this or that option is less easy for the user than the usage
> of select.

How often does he have to do that? Is it really worth it fucking with the
kconfig system?

bye, Roman

2006-08-31 03:01:37

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Tue, Aug 29, 2006 at 05:08:46PM -0400, John Stoffel wrote:
> Maybe the better solution is to remove SCSI as an option, and to just
> offer SCSI drivers and USB-STORAGE and other SCSI core using drivers
> instead. Then the SCSI core gets pulled in automatically. It's not
> like people care about the SCSI core, just the drivers which depend on
> it.

People don't want to have to say "no" to umpteen scsi drivers. They
just want to say "no" to SCSI, because they know they don't have scsi.

2006-08-31 03:06:06

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Wed, 2006-08-30 at 21:01 -0600, Matthew Wilcox wrote:
> On Tue, Aug 29, 2006 at 05:08:46PM -0400, John Stoffel wrote:
> > Maybe the better solution is to remove SCSI as an option, and to just
> > offer SCSI drivers and USB-STORAGE and other SCSI core using drivers
> > instead. Then the SCSI core gets pulled in automatically. It's not
> > like people care about the SCSI core, just the drivers which depend on
> > it.
>
> People don't want to have to say "no" to umpteen scsi drivers. They
> just want to say "no" to SCSI, because they know they don't have scsi.

so then that's shows a problem with the kconfig syntax.

CONFIG_SCSI should perhaps be hidden, and what's visible to the user is
CONFIG_SCSI_DRIVER

USB-STORAGE would automatically pull in CONFIG_SCSI as would
CONFIG_SCSI_DRIVER.

or perhaps I'm just talking out of my ass.

2006-08-31 08:05:42

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Roman Zippel wrote:
> On Thu, 31 Aug 2006, Adrian Bunk wrote:
[...]
>> sending users from one menu to another for first manually
>> selecting this or that option is less easy for the user than the usage
>> of select.
>
> How often does he have to do that? Is it really worth it fucking with the
> kconfig system?

Adrian, Roman,
both the comment hack and the 'select' hack introduce redundancy into
the Kconfig files and add maintenance cost. In addition, 'select'
currently brings a danger of inconsistent configuration. As I said
before, the proper solution would be enhancements to the "make
XYZconfig" UIs to comfortably present unfulfilled dependencies, based on
'depends on' alone. Alas my posting here is yet another one without a
patch included...
--
Stefan Richter
-=====-=-==- =--- =====
http://arcgraph.de/sr/

2006-08-31 08:57:31

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Shaya Potter wrote:
> On Wed, 2006-08-30 at 21:01 -0600, Matthew Wilcox wrote:
>> On Tue, Aug 29, 2006 at 05:08:46PM -0400, John Stoffel wrote:
>> > Maybe the better solution is to remove SCSI as an option, and to just
>> > offer SCSI drivers and USB-STORAGE and other SCSI core using drivers
[...]
>> People don't want to have to say "no" to umpteen scsi drivers. They
>> just want to say "no" to SCSI, because they know they don't have scsi.
>
> so then that's shows a problem with the kconfig syntax.
>
> CONFIG_SCSI should perhaps be hidden, and what's visible to the user is
> CONFIG_SCSI_DRIVER
[...]

But drivers like usb-storage and sbp2 are SCSI drivers too. What you
mean is CONFIG_SCSI_DRIVERS_WHICH_APPEAR_IN_THE_SCSI_MENU.

It all just revolves around the fact that the menu layout does not match
the dependency graph. We currently sacrifice clarity and integrity of
the Kconfigs in order to solve presentational issues.
--
Stefan Richter
-=====-=-==- =--- =====
http://arcgraph.de/sr/

2006-08-31 10:14:30

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

John Stoffel <[email protected]> wrote:

> Maybe the better solution is to remove SCSI as an option, and to just
> offer SCSI drivers and USB-STORAGE and other SCSI core using drivers
> instead. Then the SCSI core gets pulled in automatically. It's not
> like people care about the SCSI core, just the drivers which depend on
> it.

How do you modularise it then?

David

2006-08-31 12:33:22

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, 2006-08-31 at 10:53 +0200, Stefan Richter wrote:
> Shaya Potter wrote:
> > On Wed, 2006-08-30 at 21:01 -0600, Matthew Wilcox wrote:
> >> On Tue, Aug 29, 2006 at 05:08:46PM -0400, John Stoffel wrote:
> >> > Maybe the better solution is to remove SCSI as an option, and to just
> >> > offer SCSI drivers and USB-STORAGE and other SCSI core using drivers
> [...]
> >> People don't want to have to say "no" to umpteen scsi drivers. They
> >> just want to say "no" to SCSI, because they know they don't have scsi.
> >
> > so then that's shows a problem with the kconfig syntax.
> >
> > CONFIG_SCSI should perhaps be hidden, and what's visible to the user is
> > CONFIG_SCSI_DRIVER
> [...]
>
> But drivers like usb-storage and sbp2 are SCSI drivers too. What you
> mean is CONFIG_SCSI_DRIVERS_WHICH_APPEAR_IN_THE_SCSI_MENU.

when I said "driver" I meant more along the line of SCSI hardware
instead of things that use the "Linux" scsi subsystem. i.e. usb, sata
are not scsi hardware even though they use the scsi subsystem.

Or to put it another way, perhaps no "select"able option should ever be
visibly selectable in XYZconfig. And XYZconfig should only show an
option that is "select"able if by selecting it one ends up with a
consistent configuration.

So you have a "virtual" SCSI_SUBSYSTEM which usb-storage, sbp2, sata all
pull in by selecting it.

you have SCSI_HARDWARE that adaptec, buslogic, lsilogic...... depend on.
SCSI_HARDWARE would also select "SCSI_SUBSYSTEM".

2006-08-31 13:20:33

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Shaya Potter wrote:
> when I said "driver" I meant more along the line of SCSI hardware
> instead of things that use the "Linux" scsi subsystem.
[...]
> So you have a "virtual" SCSI_SUBSYSTEM which usb-storage, sbp2, sata all
> pull in by selecting it.
>
> you have SCSI_HARDWARE that adaptec, buslogic, lsilogic...... depend on.
> SCSI_HARDWARE would also select "SCSI_SUBSYSTEM".

One nit: SBP-2 is SCSI.
--
Stefan Richter
-=====-=-==- =--- =====
http://arcgraph.de/sr/

2006-08-31 13:27:31

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, 2006-08-31 at 15:16 +0200, Stefan Richter wrote:
> Shaya Potter wrote:
> > when I said "driver" I meant more along the line of SCSI hardware
> > instead of things that use the "Linux" scsi subsystem.
> [...]
> > So you have a "virtual" SCSI_SUBSYSTEM which usb-storage, sbp2, sata all
> > pull in by selecting it.
> >
> > you have SCSI_HARDWARE that adaptec, buslogic, lsilogic...... depend on.
> > SCSI_HARDWARE would also select "SCSI_SUBSYSTEM".
>
> One nit: SBP-2 is SCSI.

ok, but you should get the point.

basically anything "selectable" perhaps should be a purely "virtual" (as
in not shown in XYZconfig) option.

how one wants to name the options doesn't really bother me.

2006-09-01 00:15:46

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, 2006-08-31 at 00:41 +0200, Roman Zippel wrote:
> > USB_STORAGE switched from a depending on SCSI to select'ing SCSI three
> > years ago, and ATA in 2.6.19 will also select SCSI for a good reason:
>
> It was already silly three years ago.

I agree.

> > When doing anything kconfig related, you must always remember that the
> > vast majority of kconfig users are not kernel hackers.
>
> What does that mean, that only kernel hackers can read?

No, it means that we're pandering to Aunt Tillie.

--
dwmw2

2006-09-01 00:45:28

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, 31 Aug 2006 17:15:17 -0700 David Woodhouse wrote:

> On Thu, 2006-08-31 at 00:41 +0200, Roman Zippel wrote:
> > > USB_STORAGE switched from a depending on SCSI to select'ing SCSI three
> > > years ago, and ATA in 2.6.19 will also select SCSI for a good reason:
> >
> > It was already silly three years ago.
>
> I agree.
>
> > > When doing anything kconfig related, you must always remember that the
> > > vast majority of kconfig users are not kernel hackers.
> >
> > What does that mean, that only kernel hackers can read?
>
> No, it means that we're pandering to Aunt Tillie.

But David, you edit .config anyway, so who is "make *config" for?
Not that I want enable Tillie very much..

---
~Randy

2006-09-01 01:27:33

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, 2006-08-31 at 17:48 -0700, Randy.Dunlap wrote:
> But David, you edit .config anyway, so who is "make *config" for?
> Not that I want enable Tillie very much..

I edit .config but still have to use 'make oldconfig' afterwards. And it
screws me over because of all this 'select' nonsense. This used to
work...
sed -i /^CONFIG_SCSI=/d .config
yes n | make oldconfig

So "make *config" certainly isn't optimised for me, although of course I
do have to use it. It seems to be increasingly optimised for Aunt
Tillie.

--
dwmw2

2006-09-01 01:47:09

by Adrian Bunk

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, Aug 31, 2006 at 06:27:27PM -0700, David Woodhouse wrote:
> On Thu, 2006-08-31 at 17:48 -0700, Randy.Dunlap wrote:
> > But David, you edit .config anyway, so who is "make *config" for?
> > Not that I want enable Tillie very much..
>
> I edit .config but still have to use 'make oldconfig' afterwards. And it
> screws me over because of all this 'select' nonsense. This used to
> work...
> sed -i /^CONFIG_SCSI=/d .config
> yes n | make oldconfig
>
> So "make *config" certainly isn't optimised for me, although of course I
> do have to use it. It seems to be increasingly optimised for Aunt
> Tillie.

The vast majority of konfig user who might have a master in computer
science (like our Aunt Tillie has) but aren't kernel hackers have
different needs from kernel hackers.

I know how hard it is to e.g. find a maximum .config with FW_LOADER=n.

Normal kconfig users and kernel hackers have different needs, and the
real solution fitting the needs of both groups would in this case be
a patch to kconfig that allows a kernel hacker to specify which option
to deselect and does the rest automatically.

> dwmw2

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-09-01 13:47:55

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Thu, 31 August 2006 18:27:27 -0700, David Woodhouse wrote:
>
> I edit .config but still have to use 'make oldconfig' afterwards. And it
> screws me over because of all this 'select' nonsense. This used to
> work...
> sed -i /^CONFIG_SCSI=/d .config
> yes n | make oldconfig
>
> So "make *config" certainly isn't optimised for me, although of course I
> do have to use it. It seems to be increasingly optimised for Aunt
> Tillie.

Coming from you, the Aunt Tillie argument doesn't make more sense than
it did coming from ESR.

The actual problem existed before select just as it does afterwards.
People have to search extensively though Kconfig files to come up with
a useful .config. Before people had to magically know that
USB_STORAGE requires SCSI. "Magically" because USB_STORAGE didn't
show up in either make menuconfig, make xconfig or .config. Now
people have to magically know that SCSI=n requires USB_STORAGE=n.
You have the exact same problem and it has nothing to do with Aunt
Tillie.

What this shows is that select was a bad idea, as it merely shifted
the problem around instead of fixing it. It appears as if Stefan is
looking in the right direction for a decent fix and I'd like to see
patches from him. If only to stop these bad analogies ESR tried to
argue with. :)

J?rn

--
It is better to die of hunger having lived without grief and fear,
than to live with a troubled spirit amid abundance.
-- Epictetus

2006-09-01 15:35:58

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

J?rn Engel wrote:
...
> The actual problem existed before select just as it does afterwards.
> People have to search extensively though Kconfig files to come up with
> a useful .config. Before people had to magically know that
> USB_STORAGE requires SCSI. "Magically" because USB_STORAGE didn't
> show up in either make menuconfig, make xconfig or .config. Now
> people have to magically know that SCSI=n requires USB_STORAGE=n.

Yes and no. In the latter case, they have to magically know that at
least menuconfig and xconfig can be tricked to list depending options.

> What this shows is that select was a bad idea, as it merely shifted
> the problem around instead of fixing it. It appears as if Stefan is
> looking in the right direction for a decent fix and I'd like to see
> patches from him.
...

Could be a fun project right after Stefan R got rid of the kernel
freezes (months old) and data corruptions (years old) assigned to him at
bugzilla.kernel.org...
--
Stefan Richter
-=====-=-==- =--= ----=
http://arcgraph.de/sr/

2006-09-01 16:22:48

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Fri, 1 September 2006 17:31:51 +0200, Stefan Richter wrote:
>
> Yes and no. In the latter case, they have to magically know that at
> least menuconfig and xconfig can be tricked to list depending options.

True. Marginally better than horrible then. :)

> Could be a fun project [...]

Absolutely. Assuming that select gets removed in the process, and
concentrating on oldconfig, would it be enough to have something like
this in the .config?

# CONFIG_USB_STORAGE has unmet dependencies: CONFIG_SCSI, CONFIG_BLOCK

Now people looking for usb mass storage can find the option without
grepping through Kconfig files, but also every single driver for every
single disabled subsystem shows up. Might be a bit too much.

J?rn

--
Invincibility is in oneself, vulnerability is in the opponent.
-- Sun Tzu

2006-09-01 16:34:12

by Adrian Bunk

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Fri, Sep 01, 2006 at 06:19:20PM +0200, J?rn Engel wrote:
> On Fri, 1 September 2006 17:31:51 +0200, Stefan Richter wrote:
> >
> > Yes and no. In the latter case, they have to magically know that at
> > least menuconfig and xconfig can be tricked to list depending options.
>
> True. Marginally better than horrible then. :)
>
> > Could be a fun project [...]
>
> Absolutely. Assuming that select gets removed in the process, and
> concentrating on oldconfig, would it be enough to have something like
> this in the .config?
>
> # CONFIG_USB_STORAGE has unmet dependencies: CONFIG_SCSI, CONFIG_BLOCK
>
> Now people looking for usb mass storage can find the option without
> grepping through Kconfig files, but also every single driver for every
> single disabled subsystem shows up. Might be a bit too much.

Common use case:
A driver was changed to use FW_LOADER.
The .config for the old kernel contains CONFIG_FW_LOADER=n.
The user runs "make oldconfig" with the old .config in the new kernel.

How do you plan to handle this reasonably without select?

> J?rn

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-09-01 17:56:21

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

Adrian Bunk wrote:
> On Fri, Sep 01, 2006 at 06:19:20PM +0200, J?rn Engel wrote:
[...]
>> Assuming that select gets removed in the process, and
>> concentrating on oldconfig, would it be enough to have something like
>> this in the .config?
>>
>> # CONFIG_USB_STORAGE has unmet dependencies: CONFIG_SCSI, CONFIG_BLOCK
>>
>> Now people looking for usb mass storage can find the option without
>> grepping through Kconfig files, but also every single driver for every
>> single disabled subsystem shows up. Might be a bit too much.

This comment or similar things are apparently not necessary _within
subsystems_, just across subsystems, i.e. where the hierarchy of
subdirectories and files does not match the hierarchy of dependencies.

> Common use case:
> A driver was changed to use FW_LOADER.
> The .config for the old kernel contains CONFIG_FW_LOADER=n.
> The user runs "make oldconfig" with the old .config in the new kernel.
>
> How do you plan to handle this reasonably without select?

"make oldconfig" could ask questions when it sees need to disable
formerly enabled options.

In general I think:
As long as we talk about the various prefab UIs to manipulate .config
(i.e. "make {allyes, allmod, allno, def, g, menu, old, rand, silentold,
update-po-, x, ''}config"), there may be ways to implement modes of
operation which do what people expect from 'select' but with 'depends
on' alone. To ensure that no user group is discriminated in the process,
committees could be formed. (<- attempt on irony)

It will get difficult to entirely please users who don't use these
interfaces to .config. But it seems these users are better off without
'select'.
--
Stefan Richter
-=====-=-==- =--= ----=
http://arcgraph.de/sr/

2006-09-01 18:09:50

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [PATCH 17/17] BLOCK: Make it possible to disable the block layer [try #2]

On Fri, Sep 01, 2006 at 07:51:39PM +0200, Stefan Richter wrote:
>
> It will get difficult to entirely please users who don't use these
> interfaces to .config.

We have a number of more or less friendly interfaces to configure
the kernel. Yet some people prefer to directly modify the .config.
That is fine let them do so.
But to get an overview over the sometimes complex logic thay have
to turn to move powerfull tools such as menuconfig.

Editing .config is a second class citizen way of configuring the
kernel, and menuconfig is first class IMHO.

So enhancing the .config file to express the dependencies
is not the way forward. We should do this in menuconfig (and friends)
and let users use the dedicated interface to edit their kernel
configuration using the dedicated tools and not by editing .config.

Much of the discussion are centered about "select" which is indeed
ugly are in some cases ill-used.
But prime issue is that using select makes it hard to
un-select certain configuration items. And avoiding select makes it
un-intuitive to enable some configuration items.
So we simple needs to:
1) Make is easy to un-select selected configuration items by unselecting
the relevant items
2) Make it possible to select 'non-visible' options by providing a way
to satisfy the dependencies.

And maybe 2) makes select almost obsolete..

Sam