Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751103AbdDDEuT (ORCPT ); Tue, 4 Apr 2017 00:50:19 -0400 Received: from LGEAMRELO12.lge.com ([156.147.23.52]:40074 "EHLO lgeamrelo12.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750932AbdDDEuS (ORCPT ); Tue, 4 Apr 2017 00:50:18 -0400 X-Original-SENDERIP: 156.147.1.121 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 165.244.249.25 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 10.177.223.161 X-Original-MAILFROM: minchan@kernel.org Date: Tue, 4 Apr 2017 13:50:14 +0900 From: Minchan Kim To: Sergey Senozhatsky CC: Andrew Morton , , Sergey Senozhatsky , Subject: Re: [PATCH 2/5] zram: partial IO refactoring Message-ID: <20170404045014.GA32020@bbox> References: <1491196653-7388-1-git-send-email-minchan@kernel.org> <1491196653-7388-3-git-send-email-minchan@kernel.org> <20170404021706.GA475@jagdpanzerIV.localdomain> MIME-Version: 1.0 In-Reply-To: <20170404021706.GA475@jagdpanzerIV.localdomain> User-Agent: Mutt/1.5.24 (2015-08-30) X-MIMETrack: Itemize by SMTP Server on LGEKRMHUB07/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2017/04/04 13:50:14, Serialize by Router on LGEKRMHUB07/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2017/04/04 13:50:14, Serialize complete at 2017/04/04 13:50:14 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5663 Lines: 167 Hi Sergey, On Tue, Apr 04, 2017 at 11:17:06AM +0900, Sergey Senozhatsky wrote: > Hello, > > On (04/03/17 14:17), Minchan Kim wrote: > > +static bool zram_special_page_read(struct zram *zram, u32 index, > > + struct page *page, > > + unsigned int offset, unsigned int len) > > +{ > > + struct zram_meta *meta = zram->meta; > > + > > + bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value); > > + if (unlikely(!meta->table[index].handle) || > > + zram_test_flag(meta, index, ZRAM_SAME)) { > > + void *mem; > > + > > + bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value); > > + mem = kmap_atomic(page); > > + zram_fill_page(mem + offset, len, meta->table[index].element); > > + kunmap_atomic(mem); > > + return true; > > + } > > + bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value); > > + > > + return false; > > +} > > + > > +static bool zram_special_page_write(struct zram *zram, u32 index, > > + struct page *page) > > +{ > > + unsigned long element; > > + void *mem = kmap_atomic(page); > > + > > + if (page_same_filled(mem, &element)) { > > + struct zram_meta *meta = zram->meta; > > + > > + kunmap_atomic(mem); > > + /* Free memory associated with this sector now. */ > > + bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value); > > + zram_free_page(zram, index); > > + zram_set_flag(meta, index, ZRAM_SAME); > > + zram_set_element(meta, index, element); > > + bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value); > > + > > + atomic64_inc(&zram->stats.same_pages); > > + return true; > > + } > > + kunmap_atomic(mem); > > + > > + return false; > > +} > > zram_special_page_read() and zram_special_page_write() have a slightly > different locking semantics. > > zram_special_page_read() copy-out ZRAM_SAME page having slot unlocked > (can the slot got overwritten in the meantime?), while IMHO, yes, it can be overwritten but it doesn't make corruption of kernel. I mean if such race happens, it's user fault who should protect the race. zRAM is dumb block device so it can read/write block user request but one thing we should keep the promise is it shouldn't corrupt the kernel. Such pov, zram_special_page_read wouldn't be a problem to return stale data, I think. > zram_special_page_write() keeps the slot locked through out the entire > operation. zram_special_page_write is something different because it updates zram_table's slot via zram_set_[flag|element] so it should be protected by zram. > > > static void zram_meta_free(struct zram_meta *meta, u64 disksize) > > { > > size_t num_pages = disksize >> PAGE_SHIFT; > > @@ -504,169 +548,104 @@ static void zram_free_page(struct zram *zram, size_t index) > > zram_set_obj_size(meta, index, 0); > > } > > > > -static int zram_decompress_page(struct zram *zram, char *mem, u32 index) > > +static int zram_decompress_page(struct zram *zram, struct page *page, u32 index) > > { > > - int ret = 0; > > - unsigned char *cmem; > > - struct zram_meta *meta = zram->meta; > > + int ret; > > unsigned long handle; > > unsigned int size; > > + void *src, *dst; > > + struct zram_meta *meta = zram->meta; > > + > > + if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE)) > > + return 0; > > > > bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value); > > handle = meta->table[index].handle; > > size = zram_get_obj_size(meta, index); > > > > - if (!handle || zram_test_flag(meta, index, ZRAM_SAME)) { > > - bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value); > > - zram_fill_page(mem, PAGE_SIZE, meta->table[index].element); > > - return 0; > > - } > > - > > - cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO); > > + src = zs_map_object(meta->mem_pool, handle, ZS_MM_RO); > > if (size == PAGE_SIZE) { > > - copy_page(mem, cmem); > > + dst = kmap_atomic(page); > > + copy_page(dst, src); > > + kunmap_atomic(dst); > > + ret = 0; > > } else { > > struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp); > > > > - ret = zcomp_decompress(zstrm, cmem, size, mem); > > + dst = kmap_atomic(page); > > + ret = zcomp_decompress(zstrm, src, size, dst); > > + kunmap_atomic(dst); > > zcomp_stream_put(zram->comp); > > } > > zs_unmap_object(meta->mem_pool, handle); > > bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value); > > > > /* Should NEVER happen. Return bio error if it does. */ > > - if (unlikely(ret)) { > > + if (unlikely(ret)) > > pr_err("Decompression failed! err=%d, page=%u\n", ret, index); > > - return ret; > > - } > > > > - return 0; > > + return ret; > > } > > > > static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec, > > - u32 index, int offset) > > + u32 index, int offset) > > { > > int ret; > > struct page *page; > > - unsigned char *user_mem, *uncmem = NULL; > > - struct zram_meta *meta = zram->meta; > > - page = bvec->bv_page; > > > > - bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value); > > - if (unlikely(!meta->table[index].handle) || > > - zram_test_flag(meta, index, ZRAM_SAME)) { > > - bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value); > > - handle_same_page(bvec, meta->table[index].element); > > + page = bvec->bv_page; > > + if (zram_special_page_read(zram, index, page, bvec->bv_offset, > > + bvec->bv_len)) > > so, I think zram_bvec_read() path calls zram_special_page_read() twice: > > a) direct zram_special_page_read() call > > b) zram_decompress_page()->zram_special_page_read() > > is it supposed to be so? Yes, Because zram_decompress_page is called by zram_bvec_write in case of partial IO. Maybe, we makes it simple with removing zram_special_page_read in zram_bvec_read. I will look.