Skip to content

Commit d923243

Browse files
snitmmhdzumair
authored andcommitted
dm thin: handle running out of data space vs concurrent discard
commit a685557fbbc3122ed11e8ad3fa63a11ebc5de8c3 upstream. Discards issued to a DM thin device can complete to userspace (via fstrim) _before_ the metadata changes associated with the discards is reflected in the thinp superblock (e.g. free blocks). As such, if a user constructs a test that loops repeatedly over these steps, block allocation can fail due to discards not having completed yet: 1) fill thin device via filesystem file 2) remove file 3) fstrim From initial report, here: https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html "The root cause of this issue is that dm-thin will first remove mapping and increase corresponding blocks' reference count to prevent them from being reused before DISCARD bios get processed by the underlying layers. However. increasing blocks' reference count could also increase the nr_allocated_this_transaction in struct sm_disk which makes smd->old_ll.nr_allocated + smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks. In this case, alloc_data_block() will never commit metadata to reset the begin pointer of struct sm_disk, because sm_disk_get_nr_free() always return an underflow value." While there is room for improvement to the space-map accounting that thinp is making use of: the reality is this test is inherently racey and will result in the previous iteration's fstrim's discard(s) completing vs concurrent block allocation, via dd, in the next iteration of the loop. No amount of space map accounting improvements will be able to allow user's to use a block before a discard of that block has completed. So the best we can really do is allow DM thinp to gracefully handle such aggressive use of all the pool's data by degrading the pool into out-of-data-space (OODS) mode. We _should_ get that behaviour already (if space map accounting didn't falsely cause alloc_data_block() to believe free space was available).. but short of that we handle the current reality that dm_pool_alloc_data_block() can return -ENOSPC. Reported-by: Dennis Yang <[email protected]> Cc: [email protected] Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent 5396122 commit d923243

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

drivers/md/dm-thin.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -992,6 +992,8 @@ static void schedule_external_copy(struct thin_c *tc, dm_block_t virt_block,
992992

993993
static void set_pool_mode(struct pool *pool, enum pool_mode new_mode);
994994

995+
static void requeue_bios(struct pool *pool);
996+
995997
static void check_for_space(struct pool *pool)
996998
{
997999
int r;
@@ -1004,8 +1006,10 @@ static void check_for_space(struct pool *pool)
10041006
if (r)
10051007
return;
10061008

1007-
if (nr_free)
1009+
if (nr_free) {
10081010
set_pool_mode(pool, PM_WRITE);
1011+
requeue_bios(pool);
1012+
}
10091013
}
10101014

10111015
/*
@@ -1082,7 +1086,10 @@ static int alloc_data_block(struct thin_c *tc, dm_block_t *result)
10821086

10831087
r = dm_pool_alloc_data_block(pool->pmd, result);
10841088
if (r) {
1085-
metadata_operation_failed(pool, "dm_pool_alloc_data_block", r);
1089+
if (r == -ENOSPC)
1090+
set_pool_mode(pool, PM_OUT_OF_DATA_SPACE);
1091+
else
1092+
metadata_operation_failed(pool, "dm_pool_alloc_data_block", r);
10861093
return r;
10871094
}
10881095

0 commit comments

Comments
 (0)