Skip to content

data lost due to pmfs_evict_inode is not atomic #26

@iaoing

Description

@iaoing

Bug

This is a concurrency and crash-consistency bug.

If VFS issues pmfs_evict_inode while another process creates a file or a directory, the newly created file (dir) can be allocated an inode number that is still in the truncate_list. Giving a crash after the creation and before pmfs_truncate_del, during recovery, the file size of the newly created file (dir) will be altered when traversing truncate_list.

A similar bug could occur in PMFS since PMFS and WineFS have the same truncate list functionalities.

Reproduce

First, modify the source code in

if (destroy == 0) {
pmfs_dbg_verbose("%s: destroying %lu\n", __func__, inode->i_ino);
pmfs_free_dram_resource(sb, sih);
}
/* now it is safe to remove the inode from the truncate list */
pmfs_truncate_del(inode);
/* TODO: Since we don't use page-cache, do we really need the following
* call? */
as below shows.

if (destroy == 0) {
	pmfs_dbg_verbose("%s: destroying %lu\n", __func__, inode->i_ino);
	pmfs_free_dram_resource(sb, sih);
}
pmfs_dbg("start sleep 50 seconds");
msleep(50000);
pmfs_dbg("end sleep 50 seconds");
/* now it is safe to remove the inode from the truncate list */
pmfs_truncate_del(inode);

Run the below commands according to the comments.

# mount fs
insmod winefs
mount -t winefs -o init,dbgmask=255 /dev/pmem0 /mnt/pmem0

touch /mnt/pmem0/foo # terminal 1
rm /mnt/pmem0/foo # terminal 1, this command will take 50 seconds due to the modification of the source code.

mkdir /mnt/pmem0/dir # execute in terminal 2 during rm
cat /dev/pmem0 > img.1 # terminal 2, save the PM image to simulate a crash (should be done before rm foo is done)

# wait until all commands are done

# syslog shows the foo (touch foo) got the inode 33. 
# During pmfs_evict_inode (rm foo), the inode number, 33, is freed. 
# Then, dir got the inode number 33 (mkdir dir). 
# However, when allocating 33 to dir, the function `pmfs_truncate_del` has not been invoked, which means 33 is still in the truncate list.
# Therefore, we have an image that: (a) inode 33 is still in the truncate list (in PM); (b) inode 33 is allocated for 'dir'.
dmesg 

# umount fs 
umount /mnt/pmem0
rmmod winefs
insmod winefs

# recover the img
dd if=img.1 of=/dev/pmem0 bs=1048576 count=128 # size according to the dev size
mount -t winefs -o dbgmask=255 /dev/pmem0 /mnt/pmem0

# The below command will show nothing in dir directory. 
# This is because the file size of dir has been reset as 0 when recovering the truncate list, which contains the inode 33.
ls -a /mnt/pmem0/dir

Fix

Do not have a good idea so far.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions