1. Overview

When we connect a USB drive to a Linux system, it becomes an integral part of the filesystem hierarchy. This integration allows seamless data transfer between the device and the computer. However, there are pitfalls. For example, if we physically disconnect or turn off a drive without properly unmounting it first, we may lose some data or even corrupt the entire filesystem.

In this tutorial, we’ll explore the basic steps and precautions necessary to safely remove USB drives using the Linux command line.

2. Checking Device Usage and Unmount

Let’s take the simple case of copying files. By default, Windows uses write-through with USB devices, writing data directly to the device without significant delay. Linux and MacOS, on the other hand, use a write-back policy to make extensive use of caching to improve overall performance. That’s why data can remain in the cache instead of being written to the device immediately. As a result, even when the copy process appears to be complete, the data may not be fully transferred to the USB device. In such a scenario, a power failure, crash, or physical disconnection of the device can result in data loss or corruption.

2.1. Identifying the USB Device Using df and lsblk

The df command provides information about the space usage of mounted filesystems. With the -h option, it displays all mounted filesystems along with their size in human-readable format, used space, available space, and mount point.

Let’s look for the device corresponding to our USB drive:

$ df -h
Filesystem   Size  Used Avail Use% Mounted on
[...]
/dev/dm-3    917G  716G  192G  79% /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14

In this example, /dev/dm-3 is our USB device. The dm in dm-3 stands for device mapper, and the number that follows is a sequential identifier. dm-3 doesn’t refer directly to physical drives, like sda or sdb, but rather to a virtual block device that the system uses to handle complex disk operations. We often see dm-X in systems using LVM, encrypted volumes, or other sophisticated storage solutions.

Using lsblk followed by optional arguments such as NAME, KNAME and others, we can see more information clearly:

$ lsblk -o NAME,KNAME,FSTYPE,TYPE,MOUNTPOINT,SIZE
NAME                                          KNAME  FSTYPE      TYPE  MOUNTPOINT                                              SIZE
[...]
sdc                                           sdc                disk                                                        931,5G
└─sdc1                                        sdc1   crypto_LUKS part                                                        931,5G
  └─luks-d99ee6e1-7262-4267-ac15-b93674b9f666 dm-3   ext4        crypt /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14 931,5G

In this output, KNAME, which stands for “kernel name”, and NAME refer to the same device. So our USB disk has two equivalent device names:

  • /dev/dm-3
  • /dev/mapper/luks-d99ee6e1-7262-4267-ac15-b93674b9f666

We can easily verify that the second device is a symbolic link to the first:

$ ls -l /dev/dm-3 /dev/mapper/luks-d99ee6e1-7262-4267-ac15-b93674b9f666
brw-rw---- 1 root disk 253, 3 May  1 22:26 /dev/dm-3
lrwxrwxrwx 1 root root      7 May  1 22:26 /dev/mapper/luks-d99ee6e1-7262-4267-ac15-b93674b9f666 -> ../dm-3

We can use whichever one we prefer.

2.2. Checking for Active Usage of the Device

iostat is a tool for monitoring the load on I/O devices. Let’s print the I/O statistics of /dev/dm-3 every two seconds for five times:

$ iostat -d /dev/dm-3 2 5 | { head -3 ; grep dm-3 ; }
[...]
Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
dm-3            181,50     72540,00        32,00         0,00     145080         64          0
dm-3            270,00     75490,00         0,00         0,00     150980          0          0
dm-3            227,00     76104,00         6,00         0,00     152208         12          0
dm-3            131,00     68398,00     20480,00         0,00     136796      40960          0

In this case, we notice that the disk is reading and writing data because of the high kB_read/s and kB_wrtn/s speeds.

While iostat helps us understand device activity, lsof allows us to identify specific processes that have files open on the /dev/dm-3 mount point, i.e., /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14:

$ lsof | { head -1 ; grep /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14 ; }
COMMAND      PID  [...]     USER   FD   [...] NAME
nemo      393849  [...] francesco  23r  [...] /media/francesco/[...]/file1.7z
nemo      393849  [...] francesco  24w  [...] /media/francesco/[...]/file2.7z
[...]

This output means that the nemo command, which is the default file manager in Cinnamon and GNOME, is reading file1.7z and writing file2.7z. In this case, it’s just finishing a file copy.

When it’s done, we can check the device I/O statistics again:

$ iostat -d /dev/dm-3 2 5 | { head -3 ; grep dm-3 ; }
Linux 5.15.0-105-generic (asusrog) 	05/01/2024 	_x86_64_	(8 CPU)

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
dm-3             27,29      4606,04      3048,04         0,00  449510259  297463396          0
dm-3             66,50         0,00       264,00         0,00          0        528          0
dm-3              0,00         0,00         0,00         0,00          0          0          0
dm-3              0,00         0,00         0,00         0,00          0          0          0
dm-3              0,00         0,00         0,00         0,00          0          0          0

The last three reads of kB_read/s and kB_wrtn/s are 0, so no other processes are using the device and we can safely unmount the USB drive.

2.3. Safely Unmounting the USB Drive

Using the sync command before umount is a recommended practice, although not always strictly necessary. sync forces the system to write all unused data buffers to the drive, ensuring that all pending operations are completed before unmounting the drive. This is especially important to prevent data loss:

$ sync

sync produces no output, but it doesn’t terminate until all operations on all disks have been completed. Therefore, sync may exit immediately or after a few seconds. If it doesn’t exit at all, we need to investigate as in the previous steps.

Once the sync operation is complete, let’s unmount the USB device. We can use the umount command followed by the mount point or by one of the two equivalent device names we identified earlier:

$ umount /dev/dm-3

umount detaches the storage device’s filesystem from the computer’s main filesystem, making it safe to physically remove the device. However, as an extra precaution, let’s wait a few seconds to make sure that the disk LED shows no activity.

Finally, it’s worth noting that depending on the configuration of our Linux machine, some of the commands described so far may require root privileges.

3. Conclusion

In this article, we’ve learned the critical steps and precautions for safely removing USB drives under Linux. We’ve seen how to verify device usage and perform proper unmounting procedures to prevent potential data loss or corruption.

It’s important to remember that the integrity of our data depends heavily on strict adherence to these best practices. In addition, we should take every precaution to avoid physical damage, shock to the disks during use, and power outages.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments