How To Defragment An XFS File System

The XFS file system generally does a pretty good job at keeping itself clean and tidy, however it can still get fragmented over time. Here we’re going to show you how to check the level of fragmentation in place on your XFS file system and how you can defragment it if required, further increasing disk performance.

Defining Fragmentation

Fragmentation happens when a file is stored on disk in non-contiguous extents, which basically means that the file is split up and stored all over the disk which reduces overall performance as the disk must now seek data from many different locations. Defragmentation is the process of reversing this and ensuring our files are stored on contiguous sequential extents.

The XFS file system works with file extents which are contiguous blocks of data. This means that the file system by default will attempt to make use of the disk in sequence which increases resistance to disk fragmentation. However despite the best efforts of XFS it is possible for fragmentation to still occur, albeit at a lower level when compared with other file systems that do not work in a similar manner.

Another benefit of the XFS file system is that the defragmentation process can be performed completely online. This means that there is no need to schedule down time to unount the file system and perform the defragmentation process, however it will increase the disk I/O while running which could impact negatively system performance.

Note: Typically defragmentation should only be run against systems that are using spinning rust (also known as traditional hard disk drives) and not solid state disks (SSD). This is because SSDs have much less advantage to having contiguous blocks of data compared to traditional HDDs. Performing unnecessary defragmentation on an SSD can reduce its life over time, it will generally be fine to let TRIM take care of things. Red Hat even advises against periodically defragmententing an entire XFS file system, stating that this is normally not warranted.

Viewing Current XFS Fragmentation Levels

Before blindly running the defragmentation tool on our XFS file system, we first want to get an idea of the current fragmentation levels as it may be that no defragmentation is even required!

First we need to find our XFS file systems, this can be done a few different ways, one simple method is to run the ‘blkid’ command as shown below.

[root@centos7 ~]# blkid
/dev/xvda1: UUID="1f790447-ebef-4ca0-b229-d0bc1985d47f" TYPE="xfs"

As shown /dev/xvda1 on this Linux system contains an XFS file system.

The xfs_db command can be used with the -r option to open a device or file as read only, which is required for a mounted file system. Here we specify our XFS file system /dev/xvda1.

[root@centos7 ~]# xfs_db -r /dev/xvda1
xfs_db>

Things brings us to the xfs_db prompt where we can run additional commands, type ‘help’ for a full listing of what can be run here.

In this particular instance we’re going to make use of the ‘frag’ command, which is used to get fragmentation data. The -d flag is used to display directory data, while the -f flag is used for file data.

xfs_db> frag -d
actual 5933, ideal 5617, fragmentation factor 5.33%

xfs_db> frag -f
actual 132484, ideal 129817, fragmentation factor 2.01%

To exit the xfs_db prompt, simply type ‘quit’ and press enter.

There’s nothing too serious in this particular instance, let’s see if we can improve the fragmentation levels here by defragmenting our XFS file system.

Defragmenting An XFS File System

The defragmentation is done with the xfs_fsr tool, which is a file system reorganizer for XFS. It works by improving the organization of the disk extents of every file to ensure that contiguous space is being used.

This tool performs online defragmentation meaning that it can be run on a file system that is currently mounted and in use, just be aware of the increased I/O usage that will come as a result – it may be best to run it during a period of low disk activity if possible on a production system.

To run xfs_fsr, simply enter in that command as shown below.

[root@centos7 ~]# xfs_fsr
xfs_fsr -m /proc/mounts -t 7200 -f /var/tmp/.fsrlast_xfs ...
Completed all 10 passes

By default xfs_fsr with no other arguements will run against mounted XFS file systems listed in the /etc/mtab file. You can alternatively specify a file system such as /dev/xvda1 or even a specific file if you don’t want to defragment everything available. The default time that xfs_fsr will run for is 7200 seconds which is 2 hours. After this time it will save the current progress to the /var/tmp/.fsrlast_xfs file, this essentially allows us to pause and resume the XFS defragmentation process. These three options can be modified manually by specifying the -m, -t or -f flags respectively.

Here are the fragmentation levels reported by xfs_db after running xfs_fsr with the default options.

xfs_db> frag -d
actual 5935, ideal 5619, fragmentation factor 5.32%

xfs_db> frag -f
actual 129878, ideal 129863, fragmentation factor 0.01%

In this particular case the directory fragmentation has barely changed, while the file fragmentation has almost been eliminated.

It’s worth noting that xfs_fsr does not run on files that are currently mapped in memory. I have found that after running xfs_fsr and then performing a reboot and running xfs_fsr again the defragmentation level dropped by a further 0.01%, which is not exactly practical or very useful but goes to show that fragmented files that were previously held open in memory were avoiding the defragmentation process.

XFS Degragmentation Performance Information

From my investigation the xfs_fsr tool does not appear to be multithreaded, on a 4 CPU core Linux system I found that only one of the cores was 99% I/O wait while the others were 0%, indicating only one CPU core was performing the required disk operations. Below is a snapshot of ‘top’ taken while running xfs_fsr showing this.

[root@centos7 ~]# top
top - 21:42:21 up 66 days, 22:46,  3 users,  load average: 0.40, 0.12, 0.08
Tasks: 185 total,   2 running, 183 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni, 99.7 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.3 us,  1.7 sy,  0.0 ni,  0.3 id, 97.3 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16005736 total, 13166964 free,   657916 used,  2180856 buff/cache
KiB Swap:   520188 total,   520188 free,        0 used. 15037076 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
11653 root      20   0  126380  17212    680 R   1.7  0.1   0:00.40 xfs_fsr

I also noticed that my read/writes were maxing out at times, so be prepared for reduced disk performance while performing defragmentation. While doing absolutely nothing else except for running xfs_fsr, I noted the following I/O usage with the ‘iotop’ command, this is shown below. While xfs_fsr was not running read/write on the test server were 0 Bytes per second, confirming that all I/O was coming from xfs_fsr.

Total DISK READ :     180.74 M/s | Total DISK WRITE :     212.45 M/s
Actual DISK READ:     180.74 M/s | Actual DISK WRITE:     212.83 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
  26455 be/4 root      107.73 M/s  122.63 M/s  0.00 % 99.99 % xfs_fsr

This may not look like much but it’s about as fast as the disk on my test system can perform.

The overall time that it will take for xfs_fsr to complete will depend on the size of your disk, the speed of disk I/O, and how much fragmentation exists. For the 8gb disk that I have been testing on it took less than a couple of minutes to complete while making use of almost all disk I/O.

As I did not have very much fragmentation in place I was not able to perform a meaningful before and after disk I/O test to compare the performance differences before and after defragmentation.

We can further help the XFS file system in reducing the level of fragmentation by mounting it with ‘allocsize’ specified, for example see the below configuration line from /etc/fstab.

/dev/xvda1  /  xfs  defaults,allocsize=512m  0 0

When mounted this has speculatively preallocated us 512mb of contiguous sequential space, for most users the defaults that XFS provides will generally be fine. For further information on XFS tuning I recommend taking a look at the XFS FAQ as a good starting point.

Conclusion

With the xfs_db tool we can view current levels of fragmentation present within an XFS file system. Should we identify that the fragmentation levels in place need to be addressed, we can then make use of the xfs_fsr tool to perform an online defragmentation of an XFS file system. This is generally not required to be run very often as XFS already does a good job of making use of contiguous file extents on disk, however by running defragmentation as required it can be possible to increase system disk I/O performance if your system makes use of traditional hard disk drives.

  1. Hi,

    Just a question, I have on my server a big fragmentation of directory, like 32%… How can I defrag this ? File is ok but the directory still very fragmented

    Matt

    • Could you provide the output showing this? You’re using XFS?

      • Hi Jarrod,

        Here the code and error for dir frag :

        Metadata corruption detected at block 0x4603c26060/0x8000
        Metadata CRC error detected at block 0x467ffffb9a/0x200
        actual 112413, ideal 76081, fragmentation factor 32.32%
        xfs_db>

        Here for the file frag :

        Metadata corruption detected at block 0x4603c26060/0x8000
        Metadata CRC error detected at block 0x467ffffb9a/0x200
        actual 6977723, ideal 6964597, fragmentation factor 0.19%
        xfs_db>

        If you have an idea how to solve the corrupt data and CRC error, I take too ;)

        For information, it’s a server for data storage for 3D production. All artist read/write at the same time on it + render farm (total 62 computers connected to file system 24h/24h).

        It’s a custom 2x Xeon 64gb ECC ram (soon 572 gb), below the description of the server

        SuperMicro Storage Server R6048-E1CR36N, Rackmount 6U,
        Chassis CSE-847BE1C-R1K28LPB, 1200W Redundant PSU
        (Platinium 94%), Mainboard X10DRi-T4+, 2x Xeon E5-2600v3
        series, up to 1.5TB DDR4-2133 Reg ECC, LSI3108 SAS3 HW
        RAID, 36x Hot-swap SAS3 Bays (with LSI Expander), Quad
        10GBase-T LAN w/ Intel X540, etc.
        SuperMicro Rack Mount Kit MCP-220-84606-0N
        SuperMicro Advanced Swap 3-Years hardware warranty
        2x Intel Xeon E5-2620v3 (6-core / 2.4GHz / 15Mb / SKT2011v3)
        64GB DDR4-2133 (8x 8GB), Reg. ECC LRDIMM (up to 1.5TB)
        Supermicro AOC-S3108L SAS3 RAID Card w/ 2GB cache
        Supermicro SuperCap Module (BBU)
        Supermicro 2-port Int-to-Ext SAS3 expansion (to ext. expanders)
        Samsung SM863 DataCenter SSD 480GB SATA3 / min. 3.08PB
        (RAID 1)
        36x HDD 6TB SAS3 (12Gb/s) 7.2k 128Mb (Seagate RE 24/7)
        (RAID 6 / 33+2+1)
        Intel E10G42BFSR Network Adapter Dual 10GBase-SR ( PCIE)

        I made a physical raid 60 from the AOC-S3108L SAS3 RAID Card. Put this raid into an LVM and format in XFS to be able to expand the storage easily. We write 1gb file at 1.2 gb/s read at 1.0gb or 720 mb/s depending of network flux.

        All your tips are welcomes ;)

        Matt

        • Interesting, I have not personally come across this issue before so we’ll see how we go!

          Was this while running xfs_repair?

          Is this related? https://bugzilla.redhat.com/show_bug.cgi?id=1315895

          • Hi Jarrod,

            So, I tried an xfs_repair and not solved this error message or folder fragmentation. For file fragmentation, no pb.

            The messages appeared during frag -d / frag -f
            Yes, I think it’s a bug like the link but as said, error message but all datas are ok. No datas loosed. Maybe as the SAS card have an “auto-cleanup / auto-repair” fonction it’s made that kind of messages.

            Matt

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>