Quick Links

Key Takeaways

  • The actual size of a file, which is the number of bytes that make up the file, and the effective size on the hard disk, which is the number of file system blocks necessary to store it, are different due to the allocation of disk space in blocks.
  • The du command can be used to check the size of files, directories, and the total disk space used by the current directory and subdirectories.
  • Run "du -h" to see a list of files and folders in a human-readable format.

When you use the Linux du command, you obtain both the actual disk usage and the true size of a file or directory. We'll explain why these values aren't the same.

Why are Actual Disk Usage and True Size Different?

The size of a file and the space it occupies on your hard drive are rarely the same. Disk space is allocated in blocks. If a file is smaller than a block, an entire block is still allocated to it because the file system doesn't have a smaller unit of real estate to use.

Unless a file's size is an exact multiple of blocks, the space it uses on the hard drive must always be rounded up to the next whole block. For example, if a file is larger than two blocks but smaller than three, it still takes three blocks of space to store it.

Two measurements are used in relation to file size. The first is the actual size of the file, which is the number of bytes of content that make up the file. The second is the effective size of the file on the hard disk. This is the number of file system blocks necessary to store that file.

How to Check a File's Size

Let's look at a simple example. We'll redirect a single character into a file to create a small file:

echo "1" > geek.txt

echo "1" > geek.txt in a terminal window

Now, we'll use the long format listing, ls, to look at the file length:

ls -l geek.txt

ls -l geek.txt in a terminal window

The length is the numeric value that follows the dave dave entries, which is two bytes. Why is it two bytes when we only sent one character to the file? Let's take a look at what's happening inside the file.

We'll use the hexdump command, which will give us an exact byte count and allow us to "see" non-printing characters as hexadecimal values. We'll also use the -C (canonical) option to force the output to show hexadecimal values in the body of the output, as well as their alphanumeric character equivalents:

hexdump -C geek.txt

hexdump -C geek.txt in a treminal window

The output shows us that, beginning at offset 00000000 in the file, there's a byte that contains a hexadecimal value of 31, and a one that contains a hexadecimal value of 0A. The right-hand portion of the output depicts these values as alphanumeric characters, wherever possible.

The hexadecimal value of 31 is used to represent the digit one. The hexadecimal value of 0A is used to represent the Line Feed character, which cannot be shown as an alphanumeric character, so it's shown as a period (.) instead. The Line Feed character is added by echo . By default, echostarts a new line after it displays the text it needs to write to the terminal window.

That tallies with the output from ls and agrees with the file length of two bytes.

Now, we'll use the du command to look at the file size:

du geek.txt

du geek.txt in a terminal window

It says the size is four, but four of what?

There Are Blocks, and Then There Are Blocks

When du reports file sizes in blocks, the size it uses depends on several factors. You can specify which block size it should use on the command line. If you don't force du to use a particular block size, it follows a set of rules to decide which one to use.

First, it checks the following environment variables:

  • DU_BLOCK_SIZE
  • BLOCK_SIZE
  • BLOCKSIZE

If any of these exist, the block size is set, and du stops checking. If none are set, du defaults to a block size of 1,024 bytes. Unless, that is, an environment variable called POSIXLY_CORRECT is set. If that's the case, du defaults to a block size of 512 bytes.

So, how do we find out which one is in use? You can check each environment variable to work it out, but there's a quicker way. Let's compare the results to the block size the file system uses instead.

To discover the block size the file system uses, we'll use the tune2fs program. We'll then use the -l (list superblock) option, pipe the output through grep, and then print lines that contain the word "Block."

In this example, we'll look at the file system on the first partition of the first hard drive, sda1, and we'll need to use sudo:

sudo tune2fs -l /dev/sda1 | grep Block

sudo tune2fs -l /dev/sda1 | grep Block in a terminal window

The file system block size is 4,096 bytes. If we divide that by the result we got from du (four), it shows the du default block size is 1,024 bytes. We now know several important things.

First, we know the smallest amount of file system real estate that can be devoted to storing a file is 4,096 bytes. This means even our tiny, two-byte file is taking up 4 KB of hard drive space.

The second thing to keep in mind is applications dedicated to reporting on hard drive and file system statistics, such as du, ls, and tune2fs, can have different notions of what "block" means. The tune2fs application reports true file system block sizes, while ls and du can be configured or forced to use other block sizes. Those block sizes are not intended to relate to the file system block size; they're just "chunks" those commands use in their output.

Finally, other than using different block sizes, the answers from du and tune2fs convey the same meaning. The tune2fs result was one block of 4,096 bytes, and the du result was four blocks of 1,024 bytes.

Using du to Check File Size

With no command line parameters or options, du lists the total disk space the current directory and all subdirectories are using.

Let's take a look at an example:

du

du in a terminal window

The size is reported in the default block size of 1,024 bytes per block. The entire subdirectory tree is traversed.

Using du on a Different Directory

If you want du to report on a different directory than the current one, you can pass the path to the directory on the command line:

du ~/.cach/evolution/

du ~/.cach/evolution/ in a terminal window

Using du on a Specific File

If you want du to report on a specific file, pass the path to that file on the command line. You can also pass a shell pattern to a select a group of files, such as *.txt:

du ~/.bash_aliases

du ~/.bash_aliases in a terminal window

Reporting on Files in Directories

To have du report on the files in the current directory and subdirectories, use the -a (all files) option:

du -a

du -a in a terminal window

For each directory, the size of each file is reported, as well as a total for each directory.

Output from du -a in a terminal window

Limiting Directory Tree Depth

You can tell du to list the directory tree to a certain depth. To do so, use the -d (max depth) option and provide a depth value as a parameter. Note that all subdirectories are scanned and used to calculate the reported totals, but they're not all listed. To set a maximum directory depth of one level, use this command:

du -d 1

du -d 1 in a terminal window

The output lists the total size of that subdirectory in the current directory and also provides a total for each one.

To list directories one level deeper, use this command:

du -d 2

du -d 2 in a terminal window

Setting the Block Size

You can use the block option to set a block size for du for the current operation. To use a block size of one byte, use the following command to get the exact sizes of the directories and files:

du --block=1

du --block=1 in a terminal window

If you want to use a block size of one megabyte, you can use the -m (megabyte) option, which is the same as --block=1M:

du -m

du -m in a terminal window

If you want the sizes reported in the most appropriate block size according to the disk space used by the directories and files, use the -h (human-readable) option:

du -h

du -h in a terminal window

To see the apparent size of the file rather than the amount of hard drive space used to store the file, use the --apparent-size option:

du --apparent-size

du --apparent-size in a terminal window

You can combine this with the -a (all) option to see the apparent size of each file:

du --apparent-size -a

du --apparent-size -a in a terminal window

Each file is listed, along with its apparent size.

Output from du --apparent-size -a in a terminal window

Displaying Only Totals

If you want du to report only the total for the directory, use the -s (summarize) option. You can also combine this with other options, such as the -h (human-readable) option:

du -h -s

du -h -s in a terminal window

Here, we'll use it with the --apparent-size option:

du --apparent-size -s

du --apparent-size -s in a terminal window

Displaying Modification Times

To see the creation or last modification time and date, use the --time option:

du --time -d 2

du --time -d 2 in a terminal window

Strange Results?

If you see strange results from du , especially when you cross-reference sizes to the output from other commands, it's usually due to the different block sizes to which different commands can be set or those to which they default. It could also be due to the differences between real file sizes and the disk space required to store them.

If you need to match the output of other commands, experiment with the --block option in du.

Linux Commands

Files

tar · pv · cat · tac · chmod · grep · diff · sed · ar · man · pushd · popd · fsck · testdisk · seq · fd · pandoc · cd · $PATH · awk · join · jq · fold · uniq · journalctl · tail · stat · ls · fstab · echo · less · chgrp · chown · rev · look · strings · type · rename · zip · unzip · mount · umount · install · fdisk · mkfs · rm · rmdir · rsync · df · gpg · vi · nano · mkdir · du · ln · patch · convert · rclone · shred · srm · scp · gzip · chattr · cut · find · umask · wc · tr

Processes

alias · screen · top · nice · renice · progress · strace · systemd · tmux · chsh · history · at · batch · free · which · dmesg · chfn · usermod · ps · chroot · xargs · tty · pinky · lsof · vmstat · timeout · wall · yes · kill · sleep · sudo · su · time · groupadd · usermod · groups · lshw · shutdown · reboot · halt · poweroff · passwd · lscpu · crontab · date · bg · fg · pidof · nohup · pmap

Networking

netstat · ping · traceroute · ip · ss · whois · fail2ban · bmon · dig · finger · nmap · ftp · curl · wget · who · whoami · w · iptables · ssh-keygen · ufw · arping · firewalld