Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1971186

Summary: use fstrim at conclusion of installations
Product: [Fedora] Fedora Reporter: Chris Murphy <bugzilla>
Component: loraxAssignee: Brian Lane <bcl>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 35CC: anaconda-maint-list, bcl, davdunc, davide, fedoraproject, jonathan, kellin, ngompa13, reallylongword, vanmeeuwen+fedora, vponcova, w
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-18 23:07:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1972376    

Description Chris Murphy 2021-06-12 19:50:12 UTC
Description of problem:

There can be various deleted files that were needed for the installation (including downloaded RPMs) that will remain on backing media. This is a problem in particular for images. Consider running fstrim on all /mnt/sysimage file systems at the conclusion of the installation.

Version-Release number of selected component (if applicable):
anaconda-35.16-2.fc35

How reproducible:
Always, at least for images


Steps to Reproduce:

# ls -ls
total 890660
699896 -rw-r--r--. 1 root root 5368709120 Jun 12 13:08 Fedora-Cloud-Base-Rawhide-20210605.n.0.aarch64.raw
# fstrim -v /mnt
/mnt: 593 MiB (621817856 bytes) trimmed
# ls -ls
total 890656
699892 -rw-r--r--. 1 root root 5368709120 Jun 12 13:08 Fedora-Cloud-Base-Rawhide-20210605.n.0.aarch64.raw
# umount /mnt
# mount /dev/mapper/loop0p2 /mnt
# fstrim -v /mnt
/mnt: 3.6 GiB (3906367488 bytes) trimmed
# ls -ls
total 870316
679552 -rw-r--r--. 1 root root 5368709120 Jun 12 13:08 Fedora-Cloud-Base-Rawhide-20210605.n.0.aarch64.raw
# 

Actual results:

~20 MiB of garbage removed when doing fstrim.

Expected results:

Images should be as small as possible prior to compression.

Additional info:

Comment 1 Chris Murphy 2021-06-19 22:04:28 UTC
A more extreme example. Start with this:

325M -rw-r--r--. 1 root root 325M Jun 19 15:26 Fedora-Cloud-Base-Rawhide-20210619.n.0.x86_64.raw.xz

After unxz:
4.1G -rw-r--r--. 1 root root 5.0G Jun 19 15:26 Fedora-Cloud-Base-Rawhide-20210619.n.0.x86_64.raw

Image after losetup->kpartx->mount p2 (btrfs)->fstrim btrfs
412M -rw-r--r--. 1 root root 5.0G Jun 19 15:46 Fedora-Cloud-Base-Rawhide-20210619.n.0.x86_64.raw

Is it appropriate to put this in kickstart as a %post script?

Comment 2 Chris Murphy 2021-06-19 22:38:50 UTC
Set to block "Make btrfs the default file system for Fedora Cloud" just for tracking. The issue affects other file systems and images too, so fixing this would be general purpose.

Comment 3 Chris Murphy 2021-06-19 23:34:17 UTC
https://kojipkgs.fedoraproject.org//packages/Fedora-Cloud-Base/Rawhide/20210619.n.0/data/logs/image/oz-x86_64.log

>necho "Zeroing out empty space."\n# This forces the filesystem to reclaim space from deleted files\ndd bs=1M if=/dev/zero of=/var/tmp/zeros || :\nrm -f /var/tmp/zeros\necho "(Don\'t worry -- that out-of-space error was expected.)"\

Is this obsolete now? I think we're better off replacing it with fstrim instead. Also, this dd command must exclude baremetal installs or they'd take forever zeroing out the media.

Comment 4 Chris Murphy 2021-06-23 01:48:33 UTC
lorax/src/pylorax/installer.py:433:                    # For image installs, run fstrim to discard unused blocks. This way
lorax/src/pylorax/creator.py:558:    rc = execWithRedirect("/usr/sbin/fsck.ext4", ["-y", "-f", "-E", "discard", rootfs_img])

Comment 5 Chris Murphy 2021-06-27 03:04:29 UTC
See also:

creating smaller cloud images
https://pagure.io/cloud-sig/issue/335

QCOW images in recent Fedora-Cloud-Base-Vagrant libvirt boxes for Rawhide are not sparse 
https://pagure.io/cloud-sig/issue/340

Comment 6 Neal Gompa 2021-07-15 08:25:20 UTC
(In reply to Chris Murphy from comment #3)
> https://kojipkgs.fedoraproject.org//packages/Fedora-Cloud-Base/Rawhide/
> 20210619.n.0/data/logs/image/oz-x86_64.log
> 
> >necho "Zeroing out empty space."\n# This forces the filesystem to reclaim space from deleted files\ndd bs=1M if=/dev/zero of=/var/tmp/zeros || :\nrm -f /var/tmp/zeros\necho "(Don\'t worry -- that out-of-space error was expected.)"\
> 
> Is this obsolete now? I think we're better off replacing it with fstrim
> instead. Also, this dd command must exclude baremetal installs or they'd
> take forever zeroing out the media.

This is *definitely* not obsolete. Removing this caused us to go from ~300MB to ~900MB. I'm putting it back and adding a sync in https://pagure.io/fedora-kickstarts/pull-request/824

Comment 7 Chris Murphy 2021-07-16 03:14:33 UTC
>This is *definitely* not obsolete. Removing this caused us to go from ~300MB to ~900MB. I'm putting it back and adding a sync in https://pagure.io/fedora-kickstarts/pull-request/824

It's a mirage, as I explained here: https://pagure.io/cloud-sig/issue/340#comment-743548

It's pointless to write zeros, delete them *and* do fstrim. Pick one. The first one will be a fully allocated image that compresses rather well. The second will be smaller and take much less time to create. And doing both gets you the same size results as the fstrim only option, but with massive write amplification and disk contention for no benefit.

Comment 8 Neal Gompa 2021-07-16 12:51:00 UTC
Fine, I changed to do *just* fstrim and added a sync right after: https://pagure.io/fedora-kickstarts/pull-request/826

Let's see how that goes...

Comment 9 Neal Gompa 2021-07-16 13:48:58 UTC
It does not work: https://koji.fedoraproject.org/koji/taskinfo?taskID=72006902

The result is 840MB!

So I'll switch to the other way, and let's see how that goes...

Comment 10 Neal Gompa 2021-07-17 01:59:27 UTC
It works with the zero method (with no fstrim): https://koji.fedoraproject.org/koji/taskinfo?taskID=72011344

The result is 279MB.

Comment 11 Chris Murphy 2021-07-17 04:12:18 UTC
https://pagure.io/fedora-kickstarts/pull-request/826#request_diff
Sorry for the lack of clarity. fstrim before sync won't work, the sync commits the file deletion to disk. Only once the deletion is committed can fstrim do the correct thing.

Comment 12 Ben Cotton 2021-08-10 13:07:43 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 35 development cycle.
Changing version to 35.

Comment 13 Brian Lane 2021-09-24 23:27:30 UTC
I finally got some time to look into this. First off, the koji createImage task appears to be using Oz so none of what I'm about to type applies :)

I *did* manage to see a slight improvement in size by adding fstrim to livemedia-creator. PR is here:
https://github.com/weldr/lorax/pull/1172


I've added output of the image size before and after fstrim and fallocate --dig-holes, and turned on verbose output for fstrim and fallocate.
With partitioned disk, filesystem image, and live iso using the fedora-minimal.ks from lorax I see the image file that lmc creates shrink:

disk - 237MiB smaller
filesystem - 133MiB smaller
minimal iso (ext4 install.img) - 158MiB smaller
live installer iso - 751Mib smaller

This is measured using du -B1 on the image file before and after running fstrim+fallocate. You can now see these numbers in program.log when running livemedia-creator.

Comment 14 Chris Murphy 2022-05-19 15:49:26 UTC
Looks like I never opened a releng issue to make sure the VM's have discard="unmap" set so that fstrim is effective.

Comment 15 Chris Murphy 2022-05-19 15:49:36 UTC
https://pagure.io/releng/issue/10801