Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 2238982

Summary: system acts as there is no space in /home though there is near 2G space available
Product: [Fedora] Fedora Reporter: lnie <lnie>
Component: kernelAssignee: fedora-kernel-btrfs
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 39CC: acaringi, adscvr, airlied, alciregi, awilliam, bskeggs, bugzilla, davide, esandeen, fedora-kernel-btrfs, hdegoede, hpa, igor.raits, jarod, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, ngompa13, ptalbert, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-6.5.5-300.fc39 kernel-6.5.5-100.fc37 kernel-6.5.5-200.fc38 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-25 01:43:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2143447    
Attachments:
Description Flags
screencast
none
journal
none
screencast from /home <10G system
none
screencast from /home > 10G system
none
mount swapon btrfs inspect-internal dump-super and dmesg none

Description lnie 2023-09-14 16:05:10 UTC
Boot Fedora-Workstation-Live-x86_64-39_Beta-1.1.iso on VM,create a 8G btrfs volume and set mount point as /home, create partitions for /boot,/boot/efi,and finish the installation,
boot into the newly installed system,and then try to occupy the disk space,
I used scp,you will find you can not copy file ,create file to that system,
apps also don't work well,gnome-shell quit frequently.
As there is ~2G space left,so there is no low-disk-space warning pop up.
It seems this bug only happens when mount a btrfs volume (size<10G) to /home,
and ext4/xfs on lvm/standard partitions are not affected.

Reproducible: Always

Comment 1 lnie 2023-09-14 16:06:53 UTC
Created attachment 1988841 [details]
screencast

Comment 2 lnie 2023-09-14 16:07:42 UTC
Created attachment 1988842 [details]
journal

Comment 3 lnie 2023-09-15 10:20:58 UTC
Propose as a Final FE as it acts like a disaster,and may cause data loss.Besides,it kind of violates :https://fedoraproject.org/wiki/Fedora_39_Final_Release_Criteria#Default_application_functionality

Comment 4 Adam Williamson 2023-09-15 17:17:37 UTC
Can you test if this was the same in F38? I wonder if it's to do with the compression feature...

Comment 5 Josef Bacik 2023-09-15 20:50:53 UTC
Can I get a btrfs filesystem usage /home?

Comment 6 lnie 2023-09-17 14:58:03 UTC
> Can you test if this was the same in F38? I wonder if it's to do with the compression feature...

Yes,I also saw this problem in F38

Comment 7 lnie 2023-09-17 14:59:06 UTC
> Can I get a btrfs filesystem usage /home?

Sure,here it is:

lnie@localhost-live:~$ sudo btrfs filesystem usage /home
Overall:
    Device size:   8.00GiB
    Device allocated:   8.00GiB
    Device unallocated:   1.00MiB
    Device missing:     0.00B
    Device slack:     0.00B
    Used:   6.00GiB
    Free (estimated):   1.99GiB (min: 1.99GiB)
    Free (statfs, df):   1.99GiB
    Data ratio:      1.00
    Metadata ratio:      1.00
    Global reserve:   5.98MiB (used: 0.00B)
    Multiple profiles:        no

Data,single: Size:7.99GiB, Used:6.00GiB (75.09%)
   /dev/vda1   7.99GiB

Metadata,single: Size:8.00MiB, Used:6.58MiB (82.23%)
   /dev/vda1   8.00MiB

System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
   /dev/vda1   4.00MiB

Unallocated:
   /dev/vda1   1.00MiB

Comment 8 lnie 2023-09-18 03:03:12 UTC
Created attachment 1989292 [details]
screencast from /home <10G system

Comment 9 lnie 2023-09-18 03:08:09 UTC
Created attachment 1989293 [details]
screencast from /home > 10G system

Comment 10 lnie 2023-09-18 03:11:34 UTC
As you can see from the attached screencasts,on home<10G system,when you try to remove/create some file, the horrible "Read-only system" message is printed,though  you will find the file you were trying to remove is removed after a reboot. It's pretty confusing in  a pretty bad way.I'm thinking maybe we should also consider this as a blocker:(
For developer,I saw "writing error in swap file" is printed when I try to save a file on home>10G system,while "can't open file for writing" on home<10G system

Comment 11 Josef Bacik 2023-09-18 12:54:40 UTC
Huh interesting, we're not allocating enough metadata.  We have regression tests for this, let me see if I can reproduce it locally.

Comment 12 Josef Bacik 2023-09-18 13:01:31 UTC
Can I see the output from the following commands

mount
swapon
btrfs inspect-internal dump-super <device that /home lives on>

and then for the case that the file system flips read only can I get the dmesg output as well?

Comment 13 Josef Bacik 2023-09-18 19:32:38 UTC
I'm seeing a few issues here, and I've nailed down a few of them, and some of them I'm just guessing at because I don't have all the information.

1) There's a reporting problem.  In the case where it shows you have 2gib free, you have no metadata space, and that's why you're getting an -ENOSPC.  'df' is not showing the right amount because you filled up metadata space so much that it messed up our internal "does this fs have enough metadata to do anything?" logic and we're showing you the full data you have free, but should be showing 0 because there's no metadata.  This is why we recommend using "btrfs filesystem usage", as it shows the whole answer.  However this is a bug, and I've sent the patch upstream, the patch is here https://lore.kernel.org/linux-btrfs/a9e6b02e9eb7a0532c401a898661b0511c31d0e8.1695047676.git.josef@toxicpanda.com/

2) We really shouldn't let you get that close to full on metadata, we're failing to anticipate the data usage.  In your case you're using a 512mb chunk size, and this results in a lot of checksums.  But it also creates this racy environment where we think we have plenty of slack space, until suddenly we don't.  For data we assume reservation size == actual use (which it is in this case, but isn't in compression), so we're allowing metadata to overcommit in a situation where we could lose that slack space and then we're in trouble.  I was able to reproduce this particular behavior and fixed the underlying issue here https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/

3) Flipping read only.  This is concerning and I don't have the dmesg to validate my theory, but I assume it is a result of the situation that is caused by problem #2.  In my testing I got myself down to <1mib of free metadata, and if you hit the timing just right we could still think we have room for the overcommit and then fail to make a metadata allocation and then flip read only because of that.  The fix is to fix the underlying cause, which is what #2 fixes.  However that's just my best guess, I'd need to see the dmesg to validate we're flipping readonly in the allocation path.

Comment 14 lnie 2023-09-19 01:49:30 UTC
Created attachment 1989483 [details]
mount swapon btrfs inspect-internal dump-super and dmesg

Comment 15 lnie 2023-09-19 02:28:09 UTC
I have a maybe silly question,as you can see on a /home< 10G btrfs system, metadata-size/data-size is  ~ 0.00097,while on a /home < 10G btrfs system that number is 0.0178, ten times? isn't that a problem?

lnie@fedora:~$  sudo btrfs filesystem usage /home
[sudo] password for lnie:
Overall:
    Device size:  15.00GiB
    Device allocated:  15.00GiB
    Device unallocated:   1.00MiB
    Device missing:     0.00B
    Device slack:     0.00B
    Used:  14.74GiB
    Free (estimated):   8.01MiB (min: 8.01MiB)
    Free (statfs, df):   8.01MiB
    Data ratio:      1.00
    Metadata ratio:      1.00
    Global reserve:  15.23MiB (used: 0.00B)
    Multiple profiles:        no

Data,single: Size:14.74GiB, Used:14.73GiB (99.95%)
   /dev/vda1  14.74GiB

Metadata,single: Size:264.00MiB, Used:15.69MiB (5.94%)
   /dev/vda1 264.00MiB

System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
   /dev/vda1   4.00MiB

Unallocated:
   /dev/vda1   1.00MiB

Comment 16 Josef Bacik 2023-09-19 15:24:04 UTC
Ok perfect dmesg shows that you ran out of metadata space, so my second fix will resolve that.

The >10gib fs is doing what I want, allocating another metadata block group.

Btrfs divides the disk up into chunks (called block groups) and dedicates these chunks to either Data, Metadata, or System chunks.  For smaller file systems like yours we try to make the metadata chunks smaller, 256mib at a time.  As you've noticed this means that you're losing about 1% of your disk to metadata.  In the case where you're just blowing large data chunks onto your device this doesn't make much sense, unfortunately we don't know that's what's going to happen so we have to make it work for the general case.  If you decided to take a bunch of snapshots you'd use more metadata, or if you had lots of small files you'd also use a lot more metadata.  We avoid allocating new chunks except when we absolutely need to in order to reduce the amount of metadata overhead there is.  In your case we actually were too conservative and ended up causing other problems.  It's a tricky balancing act with different tradeoffs on either side.

Comment 17 Davide Cavalca 2023-09-20 20:38:53 UTC
The two fixes Josef mentioned are queued up for 6.5.5: https://gitlab.com/cki-project/kernel-ark/-/commits/fedora-6.5?ref_type=heads

Comment 18 Fedora Update System 2023-09-24 00:51:10 UTC
FEDORA-2023-0defc5c6ec has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-0defc5c6ec

Comment 19 Fedora Update System 2023-09-24 19:57:48 UTC
FEDORA-2023-4c8291ba6a has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-4c8291ba6a

Comment 20 Fedora Update System 2023-09-24 20:01:52 UTC
FEDORA-2023-3100e4d61c has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-3100e4d61c

Comment 21 Fedora Update System 2023-09-25 01:33:27 UTC
FEDORA-2023-3100e4d61c has been pushed to the Fedora 37 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-3100e4d61c`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-3100e4d61c

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 22 Fedora Update System 2023-09-25 01:43:10 UTC
FEDORA-2023-0defc5c6ec has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 23 Fedora Update System 2023-09-25 01:44:15 UTC
FEDORA-2023-4c8291ba6a has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-4c8291ba6a`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-4c8291ba6a

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 24 Fedora Update System 2023-09-27 02:37:54 UTC
FEDORA-2023-3100e4d61c has been pushed to the Fedora 37 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 25 Fedora Update System 2023-09-27 02:43:32 UTC
FEDORA-2023-4c8291ba6a has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.