Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1406435

Summary: If no limit to memory provided on container /tmp ends up at 4EB
Product: [Fedora] Fedora Reporter: James Hogarth <james.hogarth>
Component: oci-systemd-hookAssignee: Daniel Walsh <dwalsh>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 25CC: amurdaca, dwalsh, james.hogarth, lsm5, nalin
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: oci-systemd-hook-0.1.6-1.gitfe22236.fc25 oci-systemd-hook-0.1.6-1.gitfe22236.fc26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-14 17:23:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description James Hogarth 2016-12-20 14:09:33 UTC
Description of problem:
When using a systemd based container the oci-systemd-hook makes use of /sys/fs/cgroup/memory/memory.limit_in_bytes to determine the size to make /tmp 

If not explicitly limiting the container this results in a 'limit' of 9223372036854771712/2 ... or 4EB to make it readable.

a) this is insane as few systems will have 4EB of RAM available to back this ;)
b) this causes issues on anything that uses the detected size to work out max sizing 
c) this breaks 32bit apps in a lovely subtle way. Since getconf FILESIZEBITS /tmp results in 32 the assumptions is large file support is not required. So statfs() (and family) get called instead of the 64bit equivalents and the application promptly fails with: statfs("/tmp/isjbeOWOh", 0xffa15b60)    = -1 EOVERFLOW (Value too large for defined data type)


Version-Release number of selected component (if applicable):
docker-1.12.3-12.git97974ae.fc25.x86_64
oci-systemd-hook-0.1.4-3.git41491a3.fc25.x86_64


How reproducible:
deterministic

Steps to Reproduce:
1. 

cat > Dockerfile.systemd-test << EOF
FROM centos:latest

RUN yum -y install systemd bash
ENTRYPOINT ["/sbin/init"]
EOF


2. docker build -f Dockerfile.systemd-test -t systemd-test . 
3. docker run -d --name systemd-test systemd-test
4. docker exec systemd-test df -h

Actual results:
Filesystem                                             Size  Used Avail Use% Mounted on
/dev/mapper/luks-533c35ab-2572-45ad-b18b-5203c8b8563f  231G  183G   46G  81% /
tmpfs                                                  3.9G     0  3.9G   0% /dev
tmpfs                                                  3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/mapper/luks-533c35ab-2572-45ad-b18b-5203c8b8563f  231G  183G   46G  81% /etc/hosts
shm                                                     64M     0   64M   0% /dev/shm
tmpfs                                                   64M   16K   64M   1% /run
tmpfs                                                  4.0E     0  4.0E   0% /tmp


Expected results:
A /tmp that is a sensible size

Additional info:
This happens on a RHEL7 host as well and also with a fedora container

Comment 1 Daniel Walsh 2016-12-20 14:22:56 UTC
I believe the limit is half of physical memory for a tmpfs, which is the same fro the host.

   Mount options for tmpfs
       size=nbytes
              Override  default  maximum  size of the filesystem.  The size is
              given in bytes, and rounded up to entire pages.  The default  is
              half  of the memory.  The size parameter also accepts a suffix %
              to limit this tmpfs instance to that percentage of your physical
              RAM:  the default, when neither size nor nr_blocks is specified,
              is size=50%


Are you saying we are doing something wrong that is allowing this to go behond that?

Comment 2 James Hogarth 2016-12-20 14:34:42 UTC
For anyone that bumps into this in their applications there are a couple of workarounds that can be employed:

1) use docker run -m XG -d systemd

Where X is the memory to limit the guest to, rather than permitting infinite which is the default. The hook then uses X/2 for the /tmp size 

2) use docker run -v /tmp -d systemd

This then has /tmp in the volumes listed in the container config so the hook skips mounting of /tmp letting it get the sensible sizing of the underlying volume presented. Note that --tmpfs /tmp:rw,mode=1777,size=10G doesn't work as it doesn't get listed in volumes, which is what oci-systemd-hook checks. 

It might be sensible for the hook to also check any tmpfs structures rather than just volume mounts as a nice way of configuring this.

This workaround has the distinct disadvantage that volumes don't get removed by default when a container is removed so could end up with space usage creep.

Comment 3 James Hogarth 2016-12-20 14:41:51 UTC
Yeah Dan - that might be the intention but the path used to get that info doesn't result in the expected values ...

https://github.com/projectatomic/oci-systemd-hook/blob/master/src/systemdhook.c#L482

That gets the limit as half the limit set by the memory cgroup:

/sys/fs/cgroup/memory/memory.limit_in_bytes

But in a default setup there is no memory limit, so this defaults to maximum accessible on the architecture ... which is ~8EB ... not the actual physical RAM size on the host.

This then results in the 4EB /tmp assigned ... which is kind of crazy ;)

Note that the hook doesn't use default tmpfs mount options for the sizing but explicitly sets it:

https://github.com/projectatomic/oci-systemd-hook/blob/master/src/systemdhook.c#L513

rc = asprintf(&options, "mode=1777,size=%" PRIu64 "k", memory_limit_in_kb);

Comment 4 Daniel Walsh 2016-12-20 15:42:13 UTC
https://github.com/projectatomic/oci-systemd-hook/pull/40

Any chance you could check if this fixes your issue.

Comment 5 Daniel Walsh 2016-12-20 15:42:56 UTC
James, you are also saying the 

docker run --tmpfs /tmp:... Does not fix the issue?

Comment 6 James Hogarth 2016-12-21 13:05:27 UTC
sure i'll give it a go this afternoon after my lunch ;)

and indeed I tried docker run --tmpfs /tmp:rw,mode=1777,size=10G and it didn't fix it ... when you do a df -h in the container it still shows as 4EB in that instance.

I haven't had time to delve into the whys of that deeply, but on a quick check --tmpfs based things don't populate Mounts in the docker structure returned by inspect on the container, they appear in Tmpfs instead.

The code to check if it's mounted already only looks at Mounts:

https://github.com/projectatomic/oci-systemd-hook/blob/master/src/systemdhook.c#L775

A work around from the hook point of view would be to check mounts and tmpfs ... a better 'fix' for this, given it'd be a little unexpected not to consider this a mount, would be for docker to populate tmpfs stuff into the mounts array as well.

 docker run --tmpfs /tmp:rw,mode=1777,size=15G --name blahbalhfoo  centos /bin/bash

 docker inspect blahbalhfoo  | grep -iE '(mount|volume|tmpfs)'
        "MountLabel": "system_u:object_r:container_file_t:s0:c204,c767",
            "VolumeDriver": "",
            "VolumesFrom": null,
            "Tmpfs": {
        "Mounts": [],
            "Volumes": null,

Comment 7 James Hogarth 2016-12-21 15:06:48 UTC
Confirming that a build from master (with the PR already merged) behaves with an expected behaviour of half my system RAM:

[ja.hogarth@lap37607 ansible_role_wsc]$ docker exec systemd-test df -h
Filesystem                                             Size  Used Avail Use% Mounted on
/dev/mapper/luks-533c35ab-2572-45ad-b18b-5203c8b8563f  231G  192G   37G  85% /
tmpfs                                                  3.9G     0  3.9G   0% /dev
tmpfs                                                  3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/mapper/luks-533c35ab-2572-45ad-b18b-5203c8b8563f  231G  192G   37G  85% /etc/hosts
shm                                                     64M     0   64M   0% /dev/shm
tmpfs                                                   64M   16K   64M   1% /run
tmpfs                                                  3.9G     0  3.9G   0% /tmp
tmpfs                                                  3.9G  4.0K  3.9G   1% /var/log

Comment 8 James Hogarth 2016-12-21 15:16:04 UTC
The --tmpfs case now has a speparate bug and reproducible test case bz1406830

Comment 9 Daniel Walsh 2017-02-09 14:23:48 UTC
FIxed in oci-systemd-hook-0.1.5-1.git16f7c8a.fc25

Comment 10 Fedora Update System 2017-03-12 11:41:56 UTC
oci-systemd-hook-0.1.6-1.gitfe22236.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-a25973481c

Comment 11 Fedora Update System 2017-03-12 11:42:12 UTC
oci-systemd-hook-0.1.6-1.gitfe22236.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-5e4259e590

Comment 12 Fedora Update System 2017-03-13 00:21:49 UTC
oci-systemd-hook-0.1.6-1.gitfe22236.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-5e4259e590

Comment 13 Fedora Update System 2017-03-13 01:51:16 UTC
oci-systemd-hook-0.1.6-1.gitfe22236.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-a25973481c

Comment 14 Fedora Update System 2017-03-14 17:23:00 UTC
oci-systemd-hook-0.1.6-1.gitfe22236.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 15 Fedora Update System 2017-04-01 16:57:23 UTC
oci-systemd-hook-0.1.6-1.gitfe22236.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.