Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1666209 - Nagios cannot start after system reboot because of missing directory
Summary: Nagios cannot start after system reboot because of missing directory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: nagios
Version: epel7
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Guido Aulisi
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-15 08:28 UTC by Stefan Joosten
Modified: 2021-03-22 00:37 UTC (History)
13 users (show)

Fixed In Version: nagios-4.4.3-1.fc28 nagios-4.4.3-1.fc29 nagios-4.4.3-1.el6 nagios-4.4.3-1.el7 nagios-4.4.6-4.el8 nagios-4.4.6-4.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-22 00:31:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Stefan Joosten 2019-01-15 08:28:49 UTC
Description of problem:

Nagios fails to start after a reboot (on a CentOS 7 system). This is caused by a missing directory in the path where Nagios wants to write it's lock file (/var/run/nagios/nagios.pid) at.

Directory {,/var}/run/nagios is created by the RPM after installing the package.  Ans the service runs fine at that moment. But /run is a tmpfs, clearing out all content after a reboot. Starting the service via `systemctl start nagios` does not create the directory causing Nagios to exit abnormally and require manual intervention to get going again.


Version-Release number of selected component (if applicable):
nagios-4.3.4-5.el7.x86_64 (stable)
nagios-4.4.2-3.el7.x86_64 (testing)

How reproducible:

I was able to reproduce this on two CentOS 7 systems. And have also tried the latest nagios package from testing by enabling epel-testing. You install the "nagios" package. Optionally start the service. Reboot the machine. Log back in and try to start the "nagios" service. It fails to start.

# yum install nagios
# ls -laZ /var/run/nagios
drwxr-x---. nagios nagios system_u:object_r:nagios_var_run_t:s0 .
drwxr-xr-x. root   root   system_u:object_r:var_run_t:s0   ..
# 


Steps to Reproduce:
1. Install Nagios:
 # yum install nagios
2. Verify existence of lock file directory: 
 # if [[ -x /run/nagios ]]; then echo "Directory /run/nagios exists"; else echo "Directory /run/nagios does not exist"; fi
3. Optional: start the Nagios service
 # systemctl start nagios
4. Reboot the machine
 # reboot
5. Try to start the Nagios service again (will fail)
 # systemctl start nagios
6. Inspect error message
7. Verify existence of lock file directory again and discover it is missing.

Actual results:
# systemctl start nagios
Job for nagios.service failed because the control process exited with error code. See "systemctl status nagios.service" and "journalctl -xe" for details.

# journalctl -xe --unit nagios
..
Jan 15 08:30:43 <hostname> nagios[5823]: Failed to obtain lock on file /var/run/nagios/nagios.pid: No such file or directory
..


Expected results:
# systemctl start nagios
should exit normally and the Nagios service should be running.

The /run/nagios directory should be created upon start of the service.
Or the Nagios package configuration file could be changed to place the lock file directly in /run instead of in a subdirectory of it.


Additional info:

Out of the box the /etc/nagios/nagios.cfg contains:
# grep lock_file /etc/nagios/nagios.cfg 
lock_file=/var/run/nagios/nagios.pid

Comment 1 Stefan Joosten 2019-01-15 09:45:33 UTC
Two ways of fixing I came up with are:
A. Change nagios.cfg to go directly to /var/run by taking out the `nagios` subdirectory of the `lock_file` option
B. Have the systemd.service create the `nagios` runtime directory. This might be easier if this problem does not occur on EL6 for example (perhaps it's init script already takes care of this, I haven't checked).

Solution A, changing nagios.cfg:

--- nagios.cfg	2019-01-15 10:14:06.346940829 +0100
+++ nagios.cfg.fix_lock_path	2019-01-15 10:14:16.884977509 +0100
@@ -166,7 +166,7 @@
 # This is the lockfile that Nagios will use to store its PID number
 # in when it is running in daemon mode.
 
-lock_file=/var/run/nagios/nagios.pid
+lock_file=/var/run/nagios.pid


Solution B, changing the systemd unit file:

--- nagios.service	2019-01-15 10:30:12.572302032 +0100
+++ nagios.service.fix_lock_path	2019-01-15 10:39:19.450194861 +0100
@@ -7,6 +7,8 @@
 Type=forking
 User=nagios
 Group=nagios
+RuntimeDirectory=nagios
+RuntimeDirectoryMode=0750
 PIDFile=/var/run/nagios/nagios.pid
 # Mimic older config file wants
 EnvironmentFile=-/etc/sysconfig/nagios

This creates the /run/nagios directory with the user and group permissions set, mode gets set to 0750 as it's created by the RPM.
However this does cause a new issue (!) upon removal of the package. Directory /run/nagios is now removed by systemd upon a stop of the service. Causing a `yum remove nagios` to spit out a warning:
   Erasing    : nagios-4.4.2-3.el7.x86_64
 warning: file /var/run/nagios: remove failed: No such file or directory


Of course pick which you prefer, either seem to work for me, or perhaps you'll think of another solution. 
I just hope this helps :)

Comment 2 Stefan Joosten 2019-01-15 09:57:11 UTC
Uh.. I think I just assumed solution A (editing the path of the lock_file) would work... But I tested it and of course it does not work because user nagios has no permission to write to /run :

nagios[6017]: Failed to obtain lock on file /var/run/nagios.pid: Permission denied

So disregard solution A.

Comment 3 Fedora Update System 2019-01-17 00:14:39 UTC
nagios-4.4.3-1.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-d661b588d2

Comment 4 Fedora Update System 2019-01-17 00:25:19 UTC
nagios-4.4.3-1.el6 has been submitted as an update to Fedora EPEL 6. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-17b388679b

Comment 5 Fedora Update System 2019-01-17 00:43:00 UTC
nagios-4.4.3-1.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-376ecc221c

Comment 6 Fedora Update System 2019-01-17 00:55:16 UTC
nagios-4.4.3-1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2019-0b44528ff1

Comment 7 Fedora Update System 2019-01-18 01:00:24 UTC
nagios-4.4.3-1.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-d661b588d2

Comment 8 Fedora Update System 2019-01-18 01:31:47 UTC
nagios-4.4.3-1.el6 has been pushed to the Fedora EPEL 6 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-17b388679b

Comment 9 Fedora Update System 2019-01-18 03:04:54 UTC
nagios-4.4.3-1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-0b44528ff1

Comment 10 Fedora Update System 2019-01-18 03:36:14 UTC
nagios-4.4.3-1.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-376ecc221c

Comment 11 Fedora Update System 2019-01-30 01:31:56 UTC
nagios-4.4.3-1.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2019-01-30 02:06:37 UTC
nagios-4.4.3-1.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2019-02-02 00:36:20 UTC
nagios-4.4.3-1.el6 has been pushed to the Fedora EPEL 6 stable repository. If problems still persist, please make note of it in this bug report.

Comment 14 Fedora Update System 2019-02-02 00:39:21 UTC
nagios-4.4.3-1.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.

Comment 15 Stefan Joosten 2019-03-22 14:41:53 UTC
Thanks for working on this.
Unfortunately the problem still persists for me.

I'm on CentOS 7 using nagios-4.4.3-1.el7 from EPEL.
I extracted the RPM and had a look at file usr/lib/systemd/system/nagios.service
It does not seem to include my little patch of adding the two lines "RuntimeDirectory" and "RuntimeDirectoryMode".

I can still reproduce the error as originally reported. After a reboot Nagios fails to start on CentOS 7 for me unless I manually create the directory to place it's lock/PID file.

This is my current patch to usr/lib/systemd/system/nagios.service :

--- nagios.service	2019-03-22 15:38:48.066376767 +0100
+++ nagios.service.lock_file.patch	2019-03-22 15:38:37.921396470 +0100
@@ -10,6 +10,8 @@
 ExecStop=/usr/bin/kill -s TERM ${MAINPID}
 ExecStopPost=/usr/bin/rm -f /var/spool/nagios/cmd/nagios.cmd
 ExecReload=/usr/bin/kill -s HUP ${MAINPID}
+RuntimeDirectory=nagios
+RuntimeDirectoryMode=0750
 
 [Install]
 WantedBy=multi-user.target

After these are added the nagios service works as intended for me.
I changed the bug's status, hope that's OK.

Comment 16 Mike Surcouf 2019-07-18 09:48:01 UTC
In Centos 7

+RuntimeDirectory=nagios
+RuntimeDirectoryMode=0750

didnt work

I used this workaround which will survive package updates

mkdir -p /etc/systemd/system/nagios.service.d
cat > /etc/systemd/system/nagios.service.d/overides.conf << "EOF"
[Service]
ExecStartPre=/usr/bin/mkdir -p /var/run/nagios
ExecStartPre=/usr/bin/chown nagios /var/run/nagios
EOF
systemctl daemon-reload
systemctl restart nagios

BE nice if this was fixed in the service file though

Comment 17 Mike Surcouf 2019-07-18 09:49:13 UTC
BTW I didn't put the plus in (copied from commit)

Comment 18 Fedora Admin user for bugzilla script actions 2020-08-18 14:57:36 UTC
This package has changed maintainer in the Fedora.
Reassigning to the new maintainer of this component.

Comment 19 Fedora Admin user for bugzilla script actions 2021-02-20 00:05:28 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 20 Fedora Update System 2021-03-07 11:14:17 UTC
FEDORA-EPEL-2021-e9c2beec98 has been submitted as an update to Fedora EPEL 8. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-e9c2beec98

Comment 21 Fedora Update System 2021-03-07 12:07:53 UTC
FEDORA-EPEL-2021-04cc5bcb08 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-04cc5bcb08

Comment 22 Fedora Update System 2021-03-07 15:26:37 UTC
FEDORA-EPEL-2021-04cc5bcb08 has been pushed to the Fedora EPEL 7 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-04cc5bcb08

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 23 Fedora Update System 2021-03-07 15:27:26 UTC
FEDORA-EPEL-2021-e9c2beec98 has been pushed to the Fedora EPEL 8 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-e9c2beec98

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 24 Fedora Update System 2021-03-22 00:31:13 UTC
FEDORA-EPEL-2021-e9c2beec98 has been pushed to the Fedora EPEL 8 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 25 Fedora Update System 2021-03-22 00:37:18 UTC
FEDORA-EPEL-2021-04cc5bcb08 has been pushed to the Fedora EPEL 7 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.