Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1657041 - libvirt-daemon-config-network %post expects network access, which won't work on silverblue/rpm-ostree
Summary: libvirt-daemon-config-network %post expects network access, which won't work ...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 34
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1352154
TreeView+ depends on / blocked
 
Reported: 2018-12-06 22:21 UTC by Colin Walters
Modified: 2021-02-09 15:07 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Colin Walters 2018-12-06 22:21:40 UTC
For rpm-ostree, we are trying to support generating OSTree commits server side which are replicated to client systems.

This implies that everything done in %post must be system-independent and ideally predictable and bitwise reproducible.

What's going on in the current https://src.fedoraproject.org/rpms/libvirt/blob/master/f/libvirt.spec#_1400
violates this.

The general fix is to move these types of things to on daemon startup.  For example, you could create a `libvirt-daemon-config-network-init.service` systemd unit with ConditionPathExists=!/etc/libvirt/qemu/networks/default.xml, and introspect the network there.

Comment 1 Colin Walters 2018-12-06 22:24:42 UTC
Taking this to the next level, it doesn't make sense to encode *dynamic* networking state into a persistent data store in /etc.  Rather, the systemd unit could do the probing and then write the config file into /run/libvirt or so.

Comment 2 Steve Milner 2018-12-06 22:26:27 UTC
I hit this trying to do some virt based testing on Fedora Silverblue.

Comment 3 Laine Stump 2018-12-07 22:28:38 UTC
It's not really dynamic - once it's set, it should remain consistent across host reboots. The only reason that it's not set in stone in the original source files is that the "factory" choice of subnet may not work for some hosts.

For example (and this is the specific situation that led to the current code in the specfile %post - for a *very* long history, see Bug 1146232), if you install libvirt in a virtual machine whose network connection is via the libvirt default network on the L0 host, then it will already have a network connection using 192.168.122.0/24, and if the L1 host creates a new bridge in the virtual machine with address 192.168.122.1/24 (i.e. the same bridge as is created in the L0 host), this will lead to network connectivity being lost for the guest. So we have to do *something* to make sure the choice for subnet of the network is usable.

But if we try to make the choice at the time libvirtd is started, that sometimes works and sometimes doesn't, because libvirtd.service is started before the network is guaranteed to be fully up (so we might *think* it's okay to use a particular subnet because libvirtd happened to start up more quickly during the first start after install, but then the host network config would later start up a conflicting interface (or add a conflicting route).

So we added the code that is in the specfile's %post - because the install is usually running when the system is fully started up, it's more likely that any conflicting interface/route would have already been started.

However, this method has its own problems, since there are situations when the network environment at libvirt install time is different from the network environment at the time it is run  - the most annoying example is the Fedora Live CD image, which is created in some sort of container somewhere, and could later be run in a virtual machine connected to a host's libvirt virtual network.

But we don't really want to make libvirtd.service wait until the networking subsystem is fully up before it starts - someone might have networking infrastructure that uses virtual machines, and that would fail miserably if libvirtd.service couldn't start up until networking was fully up.

In spite of that, I've toyed with the idea of the config having a "super double secret probationary default" option (props to Animal House) that could be initially set for a network, and then the first time that network was started, it would delay until "networking is up" (whatever that means - I think the systemd target is different depending on whether or not NetworkManager is enabled, and we definitely don't want to make libvirt require NetworkManager!)

Anyway, TL;DR - 1) we can't have a dynamic address stored in /var/run because once chosen, the subnet must remain consistent across subsequent reboots, but 2) we have thought about the idea of somehow delaying the selection of subnet until the first run of libvirtd. 3) In the past we hadn't done that because it *still* doesn't solve the problem for everyone, but 4) this BZ may give us reason to look into it again.

Comment 4 Colin Walters 2018-12-10 15:54:06 UTC
> However, this method has its own problems, since there are situations when the network environment at libvirt install time is different from the network environment at the time it is run  - the most annoying example is the Fedora Live CD image, 

That's what this bug is about, yes.  rpm-ostree based systems (Fedora Atomic Host, Fedora Silverblue) *always* work this way.

In fact, when rpm-ostree runs scripts today, we disable networking:
https://github.com/projectatomic/rpm-ostree/blob/f811828543d46ca7264e6616dca29f39d715d4e1/src/libpriv/rpmostree-bwrap.c#L305

Note this even occurs on the *client* side.  I'm typing this from a Fedora Silverblue system which doesn't have libvirt by default, but I `rpm-ostree install libvirt`.

When the libvirt %post script runs on my local system, it won't see any network interfaces at all.

Comment 5 Daniel Berrangé 2018-12-10 15:56:05 UTC
(In reply to Colin Walters from comment #4)
> Note this even occurs on the *client* side.  I'm typing this from a Fedora
> Silverblue system which doesn't have libvirt by default, but I `rpm-ostree
> install libvirt`.
> 
> When the libvirt %post script runs on my local system, it won't see any
> network interfaces at all.

I can understand not having network when building images, but how is libvirt going to provide network connectivity for guests if it isn't given any network interfaces on client systems ?

Comment 6 Colin Walters 2018-12-14 13:42:07 UTC
> but how is libvirt going to provide network connectivity for guests if it isn't given any network interfaces on client systems ?

I was talking about the %post script.  rpm-ostree doesn't change how systemd units are run in any way.

https://bugzilla.redhat.com/show_bug.cgi?id=1657041#c0
specifically mentions moving the network detection to a systemd unit.

Comment 7 Ben Cotton 2019-08-13 16:56:15 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 8 Ben Cotton 2019-08-13 19:29:40 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to 31.

Comment 9 Ben Cotton 2020-11-03 15:06:03 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Ben Cotton 2021-02-09 15:07:11 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 34 development cycle.
Changing version to 34.


Note You need to log in before you can comment on or make changes to this bug.