633538 – builds do not abort on failure; installation improperly performed

Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 633538 - builds do not abort on failure; installation improperly performed

Summary: builds do not abort on failure; installation improperly performed

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Dogtag Certificate System
Classification:	Retired
Component:	Build
Sub Component:
Version:	unspecified
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Assignee:	John Dennis
QA Contact:	Chandrasekar Kannan
Docs Contact:
URL:
Whiteboard:
Depends On:	643206
Blocks:	dogtagIPAv2
TreeView+	depends on / blocked

Reported:	2010-09-13 22:03 UTC by John Dennis
Modified:	2015-09-28 17:50 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-12-03 20:46:21 UTC
Embargoed:

Attachments	(Terms of Use)
patch to abort build when errors occur and separate building from installation (deleted) 2010-09-15 18:25 UTC, John Dennis	no flags	Details \| Diff
Identical changes for tomcatjss (deleted) 2010-09-16 15:54 UTC, John Dennis	no flags	Details \| Diff
View All

Description John Dennis 2010-09-13 22:03:58 UTC

If one runs build_pki or any of the build_dogtag scripts and there is a build failure in one part of the processing the failure is never detected and the build process continues on it's merry way blithely reporting success despite serious build failures. The output of the build is verbose and it's easy to miss the error messages in the volume of output and instead erroneously conclude when you see "BUILD SUCCESSFUL" and a zero exit status it actually succeeded when it fact it failed.

The build process should immediately exit when a failure is detected.

I tracked the problem to a couple of key issues.

1) The Ant "exec" task has a "failonerror" parameter with this definition:

Stop the buildprocess if the command exits with a return code
signaling failure. Defaults to false.

Thus unless failonerror is explicitly set to true the failure is effectively ignored, although the error messages are output. There are many places in the various ant xml files with exec tasks which do not set failonerror.

2) The build_pki script ignores the exit status of the build scripts it invokes. Thus even after fixing the ant failure behavior the script merrily continues on its way trying to install components which didn't build. The build_pki script needs to check the exit status of the commands it invokes.

On a somewhat related matter build_pki is overstepping it's bounds because it's also attempts to install the rpm's its just built. Building and installing are two logically independent operations and should be handled separately, one should be able to just perform a build without modifying the system the build was performed on. In addition the fact build_pki attempts to install each components rpm's independently after it's built means the installation will fail if there a version or dependency conflicts. When installing all the packages need to be installed together at one time which must be done only after all the packages have been successfully built. This is another good reason to separate the build and install steps. My suggestion would be to have another script which locates all the rpms in the release area. This can then be used to either just list the rpms, or as input to yum/rpm, or to copy to a target test machine, or to build a tarball. Thus building is one step and another script will collect all the built rpms for easy processing, a more flexible scenario. If that script were called built_rpms you could do

cd pki/dogtag/scripts

# Build and install with rpm
./build_pki
rpm -Uhv --force `./built_rpms`

-or-

# Build and install with yum
./build_pki
yum localinstall `./built_rpms`

-or-

# Build and copy to test machine
./build_pki
scp `./built_rpms` test_machine:/tmp/pki_rpms

I will shortly supply a patch to address all the above problems.

Comment 1 John Dennis 2010-09-15 18:25:25 UTC

Created attachment 447534 [details]
patch to abort build when errors occur and separate building from installation

The attached patch does the following:

1) adds failonerror="true" to every ant exec task

2) bullet proofs capturing the exit status in dogtag/config-ext/build_dogtag_pki

3) modifies build_pki to:

   a) remove all installation code (building != installation)

   b) always checks the exit status of commands it runs and exits with message if it fails

4) adds a new script called "built_rpms" which output to stdout the list of built rpms in the current source tree (optionally can point it at another tree, apply inclusion/exclusion filters, etc. See usage).

Installation can be more easily and flexibly done via the built_rpms script (see usage)

Comment 2 John Dennis 2010-09-16 15:54:50 UTC

Created attachment 447778 [details]
Identical changes for tomcatjss

The first patch missed making the same changes to tomcatjss. This patch adds the same changes for tomcatjss

Comment 3 Matthew Harmsen 2010-09-20 20:41:25 UTC

attachment 447534 [details] +mharmsen
CAVEATS:
Everything in this attachment is approved EXCEPT the changes to "build_pki".

The "build_pki" script implements a "chain" build order required to make certain that all build-time dependencies are resolved. It has been suggested in the past to envelop this script inside of a "Mock" environment so as to not replace the existing packages on the build machine.  Release engineering nightly builds have been utilizing this script successfully to generate repositories utilized by quality engineering for months.

attachment 447778 [details] +mharmsen
NO CAVEATS.

Comment 6 John Dennis 2010-11-01 17:03:57 UTC

this bug can be closed once we have the cmake based build system working, bug #643206. Marking this bug as dependent on that.

Comment 7 John Dennis 2010-12-03 20:46:21 UTC

closing due to conversion to cmake

Note You need to log in before you can comment on or make changes to this bug.