This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GCC Buildbot

On 20 September 2017 at 17:01, Paulo Matos <> wrote:
> Hi all,
> I am internally running buildbot for a few projects, including one for a
> simple gcc setup for a private port. After some discussions with David
> Edelsohn at the last couple of Cauldrons, who told me this might be
> interesting for the community in general, I have contacted Sergio DJ
> with a few questions on his buildbot configuration for GDB. I then
> stripped out his configuration and transformed it into one from GCC,
> with a few private additions and ported it to the most recent buildbot
> version nine (which is numerically 0.9.x).

That's something I'd have liked to discuss at the Cauldron, but I
couldn't attend.

> To make a long story short:
> With brief documentation in:
> and configuration in:
> Now, this is still pretty raw but it:
> * Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
> --disable-multilib)
> * Does an incremental build
> * Runs all tests
> * Grabs the test results and stores them as properties
> * Creates a tarball of the sum and log files from the testsuite
> directory and uploads them
> This mail's intention is to gauge the interest of having a buildbot for
> GCC. Buildbot is a generic Python framework to build a test framework so
> the possibilities are pretty much endless as all workflows are
> programmed in Python and with buildbot nine the interface is also
> modifiable, if required.

I think there is no question about the interest of such a feature. It's almost
mandatory nowadays.

FYI, I've been involved in some "bots" for GCC for the past 4-5 years.
Our interest is in the ARM and AArch64 targets.

I don't want to start a Buildbot vs Jenkins vs something else war,
but I can share my experience. I did look at Buildbot, including when
the GDB guys started their own, but I must admit that I have trouble
with Python ;-)

A general warning would be: avoid sharing resources, it's always
a cause of trouble.

In ST, I stopped using my team's Jenkins instance because it
was overloaded, needed to be restarted at inconvenient times, ...
I'm now using a nice crontab :-)
Still in ST, I am using our Compute Farm, which is a large number
of x86_64 servers, where you submit batch jobs, wait, then parse
the results, and the workspace is deleted upon job completion.
I have to cope with various rules, to have a decent throughput,
and minimize pending time as much as possible.

Yet, and probably because the machines are shared with other users
running large (much larger?) programs at the same time, I have to face
random failures (processes are killed randomly, interrupted system calls,
etc....). Trying to handle these problems gracefully is very time consuming.

I upload the results on a Linaro server, so that I can share them
when I report a regression. For disk space reasons, I currently
keep about 2 months of results. For the trunk:

In Linaro, we use Jenkins, and a few dedicated x86_64 builders
as well as arm and aarch64 builders and test machines. We have
much less cpu power than what I can currently use in ST, so
we run less builds, and less configurations. But even there we have
to face a few random test results (mostly when threads and libgomp
are involved).

These random false failures have been preventing us from sending
results automatically.

> If this is something of interest, then we will need to understand what
> is required, among those:
> - which machines we can use as workers: we certainly need more worker
> (previously known as slave) machines to test GCC in different
> archs/configurations;

To cover various archs, it may be more practical to build cross-compilers,
using "cheap" x86_64 builders, and relying on qemu or other simulators
to run the tests. I don't think the GCC compute farm can offer powerful
enough machines for all the archs we want to test.

It's not as good as using native hardware, but this is often faster.
And it does not prevent from using native hardware for weekly
bootstraps for instance.

> - what kind of build configurations do we need and what they should do:
> for example, do we want to build gcc standalone against system (the one
> installed in the worker) binutils, glibc, etc or do we want a builder to
> bootstrap everything?

Using the system tools is OK for native builders, maybe not when building

Then.... I think it's way safer to stick to given binutils/glibc/newlib versions
and monitor only gcc changes. There are already frequent regressions,
and it's easier to be sure it's related to gcc-changes only.

And have a mechanism to upgrade such components after checking
the impact on the gcc testsuite.

In Linaro we have a job tracking all master branches, it is almost
always red :(

> - initially I was doing fresh builds and uploading a tarball (450Mgs)
> for download. This took way too long. I have moved to incremental builds
> with no tarball generation but if required we could do this for forced
> builds and/or nightly. Ideas?
> - We are currently running the whole testsuite for each incremental
> build (~40mins). If we want a faster turnaround time, we could run just
> an important subset of tests. Suggestions?

FWIW, in ST, I do builds from scratch, and I have:
- a build-only bot, which only builds binutils+glibc/newlib+gcc, for 9
targets, it takes about 30-40 minutes, and runs for every commit
in trunk

- a "check" bot, which builds and runs the testsuite for gcc, g++, libstdc++
and gfortran for 34 arm/aarch64 targets & runtestflags combinations.
Each takes about 3-4 hours to run.
It runs for every commit on the active branches (gcc-5, gcc-6 and gcc-7
at the moment), and for daily bump + any commit "related" to arm
and aarch64. Which means a lot when Richards pushes his 77-patches
series ;-) (it took about 3 days to catch up)

In practice, I think it is fast enough, I am often the bottleneck when
it comes to looking at the results :-)

> - would we like to run anything on the compiler besides the gcc
> testsuite? I know Honza does, or used to do, lots of firefox builds to
> test LTO. Shall we build those, for example? I noticed there's a testing
> subpage which contains a few other libraries, should we build these?
> (

That would be great, of course.

> - Currently we have a force build which allows people to force a build
> on the worker. This requires no authentication and can certainly be
> abused. We can add some sort of authentication, like for example, only
> allow users with a email? For now, it's not a problem.
> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
> add more languages to the mix?
> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
> which allows people to submit patches to the buildbot, buildbot patches
> the current svn version, builds and tests that. Would we want something
> like this?

I think this is very useful.
We have something like that both at Linaro and ST.
On a few occasions, I did manually submit other people's patches
for testing after they submitted them to gcc-patches@. It always
caught a few problems in some less configurations.

> - buildbot can notify people if the build fails or if there's a test
> regression. Notification can be sent to IRC and email for example. What
> would people prefer to have as the settings for notifications?

I've recently seen complaints on the gdb list because the buildbot
was sending notifications to too many people. I'm afraid that this
is going to be a touchy area if the notifications contain too many
false positives.

> - an example of a successful build is:
> This build shows several Changes because between the start and finish of
> a build there were several new commits. Properties show among other
> things test results. Responsible users show the people who were involved
> in the changes for the build.
> I am sure there are lots of other questions and issues. Please let me
> know if you find this interesting and what you would like to see
> implemented.

To summarize, I think such bots are very valuable, even if they only
act as post-commit validations.

But as other people expressed, the main difficulty is what to do with
the results. Analyzing regression reports to make sure they are
not false positive is very time consuming.

Having a buggy bisect framework can also lead to embarrassing
situations, like when I blamed a C++ front-end patch for a regression
in fortran ;-)

Most of the time, I consider it's more efficient for the project if I warn
the author of the patch that introduced the regression than if I try to
fix it myself. Except for the most trivial ones, it resulted several times
in duplicated effort and waste of time. But of course, there are many
more efficient gcc developers than me here :)

Regarding the cpu power, maybe we could have free slots in
some cloud? (travis? amazon?, ....)

Thanks for working on this and starting this discussion.


> Kind regards,
> --
> Paulo Matos

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]