GNU Tools Cauldron 2012

<< GCC Gathering 2011 | GNU Tools Cauldron 2013 >>

group_photo.jpg

cauldron logo small.png

Contents

  1. GNU Tools Cauldron 2012
    1. The cake
    2. Schedule
    3. Organizers
    4. Sponsors
    5. Mailing lists
    6. Workshop description
    7. Accomodation
    8. Presentations
      1. Keynote Presentation - Free software, GNU and GCC
      2. Status of High-Level Loop Optimizations in GCC
      3. GDB Roadmap
      4. Finding races and memory errors with GCC instrumentation (AddressSanitizer)
      5. Fission
      6. Control-flow preservation in GCC for safety-critical uses
      7. Free Software: A viable model for Commercial Success
      8. Compiler Optimizations for Dynamic Scripting Language Interpreters and JITs
      9. An implementation of predicated value numbering
      10. Improving Function Pointer Security for Virtual Method Dispatches
      11. The Local Register Allocator Project
      12. What's New in C++11
      13. Towards feature parity of GDB remote and native debugging
      14. The Quest for Cheaper Variable Tracking in GCC
      15. GDB vs. MPI (Message Passing Interface)
      16. New programming abstractions for concurrency
      17. Supporting Parallel Component Debugging Using the GDB Python Interface
      18. Reducing DWARF debuginfo size
      19. Towards Multicore GDB
      20. The Cilk Plus Implementation on GCC
      21. GCC Doc Futures
      22. Pre-Parsed Headers
      23. C++ Conversion BoF
      24. G++ diagnostics: present and (near) future
      25. Straight-line strength reduction in GCC
      26. Identifying compiler options to minimize energy consumption by embedded programs
      27. The Benefit of GCC's open structure on instrumentation in the HPC area
      28. Status of the x32 psABI
      29. StarPU's C Extensions for Hybrid CPU/GPU Task Programming, or, An Experience in Turning a Clumsy API Into Language Extensions
      30. PowerPC BoF
      31. GCC GNAT Ada in jet engine control systems
    9. How to reach the workshop
    10. Getting to Airport

The cake

cake1.jpg cake2.jpg

To celebrate 25th anniversary of GCC all present hackers gathered around a cake representing VT-100 terminal displaying the original announcement of the project to sing the free software song composed by RMS under lead of our release manager Richard Guenther.

free_software_song.pdf

Schedule

The schedule for the workshop can be accessed here.

The conference booklet is booklet.pdf.

We will be running all the presentations registered in advance as a single stream (Stream 1 in the schedule). Presentations registered on the first day of the workshop will likely need to run in a parallel stream (Stream 2 in the schedule).

Presentations have been assigned 45 minute slots (ideally 30min for the presentation and 15 min for questions). If you think you will need more or less time, please contact us at tools-cauldron-admin@googlegroups.com .

Organizers

Organizing committee:

Sponsors

Mailing lists

  1. Abstract submissions, registration, administrivia questions: tools-cauldron-admin@googlegroups.com

  2. Announcements and discussions related to the conference: gcc@gcc.gnu.org .

Workshop description

We are pleased to announce another gathering of GNU tools developers. The basic format of this meeting will be similar to the last one at the Google offices in London (http://gcc.gnu.org/wiki/GCCGathering2011).

The purpose of this workshop is to gather all GNU tools developers, discuss current/future work, coordinate efforts, exchange reports on ongoing efforts, discuss development plans for the next 12 months, developer tutorials and any other related discussions.

We will meet at the Lesser Town Campus of Charles University in Prague (Malostranske Namesti 25, Prague, Czech Republic map1, map2).

We are inviting every developer working in the GNU toolchain: GCC, GDB, binutils, runtimes, etc. The basic format of the meeting will be similar, but in addition to discussion topics selected at the conference, we are looking for advance submissions.

If you have a topic that you would like to present, please submit an abstract describing what you plan to present. We are accepting three types of submissions:

Note that we will not be doing in-depth reviews of the presentations. Mainly we are looking for applicability and to decide scheduling. There will be time at the conference to add other topics of discussion, similarly to what we did at the London meeting.

To register your abstract, send e-mail to tools-cauldron-admin@googlegroups.com .

Your submission should contain the following information:

If you intend to participate, but not necessarily present, please let us know as well. Send a message to tools-cauldron-admin@googlegroups.com stating your intent to participate.

Accomodation

The conference venue can be conveniently reached by the public transport, either by Metro (subway, underground train) line A (green line), to the station of Malostranská and then by a short walk, or by the tramway lines No. 12, 20 or 22 to the stop of Malostranské náměstí. The tramway stop is situated right across the square to the conference venue. A public traffic schemes can be downloaded at http://www.dpp.cz/en/transport-around%20prague/transit-schematics/.

Because of the location just in the center of Prague, it is easy to check lodging options on common booking sites, like http://www.marys.cz/.

Some options in walking distance from the venue include:

Presentations

Keynote Presentation - Free software, GNU and GCC

Presenter: Richard Stallman

rms.jpg

The free software movement's goals, and how GNU and GCC are part of achieving them.

Temporary location of the video recording: http://kam.mff.cuni.cz/~hubicka/rms/rms.html

Download: OGG Video

Status of High-Level Loop Optimizations in GCC

We will present the state of high-level loop optimizations in GCC. Trying to come up with a viable path forward several alternatives are present as basis for discussion.

GDB Roadmap

Status of missing features, status of features being worked on (known to me). Which way to keep unused template methods code separate from the code output. Dynamic types (such as variable length arrays) implementation choices in GDB.

Finding races and memory errors with GCC instrumentation (AddressSanitizer)

We will present two dynamic testing tools based on compile-time instrumentation. AddressSanitizer (ASan) finds memory bugs, such as use-after-free and out-of-bound accesses to heap, stack and globals. This tool could be seen as a partial replacement for Valgrind and similar tools. The major advantages over Valgrind are the speed (less than 2x slowdown on average) and the ability to handle bugs related to stack and globals.

AddressSanitizer can also fully replace Mudflap. ThreadSanitizer (TSan) finds data races. It uses the same race detection algorithm as the Valgrind-based TSan, but compile-time instrumentation allows it to be much faster (2x-4x slowdown). Both tools are implemented using GCC and LLVM infrastructures, so we will provide a comparison between GCC and LLVM from our perspective. We will also share our experience in deploying theses testing tools in large software projects.

More info:

Fission

Fission is about improving debugger usability and link-time performance. We've designed DWARF extensions that allow us to split the bulk of the debug information from the object files, allowing us to substantially reduce total link time and the size of the linked binary. In addition, because the input files to the linker are significantly smaller, bandwidth needed for a distributed build system is also reduced. The final executable will contain an index of the debug information, allowing the debugger to locate the debug information on demand, so that debugger start-up time can also be reduced. A full description of the project is on the GCC wiki: http://gcc.gnu.org/wiki/DebugFission.

Control-flow preservation in GCC for safety-critical uses

The proposed presentation is about the introduction of a "-fpreserve-control-flow" option in GCC, which directs the compiler operations so that the control-flow expressed in a source persists in the generated assembly code.

The interest is twofold:

The basic idea is to allow inferring which values were taken by boolean operands or expressions from information on the execution flow at the corresponding machine branch points (provided by the instrumented execution environment). Very roughly, we need the relevant branches to remain there and accurate enough debug info to map them to source expressions in presence of arbitrarily complex constructs, which poses a few challenges to solve in the compiler.

At this point, we have a stable implementation in our local gcc 4.5 series, supporting optimizations up to -O1. We use this to offer a non-intrusive coverage analysis framework, using valgrind or qemu as virtual execution environments instrumented to produce execution traces.

We are about to port this to gcc 4.7 and would be happy to contribute to mainline after exchanging with other developers on the approach.

The presentation will include an introduction to the

Free Software: A viable model for Commercial Success

This talk will discuss our experience at AdaCore, one of only a handful of 100% Free Software companies. All of our commercial products are licensed under the GPL and other Free Software Licenses. People often assume that there is a conflict between the use of such licenses and the needs of a commercial software company. Our experience at AdaCore shows that on the contrary, the Free Software model can be very successful both for us as a company and for our customers. We think this model can be used in many other circumstances, and want to encourage free software enthusiasts to consider this model in other circumstances.

Video of the talk: http://www.youtube.com/watch?v=PwRUk7KD8mc

Compiler Optimizations for Dynamic Scripting Language Interpreters and JITs

Modern Dynamic Scripting Languages such as Python, Ruby and PHP traditionally have been implemented as interpreters written in C. With their increasing usage in web frameworks and cloud computing infrastructure, they frequently are deployed on GNU/Linux systems, which means they are compiled with GCC. This presentation will examine compiler optimizations that can improve the performance of these types of languages. As performance demands of these languages increase, some implementations are turning to JITs and this talk will explore some compiler features that assist JITs for these languages.

An implementation of predicated value numbering

Presenter: Michael Matz

I'm talking about my ongoing work to replace the current value numberer with one capable to deal with predicates (i.e. value numbers depending on condition), with hopefully also some other advantages over the current one.

Improving Function Pointer Security for Virtual Method Dispatches

A common vector of attack in C++ programs is for attackers to make use of use-after-free bugs in the program to overwrite vtable pointers and hijack program execution. If the attacker discovers a use-after-free bug in the program, he waits until the object has been freed, then re-allocates the same memory to be an object of the same size and overwrites the vtable pointer in that object. When the object is accessed by the program (the use-after-free bug), it uses the attacker's vtable pointer to then go and start executing the attacker's code.

We are working on an approach to detect when attackers have overwritten vtable pointers, without significant performance penalties, and without changing the C++ ABI.

The Local Register Allocator Project

The Local Register Allocator (LRA) project is focused on replacing famous GCC reload pass. The project history, motivation and goals, and the different considered approaches to LRA implementation are discussed. LRA overview and structure, tasks solved by LRA, the current state of the project including SPEC2000 bechmark results on some major platforms are given.

The future of LRA project and possible new RA optimizations which utilize new CPU features could be implemented on the base LRA are discussed.

What's New in C++11

The second revision of the C++ language standard, C++ 2011, was ratified last year, thirteen years after the first one. In this talk, I will discuss notable additions to the language since C++98/03.

Towards feature parity of GDB remote and native debugging

Presenter: Ulrich Weigand

GDB supports debugging applications running natively on the host systems as well as debugging applications running on a remote system. In the latter case, GDB talks to a remote stub on the target side. If the target system is running Linux, this remote stub is usually gdbserver, which comes as part of the GDB distribution itself. In principle, there should be no difference between debugging a Linux process natively and debugging it remotely via gdbserver. However, for historical reasons, the GDB native Linux target is implemented as a completely separate code base from gdbserver. This has over time led to the unfortunate situation that certain GDB features are in fact only available when debugging natively (and other features are available only when debugging remotely). I'm planning to present an overview of the current state of affairs, including a couple of improvements that were implemented recently. I'm also planning to discuss proposals how to move forward, ideally towards a goal of achieving feature parity between native and remote/gdbserver debugging by actually sharing a single code base.

The Quest for Cheaper Variable Tracking in GCC

GCC's variable tracking pass got visibly more expensive with the introduction of VTA, Variable Tracking at Assignments. The pass scans each basic block for insns relevant for variable location debug information generation, propagates locations and values across basic blocks with global dataflow analysis, and finally generates notes with location or value expressions for variables.

The last part has been recently improved from an algorithm whose worst case was exponential to one that is linear on the variable/value equivalence graph size. The other parts have gained some memory savings by keeping global equivalences in a global data structure rather than in per-block equivalence sets, but there's need and room for performance improvements, particularly in the confluence operation in dataflow analysis.

The goal of this session is to present the current inner workings of the variable tracking pass, including the recent changes and exisiting plans, then opening for discussion, requests and suggestions of further improvements.

GDB vs. MPI (Message Passing Interface)

The MPI (Message Passing Interface) standard is the one established method to achieve highest scale parallelism on today's biggest supercomputers. There are many implementations including free ones. Yet the standard makes life for debuggers pretty difficult.

The MPI API hides away all sorts of management information in handles to give maximum flexibility to implementors. Unfortunately, this includes data type information of all messages. Therefore, debuggers are pretty much unable to show the contents of messages that are exchanged between parallel processes.

We implemented a solution for GDB using two stages: one to collect data type information from the MPI API and a GDB plugin to print a message's contents in a correct and convenient way. With this, GDB and MPI work together like they should have in the first place ... in our opinion.

New programming abstractions for concurrency

Parallelization is becoming more important than in the past, and for more developers. Parallel code often results in a concurrent execution of parts of the program (i.e., when threads do not execute truly in parallel but have to coordinate or synchronize with each other). Because concurrent code is typically more complex than sequential code, we need to provide programming abstractions that make these tasks easier for programmers.

In this talk, I will first give a brief overview of concurrency and the associated programming challenges, and then describe two programming abstractions that have been recently added to GCC: the C++11/C11 atomics and Transactional Memory. Both are based on the C++11/C11 memory model, which I will also introduce.

Supporting Parallel Component Debugging Using the GDB Python Interface

In this presentation, we will introduce the work we have undertaken in a join R&D effort of STMicroelectronics and the Laboratoire d'Informatique de Grenoble on the GDB project.

In the context of parallel and embedded computing, debugging is well- recognized as a complex activity. Nowadays, such applications are not developed anymore from scratch, relying only on the programming language primitives. Instead, they lean upon more advanced programming models allowing an easier expression of parallelism.

Interactive debuggers like GDB evolved from their earlier times when they could only handle machine instructions to support the source languages used by developers to write their applications. We believe that their next evolution could be the support of programming models, which would help the developers to manipulate higher level abstractions like the entities or communication mechanisms defined by the programming model. These abstractions will have the advantage of being closer to the concepts the developer dealt with during development time and they will help her to keep focused on application execution behaviour.

Hence, our work consists in improving GDB towards the support of such programming models. On top of GDB's Python interface, and extending it with contributed patches whenever it was required, we prepared a framework supporting the debugging of an ST home-made embedded component framework for MPSoC systems, running on an x86 simulator. The presentation will detail how we leveraged GDB to gather relevant runtime information about the component framework and the set of new features we developed, along with use-cases about their usage.

Reducing DWARF debuginfo size

Generating, linking, reading and storing DWARF debuginfo take significant resources, time and space. �We want to discuss some efforts that have recently been done to reduce some of that in the compiler, linker, package manager and tools, like debuggers, that use the DWARF debug information. �We are interested in discussing efforts that worked, the various tradeoffs, efforts that didn't produce significant results and ideas for future DWARF reduction work and/or standardization.

Towards Multicore GDB

Multicore systems have been around for a while, but the next generation takes it to a whole new level, with high-performance embedded designs consisting of anywhere from 30 to 1,000 cores. GDB needs significant work to be useful in debugging these targets, both in user interface and to improve performance.

The first part of the task is to expand GDB's vocabulary by formalizing the notion of core as its own first-class object, conceptually similar to a thread but persistent, and by introducing the "process/thread/core set", by which the user works with groups of threads, cores, etc, rather than just one at a time.

The second part is to partition the debugging workload so that GDB is less of a bottleneck. For instance, we introduce the notion of an agent library that can run on each core and handles some tasks locally, such as testing of a breakpoint condition, only notifying GDB when the condition is true.

This presentation will review the current status of multicore work, and look ahead to additional ideas to facilitate debugging of future multicore systems.

The Cilk Plus Implementation on GCC

In the current era of multicore processors, it is necessary for programmers to write efficient code to exploit their full capabilities. In this presentation, we address the Intel (r) Cilk(tm) Plus language extension that is implemented in a GCC branch. Cilk Plus is a set of language constructs for C/C++ for data and task parallelism.

The first construct defines three keywords (_Cilk_spawn, _Cilk_sync and _Cilk_for) that can be used on an existing serial program to make it task parallel. The keywords are simple to use, make the program easy to read and provide strong guarantees of serial equivalence. However, they require the help of a runtime whose source is also included in the compiler branch.

The other components in Cilk Plus provide data-parallelism constructs. Array notations aid the compiler to schedule batches of iterations to execute in parallel. If the processor has vectorization support, this construct can assist the compiler to vectorize the code. In addition, there are built-in functions that provide intrinsic operations such as finding maximum/minimum, sum and product of all the array elements. Second, elemental functions provide an option to take a scalar function in standard C and C++ and deploy it on many elements of arrays without prescribing an order of operations among the array elements. This allows the compiler to generate a vector version of the function, which vectorizes across a batch of consecutive calls to the elemental function. Finally, we provide a pragma and a set of clauses called pragma SIMD that allow users to communicate intent for vector execution and certain pertinent information to ease the job of the compiler in generating vector code.

In the first part of the presentation, we explain all the components of Cilk Plus. After this, we walk-through the compiler modifications in GCC and present some of the performance that can be achieved on some common benchmarks that were converted to Cilk Plus. We end the presentation with some future work and optimization opportunities in the compiler.

GCC Doc Futures

Presenter: Benjamin Kosnik

A complete survey of gcc documentation: what exists, in what formats and why, outstanding legal issues in the GPL vs. GFDL war, where documentation is located in the source, install, and website, how documentation is packaged for releases, how patches are tracked for release notes, how porting information and other derived/contributed information can be made part of canonical GCC documentation sources. A method for integrating wiki content into canonical manuals will be proposed. Known issues with the current documentation will be enumerated, and the audience will be queried and otherwise inspired/cajoled into contributing a more complete list of known issues. Priorities will be assigned to this derived list of known issues, and volunteers solicted to implement solutions in time for the next major gcc release.

Pre-Parsed Headers

In this talk we will discuss the status of the pre-parsed headers (PPH) project. In particular, we will describe implementation challenges, the current state of the PPH branch, lessons learned during implementation and future plans.

C++ Conversion BoF

G++ diagnostics: present and (near) future

Presenter: Paolo Carlini and Dodji Seketeli

Between the 4.6 and the 4.7 releases series a lot of work went into the C++ front-end (and the preprocessor) to improve the diagnostics and add new warnings, even without mentioning hundreds of fixes for many new and long standing bugs: eg, -ftrack-macro-expansion, PR c++/48934, -Wdelete-non-virtual-dtor, -Wzero-as-null-pointer-constant. Most definitely, 4.8 will get some form of "caret diagnostics" and more work is ongoing. Still, from many points of view, Clang++ still has an edge, for example ranges, "typedef unwrapping", spell checker, etc. Which kinds of improvements we would like to see in GCC as soon as possible? Which ones are doable with a moderate effort and which require extended infrastructural work? Which diagnostic we would like to handle differently than Clang?"

Straight-line strength reduction in GCC

GCC has long lacked a strength reduction capability outside of loops. Previous attempts to address this within existing frameworks, such as partial redundancy elimination, have not been successful. A primary reason for this is that these frameworks process individual expressions independently. For strength reduction, a determination of profitability often requires examining chains of related strength reduction candidates. This short presentation will demonstrate the issues involved and outline a new SSA dominator-based proposal for efficiently performing non-loop strength reduction.

Identifying compiler options to minimize energy consumption by embedded programs

The Benefit of GCC's open structure on instrumentation in the HPC area

Presenters:Johannes Ziegenbalg and Bert Wesarg (Technische Universität Dresden)

Function instrumentation is one foundational method of performance data gathering. This data is stored on disc in event trace files to run a performance analysis later on. Unfortunately, automatic instrumentation often results in lots of trace events being generated during the measurement run, especially in high-performance computing applications. This may alter the program behavior due to a large runtime-overhead. Additionally, the trace file becomes to large to be analyzed efficiently. Therefore, instrumentation filtering is inevitable. Though GCC is one of the few compilers which support function instrument filtering at compile time without altering the source code, it's filtering is difficult, if not impossible, to control.

In our presentation, we talk about our achievements using the instrumentation framework InterAspect to generate a GCC plugin which provides better control over the instrumentation.

We also present our plans to reduce the induced overhead by improving the generated code from the function instrumentation.

Status of the x32 psABI

Presenter: H.J. Lu

This talk presents the current status of x32 psABI, which brings x86-64 features to 32-bit applications while keeping memory footprint to 32 bits. It will discuss the performance of the new ABI and the challenges it faces.

StarPU's C Extensions for Hybrid CPU/GPU Task Programming, or, An Experience in Turning a Clumsy API Into Language Extensions

StarPU started as a run-time support library for hybrid CPU/GPU task programming, later supplemented by a GCC plug-in. The GCC plug-in allows programmers to annotate C code to describe tasks and their implementations. Each task may have one or more implementations, such as CPU implementations or implementations written in OpenCL.

StarPU's support library schedules tasks over the available CPU cores and GPUs, and is also responsible for scheduling any data transfers between main memory and GPUs.

This talk will present the rationale for StarPU's C extensions and describe them. We will then report on our experience turning a C API into convenient language extensions, and discuss this use case for GCC plug-ins.

PowerPC BoF

GCC GNAT Ada in jet engine control systems

How to reach the workshop

From the Airport

From the railway station "Hlavni nadrazi"

From the railway station "Nadrazi Holesovice"

See the surroundings of Malostranske namesti from the aerial view and a street plan http://kam.mff.cuni.cz/reach/reachpic.html (330 kb).

Current timetables and fares of the public transport http://www.dpp.cz/en/.

In the building of the School of Informatics the workshop events will take place in 1st floor.

Getting to Airport

tshirt.png

None: cauldron2012 (last edited 2014-01-02 10:53:59 by TobiasBurnus)