[Bug ipa/98748] New: Increased precision for points-to analysis

bugzilla at bash9 dot xyz gcc-bugzilla@gcc.gnu.org
Tue Jan 19 13:09:53 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98748

            Bug ID: 98748
           Summary: Increased precision for points-to analysis
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bugzilla at bash9 dot xyz
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Hi,

this ticket is to discuss the possibility of extending the points-to analysis
in GCC. The points-to analysis in GCC is currently field-sensitive,
context-insensitive and flow-insensitive. However the current GCC
implementation suffers from some limitations:

1) field-sensitivity is unavailable for heap variables (i.e. those that are
allocated via malloc)
2) field-sensitivity is similarly unavailable for structs allocated in dynamic
arrays (again via malloc).

I understand that these two limitations stem from the fact that no type
information is available at the allocation site. I also believe that the
current IPA-PTA may not take advantage of type information (however, I am not
100% sure of this at the moment).

I've been reading this paper [0] which has a Datalog implementation for a
points-to analysis that computes the points-to solution with field-sensitivity
for heap variables and field-sensitivity for dynamic arrays. The implementation
is also available on GitHub [1]. At the moment it analyzes LLVM-IR. 

This added sensitivity should be enough to unlock data-layout transformations
at the granularity of alias-sets (i.e., it will change the layout of all
variables pointing to the same object) as opposed to a coarser type granularity
(i.e., to change the layout of all variables with a given type). For example:
dropping unused fields from structs [2] and transforming a "struct of arrays to
array of structs & vice versa".

If one went and performed the most direct translation of the available analysis
in GCC, one would need to iterate over all gimple code and generate Datalog
facts that would then be fed to the solver. I believe that XSB is an
implementation of Datalog that could be called from a GCC-plugin to compute the
points-to analysis. However, it would be preferable to have the points-to
analysis be performed directly in GCC.

Please let me know if you have thoughts about this. I am happy to have a
dialogue so that we can all come to a maintainable and reliable solution.

Thanks!


[0] Balatsouras, George, and Yannis Smaragdakis. "Structure-sensitive points-to
analysis for C and C++." International Static Analysis Symposium. Springer,
Berlin, Heidelberg, 2016. (available at the author's website)

[1] https://github.com/plast-lab/cclyzer

[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92801


More information about the Gcc-bugs mailing list