SuccessChanges

Summary

  1. [InstCombine] Aggregate reconstruction simplification (PR47060) (details)
  2. [NFCI][InstCombine] Pacify GCC builds - don't name variable and enum class identically (details)
  3. Add missing parsing for attributes to std.generic_atomic_rmw op (details)
  4. Don't leave the FPOptions in a UnaryOperator uninitialized. (details)
  5. Use consistent code for setting FPFeatures from operator constructors. (details)
  6. Always keep unset fields in FPOptionsOverride zeroed. (details)
  7. Replace setter named 'getAsOpaqueInt' with a real getter. (details)
  8. [StackSafety] Skip ambiguous lifetime analysis (details)
  9. Initial MLIR python bindings based on the C API. (details)
Commit ae7f08812e0995481eb345cecc5dd4529829ba44 by lebedev.ri
[InstCombine] Aggregate reconstruction simplification (PR47060)

This pattern happens in clang C++ exception lowering code, on unwind branch.
We end up having a `landingpad` block after each `invoke`, where RAII
cleanup is performed, and the elements of an aggregate `{i8*, i32}`
holding exception info are `extractvalue`'d, and we then branch to common block
that takes extracted `i8*` and `i32` elements (via `phi` nodes),
form a new aggregate, and finally `resume`'s the exception.

The problem is that, if the cleanup block is effectively empty,
it shouldn't be there, there shouldn't be that `landingpad` and `resume`,
said `invoke` should be a  `call`.

Indeed, we do that simplification in e.g. SimplifyCFG `SimplifyCFGOpt::simplifyResume()`.
But the thing is, all this extra `extractvalue` + `phi` + `insertvalue` cruft,
while it is pointless, does not look like "empty cleanup block".
So the `SimplifyCFGOpt::simplifyResume()` fails, and the exception is has
higher cost than it could have on unwind branch :S

This doesn't happen *that* often, but it will basically happen once per C++
function with complex CFG that called more than one other function
that isn't known to be `nounwind`.

I think, this is a missing fold in InstCombine, so i've implemented it.

I think, the algorithm/implementation is rather self-explanatory:
1. Find a chain of `insertvalue`'s that fully tell us the initializer of the aggregate.
2. For each element, try to find from which aggregate it was extracted.
   If it was extracted from the aggregate with identical type,
   from identical element index, great.
3. If all elements were found to have been extracted from the same aggregate,
   then we can just use said original source aggregate directly,
   instead of re-creating it.
4. If we fail to find said aggregate when looking only in the current block,
   we need be PHI-aware - we might have different source aggregate when coming
   from each predecessor.

I'm not sure if this already handles everything, and there are some FIXME's,
i'll deal with all that later in followups.

I'd be fine with going with post-commit review here code-wise,
but just in case there are thoughts, i'm posting this.

On RawSpeed, for example, this has the following effect:
```
| statistic name                                    | baseline | proposed |     Δ |       % | abs(%) |
|---------------------------------------------------|---------:|---------:|------:|--------:|-------:|
| instcombine.NumAggregateReconstructionsSimplified |        0 |     1253 |  1253 |   0.00% |  0.00% |
| simplifycfg.NumInvokes                            |      948 |     1355 |   407 |  42.93% | 42.93% |
| instcount.NumInsertValueInst                      |     4382 |     3210 | -1172 | -26.75% | 26.75% |
| simplifycfg.NumSinkCommonCode                     |      574 |      458 |  -116 | -20.21% | 20.21% |
| simplifycfg.NumSinkCommonInstrs                   |     1154 |      921 |  -233 | -20.19% | 20.19% |
| instcount.NumExtractValueInst                     |    29017 |    26397 | -2620 |  -9.03% |  9.03% |
| instcombine.NumDeadInst                           |   166618 |   174705 |  8087 |   4.85% |  4.85% |
| instcount.NumPHIInst                              |    51526 |    50678 |  -848 |  -1.65% |  1.65% |
| instcount.NumLandingPadInst                       |    20865 |    20609 |  -256 |  -1.23% |  1.23% |
| instcount.NumInvokeInst                           |    34023 |    33675 |  -348 |  -1.02% |  1.02% |
| simplifycfg.NumSimpl                              |   113634 |   114708 |  1074 |   0.95% |  0.95% |
| instcombine.NumSunkInst                           |    15030 |    14930 |  -100 |  -0.67% |  0.67% |
| instcount.TotalBlocks                             |   219544 |   219024 |  -520 |  -0.24% |  0.24% |
| instcombine.NumCombined                           |   644562 |   645805 |  1243 |   0.19% |  0.19% |
| instcount.TotalInsts                              |  2139506 |  2135377 | -4129 |  -0.19% |  0.19% |
| instcount.NumBrInst                               |   156988 |   156821 |  -167 |  -0.11% |  0.11% |
| instcount.NumCallInst                             |  1206144 |  1207076 |   932 |   0.08% |  0.08% |
| instcount.NumResumeInst                           |     5193 |     5190 |    -3 |  -0.06% |  0.06% |
| asm-printer.EmittedInsts                          |   948580 |   948299 |  -281 |  -0.03% |  0.03% |
| instcount.TotalFuncs                              |    11509 |    11507 |    -2 |  -0.02% |  0.02% |
| inline.NumDeleted                                 |    97595 |    97597 |     2 |   0.00% |  0.00% |
| inline.NumInlined                                 |   210514 |   210522 |     8 |   0.00% |  0.00% |
```
So we manage to increase the amount of `invoke` -> `call` conversions in SimplifyCFG by almost a half,
and there is a very apparent decrease in instruction and basic block count.

On vanilla llvm-test-suite:
```
| statistic name                                    | baseline | proposed |     Δ |       % | abs(%) |
|---------------------------------------------------|---------:|---------:|------:|--------:|-------:|
| instcombine.NumAggregateReconstructionsSimplified |        0 |      744 |   744 |   0.00% |  0.00% |
| instcount.NumInsertValueInst                      |     2705 |     2053 |  -652 | -24.10% | 24.10% |
| simplifycfg.NumInvokes                            |     1212 |     1424 |   212 |  17.49% | 17.49% |
| instcount.NumExtractValueInst                     |    21681 |    20139 | -1542 |  -7.11% |  7.11% |
| simplifycfg.NumSinkCommonInstrs                   |    14575 |    14361 |  -214 |  -1.47% |  1.47% |
| simplifycfg.NumSinkCommonCode                     |     6815 |     6743 |   -72 |  -1.06% |  1.06% |
| instcount.NumLandingPadInst                       |    14851 |    14712 |  -139 |  -0.94% |  0.94% |
| instcount.NumInvokeInst                           |    27510 |    27332 |  -178 |  -0.65% |  0.65% |
| instcombine.NumDeadInst                           |  1438173 |  1443371 |  5198 |   0.36% |  0.36% |
| instcount.NumResumeInst                           |     2880 |     2872 |    -8 |  -0.28% |  0.28% |
| instcombine.NumSunkInst                           |    55187 |    55076 |  -111 |  -0.20% |  0.20% |
| instcount.NumPHIInst                              |   321366 |   320916 |  -450 |  -0.14% |  0.14% |
| instcount.TotalBlocks                             |   886816 |   886493 |  -323 |  -0.04% |  0.04% |
| instcount.TotalInsts                              |  7663845 |  7661108 | -2737 |  -0.04% |  0.04% |
| simplifycfg.NumSimpl                              |   886791 |   887171 |   380 |   0.04% |  0.04% |
| instcount.NumCallInst                             |   553552 |   553733 |   181 |   0.03% |  0.03% |
| instcombine.NumCombined                           |  3200512 |  3201202 |   690 |   0.02% |  0.02% |
| instcount.NumBrInst                               |   741794 |   741656 |  -138 |  -0.02% |  0.02% |
| simplifycfg.NumHoistCommonInstrs                  |    14443 |    14445 |     2 |   0.01% |  0.01% |
| asm-printer.EmittedInsts                          |  7978085 |  7977916 |  -169 |   0.00% |  0.00% |
| inline.NumDeleted                                 |    73188 |    73189 |     1 |   0.00% |  0.00% |
| inline.NumInlined                                 |   291959 |   291968 |     9 |   0.00% |  0.00% |
```
Roughly similar effect, less instructions and blocks total.

See also: rGe492f0e03b01a5e4ec4b6333abb02d303c3e479e.

Compile-time wise, this appears to be roughly geomean-neutral:
http://llvm-compile-time-tracker.com/compare.php?from=39617aaed95ac00957979bc1525598c1be80e85e&to=b59866cf30420da8f8e3ca239ed3bec577b23387&stat=instructions

And this is a win size-wize in general:
http://llvm-compile-time-tracker.com/compare.php?from=39617aaed95ac00957979bc1525598c1be80e85e&to=b59866cf30420da8f8e3ca239ed3bec577b23387&stat=size-text

See https://bugs.llvm.org/show_bug.cgi?id=47060

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D85787
The file was modifiedclang/test/CodeGenCXX/nrvo.cpp
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineInternal.h
The file was modifiedllvm/lib/Transforms/InstCombine/InstructionCombining.cpp
The file was modifiedllvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
The file was modifiedllvm/test/Transforms/InstCombine/aggregate-reconstruction.ll
Commit 0ec1f0f332c7a8c4833c91ebdb88382660119f19 by lebedev.ri
[NFCI][InstCombine] Pacify GCC builds - don't name variable and enum class identically
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
Commit de71b46a519db014ce906a39f8a0e1b235ef1568 by joker.eph
Add missing parsing for attributes to std.generic_atomic_rmw op

Fix llvm.org/pr47182

Differential Revision: https://reviews.llvm.org/D86030
The file was modifiedmlir/lib/Dialect/StandardOps/IR/Ops.cpp
The file was modifiedmlir/test/IR/core-ops.mlir
Commit 9860e68450cd04ec2c53fe2dbcfab64a86c76673 by richard
Don't leave the FPOptions in a UnaryOperator uninitialized.

We don't appear to use these FPOptions for anything right now, but
they shouldn't be uninitialized because that makes our AST file output
nondeterministic.
The file was modifiedclang/lib/AST/Expr.cpp
The file was addedclang/test/PCH/determinism.cpp
Commit ae3067055b33f6ab5657fbae5845cc743b91c299 by richard
Use consistent code for setting FPFeatures from operator constructors.
The file was modifiedclang/lib/AST/Expr.cpp
Commit ae500e4d0964adea69372d083416b0f13e9a87eb by richard
Always keep unset fields in FPOptionsOverride zeroed.

There are three fields that the FPOptions default constructor sets to
non-zero values; those fields previously could have been zero or
non-zero depending on whether they'd been explicitly removed from the
FPOptionsOverride set. However, that doesn't seem to ever actually
happen, so this is NFC, except that it makes the AST file representation
of FPOptionsOverride make more sense.
The file was modifiedclang/include/clang/Basic/LangOptions.h
The file was modifiedclang/test/PCH/determinism.cpp
Commit 948219d1098736758123c43f995ec784db5d921e by richard
Replace setter named 'getAsOpaqueInt' with a real getter.

Clean up a bunch of places where the opaque forms of FPOptions and
FPOptionsOverride were being used inappropriately.
The file was modifiedclang/lib/Sema/SemaAttr.cpp
The file was modifiedclang/include/clang/Sema/Sema.h
The file was modifiedclang/include/clang/Basic/LangOptions.h
The file was modifiedclang/lib/Serialization/ASTReaderStmt.cpp
The file was modifiedclang/lib/Serialization/ASTWriter.cpp
The file was modifiedclang/lib/Sema/TreeTransform.h
The file was modifiedclang/lib/Serialization/ASTReader.cpp
The file was modifiedclang/include/clang/Serialization/ASTReader.h
The file was modifiedclang/lib/Parse/ParseDeclCXX.cpp
The file was modifiedclang/lib/Sema/Sema.cpp
Commit e10e7829bf6f10c053c05e42b676d7acaf54a221 by Vitaly Buka
[StackSafety] Skip ambiguous lifetime analysis

If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D84630
The file was modifiedllvm/include/llvm/Analysis/StackLifetime.h
The file was modifiedllvm/test/CodeGen/AArch64/stack-tagging.ll
The file was modifiedllvm/lib/Analysis/StackLifetime.cpp
The file was addedllvm/test/Transforms/SafeStack/X86/no-crash-on-lifetime.ll
The file was modifiedllvm/test/Analysis/StackSafetyAnalysis/lifetime.ll
Commit fcd2969da9e04a70103bfbf8a382c0842fcf6aaf by stellaraccident
Initial MLIR python bindings based on the C API.

* Basic support for context creation, module parsing and dumping.

Differential Revision: https://reviews.llvm.org/D85481
The file was addedmlir/lib/Bindings/Python/IRModules.cpp
The file was addedmlir/lib/Bindings/Python/IRModules.h
The file was modifiedmlir/lib/Bindings/Python/CMakeLists.txt
The file was modifiedmlir/lib/Bindings/Python/MainModule.cpp
The file was addedmlir/test/Bindings/Python/ir_test.py