Commit
ffee8040534495fa739808e6c66a7fc73eca27bb
by sgueltonCorrectly track GCOVProfiling IR update
Differential Revision: https://reviews.llvm.org/D82742
|
 | llvm/lib/Transforms/Instrumentation/GCOVProfiling.cpp |
Commit
3ee580d0176f69a9f724469660f1d1805e0b6a06
by sam.parker[ARM][LowOverheadLoops] Handle reductions
While validating live-out values, record instructions that look like a reduction. This will comprise of a vector op (for now only vadd), a vorr (vmov) which store the previous value of vadd and then a vpsel in the exit block which is predicated upon a vctp. This vctp will combine the last two iterations using the vmov and vadd into a vector which can then be consumed by a vaddv.
Once we have determined that it's safe to perform tail-predication, we need to change this sequence of instructions so that the predication doesn't produce incorrect code. This involves changing the register allocation of the vadd so it updates itself and the predication on the final iteration will not update the falsely predicated lanes. This mimics what the vmov, vctp and vpsel do and so we then don't need any of those instructions.
Differential Revision: https://reviews.llvm.org/D75533
|
 | llvm/lib/CodeGen/ReachingDefAnalysis.cpp |
 | llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp |
 | llvm/test/CodeGen/Thumb2/LowOverheadLoops/reductions.ll |
 | llvm/include/llvm/CodeGen/ReachingDefAnalysis.h |
 | llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector-arith-codegen.ll |
 | llvm/lib/Target/ARM/ARMBaseInstrInfo.h |
Commit
91823163955859abbdcad5901d765aeae860939e
by Saiyedul.Islam[AMDGPU] Spill more than wavesize CSR SGPRs
In case of more than wavesize CSR SGPR spills, lanes of reserved VGPR were getting overwritten due to wrap around.
Reserve a VGPR (when NumVGPRSpillLanes = 0, WaveSize, 2*WaveSize, ..) and when one of the two conditions is true: 1. One reserved VGPR being tracked by VGPRReservedForSGPRSpill is not yet reserved. 2. All spill lanes of reserved VGPR(s) are full and another spill lane is required.
Reviewed By: arsenm, kerbowa
Differential Revision: https://reviews.llvm.org/D82463
|
 | llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp |
 | llvm/test/CodeGen/AMDGPU/spill_more_than_wavesize_csr_sgprs.ll |
Commit
a8e582c8307ba1d33c05d272b5c1b755fa809b51
by hans[ThinLTO] Always parse module level inline asm with At&t dialect (PR46503)
clang-cl passes -x86-asm-syntax=intel to the cc1 invocation so that assembly listings produced by the /FA flag are printed in Intel dialect. That flag however should not affect the *parsing* of inline assembly in the program. (See r322652)
When compiling normally, AsmPrinter::emitInlineAsm is used for assembling and defaults to At&t dialect. However, when compiling for ThinLTO, the code which parses module level inline asm to find symbols for the symbol table was failing to set the dialect. This patch fixes that. (See the bug for more details.)
Differential revision: https://reviews.llvm.org/D82862
|
 | llvm/lib/Object/ModuleSymbolTable.cpp |
 | clang/test/CodeGen/thinlto-inline-asm.c |
Commit
f12cd99c440a83d53a8717a9c8cdc4df41f39f3d
by sam.mccall[clangd] Config: compile Fragment -> CompiledFragment -> Config
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82612
|
 | clang-tools-extra/clangd/ConfigProvider.h |
 | clang-tools-extra/clangd/unittests/ConfigYAMLTests.cpp |
 | clang-tools-extra/clangd/ConfigFragment.h |
 | clang-tools-extra/clangd/ConfigYAML.cpp |
 | clang-tools-extra/clangd/unittests/CMakeLists.txt |
 | clang-tools-extra/clangd/unittests/ConfigTesting.h |
 | clang-tools-extra/clangd/ConfigCompile.cpp |
 | clang-tools-extra/clangd/CMakeLists.txt |
 | clang-tools-extra/clangd/unittests/ConfigCompileTests.cpp |
Commit
52f65323660051a5d039d475edfd4a3018682dcb
by endre.fulop[analyzer][CrossTU] Lower CTUImportThreshold default value
Summary: The default value of 100 makes the analysis slow. Projects of considerable size can take more time to finish than it is practical. The new default setting of 8 is based on the analysis of LLVM itself. With the old default value of 100 the analysis time was over a magnitude slower. Thresholding the load of ASTUnits is to be extended in the future with a more fine-tuneable solution that accomodates to the specifics of the project analyzed.
Reviewers: martong, balazske, Szelethus
Subscribers: whisperity, xazax.hun, baloghadamsoftware, szepet, rnkovacs, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, Charusso, steakhal, ASDenysPetrov, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82561
|
 | clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def |
 | clang/test/Analysis/analyzer-config.c |
Commit
9d347f6efa3018faf2fa159e25830817f4d2f41d
by llvmgnsyncbot[gn build] Port f12cd99c440
|
 | llvm/utils/gn/secondary/clang-tools-extra/clangd/unittests/BUILD.gn |
 | llvm/utils/gn/secondary/clang-tools-extra/clangd/BUILD.gn |
Commit
a1aed80a35f3f775cdb1d68c4388723691abc0dd
by paul.walker[SVE] Relax merge requirement for IR based divides.
We currently lower SDIV to SDIV_MERGE_OP1. This forces the value for inactive lanes in a way that can hamper register allocation, however, the lowering has no requirement for inactive lanes.
Instead this patch replaces SDIV_MERGE_OP1 with SDIV_PRED thus freeing the register allocator. Once done the only user of SDIV_MERGE_OP1 is intrinsic lowering so I've removed the node and perform ISel on the intrinsic directly. This also allows us to implement MOVPRFX based zeroing in the same manner as SUB.
This patch also renames UDIV_MERGE_OP1 and [F]ADD_MERGE_OP1 for the same reason but in the ADD cases the ISel code is already as required.
Differential Revision: https://reviews.llvm.org/D82783
|
 | llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll |
 | llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.h |
 | llvm/lib/Target/AArch64/SVEInstrFormats.td |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
Commit
76b2d9cbebd227d42e2099a0eb89c800b945997a
by Tony.Tye[AMDGPU] Correct AMDGPUUsage.rst DW_AT_LLVM_lane_pc example
- Correct typo of DW_OP_xaddr to DW_OP_addrx in AMDGPUUsage.rst for DW_AT_LLVM_lane_pc example.
Change-Id: I1b0ee2b24362a0240388e4c2f044c1d4883509b9
|
 | llvm/docs/AMDGPUUsage.rst |
Commit
f0ecfb789bb2d3de57876927e03a5c26da8419c8
by sam.parker[NFC][ARM] Add test.
|
 | llvm/test/CodeGen/Thumb2/LowOverheadLoops/varying-outer-2d-reduction.ll |
Commit
8270a903baf55122289499ba00a979e9c04dcd44
by pavel[lldb] Scalar re-fix UB in float->int conversions
The refactor in 48ca15592f1 reintroduced UB when converting out-of-bounds floating point numbers to integers -- the behavior for ULongLong() was originally fixed in r341685, but did not survive my refactor because I based my template code on one of the methods which did not have this fix.
This time, I apply the fix to all float->int conversions, instead of just the "double->unsigned long long" case. I also use a slightly simpler version of the code, with fewer round-trips (APFloat->APSInt->native_int vs APFloat->native_float->APInt->native_int).
I also add some unit tests for the conversions.
|
 | lldb/unittests/Utility/ScalarTest.cpp |
 | lldb/source/Utility/Scalar.cpp |
Commit
7f37d8830635bf119a5f630dd3958c8f45780805
by gchatelet[Alignment][NFC] Migrate MachineFrameInfo::CreateSpillStackObject to Align
iThis patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Differential Revision: https://reviews.llvm.org/D82934
|
 | llvm/lib/CodeGen/FixupStatepointCallerSaved.cpp |
 | llvm/include/llvm/CodeGen/MachineFrameInfo.h |
Commit
85460c4ea273784dd45da558ad9a6f13a79b2d91
by david.stenberg[DebugInfo] Do not emit entry values for composite locations
Summary: This is a fix for PR45009.
When working on D67492 I made DwarfExpression emit a single DW_OP_entry_value operation covering the whole composite location description that is produced if a register does not have a valid DWARF number, and is instead composed of multiple register pieces. Looking closer at the standard, this appears to not be valid DWARF. A DW_OP_entry_value operation's block can only be a DWARF expression or a register location description, so it appears to not be valid for it to hold a composite location description like that.
See DWARFv5 sec. 2.5.1.7:
"The DW_OP_entry_value operation pushes the value that the described location held upon entering the current subprogram. It has two operands: an unsigned LEB128 length, followed by a block containing a DWARF expression or a register location description (see Section 2.6.1.1.3 on page 39)."
Here is a dwarf-discuss mail thread regarding this:
http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2020-March/004610.html
There was not a strong consensus reached there, but people seem to lean towards that operations specified under 2.6 (e.g. DW_OP_piece) may not be part of a DWARF expression, and thus the DW_OP_entry_value operation can't contain those.
Perhaps we instead want to emit a entry value operation per each DW_OP_reg* operation, e.g.:
- DW_OP_entry_value(DW_OP_regx sub_reg0), DW_OP_stack_value, DW_OP_piece 8, - DW_OP_entry_value(DW_OP_regx sub_reg1), DW_OP_stack_value, DW_OP_piece 8, [...]
The question then becomes how the call site should look; should a composite location description be emitted there, and we then leave it up to the debugger to match those two composite location descriptions? Another alternative could be to emit a call site parameter entry for each sub-register, but firstly I'm unsure if that is even valid DWARF, and secondly it seems like that would complicate the collection of call site values quite a bit. As far as I can tell GCC does not emit any entry values / call sites in these cases, so we do not have something to compare with, but the former seems like the more reasonable approach.
Currently when trying to emit a call site entry for a parameter composed of multiple DWARF registers a (DwarfRegs.size() == 1) assert is triggered in addMachineRegExpression(). Until the call site representation is figured out, and until there is use for these entry values in practice, this commit simply stops the invalid DWARF from being emitted.
Reviewers: djtodoro, vsk, aprantl
Reviewed By: djtodoro, vsk
Subscribers: jyknight, hiraditya, fedor.sergeev, jrtc27, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D75270
|
 | llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp |
 | llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h |
 | llvm/test/DebugInfo/Sparc/entry-value-complex-reg-expr.ll |
Commit
917bdfaca6df575f617b0f3aa989183ab187e8ac
by grimar[llvm-readobj] - Simplify and refine hash table tests
Now we are able to have default values for macros in YAML descriptions. I've applied it for hash table tests and also fixed few copy-paste issues in their comments.
Differential revision: https://reviews.llvm.org/D82870
|
 | llvm/test/tools/llvm-readobj/ELF/gnuhash.test |
 | llvm/test/tools/llvm-readobj/ELF/hash-symbols.test |
 | llvm/test/tools/llvm-readobj/ELF/hash-histogram.test |
Commit
61f967dccabab67f9996a4fb1c6ec4fa4f23f005
by grimar[llvm-readobj] - Don't crash when checking the number of dynamic symbols.
When we deriving the number of symbols from the DT_HASH table, we can crash when calculate the number of symbols in the symbol table when SHT_DYNSYM has sh_entsize == 0.
The patch fixes the issue.
Differential revision: https://reviews.llvm.org/D82877
|
 | llvm/test/tools/llvm-readobj/ELF/dyn-symbols-size-from-hash-table.test |
 | llvm/tools/llvm-readobj/ELFDumper.cpp |
Commit
7dcc3858e72666dc12240c8a4bd278775cd807ea
by sam.mccall[clangd] Fix name conflict again, unbreak GCC. NFC
|
 | clang-tools-extra/clangd/unittests/ConfigTesting.h |