external/llvm.git -

Age	Commit message (Collapse)	Author
37 hours	[clang-format] Fix a bug in formatting goto labels in macros (#92494)upstream-release/18.x	Owen Pan
	Fixes #92300. (cherry picked from commit d89f20058b45e3836527e816af7ed7372e1d554d)
37 hours	release/18.x: [clang-format] Don't always break before << between str… ↵	Owen Pan
	(#94091) …ing literals (#92214)
3 days	[PPCMergeStringPool] Only replace constant once (#92996)	Nikita Popov
	In #88846 I changed this code to use RAUW to perform the replacement instead of manual updates -- but kept the outer loop, which means we try to perform RAUW once per user. However, some of the users might be freed by the RAUW operation, resulting in use-after-free. The case where this happens is constant users where the replacement might result in the destruction of the original constant. Fixes https://github.com/llvm/llvm-project/issues/92991. (cherry picked from commit 9f85bc834b07ebfec9e5e02deb9255a0f6ec5cc7)
3 days	Bump version to 18.1.7 (#93723)	Tom Stellard

2024-05-18	[libcxx][libcxxabi] Fix build for OpenBSD (#92186)	John Ericson
	- No indirect syscalls on OpenBSD. Instead there is a `futex` function which issues a direct syscall. - Monotonic clock is available despite the full POSIX suite of timers not being available in its entirety. See https://lists.boost.org/boost-bugs/2015/07/41690.php and https://github.com/boostorg/log/commit/c98b1f459add14d5ce3e9e63e2469064601d7f71 for a description of an analogous problem and fix for Boost. (cherry picked from commit af7467ce9f447d6fe977b73db1f03a18d6bbd511)
2024-05-17	[clang] Don't assume location of compiler-rt for OpenBSD (#92183)	John Ericson
	If the `/usr/lib/...` path where compiler-rt is conventionally installed on OpenBSD does not exist, fall back to the regular logic to find it. This is a minimal change to allow OpenBSD cross compilation from a toolchain that doesn't adopt all of OpenBSD's monorepo's conventions. (cherry picked from commit be10746f3a4381456eb5082a968766201c17ab5d)
2024-05-17	[GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483)	DianQK
	Fixes #91312. Don't perform the transform if the alias may be replaced at link time. (cherry picked from commit c79690040acf5bb3d857558b0878db47f7f23dc3)
2024-05-17	[Arm64EC] Correctly handle sret in entry thunks. (#92326)	Eli Friedman
	I accidentally left out the code to transfer sret attributes to entry thunks, so values weren't being passed in the right registers, and the sret pointer wasn't returned in the correct register. Fixes #90229
2024-05-17	[Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115)	Eli Friedman
	In some cases, MSVC's mangling for arm64ec thunks includes the alignment of a struct. I added some code to try to match... but it never really worked right. The issues: - Alignment is only mangled if it's 16 or more (I guess the default is supposed to be 8). - Alignment isn't mangled on return values (since the memory is allocated by the caller). The current patch leaves hooks to make alignment mangling work... but doesn't actually ever mangle alignment: clang never actually encodes a relevant alignment into the IR. Once we get clang to emit the real size/alignment of structs, we can start emitting it.
2024-05-17	[workflows] Fix libclang-abi-tests to work with new version scheme (#91865)	Tom Stellard
	(cherry picked from commit d06270ee00e37b247eb99268fb2f106dbeee08ff)
2024-05-17	[RISCV] Add a unaligned-scalar-mem feature like we had in clang 17.	Craig Topper
	This is ORed with the fast-unaligned-access feature which applies to scalar and vector together.:
2024-05-16	Update llvm/test/Transforms/InstCombine/bit_ceil.ll	Tom Stellard
	Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2024-05-16	[InstCombine] Drop nuw flag when CtlzOp is a sub nuw (#91776)	Yingwei Zheng
	See the following case: ``` define i32 @src1(i32 %x) { %dec = sub nuw i32 -2, %x %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false) %sub = sub nsw i32 32, %ctlz %shl = shl i32 1, %sub %ugt = icmp ult i32 %x, -2 %sel = select i1 %ugt, i32 %shl, i32 1 ret i32 %sel } define i32 @tgt1(i32 %x) { %dec = sub nuw i32 -2, %x %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false) %sub = sub nsw i32 32, %ctlz %and = and i32 %sub, 31 %shl = shl nuw i32 1, %and ret i32 %shl } ``` `nuw` in `%dec` should be dropped after the select instruction is eliminated. Alive2: https://alive2.llvm.org/ce/z/7S9529 Fixes https://github.com/llvm/llvm-project/issues/91691. (cherry picked from commit b5f4210e9f51f938ae517f219f04f9ab431a2684)
2024-05-15	Revert "[SLP]Fix a crash if the argument of call was affected by minbitwidth ↵	Rose
	analysis." After reconsidering the words of @nikic, I have decided to revisit the patches I suggested be backported. Upon further analysis, I think there is a high likelihood that this change added to release 18.x was referencing a crash that was caused by a PR that isn't added. I will, however, keep the test that was added just in case. This reverts commit 6e071cf30599e821be56b75e6041cfedb7872216.
2024-05-15	[GlobalIsel][AArch64] fix out of range access in regbankselect (#92072)	Thorsten Schütt
	Fixes https://github.com/llvm/llvm-project/issues/92062 (cherry picked from commit d422e90fcbdddd68749918ddd86c94188807efce)
2024-05-15	[SystemZ] Handle address clobbering in splitMove(). (#92105)	Jonas Paulsson
	When expanding an L128 (which is used to reload i128) it is possible that the quadword destination register clobbers an address register. This patch adds an assertion against the case where both of the expanded parts clobber the address, and in the case where one of the expanded parts do so puts it last. Fixes #91437 (cherry picked from commit d6ee7e8481fbaee30f37d82778ef12e135db5e67)
2024-05-15	release/18.x: [libclc] Fix linking against libIRReader	Thomas Debesse
	Fixes https://github.com/llvm/llvm-project/issues/91551
2024-05-14	[InstSimplify] Do not simplify freeze in `simplifyWithOpReplaced` (#91215)	Yingwei Zheng
	See the LangRef: > All uses of a value returned by the same ‘freeze’ instruction are guaranteed to always observe the same value, while different ‘freeze’ instructions may yield different values. It is incorrect to replace freezes with the simplified value. Proof: https://alive2.llvm.org/ce/z/3Dn9Cd https://alive2.llvm.org/ce/z/Qyh5h6 Fixes https://github.com/llvm/llvm-project/issues/91178 (cherry picked from commit d085b42cbbefe79a41113abcd2b1e1f2a203acef) Revert "[InstSimplify] Do not simplify freeze in `simplifyWithOpReplaced` (#91215)" This reverts commit 1c2eb18d52976fef89972e89c52d2ec5ed7e4868. [InstSimplify] Do not simplify freeze in `simplifyWithOpReplaced` (#91215) See the LangRef: > All uses of a value returned by the same ‘freeze’ instruction are guaranteed to always observe the same value, while different ‘freeze’ instructions may yield different values. It is incorrect to replace freezes with the simplified value. Proof: https://alive2.llvm.org/ce/z/3Dn9Cd https://alive2.llvm.org/ce/z/Qyh5h6 Fixes https://github.com/llvm/llvm-project/issues/91178 (cherry picked from commit d085b42cbbefe79a41113abcd2b1e1f2a203acef)
2024-05-14	[X86][Driver] Do not add `-evex512` for `-march=native` when the target ↵	Phoebe Wang
	doesn't support AVX512 (#91694) (cherry picked from commit 87f3407856e61a73798af4e41b28bc33b5bf4ce6)
2024-05-14	[AArch64][SelectionDAG] Mask for SUBS with multiple users cannot be elided ↵	Weihang Fan
	(#90911) In DAGCombiner, the `performCONDCombine` function attempts to remove AND instructions in front of SUBS (cmp) instructions for which the AND is transparent. The rules for that are correct, but it fails to take into account the case where the SUBS instruction has multiple users with different condition codes for comparison and simply removes the AND for all of them. This causes a miscompilation in the attached test case. (cherry picked from commit 72eaa0ed9934bfaa2449091bbc6e45648d1396d6)
2024-05-14	[RISCV] Use 'riscv-isa' module flag to set ELF flags and attributes. (#85155)	Craig Topper
	Walk all the ISA strings and set the subtarget bits for any extension we find in any string. This allows LTO output to have a ELF attributes from the union of all of the files used to compile it.
2024-05-14	[RISCV] Store RVC and TSO ELF flags explicitly in RISCVTargetStreamer. NFCI ↵	Craig Topper
	(#83344) Instead of caching STI in the RISCVELFTargetStreamer, store the two flags we need from it. My goal is to allow RISCVAsmPrinter to override these flags using IR module metadata for LTO. So they need to be separated from the STI used to construct the TargetStreamer. This patch should be NFC as long as no one is changing the contents of the STI that was used to construct the TargetStreamer between the constructor and the use of the flags.
2024-05-14	[RISCV] Add canonical ISA string as Module metadata in IR. (#80760)	Craig Topper
	In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
2024-05-13	[RISCV][lld] Set the type of TLSDESC relocation's referenced local symbol to ↵	Paul Kirth
	STT_NOTYPE When adding fixups for RISCV_TLSDESC_ADD_LO and RISCV_TLSDESC_LOAD_LO, the local label added for RISCV TLSDESC relocations have STT_TLS set, which is incorrect. Instead, these labels should have `STT_NOTYPE`. This patch stops adding such fixups and avoid setting the STT_TLS on these symbols. Failing to do so can cause LLD to emit an error `has an STT_TLS symbol but doesn't have an SHF_TLS section`. We additionally, adjust how LLD services these relocations to avoid errors with incompatible relocation and symbol types. Reviewers: topperc, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/85817 (cherry picked from commit dfe4ca9b7f4a422500d78280dc5eefd1979939e6)
2024-05-13	[PPCMergeStringPool] Avoid replacing constant with instruction (#88846)	Nikita Popov
	String pool merging currently, for a reason that's not entirely clear to me, tries to create GEP instructions instead of GEP constant expressions when replacing constant references. It only uses constant expressions in cases where this is required. However, it does not catch all cases where such a requirement exists. For example, the landingpad catch clause has to be a constant. Fix this by always using the constant expression variant, which also makes the implementation simpler. Additionally, there are some edge cases where even replacement with a constant GEP is not legal. The one I am aware of is the llvm.eh.typeid.for intrinsic, so add a special case to forbid replacements for it. Fixes https://github.com/llvm/llvm-project/issues/88844. (cherry picked from commit 3a3aeb8eba40e981d3a9ff92175f949c2f3d4434)
2024-05-13	[clang-format] Fix a crash with AlignArrayOfStructures option (#86420)	Owen Pan
	Fixes #86109. (cherry picked from commit cceedc939a43c7c732a5888364251775bffc2dba)
2024-05-13	[Clang][Sema] Revise the transformation of CTAD parameters of nested class ↵	Younan Zhang
	templates (#91628) This fixes a regression introduced by bee78b88f. When we form a deduction guide for a constructor, basically, we do the following work: - Collect template parameters from the constructor's surrounding class template, if present. - Collect template parameters from the constructor. - Splice these template parameters together into a new template parameter list. - Turn all the references (e.g. the function parameter list) to the invented parameter list by applying a `TreeTransform` to the function type. In the previous fix, we handled cases of nested class templates by substituting the "outer" template parameters (i.e. those not declared at the surrounding class template or the constructor) with the instantiating template arguments. The approach per se makes sense, but there was a flaw in the following case: ```cpp template <typename U, typename... Us> struct X { template <typename V> struct Y { template <typename T> Y(T) {} }; template <typename T> Y(T) -> Y<T>; }; X<int>::Y y(42); ``` While we're transforming the parameters for `Y(T)`, we first attempt to transform all references to `V` and `T`; then, we handle the references to outer parameters `U` and `Us` using the template arguments from `X<int>` by transforming the same `ParamDecl`. However, the first step results in the reference `T` being `<template-param-0-1>` because the invented `T` is the last of the parameter list of the deduction guide, and what we're substituting with is a corresponding parameter pack (which is `Us`, though empty). Hence we're messing up the substitution. I think we can resolve it by reversing the substitution order, which means handling outer template parameters first and then the inner parameters. There's no release note because this is a regression in 18, and I hope we can catch up with the last release. Fixes https://github.com/llvm/llvm-project/issues/88142 (cherry picked from commit 8c852ab57932a5cd954cb0d050c3d2ab486428df)
2024-05-13	[lld][WebAssembly] Fix test on Windows, use llvm-ar instead of ar	Reid Kleckner
	(cherry picked from commit 4b4763ffebaed9f1fee94b8ad5a1a450a9726683)
2024-05-13	Reland "[clang-repl] Keep the first llvm::Module empty to avoid invalid ↵	Vassil Vassilev
	memory access. (#89031)" Original commit message: " Clang's CodeGen is designed to work with a single llvm::Module. In many cases for convenience various CodeGen parts have a reference to the llvm::Module (TheModule or Module) which does not change when a new module is pushed. However, the execution engine wants to take ownership of the module which does not map well to CodeGen's design. To work this around we clone the module and pass it down. With some effort it is possible to teach CodeGen to ask the CodeGenModule for its current module and that would have an overall positive impact on CodeGen improving the encapsulation of various parts but that's not resilient to future regression. This patch takes a more conservative approach and keeps the first llvm::Module empty intentionally and does not pass it to the Jit. That's also not bullet proof because we have to guarantee that CodeGen does not write on the blueprint. However, we have inserted some assertions to catch accidental additions to that canary module. This change will fixes a long-standing invalid memory access reported by valgrind when we enable the TBAA optimization passes. It also unblock progress on https://github.com/llvm/llvm-project/pull/84758. " This patch reverts adc4f6233df734fbe3793118ecc89d3584e0c90f and removes the check of `named_metadata_empty` of the first llvm::Module because on darwin clang inserts some harmless metadata which we can ignore. (cherry picked from commit a3f07d36cbc9e3a0d004609d140474c1d8a25bb6)
2024-05-13	[workflows] Add a job for requesting a release note on release branch PRs ↵	Tom Stellard
	(#91826) We have been collecting release notes from the PRs for most of the 18.1.x releases and this just helps automate the process. (cherry picked from commit c99d1156c28dfed67a8479dd97608d1f0d6cd593)
2024-05-10	[OpenMP] Fix child processes to use affinity_none (#91391)	Jonathan Peyton
	When a child process is forked with OpenMP already initialized, the child process resets its affinity mask and sets proc-bind-var to false so that the entire original affinity mask is used. This patch corrects an issue with the affinity initialization code setting affinity to compact instead of none for this special case of forked children. The test trying to catch this only testing explicit setting of KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting. Fixes: #91098 (cherry picked from commit 73bb8d9d92f689863c94d48517e89d35dae0ebcf)
2024-05-10	[llvm][lld] Pre-commit tests for RISCV TLSDESC symbols	Paul Kirth
	Currently, we mistakenly mark the local labels used in RISC-V TLSDESC as TLS symbols, when they should not be. This patch adds tests with the current incorrect behavior, and subsequent patches will address the issue. Reviewers: MaskRay, topperc Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/85816 (cherry picked from commit f6f474c4ef9694a4ca8f08d59fd112c250fb9c73)
2024-05-10	[AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT	Amara Emerson
	We should moreElements <3 x s1> to <4 x s1> before we try to widen the element, otherwise we end up with a <3 x s21> nonsense type. (cherry picked from commit a01e9ce86f4c1bc9af819902db9f287b6d23f54f) Test has been changed from original commit due to a fallback in a G_BITCAST. Added abort=2 so we can see partial legalization and check no crash.
2024-05-09	[InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705)	Nikita Popov
	Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes https://github.com/llvm/llvm-project/issues/90695. (cherry picked from commit d484c4d3501a7ff3d00a6e0cfad026a3b01d320c)
2024-05-09	[AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection ↵	Marc Auberer
	(#86972) (#91580) Fixes #86917 `FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion.
2024-05-09	[FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180)	Nikita Popov
	For inbounds GEPs, if the source pointer is non-null, the result must also be non-null. However, this does not hold for non-inbounds GEPs. Fixes https://github.com/llvm/llvm-project/issues/91177. (cherry picked from commit f34d30cdae0f59698f660d5cc8fb993fb3441064)
2024-05-09	[clang-format] Don't remove parentheses of fold expressions (#91045)	Owen Pan
	Fixes #90966. (cherry picked from commit db0ed5533368414b1c4e1c884eef651c66359da2)
2024-05-08	[AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622)	Jay Foad
	As well as flipping the sense of the bit, GFX12 moved it from bit 0 to bit 1 in the encoded simm16 operand. (cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d)
2024-05-08	[SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712)	Björn Pettersson
	This is a fix for miscompiles reported in https://github.com/llvm/llvm-project/issues/89060 After argument copy elison the IR value for the eliminated alloca is aliasing with the fixed stack object. This patch is making sure that we mark the fixed stack object as being aliased with IR values to avoid that for example schedulers are reordering accesses to the fixed stack object. This could otherwise happen when there is a mix of MemOperands refering the shared fixed stack slow via both the IR value for the elided alloca, and via a fixed stack pseudo source value (as would be the case when lowering the arguments). (cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63)
2024-05-08	[X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125)	Phoebe Wang
	AVX doesn't provide 16-bit BROADCAST instruction. Fixes #91005
2024-05-08	[X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106)	Phoebe Wang
	With KNL/KNC being deprecated, we don't need to care about such no VLX cases anymore. We may remove such patterns in the future. Fixes #90844 (cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c)
2024-05-08	[AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595)	David Stuttard
	Code to determine if a waitcnt is required before a barrier instruction only considered S_BARRIER. gfx12 adds barrier_signal/wait so need to enhance the existing code to look for a barrier start (which is just an S_BARRIER for earlier architectures).
2024-05-08	[Workflows] Re-write release-binaries workflow (#89521)	Tom Stellard
	This updates the release-binaries workflow so that the different build stages are split across multiple jobs. This saves money by reducing the time spent on the larger github runners and also makes it easier to debug, because now it's possible to build a smaller release package (with clang and lld) using only the free GitHub runners. The workflow no longer uses the test-release.sh script but instead uses the Release.cmake cache. This gives the workflow more flexibility and ensures that the binary package will always be created even if the tests fail. This idea to split the stages comes from the "LLVM Precommit CI through Github Actions" RFC: https://discourse.llvm.org/t/rfc-llvm-precommit-ci-through-github-actions/76456 (cherry picked from commit abac98479b81cc0cc717bb6cdbae6f774e3b0232)
2024-05-08	workflows: Fix incorrect input name in release-binaries.yml (#84604)	Tom Stellard
	In aa02002491333c42060373bc84f1ff5d2c76b4ce the input name was changed from tag to release-version, but the code was never updated. (cherry picked from commit 8d220d109d28dac352c563ab062fb72132b7eca1)
2024-05-08	workflows: Fixes for building the release binaries (#83694)	Tom Stellard
	Since aa02002491333c42060373bc84f1ff5d2c76b4ce we weren't installing the correct dependencies, and since 2836d8edbfbcd461b25101ed58f93c862d65903a we must pass a custom token to github-upload-release.py for verifying permissions. (cherry picked from commit 51207756b0692f325cf75560185cf0336239b3e0)
2024-05-08	[Github] Add repository checks to release-binaries workflow (#84437)	Aiden Grossman
	This patch adds repository checks to the release-binaries workflow jobs. People were observing that the job was running on a schedule in their forks. This only happens on old forks, but those probably exist in great number given how prolific LLVM is. This is also good practice anyways, on top of solving the direct problem of these jobs running with the cron schedule on people's forks. (cherry picked from commit 9f5be5f0092a636274953389cd5771c45ac0a568)
2024-05-08	[CMake][Release] Enable CMAKE_POSITION_INDEPENDENT_CODE (#90139)	Tom Stellard
	Set this in the cache file directly instead of via the test-release.sh script so that the release builds can be reproduced with just the cache file. (cherry picked from commit 53ff002c6f7ec64a75ab0990b1314cc6b4bb67cf)
2024-05-08	[CMake][Release] Refactor cache file and use two stages for non-PGO builds ↵	Tom Stellard
	(#89812) Completely refactor the cache file to simplify it and remove unnecessary variables. The main functional change here is that the non-PGO builds now use two stages, so `ninja -C build stage2-package` can be used with both PGO and non-PGO builds. (cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69)
2024-05-08	[CMake][Release] Add stage2-package target (#89517)	Tom Stellard
	This target will be used to generate the release binary package for uploading to GitHub. (cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e)
2024-05-08	Bump version to 18.1.6 (#91094)	Tom Stellard