tl;dr: Not supporting sframes in LLD would be very unfortunate and bad for the llvm project.
There are three issues; I’ll address each in turn:
How technically deficient is the format?
I don’t think anyone would argue that there are not improvements to be made. However, calling the monolithic section an “elf violation” here overstates the problems a great deal. Yes, it is a single-section that cannot be appended one after the other, and yes, it doesn’t follow the semantics of elf groups. In particular, a non-group section has relocations that point into discardable sections.
However, there is strong precedent within elf for this, especially within unwind info: Both .eh_frame and .debug_frame work this way. Here are the section headers and relevant relocations for a small C++ that contains a group for a discardable function. It includes both sframes and eh_frames, and the pattern is identical:
$ cat u.cpp
template<class T> class bar { public: void foo(const T t) { } };
void func() { bar<int> b; b.foo(6); }
augustine:~/tmp $ g++ -Wa,-gsframe u.cpp -c && objdump -h u.o -r
u.o: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .group 00000008 0000000000000000 0000000000000000 00000040 2**2
...
4 .text._ZN3barIiE3fooEi 0000000e 0000000000000000 0000000000000000 00000064 2**1
CONTENTS, ALLOC, LOAD, READONLY, CODE
...
CONTENTS, READONLY
7 .eh_frame 00000058 0000000000000000 0000000000000000 000000a0 2**3
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
8 .sframe 00000062 0000000000000000 0000000000000000 000000f8 2**3
...
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text
0000000000000040 R_X86_64_PC32 .text._ZN3barIiE3fooEi
RELOCATION RECORDS FOR [.sframe]:
OFFSET TYPE VALUE
000000000000001c R_X86_64_PC32 .text
0000000000000030 R_X86_64_PC32 .text._ZN3barIiE3fooEi
Clang generates a similar set of relocations for .eh_frame:
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text
0000000000000040 R_X86_64_PC32 .text._ZN3barIiE3fooEi
0000000000000060 R_X86_64_PC32 .text+0x0000000000000020
All four open-source linkers, gnu-ld, gnu-gold, lld, and mold handle .eh_frame without even a comment about how this would be better handled in groups. And all four properly discard the relevant portion of .eh_frame when the group sections are discarded. It’s as if the reference to the discarded sections never existed, and the reference to the kept section is sufficient for unwinding.
The logic to do this is not difficult for formats designed for it. SFrame is designed for it, just like eh_frame.
Is the format sufficiently useful to include in LLVM?
There are two ways to think about this question. The first is a simple existence proof:
Since my original RFC, the gnu toolchain has added support for two additional targets, and others are under consideration. It has reasonably good traction there. Kernel developers are also adopting it via the gnu toolchain. A cursory glance at related mailing lists shows perhaps a dozen companies involved in its development in one way or another.
The contributors have varying levels of technical sophistication, but I’m sure most are aware of the various other unwind solutions in the pipeline. And they still have concluded this is a useful project to pursue. Regardless of what any one person thinks of it, enough companies and people have done enough work to prove that a broad group expects it will be a useful feature.
The second way of thinking about its usefulness is a more technical analysis. This is what some internal evaluations of the different options have concluded:
[@davidxl] Shadow stack support is limited to only a handful of targets. It is not ready to be turned on by default in production. Our performance experiments indicates non trivial performance regressions with it on.”
“[@rosedt] We have evaluated alternatives (see shadow stack reply). For kernel, there’s two use cases. One is the use case for profiling user space applications and the other is the use of using it as the unwinder in the kernel for live patching.
The hardware assisted use case can help with the former, but it is usually limited to a fix number of elements. If you have a stack deeper than that limit, sframes can still prove useful.
For the live kernel patching, hardware assistance is not helpful, as we need to unwind stacks of tasks that are currently off the run queue, and may not have executed for some time.
Regarding the syscall handling and BPP, the Deferred stacktrace infrastructure has been accepted upstream. This is needed for SFrames, as the SFrame tables exist in user space. Profiling happens via an interrupt or NMI. In that context, user space tables are unsafe to read. The deferred infrastructure is a way to delay the user space stack reading until the task goes back to user space. In that context, the user space tables are safe to read.
It may be that other unwind mechanisms turn out to cover some of these use-cases, but that is far from obvious, and sframes have traction today. There is no one-size-fits-all solution, but SFrame is useful to a wide variety of use cases. Further, it is much easier for users to fully evaluate sframes with them in upstream LLVM.
In the near future, a toolchain without sframe support will not be viable for advanced linux kernel development. That alone is a strong case for including it, regardless of any other issues it may or may not have.
Should I have handled upstreaming this differently?
Although there were only a few comments on my original RFC, they were all positive. I left it open for several weeks. No one asked for more analysis or wondered about its future. I would have been happy to do that. Perhaps this is it. In fact, I have gotten more “glad to see this going in” comments for this work than anything else I have contributed to LLVM.
As far as how that patch was submitted: I generally accept any reviewer github suggests, without regard to whether they are my colleagues or not. Most of these PRs have had non-colleague approvals.
I apologize for not including you in the last one. I do tend to add people involved in the project as they have a motivation to review and tend to be more responsive. But I do welcome all comments–whether or not I take them.
The PR was open for a good two weeks before you commented. And I did accept all but one of your suggestions. Quite frankly, the MC changes were routine relaxation issues, following well-known patterns. I did not expect it to be controversial, but I will be more careful in the future.
Let us know if you have further concerns.