Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose StressLog via CDAC and port StressLogAnalyzer to managed code #104999

Open
wants to merge 54 commits into
base: main
Choose a base branch
from

Conversation

jkoritzinsky
Copy link
Member

Current state in main

StressLog is an in-memory logging apparatus in CoreCLR and NativeAOT. This log is an in-memory circular log that stores messages as "offsets to format string in memory and array of arguments as size_t values".

This format has been generally stable but has evolved slowly over time. Unlike most types in CoreCLR, the StressLog was never DACized. Furthermore, it was never abstracted on the SOS-DAC boundary. As a result, SOS has intimate knowledge of the memory layout and format of the StressLog structures. The StressLog types are one of the reasons the SOSBreakingChangeVersion exists (and are responsible for 2 of the 4 version increases done over the years).

SOS supports the stresslog types by having copies of the stresslog header in its codebase and implementing some of the methods only declared when STRESS_LOG_READONLY is defined. The !dumplog and !histinit and other GC history commands utilize the stress log.

Additionally, dotnet/runtime has a tool called StressLogAnalyzer that supports reading memory-mapped stress logs (which don't require dumps). This code-base has a modified copy of the stress-log dumping/reading code from dotnet/diagnostics (including some dead code and some shims to make things compile) to support dumping and analyzing GC behavior based on stress logs.

After this PR

With this PR, a StressLog contract is defined in the cDAC. This contract provides 3 versions to enable SOS to use cDAC for all stress log dumping, even for older target runtimes. These versions differ by message format and the lookup process for the format string from the provided offset. Each of these messages corresponds to a different SOS Breaking Change Version. For versions that are supported by memory-mapped stress logs, each version corresponds to a version of the memory-mapped stress log. As there are few versions and they only differ by pointer-size, the contract descriptors for these cases can be hard-coded in SOS if a descriptor is not available.

CoreCLR (and NativeAOT) will implement version 2 of the contract, which is the newest version in this PR.

The StressLog contract exposes the information needed to dump the header of the stress log in SOS, get the logs for all threads, and get all messages from a given thread's log.

The contract does not include an implementation of formatting a stress log message into a string, but it does include a conceptual description of how to do so.

StressLogAnalyzer has been re-written into C# and uses the cDAC reader to read the StressLog. The following behavior differences are noted:

  • Specifying -tid without a thread while also specifying a thread will not change thread ids to hex. When filtering threads, specify --hex to get hex thread ids.
  • Specifying -f without a format string while also specifying a format string will not output format strings. When specifying format strings, pass -pf to print format strings as well. 

Follow-ups

This PR does not present a mechanism to push a managed cDAC package to the internal feed for SOS to consume. This would be necessary to move SOS to use cDAC to implement StressLog dumping in managed code, or any of the other commands currently implemented in the native portion of SOS.

Once a package exists that SOS can consume, the code to read the StressLog in C++ can be removed from CoreCLR as well. I don't want to remove it before we can remove the copy in SOS.

After this PR is merged, we can update the memory-mapped stresslog format to store a pointer to the contract descriptor in the runtime image. That will ensure that any future changes to the stresslog format won't require versioning the memory-mapped stresslog format version, only updating the contract.

Start on the data parsing and fill out the contract more

Implement stress message payload parsing and thread stress log iteration

Implement everything except for printing the string output

Implement output string formatting for stress log messages

Move all StressMsg logic into the contract and out of the data type.

Split the stress log contract into different versions and refactor out the StressLogHeader to be entirely a StressLogAnalyzer concern (not used in the cDAC implementation at all)

Expose the module table directly to avoid going through the StressLog object to read the module table (which isn't present in the memory-mapped log)

Give a nicer error message for an easy-to-fall-in special case in the data descriptor scraping.

Fix a few datadescriptor problems
Move stress message formatting out of the contract.
…more cases (including the alignment scenarios).
Copy link
Contributor

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

…r debug return logging (make it a const instead of a function)
@jkoritzinsky
Copy link
Member Author

@Maoni0 @cshung I've opened #108089 for the fixes for the TRACE_GC build as they're in main and not due to this PR.

jkoritzinsky and others added 3 commits December 14, 2024 12:37
… the same logic will be used from any StressLog reader.

The implementation of the formatter will not be part of the contract itself, but the description of the format string's structure is part of the contract (this is already the case in this branch).
@jkoritzinsky
Copy link
Member Author

@elinor-fung this is ready for another round of review.

}
}

#pragma warning disable CS9107 // Parameter 'Target target' is captured into the state of the enclosing type and its value is also passed to the base constructor. The value might be captured by the base class as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason to not just fix this? If we have an explicit field/property on the base class and have all the subclasses go through that, it should be the current logic without the extra field stored on the subclasses, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an explicit property on the base class. I'll see if I can clean this up to not need the pragma and still be in a style I want.

string FormatVTable(TargetPointer pointer);
string FormatStackTrace(TargetPointer pointer);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change


namespace Microsoft.Diagnostics.DataContractReader;

public sealed class PrintfStressMessageFormatter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this really belongs in the Contracts assembly. It looks like a helper for use on top of the contracts, which seems more like part of a contract consumer - so the StressLogAnalyzer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan was to reuse this in other places that need to also implement stress message formatting, like an eventual implementation of SOS's DumpLog. It would need the exact same implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants