Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Encoding in Console Output with WinExe OutputType #111057

Open
lindexi opened this issue Jan 3, 2025 · 3 comments
Open

Incorrect Encoding in Console Output with WinExe OutputType #111057

lindexi opened this issue Jan 3, 2025 · 3 comments
Labels
area-System.Console untriaged New issue has not been triaged by the area owner

Comments

@lindexi
Copy link
Member

lindexi commented Jan 3, 2025

Description

When setting the OutputType to WinExe, the console output is encoded incorrectly, resulting in garbled text.

Reproduction Steps

  1. Create a console project and set the OutputType to WinExe.
  2. Perform console output.

You will notice that the encoding of StandardOutput in the console is set to CodePage=0, which causes other software attempting to capture the application's output to display garbled text for some Unicode content.

I have written a simple demo program to illustrate this issue. Here is my code:

using System.Diagnostics;
using System.Text;

var codePage = Console.OutputEncoding.CodePage; // The code page will be 0 when the OutputType is WinExe.

if (args.Length > 0)
{
    Console.WriteLine($"CodePage={codePage} Text=\u6797");
}
else
{
    var self = Path.Join(AppContext.BaseDirectory, "YemwearqufeballnoBayboqemli.exe");
    var processStartInfo = new ProcessStartInfo(self, "foo");
    processStartInfo.RedirectStandardOutput = true;
    processStartInfo.StandardOutputEncoding = Encoding.UTF8;
    var process = Process.Start(processStartInfo)!;
    var text = process.StandardOutput.ReadToEnd();
    // You can find the output text is "CodePage=0 Text=��"
    _ = text;
}

You can access the entire project code from https://github.com/lindexi/lindexi_gd/tree/0dc56dbf6f635a7cc9cbda295b1cbe40c2eab8d9/Workbench/YemwearqufeballnoBayboqemli

Expected behavior

When setting OutputType to WinExe, it should still be possible to obtain the correct output encoding.

Actual behavior

Currently, it results in garbled text. This directly affects debugging WinExe applications in Rider, and it is not possible to set Console.OutputEncoding to UTF-8.

Reference: dotnet-campus/dotnetCampus.Logger#32

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 3, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-console
See info in area-owners.md if you want to be subscribed.

@hez2010
Copy link
Contributor

hez2010 commented Jan 3, 2025

Why are you asking for console when you explicitly disabled console by using WinExe?

@tannergooding
Copy link
Member

tannergooding commented Jan 3, 2025

Why are you asking for console when you explicitly disabled console by using WinExe?

Not having a visible output window is not the same as not having output altogether.

System.Console ultimately defaults to a thin wrapper over the standard C input/output streams. On Windows, it additionally uses the Win32 Console APIs to try and query various information and ensure it behaves "better" in the default environment.

There are then many ways for an exe to not have a console window, such as by using the CreateProcess parameters that disable it. There are equally many way for a winexe to have a console, such as by using AllocConsole.

The general issue here looks to be that Console.OutputEncoding on Windows is calling GetConsoleOutputCP and then not handling the failure result which is 0, it's just passing it down instead which will default it to ANSI.


The general console environment on Windows has changed a lot over recent years, while System.Console in .NET hasn't really had any changes to account for this, for the existence of pseudo-consoles, virtual terminal sequences, system code page differences, etc. Many (but not all) of the Win32 Console* APIs are correspondingly no longer recommended for use and have better alternatives. Likewise, the mix of using some Console* APIs but abstracting the standard C input/output streams in others leads to various disconnects like the above.

I expect its something that could be fixed, but which is not a trivial task and which has a high chance of impacting existing Windows console applications. -- Some of these nuances also show up on Linux, since the Linux environment for console/terminal handling is a bit different and doesn't "cleanly" map onto what .NET had exposed (which was largely oriented around the Windows APIs from 25 years ago).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Console untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

3 participants