Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Expose span on SpanSplitEnumerator #109874

Open
JeremyKuhne opened this issue Nov 15, 2024 · 5 comments
Open

[API Proposal]: Expose span on SpanSplitEnumerator #109874

JeremyKuhne opened this issue Nov 15, 2024 · 5 comments
Assignees
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Memory needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration

Comments

@JeremyKuhne
Copy link
Member

JeremyKuhne commented Nov 15, 2024

Background and motivation

Exposing it would allow writing extension methods on it, so one could do something like Split('\0').ToArray().

// Not possible to write currently
public static T[] ToArray<T>(this MemoryExtensions.SpanSplitEnumerator<T> enumerator) where T : IEquatable<T>

#934 #104534

API Proposal

public partial struct SpanSplitEnumerator<T>
{
   public ReadOnlySpan<T> Span;
}

The current API, for context:

public ref struct SpanSplitEnumerator<T>
{
    public SpanSplitEnumerator<T> GetEnumerator();
    public bool MoveNext();
    public Range Current { get; }
}

API Usage

// This is a common pattern in Win32
public static string[] SplitStringList(this ReadOnlySpan<char> strings)
    => strings.Split('\0').ToArray();

// Usable by above and any other code that needs to convert to an array
public static T[] ToArray<T>(this MemoryExtensions.SpanSplitEnumerator<T> enumerator) where T : IEquatable<T>
// ... implementation that grabs the ranges and creates the appropriately sized array if needed

Alternative Designs

Could possibly expose range related methods so one could do something like enumerator[enumerator.Current]?

Risks

None known.

@JeremyKuhne JeremyKuhne added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Nov 15, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Nov 15, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 15, 2024
@MihaZupan MihaZupan added area-System.Memory and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Nov 16, 2024
@bartonjs
Copy link
Member

Assuming that Span is the full captured span that's being split, I worry the name is too broad (someone could easily assume that it's the Span equivalent to the Range from Current).

While FullSpan was the suggestion off the top of my head, Source might actually be the most apt.

@jeffhandley
Copy link
Member

Assigned to @adamsitnik for further triage.

@adamsitnik
Copy link
Member

@JeremyKuhne I am looking at the provided example: Split('\0').ToArray() and I am not sure if I understand what given span would be exposing, the current slice? inputSpan[enumerator.Current]?

Which would be the equivalent of:

static char[] Equivalent(ReadOnlySpan<char> chars)
{
    SpanSplitEnumerator<char> enumerator = chars.Split('\0');
    // missing enumerator.MoveNext() so the returned array would always be empty
    return chars[enumerator.Current].ToArray();
}

Or you would like the Split('\0').ToArray() to return all the chars except of provided null character?

@adamsitnik adamsitnik added needs-author-action An issue or pull request that requires more info or actions from the author. and removed untriaged New issue has not been triaged by the area owner labels Jan 23, 2025
@bartonjs
Copy link
Member

My understanding is he wants the property to be the span that was passed to Split in the first place (which is why I suggested it be called Source):

ReadOnlySpan<char> str = "Hello|World";
var enumerator = str.Split('|');
Assert.True(enumerator.Source == str);
Assert.True(enumerator.MoveNext());
Assert.True(enumerator.Source == str);
Assert.True(enumerator.Current.Length == 5);
Assert.True(enumerator.MoveNext());
Assert.True(enumerator.Source == str);
Assert.True(enumerator.Current.Length == 5);
Assert.False(enumerator.MoveNext());
Assert.True(enumerator.Source == str);

which lets him change

private static string[] Split(ReadOnlySpan<char> source, char delim)
{
    int i = 0;

    foreach (var ignored in source.Split(delim))
    {
        i++;
    }

    string[] ret = new string[i];
    i = 0;

    foreach (Range range in source.Split(delim))
    {
        ret[i] = source[range].ToString();
        i++;
    }
}

into

private static string[] Split(ReadOnlySpan<char> source, char delim)
{
    return source.Split(delim).ToArray();
}

private static string[] ToArray(this MemoryExtensions.SpanSplitEnumerator<char> enumerator)
{
    int i = 0;

    foreach (var ignored in enumerator)
    {
        i++;
    }

    string[] ret = new string[i];
    i = 0;

    foreach (Range range in enumerator)
    {
        ret[i] = enumerator.Source[range].ToString();
        i++;
    }
}

@JeremyKuhne
Copy link
Member Author

@adamsitnik, @bartonjs assumption is correct. :)

@dotnet-policy-service dotnet-policy-service bot added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Memory needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

No branches or pull requests

5 participants