Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LibraryImport generated code feedback #69608

Closed
stephentoub opened this issue May 20, 2022 · 8 comments
Closed

LibraryImport generated code feedback #69608

stephentoub opened this issue May 20, 2022 · 8 comments
Labels
area-System.Runtime.InteropServices source-generator Indicates an issue with a source generator feature
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented May 20, 2022

I spent a little time looking over the code generated for the LibraryImports in CoreLib. Some random feedback / comments / questions...

Performance

  1. [Addressed: [LibraryImportGenerator] Allow span copy for char arrays instead of manual copy loop #69764] Manual copy loops. For RegSetValueEx, I see this code being generated for marshaling:
                __lpData_gen_native__marshaller = new(lpData, new System.Span<byte>(__lpData_gen_native__marshaller__stackptr, 512), sizeof(ushort));
                {
                    System.ReadOnlySpan<char> __lpData_gen_native__marshaller__managedSpan = __lpData_gen_native__marshaller.GetManagedValuesSource();
                    System.Span<ushort> __lpData_gen_native__marshaller__nativeSpan = System.Runtime.InteropServices.MemoryMarshal.Cast<byte, ushort>(__lpData_gen_native__marshaller.GetNativeValuesDestination());
                    for (int __i0 = 0; __i0 < __lpData_gen_native__marshaller__managedSpan.Length; ++__i0)
                    {
                        __lpData_gen_native__marshaller__nativeSpan[__i0] = __lpData_gen_native__marshaller__managedSpan[__i0];
                    }
                }

That copy loop could be fairly inefficient. Can't we use MemoryMarshal.Cast in a case like this to treat the char input as a ushort and then use the vectorized ReadOnlySpan.CopyTo?

  1. [Already the case] SkipLocalsInit. Some of the generated code emits fairly large stackallocs, e.g. stackalloc byte[512]. That can add measurable overhead if the user's code hasn't set SkipLocalsInit. Can we add [SkipLocalsInit] to the generated method if there isn't already one provided by the user at a level that would encompass the generated method?

  2. GetArrayDataReference. The GetCalendars method in corelib is being generated like this:

        internal static partial int GetCalendars(string localeName, global::System.Globalization.CalendarId[] calendars, int calendarsCapacity)
        {
            int __retVal;
            //
            // Marshal
            //
            ref global::System.Globalization.CalendarId __byref_calendars = ref (calendars == null ? ref *(global::System.Globalization.CalendarId*)0 : ref System.Runtime.InteropServices.MemoryMarshal.GetArrayDataReference(calendars));
            //
            // Pin
            //
            fixed (void* __localeName_gen_native__pinned = localeName)
            {
                ushort* __localeName_gen_native = (ushort*)__localeName_gen_native__pinned;
                fixed (byte* __calendars_gen_native = &System.Runtime.CompilerServices.Unsafe.As<global::System.Globalization.CalendarId, byte>(ref __byref_calendars))
                {
                    __retVal = __PInvoke__(__localeName_gen_native, __calendars_gen_native, calendarsCapacity);
                }
            }

            return __retVal;
            //
            // Local P/Invoke
            //
            [System.Runtime.InteropServices.DllImportAttribute("System.Globalization.Native", EntryPoint = "GlobalizationNative_GetCalendars", ExactSpelling = false)]
            extern static unsafe int __PInvoke__(ushort* localeName, byte* calendars, int calendarsCapacity);
        }

Why is it getting a ref to the array via that complicated null check + GetArrayDataReference followed later by using fixed on an Unsafe.As cast in order to get a byte*, rather than just using fixed with the array to get a void* and then casting that pointer to byte*?

  1. SafeHandle disposal. When marshaling out a SafeHandle, the generator emits code like this:
            __retVal = new global::Microsoft.Win32.SafeHandles.SafeFindHandle();
            try
            {
                //
                // Pin
                //
                fixed (void* __lpFileName_gen_native__pinned = lpFileName)
                {
                    ushort* __lpFileName_gen_native = (ushort*)__lpFileName_gen_native__pinned;
                    fixed (global::Interop.Kernel32.WIN32_FIND_DATA* __lpFindFileData_gen_native = &lpFindFileData)
                    {
                        {
                            System.Runtime.InteropServices.Marshal.SetLastSystemError(0);
                            __retVal_gen_native = __PInvoke__(__lpFileName_gen_native, fInfoLevelId, __lpFindFileData_gen_native, fSearchOp, lpSearchFilter, dwAdditionalFlags);
                            __lastError = System.Runtime.InteropServices.Marshal.GetLastSystemError();
                        }
                    }
                }

                __invokeSucceeded = true;
            }
            finally
            {
                if (__invokeSucceeded)
                {
                    //
                    // GuaranteedUnmarshal
                    //
                    System.Runtime.InteropServices.Marshal.InitHandle(__retVal, __retVal_gen_native);
                }
            }

It's using a try/finally and __invokeSucceeded, so it's trying to accomodate something in that try region throwing. But if something there does throw, the allocated SafeHandle won't be disposed, which will in turn leave it for finalization. Should there be a catch block?

catch
{
    __retVal.Dispose();
    throw;
}

or potentially instead a catch with an exception filter that does the disposal and returns false to avoid interrupting the unwind?

  1. [Addressed: Libraryimport src gen audit #69619] Out initialization. When there's an out variable, the generated code starts off by zero'ing it, e.g.
        internal static partial int CLSIDFromProgID(string lpszProgID, out global::System.Guid lpclsid)
        {
            lpclsid = default;
            ...

Does the built-in marshaling do this, or is this done to work around C# definite assignment errors? If we're doing it just for the latter, we could instead use Unsafe.SkipInit(out ...), and avoid the extra zero'ing / indirect write.

Style

  1. Blank lines. The generator currently emits code like:
            int __throwOnError_gen_native;
            int __ignoreCase_gen_native;
            //
            // Marshal
            //
            __throwOnError_gen_native = (int)(throwOnError ? 1 : 0);
            __ignoreCase_gen_native = (int)(ignoreCase ? 1 : 0);
            //
            // Pin
            //
            fixed (void* __name_gen_native__pinned = name)
            {
                ushort* __name_gen_native = (ushort*)__name_gen_native__pinned;
                {
                    __PInvoke__(assembly, __name_gen_native, __throwOnError_gen_native, __ignoreCase_gen_native, type, keepAlive, assemblyLoadContext);
                }
            }

Could we add a blank line, purely for readability, before each big comment block that's preceded by statements? Also, do we need the extra "//" lines, or could these just be single-line comments? It takes up a lot of vertical real estate, and I'm not sure what it helps.

  1. [Addressed: Libraryimport src gen audit #69619] Comments. Instead of "Marshal", could we make it a bit more descriptive, e.g. "Marshal the input arguments.", and instead of "Pin", it could be "Pin the input parameters". Similarly for "Unmarshal", which could be "Marshal the output values" or something like that.

  2. [Addressed: Style improvements for generated p/invokes #69638] Variable naming. Lots of variables include "_gen_native"... presumably the "_native" part is meant to mean that's the variable that's going to be passed to the P/Invoke... what's the meaning / value-add of the "_gen"? The variables are already prefixed with "__", presumably to differentiate them from the values being used?

  3. [Addressed: Libraryimport src gen audit #69619] Modifier ordering. IDE0036 has a preference for the order of modifiers. Since we have to choose some ordering, it'd be nice to snap to its preference (https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/ide0036#overview), e.g. today we might output "extern static unsafe void PInvoke", but following its ordering it would be "static extern unsafe void PInvoke"

  4. [Addressed: Style improvements for generated p/invokes #69638] Braces. The actual PInvoke call often ends up being surrounded by an extra level of braces, e.g.

            //
            // Pin
            //
            fixed (void* __resourceName_gen_native__pinned = resourceName)
            {
                ushort* __resourceName_gen_native = (ushort*)__resourceName_gen_native__pinned;
                {
                    __retVal = __PInvoke__(assembly, __resourceName_gen_native, assemblyRef, retFileName);
                }
            }

Why?

  1. [Addressed: [LibraryImportGenerator] Reduce unnecessary casting/locals in pinning path #69804] Locals for storing cast values. As an example, the TraceResolvingHandlerInvoked method in corelib has this generated code:
            fixed (void* __assemblyName_gen_native__pinned = assemblyName)
            {
                ushort* __assemblyName_gen_native = (ushort*)__assemblyName_gen_native__pinned;
                fixed (void* __handlerName_gen_native__pinned = handlerName)
                {
                    ushort* __handlerName_gen_native = (ushort*)__handlerName_gen_native__pinned;
                    fixed (void* __alcName_gen_native__pinned = alcName)
                    {
                        ushort* __alcName_gen_native = (ushort*)__alcName_gen_native__pinned;
                        fixed (void* __resultAssemblyName_gen_native__pinned = resultAssemblyName)
                        {
                            ushort* __resultAssemblyName_gen_native = (ushort*)__resultAssemblyName_gen_native__pinned;
                            fixed (void* __resultAssemblyPath_gen_native__pinned = resultAssemblyPath)
                            {
                                ushort* __resultAssemblyPath_gen_native = (ushort*)__resultAssemblyPath_gen_native__pinned;
                                {
                                    __retVal_gen_native = __PInvoke__(__assemblyName_gen_native, __handlerName_gen_native, __alcName_gen_native, __resultAssemblyName_gen_native, __resultAssemblyPath_gen_native);
                                }
                            }
                        }
                    }
                }
            }

It's creating new locals just to store the void* cast to a ushort*, which is then only used as an argument. Could it instead be:

            fixed (void* __assemblyName_gen_native = assemblyName)
                fixed (void* __handlerName_gen_native = handlerName)
                    fixed (void* __alcName_gen_native = alcName)
                        fixed (void* __resultAssemblyName_gen_native = resultAssemblyName)
                            fixed (void* __resultAssemblyPath_gen_native = resultAssemblyPath)
                            {
                                    __retVal_gen_native = __PInvoke__((ushort*)__assemblyName_gen_native, (ushort*)__handlerName_gen_native, (ushort*)__alcName_gen_native, (ushort*)__resultAssemblyName_gen_native, (ushort*)__resultAssemblyPath_gen_native);
                            }

or even declare PInvoke to take void* instead of ushort* and then the casts wouldn't be needed at all?

            fixed (void* __assemblyName_gen_native = assemblyName)
                fixed (void* __handlerName_gen_native = handlerName)
                    fixed (void* __alcName_gen_native = alcName)
                        fixed (void* __resultAssemblyName_gen_native = resultAssemblyName)
                            fixed (void* __resultAssemblyPath_gen_native = resultAssemblyPath)
                            {
                                    __retVal_gen_native = __PInvoke__(__assemblyName_gen_native, __handlerName_gen_native, __alcName_gen_native, __resultAssemblyName_gen_native, __resultAssemblyPath_gen_native);
                            }
  1. Simplifiable boilerplate. Consider a method like Debugger.LaunchInternal:
        private static partial bool LaunchInternal()
        {
            bool __retVal;
            int __retVal_gen_native;
            __retVal_gen_native = __PInvoke__();
            //
            // Unmarshal
            //
            __retVal = __retVal_gen_native != 0;
            return __retVal;
            //
            // Local P/Invoke
            //
            [System.Runtime.InteropServices.DllImportAttribute("QCall", EntryPoint = "DebugDebugger_Launch", ExactSpelling = false)]
            extern static unsafe int __PInvoke__();
        }

If someone were writing this by hand, it would just be:

        private static partial bool LaunchInternal()
        {
            return __PInvoke__() != 0;

            [System.Runtime.InteropServices.DllImportAttribute("QCall", EntryPoint = "DebugDebugger_Launch", ExactSpelling = false)]
            extern static unsafe int __PInvoke__();
        }

Have we considered enabling such simplified emitted code when we can detect it?

  1. [Addressed: Libraryimport src gen audit #69619] Double underscores. I assume we're using the "__" prefix to avoid conflict with user-defined names, like for arguments? Why does the "__PInvoke__" also have a "__" suffix?
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 20, 2022
@stephentoub stephentoub added area-System.Runtime.InteropServices and removed untriaged New issue has not been triaged by the area owner labels May 20, 2022
@stephentoub stephentoub added this to the 7.0.0 milestone May 20, 2022
@ghost
Copy link

ghost commented May 20, 2022

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

Issue Details

I spent a little time looking over the code generated for the LibraryImports in CoreLib. Some random feedback / comments / questions...

Performance

  1. Manual copy loops. For RegSetValueEx, I see this code being generated for marshaling:
                __lpData_gen_native__marshaller = new(lpData, new System.Span<byte>(__lpData_gen_native__marshaller__stackptr, 512), sizeof(ushort));
                {
                    System.ReadOnlySpan<char> __lpData_gen_native__marshaller__managedSpan = __lpData_gen_native__marshaller.GetManagedValuesSource();
                    System.Span<ushort> __lpData_gen_native__marshaller__nativeSpan = System.Runtime.InteropServices.MemoryMarshal.Cast<byte, ushort>(__lpData_gen_native__marshaller.GetNativeValuesDestination());
                    for (int __i0 = 0; __i0 < __lpData_gen_native__marshaller__managedSpan.Length; ++__i0)
                    {
                        __lpData_gen_native__marshaller__nativeSpan[__i0] = __lpData_gen_native__marshaller__managedSpan[__i0];
                    }
                }

That copy loop could be fairly inefficient. Can't we use MemoryMarshal.Cast in a case like this to treat the char input as a ushort and then use the vectorized ReadOnlySpan.CopyTo?

  1. SkipLocalsInit. Some of the generated code emits fairly large stackallocs, e.g. stackalloc byte[512]. That can add measurable overhead if the user's code hasn't set SkipLocalsInit. Can we add [SkipLocalsInit] to the generated method if there isn't already one provided by the user at a level that would encompass the generated method?

  2. GetArrayDataReference. The GetCalendars method in corelib is being generated like this:

        internal static partial int GetCalendars(string localeName, global::System.Globalization.CalendarId[] calendars, int calendarsCapacity)
        {
            int __retVal;
            //
            // Marshal
            //
            ref global::System.Globalization.CalendarId __byref_calendars = ref (calendars == null ? ref *(global::System.Globalization.CalendarId*)0 : ref System.Runtime.InteropServices.MemoryMarshal.GetArrayDataReference(calendars));
            //
            // Pin
            //
            fixed (void* __localeName_gen_native__pinned = localeName)
            {
                ushort* __localeName_gen_native = (ushort*)__localeName_gen_native__pinned;
                fixed (byte* __calendars_gen_native = &System.Runtime.CompilerServices.Unsafe.As<global::System.Globalization.CalendarId, byte>(ref __byref_calendars))
                {
                    __retVal = __PInvoke__(__localeName_gen_native, __calendars_gen_native, calendarsCapacity);
                }
            }

            return __retVal;
            //
            // Local P/Invoke
            //
            [System.Runtime.InteropServices.DllImportAttribute("System.Globalization.Native", EntryPoint = "GlobalizationNative_GetCalendars", ExactSpelling = false)]
            extern static unsafe int __PInvoke__(ushort* localeName, byte* calendars, int calendarsCapacity);
        }

Why is it getting a ref to the array via that complicated null check + GetArrayDataReference followed later by using fixed on an Unsafe.As cast in order to get a byte*, rather than just using fixed with the array to get a void* and then casting that pointer to byte*?

  1. SafeHandle disposal. When marshaling out a SafeHandle, the generator emits code like this:
            __retVal = new global::Microsoft.Win32.SafeHandles.SafeFindHandle();
            try
            {
                //
                // Pin
                //
                fixed (void* __lpFileName_gen_native__pinned = lpFileName)
                {
                    ushort* __lpFileName_gen_native = (ushort*)__lpFileName_gen_native__pinned;
                    fixed (global::Interop.Kernel32.WIN32_FIND_DATA* __lpFindFileData_gen_native = &lpFindFileData)
                    {
                        {
                            System.Runtime.InteropServices.Marshal.SetLastSystemError(0);
                            __retVal_gen_native = __PInvoke__(__lpFileName_gen_native, fInfoLevelId, __lpFindFileData_gen_native, fSearchOp, lpSearchFilter, dwAdditionalFlags);
                            __lastError = System.Runtime.InteropServices.Marshal.GetLastSystemError();
                        }
                    }
                }

                __invokeSucceeded = true;
            }
            finally
            {
                if (__invokeSucceeded)
                {
                    //
                    // GuaranteedUnmarshal
                    //
                    System.Runtime.InteropServices.Marshal.InitHandle(__retVal, __retVal_gen_native);
                }
            }

It's using a try/finally and __invokeSucceeded, so it's trying to accomodate something in that try region throwing. But if something there does throw, the allocated SafeHandle won't be disposed, which will in turn leave it for finalization. Should there be a catch block?

catch
{
    __retVal.Dispose();
    throw;
}

or potentially instead a catch with an exception filter that does the disposal and returns false to avoid interrupting the unwind?

  1. Out initialization. When there's an out variable, the generated code starts off by zero'ing it, e.g.
        internal static partial int CLSIDFromProgID(string lpszProgID, out global::System.Guid lpclsid)
        {
            lpclsid = default;
            ...

Does the built-in marshaling do this, or is this done to work around C# definite assignment errors? If we're doing it just for the latter, we could instead use Unsafe.SkipInit(out ...), and avoid the extra zero'ing / indirect write.

Style

  1. Blank lines. The generator currently emits code like:
            int __throwOnError_gen_native;
            int __ignoreCase_gen_native;
            //
            // Marshal
            //
            __throwOnError_gen_native = (int)(throwOnError ? 1 : 0);
            __ignoreCase_gen_native = (int)(ignoreCase ? 1 : 0);
            //
            // Pin
            //
            fixed (void* __name_gen_native__pinned = name)
            {
                ushort* __name_gen_native = (ushort*)__name_gen_native__pinned;
                {
                    __PInvoke__(assembly, __name_gen_native, __throwOnError_gen_native, __ignoreCase_gen_native, type, keepAlive, assemblyLoadContext);
                }
            }

Could we add a blank line, purely for readability, before each big comment block that's preceded by statements? Also, do we need the extra "//" lines, or could these just be single-line comments? It takes up a lot of vertical real estate, and I'm not sure what it helps.

  1. Comments. Instead of "Marshal", could we make it a bit more descriptive, e.g. "Marshal the input arguments.", and instead of "Pin", it could be "Pin the input parameters". Similarly for "Unmarshal", which could be "Marshal the output values" or something like that.

  2. Variable naming. Lots of variables include "_gen_native"... presumably the "_native" part is meant to mean that's the variable that's going to be passed to the P/Invoke... what's the meaning / value-add of the "_gen"? The variables are already prefixed with "__", presumably to differentiate them from the values being used?

  3. Modifier ordering. IDE0036 has a preference for the order of modifiers. Since we have to choose some ordering, it'd be nice to snap to its preference (https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/ide0036#overview), e.g. today we might output "extern static unsafe void PInvoke", but following its ordering it would be "static extern unsafe void PInvoke"

  4. Braces. The actual PInvoke call often ends up being surrounded by an extra level of braces, e.g.

            //
            // Pin
            //
            fixed (void* __resourceName_gen_native__pinned = resourceName)
            {
                ushort* __resourceName_gen_native = (ushort*)__resourceName_gen_native__pinned;
                {
                    __retVal = __PInvoke__(assembly, __resourceName_gen_native, assemblyRef, retFileName);
                }
            }

Why?

  1. Locals for storing cast values. As an example, the TraceResolvingHandlerInvoked method in corelib has this generated code:
            fixed (void* __assemblyName_gen_native__pinned = assemblyName)
            {
                ushort* __assemblyName_gen_native = (ushort*)__assemblyName_gen_native__pinned;
                fixed (void* __handlerName_gen_native__pinned = handlerName)
                {
                    ushort* __handlerName_gen_native = (ushort*)__handlerName_gen_native__pinned;
                    fixed (void* __alcName_gen_native__pinned = alcName)
                    {
                        ushort* __alcName_gen_native = (ushort*)__alcName_gen_native__pinned;
                        fixed (void* __resultAssemblyName_gen_native__pinned = resultAssemblyName)
                        {
                            ushort* __resultAssemblyName_gen_native = (ushort*)__resultAssemblyName_gen_native__pinned;
                            fixed (void* __resultAssemblyPath_gen_native__pinned = resultAssemblyPath)
                            {
                                ushort* __resultAssemblyPath_gen_native = (ushort*)__resultAssemblyPath_gen_native__pinned;
                                {
                                    __retVal_gen_native = __PInvoke__(__assemblyName_gen_native, __handlerName_gen_native, __alcName_gen_native, __resultAssemblyName_gen_native, __resultAssemblyPath_gen_native);
                                }
                            }
                        }
                    }
                }
            }

It's creating new locals just to store the void* cast to a ushort*, which is then only used as an argument. Could it instead be:

            fixed (void* __assemblyName_gen_native = assemblyName)
                fixed (void* __handlerName_gen_native = handlerName)
                    fixed (void* __alcName_gen_native = alcName)
                        fixed (void* __resultAssemblyName_gen_native = resultAssemblyName)
                            fixed (void* __resultAssemblyPath_gen_native = resultAssemblyPath)
                            {
                                    __retVal_gen_native = __PInvoke__((ushort*)__assemblyName_gen_native, (ushort*)__handlerName_gen_native, (ushort*)__alcName_gen_native, (ushort*)__resultAssemblyName_gen_native, (ushort*)__resultAssemblyPath_gen_native);
                            }

or even declare PInvoke to take void* instead of ushort* and then the casts wouldn't be needed at all?

            fixed (void* __assemblyName_gen_native = assemblyName)
                fixed (void* __handlerName_gen_native = handlerName)
                    fixed (void* __alcName_gen_native = alcName)
                        fixed (void* __resultAssemblyName_gen_native = resultAssemblyName)
                            fixed (void* __resultAssemblyPath_gen_native = resultAssemblyPath)
                            {
                                    __retVal_gen_native = __PInvoke__(__assemblyName_gen_native, __handlerName_gen_native, __alcName_gen_native, __resultAssemblyName_gen_native, __resultAssemblyPath_gen_native);
                            }
  1. Simplifiable boilerplate. Consider a method like Debugger.LaunchInternal:
        private static partial bool LaunchInternal()
        {
            bool __retVal;
            int __retVal_gen_native;
            __retVal_gen_native = __PInvoke__();
            //
            // Unmarshal
            //
            __retVal = __retVal_gen_native != 0;
            return __retVal;
            //
            // Local P/Invoke
            //
            [System.Runtime.InteropServices.DllImportAttribute("QCall", EntryPoint = "DebugDebugger_Launch", ExactSpelling = false)]
            extern static unsafe int __PInvoke__();
        }

If someone were writing this by hand, it would just be:

        private static partial bool LaunchInternal()
        {
            return __PInvoke__() != 0;

            [System.Runtime.InteropServices.DllImportAttribute("QCall", EntryPoint = "DebugDebugger_Launch", ExactSpelling = false)]
            extern static unsafe int __PInvoke__();
        }

Have we considered enabling such simplified emitted code when we can detect it?

  1. Double underscores. I assume we're using the "__" prefix to avoid conflict with user-defined names, like for arguments? Why does the "__PInvoke__" also have a "__" suffix?
Author: stephentoub
Assignees: -
Labels:

area-System.Runtime.InteropServices

Milestone: -

@AaronRobinsonMSFT AaronRobinsonMSFT added the source-generator Indicates an issue with a source generator feature label May 20, 2022
@elinor-fung
Copy link
Member

  1. SkipLocalsInit... Can we add [SkipLocalsInit] to the generated method if there isn't already one provided by the user at a level that would encompass the generated method?

@stephentoub could you point us at an example for this? We should already be doing this (and should have tests for it), so it's definitely a bug if it is not happening. I believe most of the runtime libraries already have it set it at the module level, so we don't add it on the method.

@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented May 21, 2022

@stephentoub Using Unsafe.SkipInit is possible except when the out is a pointer value - good idea. The remaining 4 functional statements the team is looking at - see @elinor-fung's comment above.

I appreciate the aspirational aspect of the style issues and I think we can move closer, but for our first public version we are only going to focus on the easy ones. Updated comments on the stages - good idea and will help with design and education. Modifier order is an oops on our part, good idea. The "__" suffix is superfluous and agree that we can remove it.

The rest are really about simplifying code. This gets really complicated fast because it involves going back through the Roslyn syntax trees and updating them. I see very little practical value in this and minimizing it is going to become very tedious. For right now the rest of the style suggestions are going to be deferred/rejected.

Seems I misinterpreted the stated cost for some of the style issues. Will handle on a case by case basis.

@stephentoub
Copy link
Member Author

stephentoub commented May 22, 2022

We should already be doing this

You're right, it's behaving as I'd expect. I'd tried it in a separate project but must have been doing something wrong. I see it now doing what I expect it to be doing.

The remaining 4 functional statements the team is looking at

Thanks.

The rest are really about simplifying code.

Yes. Though sometimes stylistic improvements also map to smaller IL and asm, e.g. here's (6) on the style list, before and after:
SharpLab
image

This gets really complicated fast because it involves going back through the Roslyn syntax trees and updating them.

I'm curious why we're using Roslyn's syntax trees to generate the resulting code, especially if it makes it harder to generate the desired output. We can also just write out the C# directly, which is what all of the other generators in dotnet/runtime do (the logging generator, the json generator, and the regex generator).

Will handle on a case by case basis.

Sounds good. It's feedback; address it as you see fit.

@jkoritzinsky
Copy link
Member

jkoritzinsky commented May 24, 2022

Performance

  1. GetArrayDataReference. The GetCalendars method in corelib is being generated like this:
        internal static partial int GetCalendars(string localeName, global::System.Globalization.CalendarId[] calendars, int calendarsCapacity)
        {
            int __retVal;
            //
            // Marshal
            //
            ref global::System.Globalization.CalendarId __byref_calendars = ref (calendars == null ? ref *(global::System.Globalization.CalendarId*)0 : ref System.Runtime.InteropServices.MemoryMarshal.GetArrayDataReference(calendars));
            //
            // Pin
            //
            fixed (void* __localeName_gen_native__pinned = localeName)
            {
                ushort* __localeName_gen_native = (ushort*)__localeName_gen_native__pinned;
                fixed (byte* __calendars_gen_native = &System.Runtime.CompilerServices.Unsafe.As<global::System.Globalization.CalendarId, byte>(ref __byref_calendars))
                {
                    __retVal = __PInvoke__(__localeName_gen_native, __calendars_gen_native, calendarsCapacity);
                }
            }

            return __retVal;
            //
            // Local P/Invoke
            //
            [System.Runtime.InteropServices.DllImportAttribute("System.Globalization.Native", EntryPoint = "GlobalizationNative_GetCalendars", ExactSpelling = false)]
            extern static unsafe int __PInvoke__(ushort* localeName, byte* calendars, int calendarsCapacity);
        }

Why is it getting a ref to the array via that complicated null check + GetArrayDataReference followed later by using fixed on an Unsafe.As cast in order to get a byte*, rather than just using fixed with the array to get a void* and then casting that pointer to byte*?

We can't use a fixed statement with arrays as C# normalizes an empty array to a null value, which breaks a lot of interop scenarios with arrays. We have the complex initialization to handle null arrays as a null array causes an exception in GetArrayDataReference. See

fixed (char* pSource = &MemoryMarshal.GetNonNullPinnableReference(source))
, https://github.com/dotnet/runtime/pull/52030/files#diff-4b9bcd238c75b08d184c630e8e90372464f848bbb2ed0466c0c3bafe74ee27b7R23, dotnet/runtimelab#1051 (comment),
// COMPAT: We cannot generate the same code that the C# compiler generates for
.

  1. SafeHandle disposal. When marshaling out a SafeHandle, the generator emits code like this:
            __retVal = new global::Microsoft.Win32.SafeHandles.SafeFindHandle();
            try
            {
                //
                // Pin
                //
                fixed (void* __lpFileName_gen_native__pinned = lpFileName)
                {
                    ushort* __lpFileName_gen_native = (ushort*)__lpFileName_gen_native__pinned;
                    fixed (global::Interop.Kernel32.WIN32_FIND_DATA* __lpFindFileData_gen_native = &lpFindFileData)
                    {
                        {
                            System.Runtime.InteropServices.Marshal.SetLastSystemError(0);
                            __retVal_gen_native = __PInvoke__(__lpFileName_gen_native, fInfoLevelId, __lpFindFileData_gen_native, fSearchOp, lpSearchFilter, dwAdditionalFlags);
                            __lastError = System.Runtime.InteropServices.Marshal.GetLastSystemError();
                        }
                    }
                }

                __invokeSucceeded = true;
            }
            finally
            {
                if (__invokeSucceeded)
                {
                    //
                    // GuaranteedUnmarshal
                    //
                    System.Runtime.InteropServices.Marshal.InitHandle(__retVal, __retVal_gen_native);
                }
            }

It's using a try/finally and __invokeSucceeded, so it's trying to accomodate something in that try region throwing. But if something there does throw, the allocated SafeHandle won't be disposed, which will in turn leave it for finalization. Should there be a catch block?

catch
{
    __retVal.Dispose();
    throw;
}

or potentially instead a catch with an exception filter that does the disposal and returns false to avoid interrupting the unwind?

Our SafeHandle marshalling was designed to match the built-in marshalling as closely as possible, as SafeHandle has particular guarantees in interop that we didn't want to break. Changing this portion of the behavior seems doable, so we'll consider it.

I'm curious why we're using Roslyn's syntax trees to generate the resulting code, especially if it makes it harder to generate the desired output. We can also just write out the C# directly, which is what all of the other generators in dotnet/runtime do (the logging generator, the json generator, and the regex generator).

As we designed our generator, we decided that using at least some of the Roslyn syntax APIs would help make our code easier to maintain as we'd know if a particular API returned a statement, expression, type name, etc. just by looking at the signature. Additionally, it handles indentation for us, which is quite a convenient feature. Honestly I feel like just using string concatenation would make our generator harder to maintain and extend as we'd have to heavily document what each method returns as the type system wouldn't be able to help us, and analyzing the syntax after generation to optimize code after generation would actually be even more difficult than using the Roslyn APIs. We don't want to special-case or overcomplicate things in our generator too much as it can quickly make the system unmaintainable (i.e. MCG).

We could probably use a speculative semantic model to do some post-generation analysis and simplification if the gains are large enough, but I don't know if it's worth the effort (or even possible with Roslyn's API). Roslyn doesn't currently support getting a speculative semantic model for a new method body for partial method with no body, so that combined with the other costs makes me feel like going this direction is too expensive.

@GSPP
Copy link

GSPP commented May 25, 2022

Can this complex "GetArrayDataReference" expression be made into a helper method in the CompilerServices namespace? That would remove the need to emit this code again and again. This would reduce binary size, improve compile times and make the code more readable.

@jkoritzinsky
Copy link
Member

jkoritzinsky commented Aug 5, 2022

@GSPP that change was handled with the V2 marshaller design and the new ArrayMarshaller<T, TUnmanagedElement> type (the complex code is in the static GetPinnableReference method).

I've extracted performance suggestion 4 (safehandle) to #73496 as that's the last feedback item that we plan to address. I'll close this as completed in the meantime.

@ghost ghost locked as resolved and limited conversation to collaborators Sep 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime.InteropServices source-generator Indicates an issue with a source generator feature
Projects
None yet
Development

No branches or pull requests

5 participants