Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random NullReference and InvalidCast crashes in mono class libraries #1188

Closed
Belorus opened this issue Jan 12, 2018 · 18 comments
Closed

Random NullReference and InvalidCast crashes in mono class libraries #1188

Belorus opened this issue Jan 12, 2018 · 18 comments
Assignees
Labels
Area: App Runtime Issues in `libmonodroid.so`. Area: Mono Runtime Mono-related issues: BCL bugs, AOT issues, etc. need-info Issues that need more information from the author.

Comments

@Belorus
Copy link

Belorus commented Jan 12, 2018

Steps to Reproduce

We have a big game that communicates with server via HTTP.

<AndroidLinkMode>SdkOnly</AndroidLinkMode>
<AotAssemblies>True</AotAssemblies>
<EnableLLVM>True</EnableLLVM>
<AndroidEnableSGenConcurrent>false</AndroidEnableSGenConcurrent>

GCBridge - default, it is tarjan in this XA version.

Actual Behavior

Most of the crashes come from SSL/TLS stack, but List.Add, Dictionary.Add also crash for some users, but 50 times less often.

Crashes on Androids:

Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper remoting-invoke-with-check) Mono.Security.Protocol.Tls.SslStreamBase:set_CheckCertRevocationStatus (bool)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
  at Mono.Security.Protocol.Tls.SslClientStream.add_ServerCertValidation2 (Mono.Security.Protocol.Tls.CertificateValidationCallback2 value) <0x9d0ee038 + 0x0005c> in <cdde401f4fc64c87a9a513ce5f60d5ad>:0 
  at (wrapper remoting-invoke-with-check) Mono.Security.Protocol.Tls.SslClientStream:add_ServerCertValidation2 (Mono.Security.Protocol.Tls.CertificateValidationCallback2)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'SslStream'.
System.Net.Security.SslStream.get_Impl()
System.Net.Security.SslStream.get_Impl()(wrapper remoting-invoke-with-check)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
System.Net.HttpWebRequest.get_ClientCertificates()
System.Net.HttpWebRequest.get_ClientCertificates()(wrapper remoting-invoke-with-check)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
System.Net.HttpWebRequest.get_ServerCertValidationCallback()
System.Net.HttpWebRequest.get_ServerCertValidationCallback()(wrapper remoting-invoke-with-check)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
Mono.Security.Protocol.Tls.TlsStream.ToArray()
Mono.Security.Protocol.Tls.TlsStream.ToArray()(wrapper remoting-invoke-with-check)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
System.Collections.Generic.List<T>.Add(T item)
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
  at System.Runtime.CompilerServices.TaskAwaiter`1[TResult].UnsafeOnCompleted (System.Action continuation) <0x9c1fa264 + 0x00014> in <162b154355a14010abd9ab37698d0a10>:0 
  at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[TResult].AwaitUnsafeOnCompleted[TAwaiter,TStateMachine] (TAwaiter& awaiter, TStateMachine& stateMachine) [0x0004c] in <162b154355a14010abd9ab37698d0a10>:0 
--- End of stack trace from previous location where exception was thrown ---
  at System.Runtime.CompilerServices.AsyncMethodBuilderCore+<>c.<ThrowAsync>b__6_1 (System.Object state) <0x9c1f9ba0 + 0x00058> in <162b154355a14010abd9ab37698d0a10>:0 
  at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context (System.Object state) <0x9c151b64 + 0x0004f> in <162b154355a14010abd9ab37698d0a10>:0 
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) <0x9c14ba74 + 0x001bb> in <162b154355a14010abd9ab37698d0a10>:0 
  at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () <0x9c151afc + 0x0003b> in <162b154355a14010abd9ab37698d0a10>:0 
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00074] in <162b154355a14010abd9ab37698d0a10>:0 
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () <0x9c15199c + 0x00007> in <162b154355a14010abd9ab37698d0a10>:0 
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
System.Runtime.CompilerServices.TaskAwaiter.OnCompletedInternal(Task task, Action continuation, bool continueOnCapturedContext, bool flowExecutionContext)
System.Runtime.CompilerServices.TaskAwaiter<TResult>.UnsafeOnCompleted(Action continuation)
System.Runtime.CompilerServices.AsyncTaskMethodBuilder<TResult>.AwaitUnsafeOnCompleted<TAwaiter, TStateMachine>(ref TAwaiter awaiter, ref TStateMachine stateMachine)<162b154355a14010abd9ab37698d0a10>:0
System.Threading.QueueUserWorkItemCallback.WaitCallback_Context(object state)
System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, object state, bool preserveSyncCtx)
System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
System.Threading.ThreadPoolWorkQueue.Dispatch()<162b154355a14010abd9ab37698d0a10>:0
System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()

An so on.

Version Information

Microsoft Visual Studio Professional 2017
Version 15.5.2
VisualStudio.15.Release/15.5.2+27130.2010
Microsoft .NET Framework
Version 4.7.02053

Installed Version: Professional

Xamarin 4.8.0.753 (6575bd113)
Visual Studio extension to enable development for Xamarin.iOS and Xamarin.Android.

Xamarin.Android SDK 8.1.0.25 (HEAD/d8c6e504f)
Xamarin.Android Reference Assemblies and MSBuild support.

Other

My naive assumptions is that this is a bug in GC, we'll definitely try SGEN in next release to check that.
Will try new GC bridge as well

@Belorus Belorus changed the title Random NullRef crashes in SSL/TLS stack Random NullRef crashes in SSL/TLS (not only) stack Jan 12, 2018
@marek-safar
Copy link
Contributor

/cc @brendanzagaeski could Mono part of the problem be addressed by your latest fix

@brendanzagaeski
Copy link
Contributor

Hmm. These look a bit different. I quickly checked to see how unobserved exceptions from Tasks look on Xamarin.Android. As it turns out, exception telemetry coming from the UnobservedTaskException event would by default not include the android.runtime.JavaProxyThrowable Java type because the exception would be sent before the transition out of the C# runtime. I think the presence of android.runtime.JavaProxyThrowable in the exceptions in the initial comment means the exceptions have propagated to a Java thread by the time they are sent, and that wouldn't happen by default for the unobserved exceptions targeted by my recent fix.

@jonpryor
Copy link
Member

@brendanzagaeski, @marek-safar: Related: Issue #1144.

I think the presence of android.runtime.JavaProxyThrowable in the exceptions in the initial comment means the exceptions have propagated to a Java thread by the time they are sent

Not quite; the presence of JavaProxyThrowable means that the exception was propagated to a Java stack frame.

For example, consider Android calling an Activity.OnCreate() override:

[Activity]
partial class BadExample : Activity {
    protected override void OnCreate(Bundle bundle) {
        throw new InvalidOperationException("Bwa-ha-ha-ha");
    }
}

In the above, we'll have a Java stack frame calling BadExample.OnCreate() (through a delegate/JNI), and thus a stack of:

Java Method > Managed Method (`BadActivity.OnCreate()`)

When BadActivity.OnCreate() throws InvalidOperationException, it is wrapped into a JavaProxyThrowable and then JNIEnv::Throw() is called, making the JavaProxyThrowable a "pending" exception. When control returns to Java Method, the pending JavaProxyThrowable is thrown in the JVM, allowing normal catch and finally blocks to execute.

Unfortunately (Issue #1144) we neglected to "unwrap" the JavaProxyThrowable before raising the AppDomain.UnhandleException event, which is probably why it's showing up in all of those stack traces...IFF a Java > Managed transition occurred, anyway.

@jonpryor
Copy link
Member

@Belorus wrote:

we'll definitely try SGEN in next release

Xamarin.Android doesn't support Boehm. You're already using SGen -- it's the only GC that Xamarin.Android supports.

Crashes on Androids:

How are you collecting those stack traces? They seem oddly truncated, in that I would expect more than two calling frames. Maybe even the Java-side stack trace?

@Belorus
Copy link
Author

Belorus commented Jan 16, 2018

@jonpryor Via HockeyApp SDK 4.1.5
https://snag.gy/3JqLQY.jpg
https://snag.gy/reazTl.jpg

@jonpryor
Copy link
Member

@Belorus: I don't recognize that interface. :-(

I can only assume it's a web UI to some crash reporter, which means I need to ask what the crash reporter is, how it works, and (most importantly) where is the rest of the stack trace?

:-/

@jonpryor
Copy link
Member

I can only assume it's a web UI to some crash reporter

I can only assume I need to spend more time on reading comprehension, as you said it was "HockeyApp SDK 4.1.5".

Doh!

Unfortunately that doesn't immediately answer the "how does it work", though a bit of spelunking suggests that AppDomain.UnhandledException is used as well as java.lang.Thread.setDefaultUncaughtExceptionHandler().

The possible answer to "where is the rest of the stack trace" is probably "< insert gnashing of teeth here >" ^H^H^H I mean "They use Throwable.printStackTrace(), with no calls to Throwable.getMessage()."

JavaProxyThrowable provides the result of innerException.ToString() as the "message" for the base class, accessible via Throwable.getMessage(). (This is because I didn't know of a way to add stack frames to a java.lang.Throwable, apparently having completely overlooked Throwable.setStackTrace().

At present, Throwable.getMessage() is the only way to get the inner/managed stack trace held by the JavaProxyThrowable instance, and it appears that (conjecture!) -- while Throwable.getMessage() is seemingly called -- which is presumably why we get the System.NullReferenceException: Object reference not set to an instance of an object message -- it's being truncated to one line.

Which isn't particularly helpful.

@Belorus
Copy link
Author

Belorus commented Jan 17, 2018

Today i saw the winner (https://snag.gy/9R6ewZ.jpg):

Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object

We have realtime logs on client devices, there i saw the following (Still truncated, although received not via HockeyApp):

AppDomain.CurrentDomain.UnhandledException. UtcNow is 2018-01-15T11:40:31. --> System.NullReferenceException: Object reference not set to an instance of an object

In code it looks like (simplified):

AppDomain.CurrentDomain.UnhandledException += OnCurrentDomainUnhandledException;

// ...

private static void OnCurrentDomainUnhandledException(object sender, UnhandledExceptionEventArgs e)
{
    if (e.ExceptionObject is Exception exception)
    {
        Log.Fatal($"AppDomain.CurrentDomain.UnhandledException. UtcNow is {DateTime.UtcNow:s}. --> " + exception.ToString());
    }
}

Looks like exception that we get in event handler is already very lets say... consise.

@baulig
Copy link
Contributor

baulig commented Jan 17, 2018

I'm not exactly sure why, but this is using the legacy TLS stack, which is no longer supported and will soon be deleted.

Looking over your stack traces:

  1. Legacy TLS, unsupported and scheduled for removal
  2. same
  3. We would need a full stack trace here - where's the SslStream used, what may have disposed it.
  4. Client Certificates are not supported - though we should only throw NotImplemented / NotSupported in the setter and not throw at all in the getter.
  5. This property returns an instance field - I don't see how we could null-ref here:
    internal ServerCertValidationCallback ServerCertValidationCallback {
        get { return certValidationCallback; }
    }
  1. Legacy TLS, see 1.

I am very confused about these stack traces and don't really understand what's happening here. If this is related to TLS at all, then we should try to at least use BTLS on Android. Testing with the new web stack (once it gets integrated into Android) would be helpful as well.

Let me know if there's anything I can help you with - at the moment, I'm not sure what I could possibly do because I just don't understand what's happening.

@Belorus Belorus changed the title Random NullRef crashes in SSL/TLS (not only) stack Random NullReference and InvalidCast crashes in mono class libraries Jan 17, 2018
@Belorus
Copy link
Author

Belorus commented Jan 17, 2018

@baulig Issue isn't in C# implementation of TLS. Issue is in runtime or GC, or bridge.
I've checked mono C# code as you did, and didn't find reasonable explanations for the crashes.

In app version with native TLS we have huge amount of nullref crashes as well. But in other places.
I've renamed issue, and added more stacktraces.

@Belorus
Copy link
Author

Belorus commented Jan 19, 2018

image
This is crash chart of one of our apps after transition from XA7.x (don't remember exactly) to 8.1
As you see most crashes are not logic related, these are issues of runtime. We didn't have any of these before.

image

@Belorus
Copy link
Author

Belorus commented Jan 19, 2018

Is there any way to downgrade Xamarin Andoid to 7.4 inside VS2017?
I've downloaded https://xamarin.azureedge.net/XamarinforVisualStudio/Windows/Xamarin.VisualStudio_4.6.3.4.msi but it doesn't change anything after installation.

@joj
Copy link
Contributor

joj commented Jan 19, 2018

@Belorus the msi is only installing for VS2015. There is no easy way to downgrade an installed package in VS2017 AFAIK. I may be wrong, though.

@brendanzagaeski
Copy link
Contributor

I'm not sure if it will be helpful for this particular scenario (Xamarin.Android 7.1.0.43 might be too old for what you need), but for reference, Visual Studio 2017 version 15.0 is available from the Downloads section of my.VisualStudio.com to align with the Visual Studio servicing guidelines.

(Related: Installing an earlier release of Visual Studio 2017.)

@akravch
Copy link

akravch commented Apr 10, 2019

Hi, sorry for necroposting, but are there any updates regarding this? We are still experiencing this issue with the crashes around TLS, even thought we've updated Xamarin and VS since then:

Android SDK Tools 1.16
Visual Studio Professional 2017 15.9.28307.222
Microsoft .NET Native SDK 15.0.24211
Xamarin.Android 9.1.4.2 (HEAD/8255f42fc)
SSL/TLS implementation Native TPS 1.2+

Here is a few of the crashes:

Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper xdomain-invoke) Mono.Security.Protocol.Tls.TlsStream.Write(byte[])
  at (wrapper remoting-invoke-with-check) Mono.Security.Protocol.Tls.TlsStream.Write(byte[])
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper xdomain-invoke) Mono.Security.Protocol.Tls.TlsStream.get_EOF()
  at (wrapper remoting-invoke-with-check) Mono.Security.Protocol.Tls.TlsStream.get_EOF()
Xamarin caused by: android.runtime.JavaProxyThrowable: System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper xdomain-invoke) System.Net.HttpWebRequest.get_TlsSettings()
  at (wrapper remoting-invoke-with-check) System.Net.HttpWebRequest.get_TlsSettings()

@jonathanpeppers
Copy link
Member

Would anyone be able to comment if this issue is solved in .NET 6 Android projects? Thanks!

@jonathanpeppers jonathanpeppers self-assigned this Jun 1, 2022
@jonathanpeppers jonathanpeppers added the need-info Issues that need more information from the author. label Jun 1, 2022
@ghost
Copy link

ghost commented Jun 1, 2022

Hi @Belorus. We have added the "need-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time.

@jonathanpeppers jonathanpeppers added this to the Under Consideration milestone Jun 1, 2022
@jonathanpeppers jonathanpeppers added Area: Mono Runtime Mono-related issues: BCL bugs, AOT issues, etc. Area: App Runtime Issues in `libmonodroid.so`. labels Jun 1, 2022
@ghost
Copy link

ghost commented Jun 8, 2022

Hi @Belorus. Due to inactivity, we will be closing this issue. Please feel free to re-open this issue if the issue persists. For enhanced visibility, if over 7 days have passed, please open a new issue and link this issue there. Thank you.

@ghost ghost closed this as completed Jun 8, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jul 9, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Area: App Runtime Issues in `libmonodroid.so`. Area: Mono Runtime Mono-related issues: BCL bugs, AOT issues, etc. need-info Issues that need more information from the author.
Projects
None yet
Development

No branches or pull requests

8 participants