-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
635 beaconfqdn slow #638
635 beaconfqdn slow #638
Conversation
Closes #635 |
In pkg/beaconfqdn/dissector.go, there appears to be a hardcoded IP: |
It's part of the example Mongo query for that piece of code. The IP is from the dnscat2-ja3-strobe-agent set. It's not actually used in the code; just there so that the query is valid without modification. Good check though! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks good. We need to tweak one thing regarding the network_names.
Great work y'all!
pkg/beaconfqdn/dissector.go
Outdated
@@ -95,7 +191,7 @@ func (d *dissector) start() { | |||
"icerts": bson.M{"$anyElementTrue": []interface{}{"$dat.icerts"}}, | |||
}}, | |||
{"$group": bson.M{ | |||
"_id": "$src", | |||
"_id": bson.M{"src": "$src", "uuid": "$src_network_uuid", "network": "$src_network_name"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot group on network name as multiple network names may be associated with the same network uuid.
This is a downstream result of how winlogbeat identifies Sysmon agents.
Instead, please use "network_name": bson.M{"$first": "$src_network_name"} "network_name": bson.M{"$first": "$src_network_name"}
. Unfortunately, this will need to be added to most of the clauses below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past, queries used $last
to group on network uuids so that we'd be referencing the last known netbios name in case it ever changed. Is there a reason why we should use $first
here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. Great catch. I'm just a big dummy. 🤡 Please use $last
.
Merged master back into this branch. Solved merge conflicts by removing the icerts calculations in fqdn beacons since it was removed in master. Also, the fqdn analysis is like wicked fast yo 🙌 |
Believe I have the comments addressed and this is ready for the final approval. Thanks! |
pkg/beaconfqdn/dissector.go
Outdated
"tbytes": {"$sum": "$dat.tbytes"}, | ||
}}, | ||
{"$group": { | ||
"_id": {"src": "$src", "uuid": "$src_network_uuid", "network": "$src_network_name"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this comment needs to be changed to match the recent changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this PR is ready to pull in otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think i've got it fixed, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Tested and it seemed to work well.
I have a question on this. I just tried this on a large dataset, and it is really slow, might finish in (2 hours of Bro data took over 30 minutes) 6-8 hours, i'm not sure yet. Are there any configuration changes I can make? I tried upping the Threshold to 80 and even 200, and got a little improvement. Is it possible to run all the analysis beside FDQN Beacon, then come back and run just the FQDN Beacon so that I might be able to start looking at the other data while the FDQN Beacon is still processing? Thank you, |
Hello, thank you for letting us know about the issue. The code patch here is set to be released as part of RITA v4.3.0 https://github.com/activecm/rita/releases/download/v4.3.0/install.sh. We are currently running quality control testing on the new release, but it is available as a pre-release. We expect that v4.3.0 will have a formal release sometime in the coming week. Would you be interested in giving version 4.3.0 a try and letting us know if the amount of time required to process the FQDN Beacons goes down? Note: RITA v4.3.0 requires upgrading MongoDB to version 4.2. Unfortunately, this means that previous versions of RITA will no longer continue to work as the maximum version of MongoDB supported by previous versions is 3.6. |
Sure I can test, I might need figure out a backout plan if I need to go back to mongodb 3.6 I have all my current Rita data indexed into splunk so I typically don't need to go back to previous data. |
I installed 4.3.0 and tested. I had some problems with the zeek and mongodb installs but since I already installed mongodb and I don't run bro nor zeek on this system, I was able to run the install with no zeek and no mongo and it worked. The beacons-FQDN is really fast now! I will see what the runtime is in the morning for a full day's worth of logs. A difference between the beacons-fqdn and beacons, it doesn't have the total bytes like the beacons does now. Also I notice that I have almost 10 times as many rows in the beacons-fqdn as I do in the beacons, is there some way to put all the rows for the same beacon activity into one row with a multivalue field like the "Port:Protocol:Service" in long-conns? Maybe you can even add the dest-ip, unless I am missing something with the what the beacons-fqdn is doing, I think it just help see what the possible dns request of the traffic was using a reverse lookup, like if it was login.microsoftonline.com you might be able to filter out the result. However for one connection to google I see all these FQDN's all from one source IP with the same values in every other field.: Maybe give an option to include the Dest-IP and another option if you want to condense the data with a multi-value field. Thank you, |
So the Import command ran in a reasonable time, however the rita show-beacons-fqdn took almost 6 hours to run and is 5.1G in size:
Does this seam right? Thank you, |
Credits to @lisaSW as well
We modified how we are performing queries when performing beacon FQDN analysis.
Previously, we would:
Now, we do the following:
This subtle difference resulted in over an 8x speed up from the initial implementation.