Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP scrubber #2990

Merged
merged 21 commits into from
Sep 5, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions account/account.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,11 @@ func GetAccount(ctx context.Context, cfg *config.Configuration, fetcher stored_r
return nil, errs
}

errs = account.IpMasking.Validate(errs)
if len(errs) > 0 {
return nil, errs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reasonable way to add a test that throws this error? Or is it challenging with this Validate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added!

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Could simplify the syntax...

if errs := account.Privacy.IPv6Config.Validate(nil); len(errs) > 0 {
	account.Privacy.IPv6Config.AnonKeepBits = iputil.IPv6BitSize
}

if errs := account.Privacy.IPv4Config.Validate(nil); len(errs) > 0 {
	account.Privacy.IPv4Config.AnonKeepBits = iputil.IPv4BitSize
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines show as having test coverage, but there doesn't seem to an assertion on setting their default values. I made a type locally where IPv4Config.AnonKeepBits = iputil.IPv6BitSize which is setting the ipv6 value for the ipv4 default and this mistake wasn't caught by a test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is tested in a general GetAccount test.
I added extra flag to assert IP config for account id "invalid_acct_ipv6_ipv4".
I also realized I set both ipv6 and ipv4 to max value, instead of default mask bits. I added 2 new constants:

IPv4DefaultMaskingBitSize = 24
IPv6DefaultMaskingBitSize = 56

For the simplified syntax I added new variables for errors to show explicitly we don't want to add IP config validation errors to errs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also tested this update end to end including test with publisher account id.


// set the value of events.enabled field based on deprecated events_enabled field and ensure backward compatibility
deprecateEventsEnabledField(account)

Expand Down
39 changes: 39 additions & 0 deletions config/account.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ const (
ChannelWeb ChannelType = "web"
)

const (
Ipv4Bits = 32
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
Ipv6Bits = 128
)
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved

// Account represents a publisher account configuration
type Account struct {
ID string `mapstructure:"id" json:"id"`
Expand All @@ -41,6 +46,7 @@ type Account struct {
DefaultBidLimit int `mapstructure:"default_bid_limit" json:"default_bid_limit"`
BidAdjustments *openrtb_ext.ExtRequestPrebidBidAdjustments `mapstructure:"bidadjustments" json:"bidadjustments"`
Privacy *AccountPrivacy `mapstructure:"privacy" json:"privacy"`
IpMasking IpMasking `mapstructure:"ip_masking" json:"ip_masking"`
}

// CookieSync represents the account-level defaults for the cookie sync endpoint.
Expand Down Expand Up @@ -297,3 +303,36 @@ func (a *AccountChannel) IsSet() bool {
type AccountPrivacy struct {
AllowActivities AllowActivities `mapstructure:"allowactivities" json:"allowactivities"`
}

type IpMasking struct {
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
IpV6 IpMasks `mapstructure:"ipv6" json:"ipv6"`
IpV4 IpMasks `mapstructure:"ipv4" json:"ipv4"`
}

type IpMasks struct {
ActivityLeftMaskBits int `mapstructure:"activity_left_mask_bits" json:"activity_left_mask_bits"`
GdprLeftMaskBitsLowest int `mapstructure:"gdpr_left_mask_bits_lowest" json:"gdpr_left_mask_bits_lowest"`
GdprLeftMaskBitsHighest int `mapstructure:"gdpr_left_mask_bits_highest" json:"gdpr_left_mask_bits_highest"`
}
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Let's use underscores to be consistent with the rest of this object: anon-keep-bits -> anon_keep_bits


func (ipMasking *IpMasking) Validate(errs []error) []error {
errs = ipMasking.IpV6.validate(errs, Ipv6Bits, "ipv6")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to handle the case of ipMasking being nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically ipMasking is never nil here. Should I modify it to a non-pointer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please. Let's add some defensive coding since nil is allowed here. Alternatively, you may be able to change this to a value type receiver.

errs = ipMasking.IpV4.validate(errs, Ipv4Bits, "ipv4")
return errs
}

func (ipMasks *IpMasks) validate(errs []error, maxBits int, ipVersion string) []error {
if ipMasks.ActivityLeftMaskBits > maxBits || ipMasks.ActivityLeftMaskBits < 0 {
err := fmt.Errorf("activity left mask bits cannot exceed %d in %s address, or be less than 0", maxBits, ipVersion)
errs = append(errs, err)
}
if ipMasks.GdprLeftMaskBitsLowest > maxBits || ipMasks.GdprLeftMaskBitsLowest < 0 {
err := fmt.Errorf("gdpr left mask bits lowest cannot exceed %d in %s address, or be less than 0", maxBits, ipVersion)
errs = append(errs, err)
}
if ipMasks.GdprLeftMaskBitsHighest > maxBits || ipMasks.GdprLeftMaskBitsHighest < 0 {
err := fmt.Errorf("gdpr left mask bits highest cannot exceed %d in %s address, or be less than 0", maxBits, ipVersion)
errs = append(errs, err)
}
return errs
}
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
7 changes: 7 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,8 @@ func (cfg *Configuration) validate(v *viper.Viper) []error {

errs = cfg.Experiment.validate(errs)
errs = cfg.BidderInfos.validate(errs)
errs = cfg.AccountDefaults.IpMasking.Validate(errs)

return errs
}

Expand Down Expand Up @@ -1062,6 +1064,11 @@ func SetupViper(v *viper.Viper, filename string, bidderInfos BidderInfos) {
v.SetDefault("request_validation.ipv4_private_networks", []string{"10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "169.254.0.0/16", "127.0.0.0/8"})
v.SetDefault("request_validation.ipv6_private_networks", []string{"::1/128", "fc00::/7", "fe80::/10", "ff00::/8", "2001:db8::/32"})

v.SetDefault("ip_masking.ipv6.activity_left_mask_bits", 56)
v.SetDefault("ip_masking.ipv6.gdpr_left_mask_bits_lowest", 112)
v.SetDefault("ip_masking.ipv6.gdpr_left_mask_bits_highest", 96)
v.SetDefault("ip_masking.ipv4.left_mask_bits", 24)

bindDatabaseEnvVars(v)

// Set environment variable support:
Expand Down
2 changes: 1 addition & 1 deletion exchange/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ func (rs *requestSplitter) cleanOpenRTBRequests(ctx context.Context,
applyFPD(auctionReq.FirstPartyData[bidderRequest.BidderName], bidderRequest.BidRequest)
}

privacyEnforcement.Apply(bidderRequest.BidRequest)
privacyEnforcement.Apply(bidderRequest.BidRequest, &auctionReq.Account.IpMasking)
allowedBidderRequests = append(allowedBidderRequests, bidderRequest)

// GPP downgrade: always downgrade unless we can confirm GPP is supported
Expand Down
9 changes: 6 additions & 3 deletions privacy/enforcement.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
package privacy

import "github.com/prebid/openrtb/v19/openrtb2"
import (
"github.com/prebid/openrtb/v19/openrtb2"
"github.com/prebid/prebid-server/config"
)

// Enforcement represents the privacy policies to enforce for an OpenRTB bid request.
type Enforcement struct {
Expand All @@ -27,8 +30,8 @@ func (e Enforcement) AnyActivities() bool {
}

// Apply cleans personally identifiable information from an OpenRTB bid request.
func (e Enforcement) Apply(bidRequest *openrtb2.BidRequest) {
e.apply(bidRequest, NewScrubber())
func (e Enforcement) Apply(bidRequest *openrtb2.BidRequest, ipMasking *config.IpMasking) {
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
e.apply(bidRequest, NewScrubber(ipMasking))
}

func (e Enforcement) apply(bidRequest *openrtb2.BidRequest, scrubber Scrubber) {
Expand Down
68 changes: 21 additions & 47 deletions privacy/scrubber.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@ package privacy

import (
"encoding/json"
"github.com/prebid/prebid-server/config"
"github.com/prebid/prebid-server/util/ptrutil"
"strings"
"net"

"github.com/prebid/openrtb/v19/openrtb2"
)
Expand Down Expand Up @@ -76,14 +77,18 @@ type Scrubber interface {
ScrubUser(user *openrtb2.User, strategy ScrubStrategyUser, geo ScrubStrategyGeo) *openrtb2.User
}

type scrubber struct{}
type scrubber struct {
ipMasking *config.IpMasking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be passed as a pointer?

Copy link
Contributor Author

@VeronikaSolovei9 VeronikaSolovei9 Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main idea here was to load these values to a non-pointer config struct, because we always need them. And then pass it as a pointer downstream to avoid copying the struct.
But I can see now why this was not a good idea. I replaced it to non-pointer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then pass it as a pointer downstream to avoid copying the struct.

That is't a surefire premature optimization. Go has optimizations for passing arguments as values which may be much more efficient than using a pointer. It's a bit more reliable for large data structures like the BidRequest, but this struct is relatively small.

}

// NewScrubber returns an OpenRTB scrubber.
func NewScrubber() Scrubber {
return scrubber{}
func NewScrubber(ipMasking *config.IpMasking) Scrubber {
return &scrubber{
ipMasking: ipMasking,
}
}

func (scrubber) ScrubRequest(bidRequest *openrtb2.BidRequest, enforcement Enforcement) *openrtb2.BidRequest {
func (s *scrubber) ScrubRequest(bidRequest *openrtb2.BidRequest, enforcement Enforcement) *openrtb2.BidRequest {
var userExtParsed map[string]json.RawMessage
userExtModified := false

Expand Down Expand Up @@ -166,8 +171,8 @@ func (scrubber) ScrubRequest(bidRequest *openrtb2.BidRequest, enforcement Enforc
if deviceCopy.Geo != nil {
deviceCopy.Geo = scrubGeoPrecision(deviceCopy.Geo)
}
deviceCopy.IP = scrubIPV4Lowest8(deviceCopy.IP)
deviceCopy.IPv6 = scrubIPV6Lowest32Bits(deviceCopy.IPv6)
deviceCopy.IP = scrubIp(deviceCopy.IP, s.ipMasking.IpV4.GdprLeftMaskBitsLowest, config.Ipv4Bits)
deviceCopy.IPv6 = scrubIp(deviceCopy.IPv6, s.ipMasking.IpV6.ActivityLeftMaskBits, config.Ipv6Bits)
}
}

Expand All @@ -176,7 +181,7 @@ func (scrubber) ScrubRequest(bidRequest *openrtb2.BidRequest, enforcement Enforc
return bidRequest
}

func (scrubber) ScrubDevice(device *openrtb2.Device, id ScrubStrategyDeviceID, ipv4 ScrubStrategyIPV4, ipv6 ScrubStrategyIPV6, geo ScrubStrategyGeo) *openrtb2.Device {
func (s *scrubber) ScrubDevice(device *openrtb2.Device, id ScrubStrategyDeviceID, ipv4 ScrubStrategyIPV4, ipv6 ScrubStrategyIPV6, geo ScrubStrategyGeo) *openrtb2.Device {
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
if device == nil {
return nil
}
Expand All @@ -196,14 +201,14 @@ func (scrubber) ScrubDevice(device *openrtb2.Device, id ScrubStrategyDeviceID, i

switch ipv4 {
case ScrubStrategyIPV4Lowest8:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need ScrubStrategyIPV4Lowest8?

Copy link
Contributor Author

@VeronikaSolovei9 VeronikaSolovei9 Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this to indicate we want to scrub IPv4. There are just 2 options:

const (
	// ScrubStrategyIPV4None does not remove any part of an IPV4 address.
	ScrubStrategyIPV4None ScrubStrategyIPV4 = iota

	// ScrubStrategyIPV4Lowest8 zeroes out the last 8 bits of an IPV4 address.
	ScrubStrategyIPV4Lowest8
)

I think we should rename it to something like ScrubStrategyIPV4Default or better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. Yes, renaming would be good. Perhaps ScrubStrategyIPV4Subnet to match IPv6 naming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right!

deviceCopy.IP = scrubIPV4Lowest8(device.IP)
deviceCopy.IP = scrubIp(device.IP, s.ipMasking.IpV4.GdprLeftMaskBitsLowest, config.Ipv4Bits)
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
}

switch ipv6 {
case ScrubStrategyIPV6Lowest16:
deviceCopy.IPv6 = scrubIPV6Lowest16Bits(device.IPv6)
deviceCopy.IPv6 = scrubIp(device.IPv6, s.ipMasking.IpV6.GdprLeftMaskBitsLowest, config.Ipv6Bits)
case ScrubStrategyIPV6Lowest32:
deviceCopy.IPv6 = scrubIPV6Lowest32Bits(device.IPv6)
deviceCopy.IPv6 = scrubIp(device.IPv6, s.ipMasking.IpV6.GdprLeftMaskBitsHighest, config.Ipv6Bits)
}

switch geo {
Expand Down Expand Up @@ -241,44 +246,13 @@ func (scrubber) ScrubUser(user *openrtb2.User, strategy ScrubStrategyUser, geo S
return &userCopy
}

func scrubIPV4Lowest8(ip string) string {
i := strings.LastIndex(ip, ".")
if i == -1 {
func scrubIp(ip string, ones, bits int) string {
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved
if ip == "" {
return ""
}

return ip[0:i] + ".0"
}

func scrubIPV6Lowest16Bits(ip string) string {
ip = removeLowestIPV6Segment(ip)

if ip != "" {
ip += ":0"
}

return ip
}

func scrubIPV6Lowest32Bits(ip string) string {
ip = removeLowestIPV6Segment(ip)
ip = removeLowestIPV6Segment(ip)

if ip != "" {
ip += ":0:0"
}

return ip
}

func removeLowestIPV6Segment(ip string) string {
i := strings.LastIndex(ip, ":")

if i == -1 {
return ""
}

return ip[0:i]
ipv6Mask := net.CIDRMask(ones, bits)
ipMasked := net.ParseIP(ip).Mask(ipv6Mask)
return ipMasked.String()
}
SyntaxNode marked this conversation as resolved.
Show resolved Hide resolved

func scrubGeoFull(geo *openrtb2.Geo) *openrtb2.Geo {
Expand Down
Loading