Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diff is extremely slow #335

Open
seehuhn opened this issue Aug 15, 2023 · 3 comments
Open

Diff is extremely slow #335

seehuhn opened this issue Aug 15, 2023 · 3 comments

Comments

@seehuhn
Copy link

seehuhn commented Aug 15, 2023

I tried to use cmp.Diff in a unit test, but for my data, the function is so slow that the unit tests time out. The data in question are two decoded versions of the "Go Regular" font (as described in the Go blog ). The comparison takes more than 16 seconds on a fast laptop.

Here is code to reproduce the problem:

package main

import (
	"fmt"
	"log"

	"github.com/google/go-cmp/cmp"

	"seehuhn.de/go/sfnt/glyph"

	"seehuhn.de/go/pdf"
	"seehuhn.de/go/pdf/font"
	"seehuhn.de/go/pdf/font/charcode"
	"seehuhn.de/go/pdf/font/gofont"
	"seehuhn.de/go/pdf/font/opentype"
)

func main() {
	otf, err := gofont.OpenType(gofont.GoRegular)
	if err != nil {
		log.Fatal(err)
	}

	encoding := make([]glyph.ID, 256)
	encoding[65] = otf.CMap.Lookup('A')
	encoding[66] = otf.CMap.Lookup('C')

	toUnicode := map[charcode.CharCode][]rune{
		65: {'A'},
		66: {'C'},
	}

	info1 := &opentype.EmbedInfoSimpleCFF{
		Font:      otf,
		SubsetTag: "UVWXYZ",
		Encoding:  encoding,
		ToUnicode: toUnicode,
	}

	rw := pdf.NewData(pdf.V1_7)
	ref := rw.Alloc()
	err = info1.Embed(rw, ref)
	if err != nil {
		log.Fatal(err)
	}

	dicts, err := font.ExtractDicts(rw, ref)
	if err != nil {
		log.Fatal(err)
	}
	info2, err := opentype.Extract(rw, dicts)
	if err != nil {
		log.Fatal(err)
	}

	d := cmp.Diff(info1, info2)
	fmt.Println(d)
}

The following go.mod file shows the exact versions I used:

module seehuhn.de/go/issues/001

go 1.21.0

require (
	github.com/google/go-cmp v0.5.9
	seehuhn.de/go/pdf v0.3.5-0.20230815144317-1dc53998ea9b
	seehuhn.de/go/sfnt v0.3.5-0.20230814123027-5af1767be82b
)

require (
	github.com/xdg-go/stringprep v1.0.4 // indirect
	golang.org/x/exp v0.0.0-20230713183714-613f0c0eb8a1 // indirect
	golang.org/x/image v0.7.0 // indirect
	golang.org/x/text v0.9.0 // indirect
	seehuhn.de/go/dag v0.0.0-20230612165854-b02059e84ec5 // indirect
	seehuhn.de/go/dijkstra v0.9.3 // indirect
	seehuhn.de/go/postscript v0.3.5-0.20230811095027-86e1857cf919 // indirect
)

Since the font data is of moderate size (a few hundred glyphs, each containing a few tens of control points), I would expect the comparison to be much faster than 16 seconds.

@dsnet
Copy link
Collaborator

dsnet commented Aug 15, 2023

Thank you for the bug report and reproduction. This is a good bug report.

It's known that there are a few edge-cases that cause cmp to have pathological behavior (e.g., excessive branching and re-checking of results in certain nested structures). I was able to reproduce a execution time of 25s on my Ryzen 5900x. I won't have time to dig into what's going on in the near future, but can hopefully can fix this.

@Carrotman42
Copy link

I've worked around slowness in cmp.Diff (with no cmpopts applied) by checking reflect.DeepEqual first; in the 5-ish times I've done this I've found it to always be correct and performant. Would it be reasonable to include that in the implementation of cmp.Diff? i.e. if reflect.DeepEqual returns true, have a fast-path return "", else do the full diffing that callers expect.

@dsnet
Copy link
Collaborator

dsnet commented Feb 1, 2024

The semantics of reflect.DeepEqual is not identical to cmp.Equal even without any options specified, so that approach won't work in the general case (it might work for your use case). What we can do though is to analyze the set of options passed in and figure out exactly what information needs to be tracked and avoid doing that if it's unnecessary. For example, tracking the entire cmp.Path is unnecessary if cmp.FilterPath is not used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants