Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: structured logging #182

Merged
merged 5 commits into from
Aug 3, 2022
Merged

Feature: structured logging #182

merged 5 commits into from
Aug 3, 2022

Conversation

cap10morgan
Copy link
Contributor

@cap10morgan cap10morgan commented Aug 2, 2022

I recently added an optional JSON log format so that our logs would be more useful in places like AWS CloudWatch (which has powerful search and filtering options if you send it JSON).

However, in my initial attempt the JSON logs just looked like this:

{
  "timestamp": "2022-08-02T20:26:15.197Z",
  "level": "INFO",
  "thread": "Thread-9",
  "logger": "fluree.db.connection",
  "message": "connection closed",
  "context": "default"
}

or sometimes like this:

{
  "timestamp": "2022-08-02T20:26:09.939Z",
  "level": "INFO",
  "thread": "async-dispatch-6",
  "logger": "fluree.db.ledger.consensus.raft",
  "message": "Ledger group leader change: {:cause :become-leader, :server \"myserver\", :new-leader \"myserver\", :old-leader nil, :event :become-leader, :message \"This server, myserver, received the majority of the votes to become leader. New term: 21, latest index: 1305.\"}",
  "context": "default"
}

...with an opaque (to a JSON parser) blob of EDN inside the "message" value.

But then I had a 💡! If we're just sending JSON to CloudWatch, couldn't we send arbitrary data serialized as JSON too and then be able to search and filter based on our own domain concepts?

So I started searching around and I discovered a) yes, yes we can and b) this is apparently called "structured logging" (I was today years old when I learned...).

And thus here it is. This PR does a few things:

  1. It replaces my first naive attempt at JSON logging with a library called cambium that helps with structured logging in Clojure.
  2. It adds some examples of using cambium to log some data in the server startup code. (To really see the magic here, set the env var FDB_LOG_FORMAT=json). Note that its a good idea to wrap the data in a map w/ a namespaced key like in those examples b/c they get merged into the JSON object w/ all of the logging-related keys (e.g. "message", "timestamp", etc.).
  3. It adds a new var %mdc in our default text logging patterns that will contain any data we log out w/ the cambium logging fns. Note that by default these get formatted in a key=val pattern in the logs. If you want to retain the pr-str EDN format, you can just keep using the fluree.db.util.log fns instead (but then you don't get the JSON goodness in environments that can process it like CloudWatch!).
    1. We can revisit this later to see if that text output format is configurable and potentially make it use pr-str from cambium too.
    2. If you're curious, that %mdc thingy stands for Mapped Diagnostic Context and it is something built into logback for this purpose. That highfalutin name tho. Java devs, man...

So what good is this? Well I could imagine a few ways we could put it to use:

  1. Put a user-id in the structured log data to filter on that and trace an individual user's interactions across our systems.
  2. Same as above but network and ledger identifiers.
  3. Also transactions vs. queries. Query types (SPARQL, SQL, etc.)
  4. Bring the data portion of ex-info exceptions into the structured JSON output when errors occur and then search and filter on that info (cambium might already do this; not sure).
  5. Process logs with tools like jq.
  6. Everything is better as data. Code, logs, etc.

Oh right, how about an example JSON structured log message:

;; code in fluree.db.server/startup
(slog/info {:fluree/settings
            (cond-> (into (sorted-map) settings) ;; hide encryption secret from logs
              (:fdb-encryption-secret settings)
              (assoc :fdb-encryption-secret "prying eyes want to know..."))}
           "Starting with config:")

outputs this in JSON logging mode (prettified with jq):

{
  "timestamp": "2022-08-02T20:26:06.734Z",
  "level": "INFO",
  "thread": "main",
  "ns": "fluree.db.server",
  "line": 152,
  "column": 4,
  "fluree/settings": {
    "fdb-storage-file-directory": null,
    "fdb-join?": false,
    "fdb-storage-s3-group-prefix": "group/",
    "fdb-memory-reindex-max": "2mb",
    "fdb-group-private-key-file": "default-private-key.txt",
    "fdb-command": null,
    "fdb-storage-file-group-path": "group/",
    "fdb-group-config-path": "./",
    "fdb-ledger-private-keys": null,
    "fdb-group-catch-up-rounds": "10",
    "fdb-consensus-type": "raft",
    "fdb-storage-file-root": "./data/",
    "fdb-stats-report-frequency": "1m",
    "fdb-storage-file-ledger-path": "ledger/",
    "fdb-encryption-secret": null,
    "fdb-group-log-history": "5",
    "fdb-query-peer-servers": "localhost:8090",
    "fdb-group-private-key": null,
    "fdb-pw-auth-signing-key": null,
    "fdb-pw-auth-jwt-max-renewal": "1y",
    "fdb-group-timeout": "2000",
    "fdb-pw-auth-enable": "true",
    "fdb-api-open": "true",
    "fdb-group-snapshot-threshold": "200",
    "fdb-license-key": null,
    "fdb-memory-cache": "200mb",
    "fdb-group-port": null,
    "fdb-pw-auth-jwt-secret": null,
    "fdb-storage-s3-bucket": null,
    "fdb-storage-type": "file",
    "fdb-group-heartbeat": null,
    "fdb-mode": "dev",
    "fdb-api-port": 8090,
    "fdb-group-snapshot-path": "data/group/snapshots",
    "fdb-json-bigdec-string": "true",
    "fdb-pw-auth-secret": "fluree",
    "fdb-ledger-servers": null,
    "fdb-group-log-directory": "./data/group",
    "fdb-memory-reindex": "1mb",
    "fdb-pw-auth-jwt-max-exp": "1y",
    "fdb-group-this-server": "myserver",
    "fdb-group-servers": "myserver@localhost:9790",
    "fdb-storage-s3-ledger-prefix": "ledger/"
  },
  "logger": "fluree.db.server",
  "message": "Starting with config:",
  "context": "default"
}

@cap10morgan cap10morgan requested review from a team and jakep36 August 2, 2022 20:55
Copy link
Contributor

@dpetran dpetran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Structured logging is fantastic and I think this will be really useful.

@cap10morgan cap10morgan merged commit 60f2af3 into main Aug 3, 2022
@cap10morgan cap10morgan deleted the feature/structured-logs branch August 3, 2022 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants