Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed sdk versions #68

Merged
merged 5 commits into from
May 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .pulumi.version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.116.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielrbradley I don't think this is necessary if you bump the pu/pu version in go.mod. I'm using this provider as a bit of a testing ground to tighten up our dependency management, and getting rid of magic files like this is part of that goal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was pulling the wrong dependency for me locally - was running a very old version of the language plugins. This is how we do it elsewhere, so standardisation seems like a good win to be able to understand different providers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're specifically wanting to pin codegen and pkg to the same version, you're probably better copying Ian's more complete approach which uses the proper installer:

$(PULUMI): HOME := $(WORKING_DIR)
$(PULUMI): provider/go.mod
	@ PULUMI_VERSION="$$(cd provider && go list -m github.com/pulumi/pulumi/pkg/v3 | awk '{print $$2}')"; \
	if [ -x $(PULUMI) ]; then \
		CURRENT_VERSION="$$($(PULUMI) version)"; \
		if [ "$${CURRENT_VERSION}" != "$${PULUMI_VERSION}" ]; then \
			echo "Upgrading $(PULUMI) from $${CURRENT_VERSION} to $${PULUMI_VERSION}"; \
			rm $(PULUMI); \
		fi; \
	fi; \
	if ! [ -x $(PULUMI) ]; then \
		curl -fsSL https://get.pulumi.com | sh -s -- --version "$${PULUMI_VERSION#v}"; \
	fi

The downside of this is that it's a little harder to debug if it breaks compared to:

.pulumi/bin/pulumi: PULUMI_VERSION := $(shell cat .pulumi.version)
.pulumi/bin/pulumi: HOME := $(WORKING_DIR)
.pulumi/bin/pulumi: .pulumi.version
	curl -fsSL https://get.pulumi.com | sh -s -- --version "$(PULUMI_VERSION)"

Additionally, we've actually needed in the past to use different versions of codegen vs pkg reference .. which was one of the driving forces for introducing pulumi package gen-sdk. If we're unable to upgrade pkg for any reason, we might still need to take fixes to codegen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of this is that it's a little harder to debug if it breaks compared to:

The downside I see with these solutions is that it's not straightforward to test against pre-release versions. Everything needs to already be published to get.pulumi.com, so working with local or remote branches means awkward environment variable overrides. I 100% agree that standardization should be the goal, but I also think in this case we've converged on some unnecessary complexity.

It was pulling the wrong dependency for me locally - was running a very old version of the language plugins. This is how we do it elsewhere, so standardisation seems like a good win to be able to understand different providers.

I remember now that when I was initially putting this together I ran into an issue with the python language plugin requiring an exec script. I punted on fixing that, hence why this was using your ambient plugins.

We can hack around that issue by vendoring the script for now. Here's what that looks like.

If we're unable to upgrade pkg for any reason, we might still need to take fixes to codegen.

I might not fully understand what you mean here, but pkg and sdk (as well as the language plugins) are still versioned separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside I see with these solutions is that it's not straightforward to test against pre-release versions.

Working with testing local versions of codegen is a fairly unusual case on the providers team generally, though we should have this as part of the playbook. I'd suggest an alternative approach for locally testing specific codegen changes would be to run the dev build directly and not via make at all:

~/.pulumi-dev/bin/pulumi package gen-sdk bin/pulumi-resource-xyz --language python --version 2.0.0-dev

We can hack around that issue by vendoring the script for now.

From your link I see you're installing each plugin manually? e.g.

GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi/sdk/nodejs/cmd/pulumi-language-nodejs/v3

This seems like it should work for the time being if you'd prefer that option, though I think it would be worth discussing deviating from using the official installer with the rest of the team as this setup looks quite unusual for any maintainers coming from other providers which we generally try to avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working with testing local versions of codegen is a fairly unusual case on the providers team generally,

I must be unlucky then :)

though we should have this as part of the playbook. I'd suggest an alternative approach for locally testing specific codegen changes would be to run the dev build directly and not via make at all:

~/.pulumi-dev/bin/pulumi package gen-sdk bin/pulumi-resource-xyz --language python --version 2.0.0-dev

From your link I see you're installing each plugin manually? e.g.

GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi/sdk/nodejs/cmd/pulumi-language-nodejs/v3

This seems like it should work for the time being if you'd prefer that option, though I think it would be worth discussing deviating from using the official installer with the rest of the team as this setup looks quite unusual for any maintainers coming from other providers which we generally try to avoid.

I strongly agree with you re: deviation, but what's more unusual to me is that we're creating work for ourselves by inventing a new way to manage Go dependencies when the native toolchain should suffice.

  • Need to use a different version locally? Use a replace directive.
  • Need to upgrade your local version? Use go get.
  • Need to run a CLI pinned to a particular version? Use go run.
  • Need to periodically bump a version? Use a tool like Dependabot which understands go.mod (but not .pulumi.version files).

A playbook would help but still has a discoverability problem. If I'm brand new to the team, why should I expect there to be two ways to manage dependencies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of downloading the CLI for codegen was to reduce our dependency on internal parts of the pulumi/pulumi codebase as these have caused issues in the past relating to breaking changes. Instead, the providers team consume the Pulumi CLI in the same way as any external users which makes compatibility easier to reason about.

Installing the Pulumi CLI & language plugins via the go toolchain is not a supported installation method and is likely to break. E.g. languages being moved to their own repos.

There's certainly nicities to treating it that way as you highlight, but it also requires the provider to be coupled to the internal implementation details of the CLI's build process rather than the public interface (the installer) which we advertise to our users.

Additionally, almost all providers don't have a root go.mod file - this would have to be integrated into the provider module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it sounds like there's a decision to be made across providers about the best way to manage the pulumi version, but let's not let that block us on getting the go version support rolled out.

IIUC, there's no strict dependency on avoiding ambient plugins, to start setting for RespectSchemaVersion? If so, I think we should just do that to move this forward.

I do agree with Daniel though that if possible, I'd rather have consistency in how we pin and download pulumi CLI across providers than an ideal approach applied to just 1 or 2 providers. That ultimately makes it easier to improve all providers later.

I think you two are on similar timezones this week, maybe you can get 30min to sync tomorrow and hash out a pragmatic approach?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, keen to get this in so have adopted the change from your branch @blampe if you're happy with that.

Ran a test locally and those changes fixed it for with with the extra change for make build which failed.

I also timed the pulumi CLI installation process:

  • Using go install: make bin/pulumi 25.44s user 13.49s system 88% cpu 44.224 total
  • Using curl: make .pulumi/bin/pulumi 2.78s user 1.43s system 36% cpu 11.469 total

Let's revisit standardisation later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of downloading the CLI for codegen was to reduce our dependency on internal parts of the pulumi/pulumi codebase as these have caused issues in the past relating to breaking changes. Instead, the providers team consume the Pulumi CLI in the same way as any external users which makes compatibility easier to reason about.

Totally makes sense. Invoking the CLI this way doesn't make us any more dependent on code internals but does make us sensitive to knowing where the code lives, as you pointed out. The difference between this binary and the published one essentially boils down to the ldflag for Version, which can be somewhat alleviated with ReadBuildInfo.

golangci-lint is an example of a similar hermeticity problem. The linter and the code need to be updated together in cases when new breaking rules are added, but the linter's version is currently driven by ci-mgmt -- so workflow updates become blocked until someone takes the time to fix lint errors. It would be nice to instead let the repo decide the lint version it's compatible with, so its linter can be updated independently. One way to do that would be with a similar .golangci-lint.version file, but by leveraging go.mod instead we would have an approach that works for all tooling like this.

Additionally, almost all providers don't have a root go.mod file - this would have to be integrated into the provider module.

They should have a root go.mod, but that's a conversation for later :)

I also timed the pulumi CLI installation process:

  • Using go install: make bin/pulumi 25.44s user 13.49s system 88% cpu 44.224 total
  • Using curl: make .pulumi/bin/pulumi 2.78s user 1.43s system 36% cpu 11.469 total

Yeah, this is expected since it's compiled from source. The tools.go trick used here will be officially supported in go ~1.24 as go get -tool and should have nicer caching semantics.

Anyway thanks for indulging me again, as I mentioned I'd like to spend some more time flushing out a prototype to hopefully make the benefits more obvious.

23 changes: 12 additions & 11 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ PROVIDER_VERSION ?= 1.0.0-alpha.0+dev
# Use this normalised version everywhere rather than the raw input to ensure consistency.
VERSION_GENERIC = $(shell pulumictl convert-version --language generic --version "$(PROVIDER_VERSION)")

export PULUMI_IGNORE_AMBIENT_PLUGINS = true

.PHONY: ensure
ensure:: tidy lint test_provider examples

Expand Down Expand Up @@ -83,6 +85,9 @@ examples/java: ${PULUMI} bin/${PROVIDER} ${WORKING_DIR}/examples/yaml/Pulumi.yam
${PULUMI}: go.sum
GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi/pkg/v3/cmd/pulumi
GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi/sdk/go/pulumi-language-go/v3
GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi/sdk/nodejs/cmd/pulumi-language-nodejs/v3
GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi/sdk/python/cmd/pulumi-language-python/v3
GOBIN=${WORKING_DIR}/bin go install github.com/pulumi/pulumi-java/pkg/cmd/pulumi-language-java

${GOGLANGCILINT}: go.sum
GOBIN=${WORKING_DIR}/bin go install github.com/golangci/golangci-lint/cmd/golangci-lint
Expand Down Expand Up @@ -125,7 +130,7 @@ devcontainer::
cp -f .devcontainer/devcontainer.json .devcontainer.json

.PHONY: build
build:: provider dotnet_sdk go_sdk nodejs_sdk python_sdk
build:: provider sdk/dotnet sdk/go sdk/nodejs sdk/python sdk/java

# Required for the codegen action that runs in pulumi/pulumi
only_build:: build
Expand Down Expand Up @@ -192,23 +197,22 @@ go.sum: go.mod
sdk: $(shell mkdir -p sdk)
sdk: sdk/python sdk/nodejs sdk/java sdk/python sdk/go sdk/dotnet

sdk/python: PYPI_VERSION := $(shell pulumictl convert-version --language python -v "$(VERSION_GENERIC)")
# Folders can't be used for up-to-date checks as they will be marked as up-to-date even if the step fails - leading to a broken state.
.PHONY: sdk/*

sdk/python: TMPDIR := $(shell mktemp -d)
sdk/python: $(PULUMI) bin/${PROVIDER}
rm -rf sdk/python
$(PULUMI) package gen-sdk bin/$(PROVIDER) --language python -o ${TMPDIR}
cp README.md ${TMPDIR}/python/
cd ${TMPDIR}/python/ && \
rm -rf ./bin/ ../python.bin/ && cp -R . ../python.bin && mv ../python.bin ./bin && \
sed -i.bak -e 's/^ version = .*/ version = "$(PYPI_VERSION)"/g' ./bin/pyproject.toml && \
rm ./bin/pyproject.toml.bak && \
python3 -m venv venv && \
./venv/bin/python -m pip install build && \
cd ./bin && \
../venv/bin/python -m build .
mv -f ${TMPDIR}/python ${WORKING_DIR}/sdk/.

sdk/nodejs: NODE_VERSION := $(shell pulumictl convert-version --language javascript -v "$(VERSION_GENERIC)")
sdk/nodejs: TMPDIR := $(shell mktemp -d)
sdk/nodejs: $(PULUMI) bin/${PROVIDER}
rm -rf sdk/nodejs
Expand All @@ -217,9 +221,7 @@ sdk/nodejs: $(PULUMI) bin/${PROVIDER}
cd ${TMPDIR}/nodejs/ && \
yarn install && \
yarn run tsc && \
cp README.md LICENSE package.json yarn.lock bin/ && \
sed -i.bak 's/$${VERSION}/$(NODE_VERSION)/g' bin/package.json && \
rm ./bin/package.json.bak
cp README.md LICENSE package.json yarn.lock bin/
mv -f ${TMPDIR}/nodejs ${WORKING_DIR}/sdk/.

sdk/go: TMPDIR := $(shell mktemp -d)
Expand All @@ -233,14 +235,13 @@ sdk/go: $(PULUMI) bin/${PROVIDER}
go mod tidy
mv -f ${TMPDIR}/go ${WORKING_DIR}/sdk/go

sdk/dotnet: DOTNET_VERSION := $(shell pulumictl convert-version --language dotnet -v "$(VERSION_GENERIC)")
sdk/dotnet: TMPDIR := $(shell mktemp -d)
sdk/dotnet: $(PULUMI) bin/${PROVIDER}
rm -rf sdk/dotnet
$(PULUMI) package gen-sdk bin/${PROVIDER} --language dotnet -o ${TMPDIR}
cd ${TMPDIR}/dotnet/ && \
echo "$(DOTNET_VERSION)" > version.txt && \
dotnet build /p:Version=${DOTNET_VERSION}
echo "$(VERSION_GENERIC)" > version.txt && \
dotnet build
mv -f ${TMPDIR}/dotnet ${WORKING_DIR}/sdk/.

sdk/java: PACKAGE_VERSION := $(shell pulumictl convert-version --language generic -v "$(VERSION_GENERIC)")
Expand Down
204 changes: 204 additions & 0 deletions bin/pulumi-language-python-exec
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
#!/usr/bin/env python
# Copyright 2016-2018, Pulumi Corporation. All rights reserved.

import argparse
import asyncio
from typing import Optional
import logging
import os
import sys
import traceback
import runpy
from concurrent.futures import ThreadPoolExecutor

# The user might not have installed Pulumi yet in their environment - provide a high-quality error message in that case.
try:
import pulumi
import pulumi.runtime
except ImportError:
# For whatever reason, sys.stderr.write is not picked up by the engine as a message, but 'print' is. The Python
# langhost automatically flushes stdout and stderr on shutdown, so we don't need to do it here - just trust that
# Python does the sane thing when printing to stderr.
print(traceback.format_exc(), file=sys.stderr)
print("""
It looks like the Pulumi SDK has not been installed. Have you run pip install?
If you are running in a virtualenv, you must run pip install -r requirements.txt from inside the virtualenv.""", file=sys.stderr)
sys.exit(1)

# use exit code 32 to signal to the language host that an error message was displayed to the user
PYTHON_PROCESS_EXITED_AFTER_SHOWING_USER_ACTIONABLE_MESSAGE_CODE = 32

def get_abs_module_path(mod_path):
path, ext = os.path.splitext(mod_path)
if not ext:
path = os.path.join(path, '__main__')
return os.path.abspath(path)


def _get_user_stacktrace(user_program_abspath: str) -> str:
'''grabs the current stacktrace and truncates it to show the only stacks pertaining to a user's program'''
tb = traceback.extract_tb(sys.exc_info()[2])

for frame_index, frame in enumerate(tb):
# loop over stack frames until we reach the main program
# then return the traceback truncated to the user's code
cur_module = frame[0]
if get_abs_module_path(user_program_abspath) == get_abs_module_path(cur_module):
# we have detected the start of a user's stack trace
remaining_frames = len(tb)-frame_index

# include remaining frames from the bottom by negating
return traceback.format_exc(limit=-remaining_frames)

# we did not detect a __main__ program, return normal traceback
return traceback.format_exc()

def _set_default_executor(loop, parallelism: Optional[int]):
'''configure this event loop to respect the settings provided.'''
if parallelism is None:
return
parallelism = max(parallelism, 1)
exec = ThreadPoolExecutor(max_workers=parallelism)
loop.set_default_executor(exec)

if __name__ == "__main__":
# Parse the arguments, program name, and optional arguments.
ap = argparse.ArgumentParser(description='Execute a Pulumi Python program')
ap.add_argument('--project', help='Set the project name')
ap.add_argument('--stack', help='Set the stack name')
ap.add_argument('--parallel', help='Run P resource operations in parallel (default=none)')
ap.add_argument('--dry_run', help='Simulate resource changes, but without making them')
ap.add_argument('--pwd', help='Change the working directory before running the program')
ap.add_argument('--monitor', help='An RPC address for the resource monitor to connect to')
ap.add_argument('--engine', help='An RPC address for the engine to connect to')
ap.add_argument('--tracing', help='A Zipkin-compatible endpoint to send tracing data to')
ap.add_argument('--organization', help='Set the organization name')
ap.add_argument('PROGRAM', help='The Python program to run')
ap.add_argument('ARGS', help='Arguments to pass to the program', nargs='*')
args = ap.parse_args()

# If any config variables are present, parse and set them, so subsequent accesses are fast.
config_env = pulumi.runtime.get_config_env()
if hasattr(pulumi.runtime, "get_config_secret_keys_env") and hasattr(pulumi.runtime, "set_all_config"):
# If the pulumi SDK has `get_config_secret_keys_env` and `set_all_config`, use them
# to set the config and secret keys.
config_secret_keys_env = pulumi.runtime.get_config_secret_keys_env()
pulumi.runtime.set_all_config(config_env, config_secret_keys_env)
else:
# Otherwise, fallback to setting individual config values.
for k, v in config_env.items():
pulumi.runtime.set_config(k, v)

# Configure the runtime so that the user program hooks up to Pulumi as appropriate.
# New versions of pulumi python support setting organization, old versions do not
try:
settings = pulumi.runtime.Settings(
monitor=args.monitor,
engine=args.engine,
project=args.project,
stack=args.stack,
parallel=int(args.parallel),
dry_run=args.dry_run == "true",
organization=args.organization,
)
except TypeError:
settings = pulumi.runtime.Settings(
monitor=args.monitor,
engine=args.engine,
project=args.project,
stack=args.stack,
parallel=int(args.parallel),
dry_run=args.dry_run == "true"
)

pulumi.runtime.configure(settings)

# Finally, swap in the args, chdir if needed, and run the program as if it had been executed directly.
sys.argv = [args.PROGRAM] + args.ARGS
if args.pwd is not None:
os.chdir(args.pwd)

successful = False

try:
# The docs for get_running_loop are somewhat misleading because they state:
# This function can only be called from a coroutine or a callback. However, if the function is
# called from outside a coroutine or callback (the standard case when running `pulumi up`), the function
# raises a RuntimeError as expected and falls through to the exception clause below.
loop = asyncio.get_running_loop()
except RuntimeError:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

# Configure the event loop to respect the parallelism value provided as input.
_set_default_executor(loop, settings.parallel)

# We are (unfortunately) suppressing the log output of asyncio to avoid showing to users some of the bad things we
# do in our programming model.
#
# Fundamentally, Pulumi is a way for users to build asynchronous dataflow graphs that, as their deployments
# progress, resolve naturally and eventually result in the complete resolution of the graph. If one node in the
# graph fails (i.e. a resource fails to create, there's an exception in an apply, etc.), part of the graph remains
# unevaluated at the time that we exit.
#
# asyncio abhors this. It gets very upset if the process terminates without having observed every future that we
# have resolved. If we are terminating abnormally, it is highly likely that we are not going to observe every single
# future that we have created. Furthermore, it's *harmless* to do this - asyncio logs errors because it thinks it
# needs to tell users that they're doing bad things (which, to their credit, they are), but we are doing this
# deliberately.
#
# In order to paper over this for our users, we simply turn off the logger for asyncio. Users won't see any asyncio
# error messages, but if they stick to the Pulumi programming model, they wouldn't be seeing any anyway.
logging.getLogger("asyncio").setLevel(logging.CRITICAL)
exit_code = 1
try:
# record the location of the user's program to return user tracebacks
user_program_abspath = os.path.abspath(args.PROGRAM)
def run():
try:
runpy.run_path(args.PROGRAM, run_name='__main__')
except ImportError as e:
def fix_module_file(m: str) -> str:
# Work around python 11 reporting "<frozen runpy>" rather
# than runpy.__file__ in the traceback.
return runpy.__file__ if m == "<frozen runpy>" else m

# detect if the main pulumi python program does not exist
stack_modules = [fix_module_file(f.filename) for f in traceback.extract_tb(e.__traceback__)]
unique_modules = set(module for module in stack_modules)
last_module_name = stack_modules[-1]

# we identify a missing program error if
# 1. the only modules in the stack trace are
# - `pulumi-language-python-exec`
# - `runpy`
# 2. the last function in the stack trace is in the `runpy` module
if unique_modules == {
__file__, # the language runtime itself
runpy.__file__,
} and last_module_name == runpy.__file__ :
# this error will only be hit when the user provides a directory
# the engine has a check to determine if the `main` file exists and will fail early

# if a language runtime receives a directory, it's the language's responsibility to determine
# whether the provided directory has a pulumi program
pulumi.log.error(f"unable to find main python program `__main__.py` in `{user_program_abspath}`")
sys.exit(PYTHON_PROCESS_EXITED_AFTER_SHOWING_USER_ACTIONABLE_MESSAGE_CODE)
else:
raise e

coro = pulumi.runtime.run_in_stack(run)
loop.run_until_complete(coro)
exit_code = 0
except pulumi.RunError as e:
pulumi.log.error(str(e))
except Exception:
error_msg = "Program failed with an unhandled exception:\n" + _get_user_stacktrace(user_program_abspath)
pulumi.log.error(error_msg)
exit_code = PYTHON_PROCESS_EXITED_AFTER_SHOWING_USER_ACTIONABLE_MESSAGE_CODE
finally:
loop.close()
sys.stdout.flush()
sys.stderr.flush()

sys.exit(exit_code)
Loading
Loading