Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Asset Updates "Waiting on Upstream Data to Be Up to Date" #26436

Open
hwspuijbroek opened this issue Dec 12, 2024 · 0 comments
Open
Labels
type: bug Something isn't working

Comments

@hwspuijbroek
Copy link

hwspuijbroek commented Dec 12, 2024

What's the issue?

Hi Dagster Team,

I am encountering an issue that resembles GitHub issue #20177. I am using Dagster version 1.7.9 on a Windows VM with SQLlite database. Most of the time, asset refreshing works fine, but occasionally some assets fail to update, remaining stuck with the status "waiting on upstream data to be up to date."

When I investigate the upstream assets listed as ancestors, they appear to be up-to-date. However, Dagster does not recognize this and prevents the dependent assets from materializing. This results in an unnecessary blockage in the asset pipeline.

I would like to ask the following:

Is this a known issue in version 1.7.9?

If so, is there a fix available or planned for this problem?

Is there a way to manually trigger Dagster to update its metadata to recognize that the upstream assets are up-to-date, even if the pipeline currently does not detect this correctly?

Are there workarounds or recommendations for ensuring that assets refresh correctly when this issue occurs? For example, should I be manually materializing assets more often, or is there a configuration setting I might have overlooked?

What did you expect to happen?

If a Asset is refreshed, I expect that all the downstream upstream assets will be refreshed also. This is what I see most days also.

My policy is:
########################################################################################

Policy for DBT materializations

parent_updated_policy = AutoMaterializePolicy.eager().with_rules(
AutoMaterializeRule.skip_on_not_all_parents_updated()
)

How to reproduce?

I am currently unsure what causes this behavior. Any insights into potential triggers or diagnostic steps would be greatly appreciated.

Dagster version

1.7.9

Deployment type

Local

Deployment details

pip freeze:

agate==1.7.1
alembic==1.13.1
aniso8601==9.0.1
annotated-types==0.7.0
anyio==4.4.0
attrs==23.1.0
azure-core==1.30.2
azure-identity==1.16.1
azure-keyvault-secrets==4.8.0
azure-storage-blob==12.20.0
azure-storage-file-datalake==12.15.0
Babel==2.15.0
backoff==2.2.1
cachetools==5.3.3
certifi==2024.6.2
cffi==1.15.1
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
coloredlogs==14.0
croniter==2.0.5
cryptography==42.0.8
dagster==1.7.9
dagster-azure==0.23.9
dagster-cloud==1.7.9
dagster-cloud-cli==1.7.9
dagster-databricks==0.23.9
dagster-dbt==0.23.9
dagster-graphql==1.7.9
dagster-msteams==0.23.9
dagster-pipes==1.7.9
dagster-pyspark==0.23.9
dagster-spark==0.23.9
dagster-webserver==1.7.9
databricks-api==0.9.0
databricks-cli==0.18.0
databricks-sdk==0.17.0
databricks-sql-connector==2.9.6
dbt-core==1.7.15
dbt-databricks==1.7.8
dbt-extractor==0.5.1
dbt-semantic-interfaces==0.4.4
dbt-spark==1.7.1
docstring_parser==0.16
et-xmlfile==1.1.0
filelock==3.14.0
fsspec==2024.6.0
github3.py==4.0.1
google-auth==2.30.0
gql==3.5.0
graphene==3.3
graphql-core==3.2.3
graphql-relay==3.2.0
greenlet==3.0.3
grpcio==1.64.1
grpcio-health-checking==1.62.2
h11==0.14.0
httptools==0.6.1
humanfriendly==10.0
idna==3.7
importlib-metadata==6.11.0
isodate==0.6.1
jaraco.classes==3.3.1
Jinja2==3.1.4
jsonschema==4.22.0
jsonschema-specifications==2023.11.2
keyring==24.3.0
leather==0.4.0
Logbook==1.5.3
lz4==4.3.3
Mako==1.3.5
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mashumaro==3.13
mdurl==0.1.2
minimal-snowplow-tracker==0.0.2
more-itertools==10.1.0
msal==1.29.0
msal-extensions==1.2.0
msgpack==1.0.8
multidict==6.0.5
networkx==3.3
numpy==1.26.4
oauthlib==3.2.2
openpyxl==3.1.2
orjson==3.10.3
packaging==24.1
pandas==2.1.4
parsedatetime==2.6
pathspec==0.11.1
pendulum==2.1.2
pex==2.3.3
portalocker==2.10.0
prompt-toolkit==3.0.36
protobuf==4.25.3
psutil==5.9.8
py4j==0.10.9.7
pyarrow==15.0.0
pyasn1==0.6.0
pyasn1_modules==0.4.0
pycparser==2.22
pydantic==2.7.3
pydantic_core==2.18.4
Pygments==2.18.0
PyJWT==2.8.0
pyodbc==5.1.0
pyreadline3==3.4.1
pyspark==3.5.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-slugify==8.0.4
pytimeparse==1.1.8
pytz==2024.1
pytzdata==2020.1
pywin32==306
pywin32-ctypes==0.2.1
PyYAML==6.0.1
questionary==2.0.1
referencing==0.35.1
requests==2.32.3
requests-toolbelt==1.0.0
rich==13.7.1
rpds-py==0.18.1
rsa==4.9
shellingham==1.5.4
six==1.16.0
sniffio==1.3.1
SQLAlchemy==1.4.51
sqlglot==25.0.3
sqlglotrs==0.2.5
sqlparams==6.0.1
sqlparse==0.5.0
starlette==0.37.2
structlog==24.2.0
tabulate==0.9.0
text-unidecode==1.3
thrift==0.16.0
tomli==2.0.1
toposort==1.10
tqdm==4.66.4
typer==0.12.3
typing_extensions==4.12.2
tzdata==2024.1
universal_pathlib==0.2.2
uritemplate==4.1.1
urllib3==1.26.19
uvicorn==0.30.1
watchdog==4.0.1
watchfiles==0.22.0
wcwidth==0.2.13
websockets==12.0
yarl==1.9.4
zipp==3.19.2

Additional information

SQLLite database

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
By submitting this issue, you agree to follow Dagster's Code of Conduct.

@hwspuijbroek hwspuijbroek added the type: bug Something isn't working label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant