Store and print Lineage information in variable in code page #455
Replies: 0 comments 2 replies
-
There are several dispatchers available, check the docs: If none of them meets your needs, you can create your own.
Post processing filter may be useful for this. It is applied before the dispatcher sends/stores the lineage data. |
Beta Was this translation helpful? Give feedback.
-
The short answer is yes, as Adam said, but there are some but's. Let me elaborate on it. Spline agent is called by Spark driver from a separate thread, so to pull the lineage content into a variable in your notebook page or a shell you need to do it in concurrent manner. With Spark 2.x it's easier because the actions were blocking and by the time the control is returned the Spline work has already done, and all the dispatchers are called, so you can expect the lineage to be captured. However in Spark 3+ the event listeners are processed asynchronously to the actions, hence you need to implement some sort of synchronization and wait until the lineage content is ready and written into your variable. This is not that straight-forward, but is doable. We do it in our integration-tests. Take a look at the LineageCaptor class, and the usage in some tests, e.g. BasicIntegrationTests |
Beta Was this translation helpful? Give feedback.
-
HI Team,
Just wanted to know if i can write the Lineage information to any variable in the code page it self if possible. Just explain in detail
I want to store the complete lineage json in variable and do further cleaning after my code execution.
Any sample code reference would be great help.
Thank You.
Beta Was this translation helpful? Give feedback.
All reactions