-
Notifications
You must be signed in to change notification settings - Fork 126
Description
Expected Behavior
A deployment with dbx deploy [...] --write-specs-to-file=spec.json
with a named_parameters
definition like so:
named_parameters:
conf-file: "file:fuse://path/to/some-file.yaml"
Should result in a specs file with the following named_parameters
object:
"named_parameters": {
"conf-file": "/dbfs/[...]/artifacts/path/to/some-file.yaml",
}
And only conf-file
should be created as a keyword argument parameter in the workflow task on the web GUI.
Current Behavior
Instead, I get this:
"named_parameters": {
"conf-file": "/dbfs/[...]/artifacts/path/to/some-file.yaml",
"named_parameters": "/dbfs/[...]/artifacts/path/to/some-file.yaml"
}
And the workflow task on the web GUI ends up with two keyword argument parameters: conf-file
and named_parameters
.
Steps to Reproduce (for bugs)
Run a dbx deploy
with a workflow that has python_wheel_task
tasks with named_parameters
, at least one of which has a value that starts with file://
or file:fuse://
. Then check the keyword argument parameters of the task in the web GUI.
Context
I've traced the problem to this function:
dbx/dbx/api/adjuster/adjuster.py
Lines 165 to 169 in 34bd186
def file_traverse(self, workflows, file_adjuster: FileReferenceAdjuster): | |
for element, parent, index in self.traverse(workflows): | |
if isinstance(element, str): | |
if element.startswith("file://") or element.startswith("file:fuse://"): | |
file_adjuster.adjust_file_ref(element, parent, index) |
And this part in PropertyAdjuster.traverse()
:
dbx/dbx/api/adjuster/adjuster.py
Lines 43 to 48 in 34bd186
if isinstance(_object, dict): | |
for key in list(_object.keys()): | |
item = _object[key] | |
yield item, _object, key | |
for _out in self.traverse(item, _object, index_in_parent): | |
yield _out |
After yielding the correct tuple for conf-file
:
('file:fuse://path/to/config.yaml', {'conf-file': 'file:fuse://path/to/config.yaml'}, 'conf-file')
It then attempts to traverse item
with index_in_parent
, which is named_parameters
, and since item
is a string, traverse
jumps here and terminates:
dbx/dbx/api/adjuster/adjuster.py
Line 66 in 34bd186
yield _object, parent, index_in_parent |
And yields essentially a duplicate tuple except with the wrong index_in_parent
:
('file:fuse://path/to/config.yaml', {'conf-file': 'file:fuse://path/to/config.yaml'}, 'named_parameters')
Your Environment
- dbx version used: 0.8.18
- Databricks Runtime version: 12.2.x-scala2.12