Skip to content

Using the same workspace directory between different environments #862

@WmWessels

Description

@WmWessels

Expected Behavior

I would like to create two environments (in .dbx/project.json). Here, I want to have the same workspace directory in both environments, but use different artifact locations.

Current Behavior

When I deploy my python project using dbx in our CICD pipeline, I get an exception. The exception I get is this:

Exception: Required location of experiment /Shared/dbx/ doesn't match
the project defined one.

Steps to Reproduce (for bugs)

Create a dbx project. In the project.json, there should be two different environments. The workspace directory should be the same, but the artifact location should be different.

Then, create two deployment files (one for training, one for scoring). In the first deployment file, we create a workflow using the first environment. In the second deployment file, we create a workflow using the second environment.

finally:

  • dbx deploy --deployment-file <deployment_file_train>
  • dbx deploy --deployment-file <deployment_file_score>

Context

We want to version our ML code in production. We currently have a training workflow and a scoring workflow (training workflow stores the trained models, scoring refers to these models). As such, we would like the training workflow and scoring workflow to use the same workspace directory. However, we also want to use different artifact locations, such that we can version our code and not have the training/scoring workflows use the same code version.

How would I need to structure my project.json in order to get this to work?

Your Environment

  • dbx version used: 0.8.17
  • Databricks Runtime version: 12.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions