Adding Code to Dagster Cloud#

This section provides instructions for how to add your Dagster Python code to Dagster Cloud.

Dagster Cloud Code Requirements#

  • All Dagster code within a code location must be loaded from a single entry point that's either a Python file or a Python package. That entry point can load repositories from other files or packages.

  • Your Dagster Cloud code must run in an environment that has both the dagster and dagster-cloud Python packages installed. The dagster-cloud package is pinned to always be the same version as dagster. If you're porting code from Dagster that was originally written for version 0.11 or later, you can upgrade to the latest version of dagster without needing to change your code. We recommend installing the latest version of dagster to have access to the latest features and improvements, but any Dagster version starting at 0.13.2 will work with Dagster Cloud.

  • If you're using the Kubernetes, ECS, or Docker agents, you'll need to package your code into a Docker image and push it to a registry that your agent can access. Because of Dagster Cloud's hybrid architecture, Dagster's hosted components don't need access to your image - only your agent needs to be able to pull your image. The Dockerfile for your image does not need to specify an entry point or command - those will be specified by your agent when it runs code using your supplied image.

  • If you're using the local agent, your code must be in a Python environment that can be accessed on the same machine as your agent.

  • Your code does not need to use the same version of Dagster as your agent, and different code locations can use different versions of Dagster.

Configuring your Code Locations#

In order to load your code, your agent needs to know where to find it. In Dagster Cloud, you can configure your code locations from the Dagit UI or by using the dagster-cloud CLI. You can also use the Dagster Cloud Github action, which updates your code locations using the Dagster Cloud API on each commit to your Github repo.

When you add or update a code location, the agent uses the location configuration to load your code and upload metadata about your jobs to Dagster Cloud, allowing you to operate Dagit and launch runs. Each Dagster Cloud deployment has its own independent set of code locations.

Note that, unlike open-source Dagster, Dagster Cloud does not require you to create a workspace.yaml file. Instead, you use the Dagster Cloud API to configure your workspace. You can still create a workspace.yaml file if you want to also be able to load your code in an open-source Dagit instance, but doing so won't affect how your code is loaded in Dagster Cloud.

Using the Workspace Tab in Dagit#

You can add a new code location by navigating to the Workspace tab in Dagit and pressing the "Add Code Location" button. This will open a YAML editor with a schema that indicates the acceptable fields.

Add Code Location Config Editor

Every code location must set the code_source: key to either python_file: or package_name: to specify where to find your code.

If you're not using the local agent, you'll also need to specify an image: key.

You can also set the working_directory: key to specify what directory should be used to resolve relative Python imports while loading your code, and the attribute: key to specify that only a specific Dagster repository should be loaded.

Finally, you can set the executable_path: to a specific Python executable if your code should run in certain Python environment. If executable_path: is not set, the code will run using the default dagster command-line entry-point. If a Python executable is specified in the executable_path: key, your code will run in that Python environment.

For example, you can use this configuration to load our public example image:

location_name: cloud-examples
image: dagster/dagster-cloud-examples:latest
code_source:
  package_name: dagster_cloud_examples

Once you add a code location, your agent will attempt to load your code and send metadata about it to Dagit. It may take some time, depending on your execution environment, for the agent pull any needed images and load your code.

Once your code has loaded, the location will show a green "Loaded" status, and your jobs will immediately appear in Dagit. If the agent is unable to load your code, the code location will show an error with more information.

You can modify your code location configuration (for example, to update the image to a newer tag) by opening the dropdown menu in the right-hand column next to your location and selecting "Modify". Once you update a code location, your agent will perform a rolling update of your code, and your jobs will update in Dagit. Updating your code will not interrupt any currently launched runs.

You can also press the Redeploy button to tell your agent to reload your code and upload the metadata about your jobs to Dagit, without needing to modify the code location. For example, if the agent was unable to pull your image due to a permissions issue and you've fixed the problem, pressing Redeploy will tell it to attempt to pull the same image again.

Environment-specific config#

If your agent is at version 0.14.9 or higher, you can set the optional container_context key in your code location config to customize the code location for a specific execution environment (like Kubernetes, Docker, or ECS). For example, you could use the following config to specify that a code location should include a secret called my_secret and run in a K8s namespace called my_namespace whenever the Kubernetes agent creates a pod for that location:

location_name: cloud-examples
image: dagster/dagster-cloud-examples:latest
code_source:
  package_name: dagster_cloud_examples
container_context:
  k8s:
    namespace: my_namespace
    env_secrets:
      - my_secret

See the Agent Manuals section for information the configuration options available for each agent type.

Using the dagster-cloud CLI#

You can also use the dagster-cloud workspace CLI commands to add, update, and delete code locations. These commands perform the same underlying operations as editing your code locations in the Workspace tab in Dagit. If you're not able to use the Dagster Cloud Github action, these commands can be integrated with a CI/CD process to continually update the code in your Dagster Cloud deployment.

In order to use the CLI, you must have the dagster-cloud Python package installed in your command-line Python environment:

pip install dagster-cloud

See the Using the dagster-cloud CLI section for more information about configuring the dagster-cloud CLI.

Adding a location#

You can add or update a code location with the add-location command. For example, to add our public example image, you can run:

# Set up YAML file for example location
cat > example_location.yaml <<EOL
location_name: cloud-examples
image: dagster/dagster-cloud-examples:latest
code_source:
  package_name: dagster_cloud_examples
EOL

dagster-cloud workspace add-location --from example_location.yaml

Instead of creating a YAML file, you may also pass these values inline as command line options:

dagster-cloud workspace add-location test_location \
    --image dagster/dagster-cloud-examples:latest \
    --package-name dagster_cloud_examples

You may also selectively overwrite parts of your YAML input using inline options:

dagster-cloud workspace add-location \
    --from example_location.yaml \
    --image dagster/dagster-cloud-examples:1d9c5d

The arguments to the add-location and update-location commands are similar to the keys in the YAML config editor in the Workspace tab. You can run dagster-cloud workspace add-location --help to see the full set of available options. You must specify the full set of information about your code location when updating an existing location, even if you're only changing one piece of configuration.

Deleting a location#

You can delete an existing code location from Dagster cloud using the delete-location command:

dagster-cloud workspace delete-location test_location

Syncing the workspace#

You can also keep the YAML configuration for your entire workspace in a file, and use the dagster-cloud sync command to reconcile the workspace config in Dagster Cloud with that local file (adding, updating, or removing any locations as needed). For example, if you have the following cloud_workspace.yaml file:

locations:
  - location_name: machine-learning
    image: myregistry/dagster-machine-learning:mytag
    code_source:
      package_name: dagster_cloud_machine_learning
    executable_path: /my/folder/python_executable
    attribute: my_repo
  - location_name: data-eng
    image: myregistry/dagster-data-eng:myothertag
    code_source:
      python_file: repo.py
    working_directory: /my/folder/working_dir/

You can reconcile it with Dagster Cloud's remote workspace by running:

dagster-cloud workspace sync -w cloud_workspace.yaml

Example#

For an example of a Github repo set up to run in Dagster Cloud, see our example repo. This repo also uses the Dagster Cloud Github action, described in the next section, to automatically redeploy the jobs to Dagster Cloud on each commit.