Runs and Processes

Runs and processes allow a user to inspect processes, get logs, resume failed jobs and kill processes

The purpose of the runs & processes window is to have an in-depth view of the Tensoleap processes that were started by the user - their progress, status, and logs. This window also allows to extract logs from failing processes for debugging purposes, re-run failed evaluations, and kill processes.

Overview

The Runs and processes window presents all of the processes that was started in the platform. The table lists:

The Model Name that is used by process
The Model Run that is used by process
The Code Integration Name, branch, and version that is used by the process
The Type & status of the process
Creation time and total process duration.

The top of the runs & processes bar supports a filtering of specific types of processes, stopping all jobs, and deleting all logged processes from the window.

Each process could be expanded (by clicking the process Name), to show additional process attributes.

Process Types in Tensorleap

Import Model: The process of uploading a new model to the platform via the CLI or UI.
Dataset Parse: The process of parsing an integration script uploaded via the CLI or UI.
Graph Validate: The process of validating the assets of a mapping.
Evaluate: The process of running an evaluate of a model.
Population Exploration: The process of analyzing the model latent space. This triggers when:
- An evaluate was complete
- The state of the Population exploration Dashlet changes
- A change in the dashboard filters
Visualizers Calculation: The process of visualizing the different samples within the population exploration using the provided visualizers.
Fetch Similar: The process of fetching similar samples from within the population exploration.
Sample Analysis: The process of analyzing a specific sample from within the population exploration.
Import Project: The process of importing a gallery project to the platform.

Filtering, killing, and removing process logs

Filtering a process

In order to only view some of the types you can open the filter icon at the top of the runs and process, and select the relevant process.

Killing a process

Clicking the top "Skull" next to the filter kills all processes (in case multiple processes are stuck)
For active processes, hovering over them would add another in-line "skull" icon. pressing it would kil the process

Clearing logs and previous processes

The right-most button removes all previous process logs from the platform.

Inspecting a process

Inspecting a process allows a user to monitor logs from the process (live) or download a .tar.gz of the most recent logs once a process is terminated.

How to inspect a process

To inspect a process - click the process within the Runs & processes and then click "Inspect Process"

The process inspection view

In the process inspection view we can download the logs (top right cloud icon downloads a .tar.gz file), or review them within the paltform.

The logs are divided into tab, each tab essentially describe the logs from a kubernetes pod that takes part in the work needed for this process to complete.

There are tabs per pod, named describe-POD-NAME and POD-NAME.

The desribe-POD-NAME pod is essentially the output of a kubectl describe pod command. It provides information on the status of the pod, if it encountered an OOM, and other high-level configurations (limits, etc.)
The POD-NAME contains the most recent logs from the pod.

Debugging process errors

Most Process error result in a meaningful notification that should point a specific issue that should be resolved. To further debug the issue, it is possible to examine the logs - and review any errors that might appear.

PreviousEvaluate a Model NextProcess Types

Last updated 25 days ago

Was this helpful?