Agressors detection
This describes how to detect aggressors within the Tensorleap platform
What Are Aggressors and Why They Matter
In deep learning workflows, understanding why a model fails is often harder than spotting where it fails. While metrics like accuracy or loss may flag issues, they rarely reveal their underlying causes. This is where aggressors come in.
What Is a Model Aggressor?
An aggressor is a semantic subgroup of data where your model consistently underperforms. These patterns—often called error slices—might relate to lighting conditions, phrasing, or edge-case user behaviors. They’re rarely visible through aggregate metrics but can lead to persistent failure modes, especially in production.
Aggressors are important because they:
Reveal systematic model weaknesses
Often go unnoticed during standard evaluation
Tend to reappear in real-world deployment
Are difficult to catch without deeper analysis
Why They’re Worth Addressing
Ignoring aggressors can lead to:
Passing benchmarks but failing on edge cases
Long debug cycles with little progress
Production failures and wasted resources
Treating them early can:
Speed up model development
Improve generalization
Reduce the cost of failure
The Aggressor Lifecycle
Most teams—explicitly or not—follow the same general process when addressing model failures. This lifecycle includes:
Indicating a problematic behavior and measuring its severity
Forming a root cause hypothesis
Validating that hypothesis
Solving the issue through model or data changes
Tracking the outcome over time
One Lifecycle, Two Ways to Execute It
This process is often done manually, but it’s slow and hard to scale. Tensorleap supports the same lifecycle, but makes each step faster, more structured, and easier to explain.
The next sections walk through each stage — comparing traditional approaches with how Tensorleap streamlines them. It ends with a complete example that showcase an example of end-to-end tackling of an aggressor in the platform.
Last updated
Was this helpful?