Skip to main content

Toxicity

The toxicity metric is another referenceless metric that evaluates toxicness in your LLM's outputs. This is particularly useful for a fine-tuning use case.

Installation

Toxicity in deepeval requires an additional installation:

pip install detoxify

Required Arguments

To use the NonToxicMetric, you'll have to provide the following arguments when creating an LLMTestCase:

  • input
  • actual_output

Example

Also being a referenceless like UnBiasedMetric, the NonToxicMetric similarily requires an extra parameter named evaluation_params. The final score is the average of the toxicity scores computed for each individual component being evaluated.

from deepeval import evaluate
from deepeval.metrics import NonToxicMetric
from deepeval.test_case import LLMTestCase, LLMTestCaseParams

# Replace this with the actual output from your LLM application
actual_output = "We offer a 30-day full refund at no extra cost."

metric = NonToxicMetric(
evaluation_params=[LLMTestCaseParams.INPUT, LLMTestCaseParams.ACTUAL_OUTPUT],
threshold=0.5
)
test_case = LLMTestCase(
input="What if these shoes don't fit?",
actual_output=actual_output,
)

metric.measure(test_case)
print(metric.score)

# or evaluate test cases in bulk
evaluate([test_case], [metric])