Squared log error objective function produces NaN values during training #11210

heptaflar · 2025-02-06T08:50:05Z

The squared log error objective function produces NaN values during training, if the predicted value x is less than or equal to -1.0, as you correctly write in your documentation. However, this is not a mathematical necessity but a consequence of your implementation. You may exploit the identity

log(1 + x) == 0.5 * log((1 + x) * (1 + x))

or as you may prefer

log1p(x) == 0.5 * log1p(x * (2.0 + x))

which yields NaN if and only if x == -1.0 but a meaningful value otherwise. The gradient and the Hessian (and hence the training) become stable for values x < -1.0, too. I included my custom implementation of the squared log error objective (and its associated metric) in the code below. It works well in my case and I would like to share it with you. My implementation also uses the identity

log(a) - log(b) == log(a / b)

But this is not essential. Take it or leave it. Best wishes, Ralf.

"""
This module defines custom objectives.
"""

from abc import ABC
from abc import abstractmethod

import numpy as np
import xgboost as xgb


class Objective(ABC):
    """
    The interface for a custom objective and its associated metric.
    """

    @abstractmethod
    def gradient(self, pred: np.ndarray, data: xgb.DMatrix) -> np.ndarray:
        """
        Returns the gradient of the objective.

        :param pred: The predicted values.
        :param data: The predictor values.
        :return: The gradient.
        """

    @abstractmethod
    def hessian(self, pred: np.ndarray, data: xgb.DMatrix) -> np.ndarray:
        """
        Returns the Hessian of the objective.

        :param pred: The predicted values.
        :param data: The predictor values.
        :return: The Hessian.
        """

    @abstractmethod
    def metric(
        self, pred: np.ndarray, data: xgb.DMatrix
    ) -> tuple[str, float]:
        """
        Returns the metric associated with the objective.

        :param pred: The predicted values.
        :param data: The predictor values.
        :return: The name and the value of the metric.
        """

    def obj(
        self, pred: np.ndarray, data: xgb.DMatrix
    ) -> tuple[np.ndarray, np.ndarray]:
        """
        The objective function.

        :param pred: The predicted values.
        :param data: The predictor values.
        :return: The gradient and the Hessian of the objective.
        """
        return self.gradient(pred, data), self.hessian(pred, data)


def le(x: np.ndarray, y: np.ndarray) -> np.ndarray:
    """Returns the logarithmic error terms."""
    return 0.5 * np.log(np.square((1.0 + x) / (1.0 + y)))


def rms(e: np.ndarray, w: np.ndarray) -> np.ndarray:
    """Returns the root (weighted) mean squared error."""
    return np.sqrt(
        np.average(np.square(e), weights=w if w.shape == e.shape else None)
    )


class SLE(Objective):
    """
    The squared logarithmic error objective.

    This objective shall replace the internal XGB squared logarithmic
    error objective.
    """

    def gradient(self, pred: np.ndarray, data: xgb.DMatrix) -> np.ndarray:
        return le(pred, data.get_label()) / (1.0 + pred)

    def hessian(self, pred: np.ndarray, data: xgb.DMatrix) -> np.ndarray:
        return (1.0 - le(pred, data.get_label())) / np.square(1.0 + pred)

    def metric(
        self, pred: np.ndarray, data: xgb.DMatrix
    ) -> tuple[str, float]:
        return (
            "rmsle",
            rms(le(pred, data.get_label()), data.get_weight()).item(),
        )

The text was updated successfully, but these errors were encountered:

trivialfis · 2025-02-07T07:57:49Z

Thank you for sharing, will look into this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Squared log error objective function produces NaN values during training #11210

Squared log error objective function produces NaN values during training #11210

heptaflar commented Feb 6, 2025 •

edited

Loading

trivialfis commented Feb 7, 2025

Squared log error objective function produces NaN values during training #11210

Squared log error objective function produces NaN values during training #11210

Comments

heptaflar commented Feb 6, 2025 • edited Loading

trivialfis commented Feb 7, 2025

heptaflar commented Feb 6, 2025 •

edited

Loading