recommendation.metric¶

fairdiverse.recommendation.metric.AUC_score(y_scores, y_true)[source]¶

AUC (also known as Area Under Curve) is used to evaluate the two-class model, referring to the area under the ROC curve.

Note

This metric does not calculate group-based AUC which considers the AUC scores averaged across users. It is also not limited to k. Instead, it calculates the scores on the entire prediction results regardless the users. We call the interface in scikit-learn, and code calculates the metric using the variation of following formula.

\[\mathrm {AUC} = \frac {{{M} \times {(N+1)} - \frac{M \times (M+1)}{2}} - \sum\limits_{i=1}^{M} rank_{i}} {{M} \times {(N - M)}}\]

\(M\) denotes the number of positive items. \(N\) denotes the total number of user-item interactions. \(rank_i\) denotes the descending rank of the i-th positive item.

fairdiverse.recommendation.metric.Entropy(utility_list, weights=None, group_mask=None)[source]¶

Calculate the entropy of a distribution given by utility_list, optionally weighted by weights and filtered by group_mask. Entropy measures the disorder or uncertainty in the distribution.

Parameters:

utility_list – list or array-like A list or array representing utility values for each item.
weights – list or array-like, optional A list or array of weights corresponding to each utility value. If not provided, all utilities are considered equally weighted. Defaults to None.

:param group_masklist or array-like, optional: A boolean mask indicating which utilities to include in the calculation. If not provided, all utilities are included. Defaults to None.

Returns:: float: The calculated entropy of the (potentially weighted and masked) distribution.

Notes - Entropy is calculated as H = -sum(p * log2(p)), where p is the probability of each event. - Probabilities are normalized to ensure their sum equals 1. - To avoid taking the log of zero, a small constant (1e-9) is added to each probability before calculating the entropy.

fairdiverse.recommendation.metric.Gini(utility_list, weights=None, group_mask=None)[source]¶

This function computes the Gini coefficient, a measure of statistical dispersion intended to represent income inequality within a nation or social group. The Gini coefficient is calculated based on the cumulative distribution of values in utility_list, which can optionally be weighted and masked.

Parameters:

utility_list – array_like A 1D array representing individual utilities. The utilities are used to compute the Gini coefficient.
weights – array_like, optional A 1D array of weights corresponding to utility_list. If provided, each utility value is multiplied by its respective weight before calculating the Gini coefficient. Defaults to None, implying equal weighting.
group_mask – array_like, optional A 1D boolean array used to selectively include elements from utility_list. If provided, only the elements where the mask is True are considered in the calculation. Defaults to None, meaning all elements are included.

Returns:

float: The computed Gini coefficient, ranging from 0 (perfect equality) to 1 (maximal inequality).

fairdiverse.recommendation.metric.HR(ranking_list, label_list, k)[source]¶

HR (also known as truncated Hit-Ratio) is a way of calculating how many ‘hits’ you have in an n-sized list of ranked items. If there is at least one item that falls in the ground-truth set, we call it a hit.

\[\mathrm {HR@K} = \frac{1}{|U|}\sum_{u \in U} \delta(\hat{R}(u) \cap R(u) \neq \emptyset),\]

\(\delta(·)\) is an indicator function. \(\delta(b)\) = 1 if \(b\) is true and 0 otherwise. \(\emptyset\) denotes the empty set.

fairdiverse.recommendation.metric.MMF(utility_list, ratio=0.5, weights=None, group_mask=None)[source]¶

Calculate the Max-min Fairness (MMF) index based on a given utility list.

Parameters :param utility: array-like

A list or array representing the utilities of resources or users.

Parameters:

ratio – float, optional The fraction of the minimum utilities to consider for the MMF calculation. Defaults to 0.5.
ratio – float, optional The fraction of the minimum utilities to consider for the MMF calculation. Defaults to 0.5.
ratio – float, optional The fraction of the minimum utilities to consider for the MMF calculation. Defaults to 0.5.

:param weightsarray-like, optional: An optional list or array of weights corresponding to each utility in utility_list. If provided, utilities are multiplied by their respective weights before sorting. Defaults to None, implying equal weighting.
:param group_maskarray-like, optional: An optional list or array used to selectively apply weights. If provided, it must have the same length as utility_list and weights. Defaults to None, indicating no group-based weighting.

Returns:: The computed MMF index, representing the fairness of the allocation.

fairdiverse.recommendation.metric.MRR(ranking_list, k)[source]¶

The MRR (also known as Mean Reciprocal Rank) computes the reciprocal rank of the first relevant item found by an algorithm.

\[\mathrm {MRR@K} = \frac{1}{|U|}\sum_{u \in U} \frac{1}{\operatorname{rank}_{u}^{*}}\]

\({rank}_{u}^{*}\) is the rank position of the first relevant item found by an algorithm for a user \(u\).

fairdiverse.recommendation.metric.MinMaxRatio(utility_list, weights=None, group_mask=None)[source]¶

This function computes the minimum-to-maximum ratio of a list of utilities, optionally weighted and grouped.

Parameters:

float) (utility_list (list of) – A list containing numerical utility values.
optional) (group_mask (list of int or bool,) – A list of weights corresponding to the utilities in utility_list. If provided, each utility is multiplied by its respective weight. If None, all utilities are considered with equal weight.
optional) – A mask indicating groups within the utility_list. If provided, it must be of the same length as utility_list. Groups are defined by consecutive True or 1 values. If None, no grouping is applied.

Returns:

float: The computed minimum-to-maximum ratio of the (weighted) utilities.

fairdiverse.recommendation.metric.NDCG(ranking_list, label_list, k)[source]¶

NDCG (also known as normalized discounted cumulative gain) is a measure of ranking quality, where positions are discounted logarithmically. It accounts for the position of the hit by assigning higher scores to hits at top ranks.

\[\mathrm {NDCG@K} = \frac{1}{|U|}\sum_{u \in U} (\frac{1}{\sum_{i=1}^{\min (|R(u)|, K)} \frac{1}{\log _{2}(i+1)}} \sum_{i=1}^{K} \delta(i \in R(u)) \frac{1}{\log _{2}(i+1)})\]

\(\delta(·)\) is an indicator function.

fairdiverse.recommendation.metric.dcg(scores, k)[source]¶: Calculate the Discounted Cumulative Gain (DCG) at rank k.

fairdiverse.recommendation.metric.mask_utility(utility, group_mask)[source]¶

Mask the utility values based on the provided group mask.

This function filters out the utility values where the corresponding group mask element is zero, effectively removing them from the output.

Parameters:

utility – array for item/user utilities
group_mask – bool array for whether computed the group utilityes

Returns:

masked utility array

fairdiverse.recommendation.metric.reconstruct_utility(utility_list, weights, group_mask)[source]¶: Reconstruct utility by re-weighting them and masking the utility of certain unused groups. :param utility_list: array for item/user utilities :param weights: array for item/user utilities weights :param group_mask: bool array for whether computed the group utilityes :return: re-constructed utility array