utils.utils¶

fairdiverse.search.utils.utils.get_metrics_20(csv_file_path)[source]¶

Retrieves evaluation metrics from a CSV file for the top 20 documents.

Parameters:: csv_file_path – The path to the CSV file containing evaluation results.
Returns:: A tuple containing the mean values of alpha-nDCG@20, NRBP@20, ERR-IA@20, and strec@20.

fairdiverse.search.utils.utils.get_rel_feat(path)[source]¶

Loads and scales the relevance features from a CSV file.

Parameters:: path – Path to the CSV file containing the relevance features.
Returns:: A dictionary where the key is a tuple (query, doc) and the value is a list of features.

fairdiverse.search.utils.utils.load_embedding(filename, sep='\t')[source]¶: Load embedding from file :param filename: embedding file name :param sep: the char used as separation symbol :return: a dict with item name as key and embedding vector as value

fairdiverse.search.utils.utils.pkl_load(filename)[source]¶

Loads a pickle file and returns the data inside it.

fairdiverse.search.utils.utils.pkl_save(data_dict, filename)[source]¶

Saves a dictionary to a compressed pickle file.

Parameters:

fairdiverse.search.utils.utils.read_rel_feat(path)[source]¶

Reads relevance features from a CSV file and returns them in a nested dictionary format.

Parameters:: path – Path to the CSV file containing the relevance features.
Returns:: A nested dictionary where the key is a query and the value is another dictionary of documents and features.

fairdiverse.search.utils.utils.remove_duplicate(input_path, output_path)[source]¶

Removes duplicate documents in the ranking list.

Parameters:

fairdiverse.search.utils.utils.restore_doc_ids(order_str, id_dict)[source]¶

Restores document IDs based on an ordered list of indices and a dictionary of document IDs.

Parameters:

Returns:

A list of document IDs in the restored order.

fairdiverse.search.utils.utils.split_list(origin_list, n)[source]¶

Splits the input list into smaller sublists of size n (or close to n).

Parameters:

Returns:

A list of sublists.