utils.utils¶
- fairdiverse.search.utils.utils.get_metrics_20(csv_file_path)[source]¶
Retrieves evaluation metrics from a CSV file for the top 20 documents.
- Parameters:
csv_file_path – The path to the CSV file containing evaluation results.
- Returns:
A tuple containing the mean values of alpha-nDCG@20, NRBP@20, ERR-IA@20, and strec@20.
- fairdiverse.search.utils.utils.get_rel_feat(path)[source]¶
Loads and scales the relevance features from a CSV file.
- Parameters:
path – Path to the CSV file containing the relevance features.
- Returns:
A dictionary where the key is a tuple (query, doc) and the value is a list of features.
- fairdiverse.search.utils.utils.load_embedding(filename, sep='\t')[source]¶
Load embedding from file :param filename: embedding file name :param sep: the char used as separation symbol :return: a dict with item name as key and embedding vector as value
- fairdiverse.search.utils.utils.pkl_load(filename)[source]¶
Loads a pickle file and returns the data inside it.
- Parameters:
filename – Path to the pickle file.
- Returns:
The loaded data from the pickle file.
- fairdiverse.search.utils.utils.pkl_save(data_dict, filename)[source]¶
Saves a dictionary to a compressed pickle file.
- Parameters:
data_dict – The dictionary to be saved.
filename – The path where the pickle file should be saved.
- fairdiverse.search.utils.utils.read_rel_feat(path)[source]¶
Reads relevance features from a CSV file and returns them in a nested dictionary format.
- Parameters:
path – Path to the CSV file containing the relevance features.
- Returns:
A nested dictionary where the key is a query and the value is another dictionary of documents and features.
- fairdiverse.search.utils.utils.remove_duplicate(input_path, output_path)[source]¶
Removes duplicate documents in the ranking list.
- Parameters:
input_path – The path to the input file containing the ranking list.
output_path – The path where the cleaned ranking list will be saved.
- fairdiverse.search.utils.utils.restore_doc_ids(order_str, id_dict)[source]¶
Restores document IDs based on an ordered list of indices and a dictionary of document IDs.
- Parameters:
order_str – A string representing the order of document indices.
id_dict – A dictionary mapping indices to document IDs.
- Returns:
A list of document IDs in the restored order.