ranklib_ranker¶
- class fairdiverse.search.ranker_model.ranklib_ranker.RankLib(configs, dataset)[source]¶
Bases:
Ranker
Wrapper class to run the available ranking models in the Ranklib library. For more information about available models and params check the official documentation: https://sourceforge.net/p/lemur/wiki/RankLib%20How%20to%20use/
- assign_judgement(x, th, cols)[source]¶
Assigns judgement scores based on relevance ranking.
This method assigns a judgement score to each document based on its relevance to a query.
- :param xpandas.DataFrame
The subset of data belonging to a single query.
- :param thfloat
The threshold for classifying relevance.
- :param colslist
The list of feature columns.
- :returnpandas.DataFrame
The data with the assigned judgement scores.
- create_ranklib_data(cols, data, out_dir, split)[source]¶
Formats and writes data for RankLib.
This method prepares the data by formatting it according to RankLib’s required format and writes it to a text file.
- :param colslist
The list of feature columns.
- :param datapandas.DataFrame
The data to be written to a text file.
- :param out_dirPath
The output directory where the file will be saved.
- :param splitstr
The type of data, either “train” or “test”.
- generate_ranklib_data(data_train, data_test, run)[source]¶
Generates data formatted for RankLib training and testing.
This method prepares the training and testing data for RankLib by generating the required feature matrix and label information in a format that RankLib can process.
- :param data_trainpandas.DataFrame
The training dataset.
- :param data_testpandas.DataFrame
The testing dataset.
- :param runstr
The identifier for the current run.
- predict(data, run, file_name)[source]¶
Generates predictions using the trained RankLib model.
This method reads the predictions from the trained model and saves them as a CSV file.
- :param datapandas.DataFrame
The dataset on which predictions need to be made.
- :param runstr
The identifier for the current run.
- :param file_namestr
The file name to save the predictions as a CSV.
- :returnpandas.DataFrame
A DataFrame containing the predictions.
- read_predictions(data, run)[source]¶
Retrieves LTR predictions for the dataset.
This method loads the predictions from the trained RankLib model.
- :param datapandas.DataFrame
The dataset for which predictions need to be made.
- :param runstr
The identifier for the run.
- :returnpandas.DataFrame
The dataset with predictions added.
- train(data_train, data_test, run)[source]¶
Trains ranking models using RankLib.
This method generates RankLib-compatible training data and then runs the RankLib training script.
- :param data_trainpandas.DataFrame
The training dataset to be used for training the ranking model.
- :param data_testpandas.DataFrame
The testing dataset to be used for evaluating the ranking model.
- :param runstr
The identifier for the current training run.
- fairdiverse.search.ranker_model.ranklib_ranker.get_LTR_predict(data, out_dir, ranker, score_col, query_col, id_col)[source]¶
Fetches RankLib prediction scores.
This method loads prediction scores from the model and merges them with the provided dataset.
- :param datapandas.DataFrame
The dataset that needs the predictions.
- :param out_dirPath
The directory where the RankLib predictions are stored.
- :param rankerstr
The name of the ranking model used.
- :param score_colstr
The column name of the score in the dataset.
- :param query_colstr
The column representing queries.
- :param id_colstr
The unique identifier for each data point.
- :returnpandas.DataFrame
The dataset with added prediction scores.
- fairdiverse.search.ranker_model.ranklib_ranker.get_prediction_scores(pred_path)[source]¶
Retrieves prediction scores from the latest RankLib experiment.
This method reads the predictions generated from the latest experiment and returns them.
- :param pred_pathPath
The directory containing the prediction files.
- :returndict
A dictionary mapping document IDs to predicted scores.