Parameter settings data pre-processing¶
(Default values are in ~/recommendation/properties/dataset.yaml)
The benchmark provides several arguments for describing:
Basic setting of the parameters
See below for the details:
Required parameters¶
Cache set ups¶
reprocess (bool)
: Should the preprocessing be redone based on the new parameters instead of using the cached files in ~/recommendation/process_dataset
Data directory set ups¶
ground_truth (str)
: The path of ground truth data, default as ground_truth.doc_content_dir (str)
: The path of document content, default as clueweb09_doc_content_dir.query_suggestion (str)
: The path of query suggestion, default as query_suggestion.xml.
Embedding set ups¶
embedding_dir (str)
: The path of query and documents’ embedding, default as embedding.embedding_type (str)
: The embedding type of the query and documents, default as doc2vec.embedding_length (int)
: The embedding length of the query and documents, default as 100.