Skip to content

Latest commit

 

History

History
18 lines (17 loc) · 2.67 KB

File metadata and controls

18 lines (17 loc) · 2.67 KB

Database-Fulltext-Search

Improving the effectiveness of keyword search on relational data using effective retrieval subsets.

Project Structure:

  • Cache Enhancement: This project gives the capability of indexing all documents in a directory and its subdirectories, filtering the parts of the generated index that satisfy a specific condition, and measuring query difficulty metrics against the selected part of the index. Main modules are:
    • build.py: used to build an index based on a directory of documents;
    • partition.py: used to build a virtual partition on top of the main index;
    • querydifficulty.py: our main query difficulty metrics;
    • enhancer/describe.py: used to compare two different virtual partitions, treating them as two giant documents;
    • enhancer/solutions.py: used to recursively refine a virtual partition by removing documents from it to increase its difference from another base partition.
  • MSLR: applies Learn-to-rank on the MSLR dataset;
  • rrank-analysis: effect of cache size on the reciprocal ranks;
  • ML-evaluate: evaluates machine learninng models as the cache selection algorithm;
  • ML-prepare
  • Cluster Analysis
  • text-classification
  • wiki13
  • wikipagecount