apt xapian index
play

apt-xapian-index Everything You Always Wanted to Index About Debian - PowerPoint PPT Presentation

Introduction The solution Interesting bits still to figure out HELP! apt-xapian-index Everything You Always Wanted to Index About Debian Packages, But were Afraid to Ask Enrico Zini enrico@debian.org 23 February 2008 Enrico Zini


  1. Introduction The solution Interesting bits still to figure out HELP! apt-xapian-index Everything You Always Wanted to Index About Debian Packages, But were Afraid to Ask Enrico Zini enrico@debian.org 23 February 2008 Enrico Zini enrico@debian.org apt-xapian-index

  2. Introduction The solution Please help me with the notes Interesting bits still to figure out Introduction HELP! Outline Introduction 1 Please help me with the notes Introduction The solution 2 A tour of apt-xapian-index Code examples Interesting bits still to figure out 3 HELP! 4 Enrico Zini enrico@debian.org apt-xapian-index

  3. Introduction The solution Please help me with the notes Interesting bits still to figure out Introduction HELP! Please help me with the notes apt-get install gobby 1 Run gobby 2 Connect to the session at 192.168.42.217 , port 6522 , 3 password enrico Join document notes.txt 4 Enrico Zini enrico@debian.org apt-xapian-index

  4. Introduction The solution Please help me with the notes Interesting bits still to figure out Introduction HELP! Outline Introduction 1 Please help me with the notes Introduction The solution 2 A tour of apt-xapian-index Code examples Interesting bits still to figure out 3 HELP! 4 Enrico Zini enrico@debian.org apt-xapian-index

  5. Introduction The solution Please help me with the notes Interesting bits still to figure out Introduction HELP! The problem What I want to see happening Build smart interfaces to browse the large Debian archive. The first problem I think needs solving: The only fast package index we have at the moment is APT The task of the APT index is to solve dependencies APT shouldn’t be expanded (bloated) to do much more Solution: create another index to complement APT Enrico Zini enrico@debian.org apt-xapian-index

  6. Introduction The solution Please help me with the notes Interesting bits still to figure out Introduction HELP! What the new index should have Fast full text searches Fast tag searches Extensible, to accomodate new ideas for data to index Enrico Zini enrico@debian.org apt-xapian-index

  7. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! Outline Introduction 1 Please help me with the notes Introduction The solution 2 A tour of apt-xapian-index Code examples Interesting bits still to figure out 3 HELP! 4 Enrico Zini enrico@debian.org apt-xapian-index

  8. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! A tour of apt-xapian-index The technology Sits in /var/lib/apt-xapian/index Based on Xapian Indexes text as well as numbers and dates Decent bindings in all sorts of languages Stretchable and abusable by great lengths Self documented in /var/lib/apt-xapian-index/README Enrico Zini enrico@debian.org apt-xapian-index

  9. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! A tour of apt-xapian-index Indexing Done by /usr/sbin/update-apt-xapian-index Can be run interactively Runs in a weekly cron job Packages can inject extra data by adding plugins in /usr/share/apt-xapian-index/plugins Enrico Zini enrico@debian.org apt-xapian-index

  10. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! A tour of apt-xapian-index Searching You just need the plain Xapian API /var/lib/apt-xapian-index/README documents the index layout Enrico Zini enrico@debian.org apt-xapian-index

  11. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! Tools using it goplay ( golearn , goadmin , . . . ) debtags.debian.net (just started) Enrico Zini enrico@debian.org apt-xapian-index

  12. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! Outline Introduction 1 Please help me with the notes Introduction The solution 2 A tour of apt-xapian-index Code examples Interesting bits still to figure out 3 HELP! 4 Enrico Zini enrico@debian.org apt-xapian-index

  13. Introduction The solution A tour of apt-xapian-index Interesting bits still to figure out Code examples HELP! This page is sneakily left blank to divert your attention elsewhere. Enrico Zini enrico@debian.org apt-xapian-index

  14. Introduction The solution Interesting bits still to figure out HELP! Getting more data into the system My proposal One package per dataset to get Ship a copy of the dataset in the package, to use if everything fails A tool that can be run to fetch the data, or A plugin system to fetch the data using a single tool instead? Download new versions using a cron job Provide the data somewhere under /var Add an apt-xapian-index plugin to index it For example: popcon, bts statistics, iterating.org Enrico Zini enrico@debian.org apt-xapian-index

  15. Introduction The solution Interesting bits still to figure out HELP! More indexing ideas Debian specific stemming “lib foo” becomes “library” and “foo”; “deb foo” becomes “debian” and “foo” “cvsdelta”, “cvsgraph”, “gnomecatalog”, “gnomeradio”, “gnu something” (but not “gnustep”), “kde something”... More generally, how to index “Rindfleischetiket- tierungsüberwachungsaufgabenübertragungsgesetz”? How to provide the same stemming algorithm at query time? Compensate with improved descriptions? Enrico Zini enrico@debian.org apt-xapian-index

  16. Introduction The solution Interesting bits still to figure out HELP! More indexing ideas What else to index? popcon bts statistics iterating.com more ideas? Enrico Zini enrico@debian.org apt-xapian-index

  17. Introduction The solution Interesting bits still to figure out HELP! i18n How about searching translated descriptions? Xapian already supports stemming for many languages Is it useful, with such short descriptions? One index per language? How about disk space, and indexing time? Enrico Zini enrico@debian.org apt-xapian-index

  18. Introduction The solution Interesting bits still to figure out HELP! Index update Can it be improved? Incremental updates Need to track what’s new after an apt-get update Increases index size Suid update script to run goplay right after installing it Enrico Zini enrico@debian.org apt-xapian-index

Recommend


More recommend