View RSS Feed

Danny Bembibre (dbembibre) Crawlability Inc.

Setup Sphinx search in vBulletin

Rating: 2 votes, 5.00 average.
by , 11-16-2007 at 10:07 PM (12749 Views)
In order to try to have sphinx search working in my forums I follow the steps posted in this thread: http://www.vbulletin.org/forum/showp...postcount=3877


Download the latest stable sphinx release and unpack it
Code:
$ wget http://www.sphinxsearch.com/downloads/sphinx-0.9.7.tar.gz

$ tar xzf sphinx-0.9.7.tar.gz

$ cd sphinx-0.9.7
Configure Sphinx

Code:
 ./configure --prefix=/usr

checking build environment
--------------------------

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether to enable maintainer-specific portions of Makefiles... no

configuring Sphinx
------------------

checking whether to compile with MySQL support... (cached) yes
checking whether to compile with PostgreSQL support... (cached) no
checking MySQL includes... (cached) /usr/include/mysql
checking MySQL libraries... (cached) /usr/lib/mysql


$ make 
$ make install
Create needed directories

Code:
$ mkdir /var/log/sphinx
$ mkdir /var/data/sphinx
Time to config sphinx

Code:
$ touch /etc/sphinx.conf
$ vi /etc/sphinx.conf

and put the content of attached sphinx.conf.txt replace user, password, and forum_db
Code:
mysql> CREATE TABLE sphinx_counter 
          (counter_id INTEGER PRIMARY KEY NOT NULL,       
           max_doc_id INTEGER NOT NULL );

Query OK, 0 rows affected (0.01 sec)
Time to index

Code:
$ indexer --config /etc/sphinx.conf --all
Sphinx 0.9.7
Copyright (c) 2001-2007, Andrew Aksyonoff

using config file '/etc/sphinx.conf'...
indexing index 'post'...
collected 3570597 docs, 1008.4 MB
sorted 91.5 Mhits, 100.0% done
total 3570597 docs, 1008376598 bytes
total 351.204 sec, 2871197.49 bytes/sec, 10166.73 docs/sec
indexing index 'postdelta'...
collected 7 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 7 docs, 1133 bytes
total 0.014 sec, 79778.85 bytes/sec, 492.90 docs/sec
indexing index 'thread'...
collected 222712 docs, 6.8 MB
sorted 0.7 Mhits, 100.0% done
total 222712 docs, 6778203 bytes
total 2.806 sec, 2415407.71 bytes/sec, 79363.26 docs/sec
indexing index 'threaddelta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
distributed index 'fulltext' can not be directly indexed; skipping.
distributed index 'threadtitles' can not be directly indexed; skipping.
Now its time to see if sphinx work, for this pourpose test with search any word here

Code:
$ search bmwfaq.com --config /etc/sphinx.conf
Sphinx 0.9.7
Copyright (c) 2001-2007, Andrew Aksyonoff

index 'post': query 'bmwfaq.com ': returned 1000 matches of 63297 total in 0.012 sec

displaying matches:
1. document=309409, weight=2, forumid=7, threadid=26258, userid=3222, postuserid=3222, dateline=Tue Nov  2 16:51:19 2004
2. document=309420, weight=2, forumid=27, threadid=26260, userid=3222, postuserid=3222, dateline=Tue Nov  2 16:51:19 2004

...
Start thread
Code:
$ searchd --config /etc/sphinx.conf
Sphinx 0.9.7
Copyright (c) 2001-2007, Andrew Aksyonoff

using config file '/etc/sphinx.conf'...

and rotate postdelta and threaddelta
$ indexer --config /etc/sphinx.conf --rotate postdelta threaddelta
Sphinx 0.9.7
Copyright (c) 2001-2007, Andrew Aksyonoff

using config file '/etc/sphinx.conf'...
indexing index 'postdelta'...
collected 51 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 51 docs, 14809 bytes
total 0.010 sec, 1480900.03 bytes/sec, 5100.00 docs/sec
indexing index 'threaddelta'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 125 bytes
total 0.010 sec, 12500.00 bytes/sec, 400.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=29995).
Copy the file sphinxapi.php from the api directory of sphinx install to your forums-root where global.php live

Copy the attached sphinx.php to forums-root/includes

Create cron file to reindex and rotate threads and posts
This scripts are taken from: http://www.vbulletin.org/forum/showp...&postcount=336

Code:
$ cd /usr/local/etc
$ vi rotate.sh (with this content)
#!/bin/sh

LOCKFILE=/var/lock/sphinx.cron.lock

[ -f $LOCKFILE ] && exit 0

trap "{ rm -f $LOCKFILE ; exit 255; }" EXIT

touch $LOCKFILE

indexer --config /etc/sphinx.conf --rotate postdelta threaddelta

$ vi reindex.sh (with this content)
#!/bin/sh

LOCKFILE=/var/lock/sphinx.cron.lock

[ -f $LOCKFILE ] && exit 0

trap "{ rm -f $LOCKFILE ; exit 255; }" EXIT

touch $LOCKFILE

indexer --all --rotate --config /etc/sphinx.conf  >/dev/null 2>&1
Modify the /etc/crontab file to catch the new indexers scripts the first every 20 minutes and the indexed --all every day at 4:59 AM
Code:
*/20 * * * * root run-parts /usr/local/etc/rotate.sh;
59 4 * * * root /usr/local/etc/reindex.sh;
Finally modify search.php according as search.php.txt attached in this post http://www.vbulletin.org/forum/showp...&postcount=387
Attached Thumbnails Attached Files
Categories
BMW FAQ Club

Comments

  1. TECK's Avatar
    The disadvantage with this setup Mike is that all search results (except for Search in Title, with results displayed as Threads) are highly inaccurate.

    I will explain soon into my blog entry, why.
  2. Danny Bembibre's Avatar
    Yes Florence this is true, have you finished your sphinx product ?
  3. TECK's Avatar
    Yes, but I already work on a new release. Here it is a teaser:
    sphinx search 2.0.0 nearly completed | why . queued

    Floren
  4. Firestar's Avatar
    How accurate is this still? Is it outdated or will it still be useful? I'm a complete noob, and I need a step by step like this to do this integration. Anything more current around that I haven't found yet?

Trackbacks

Total Trackbacks 0
Trackback URL: