sfLucenePlugin
0.0.6alpha
for sf 1.4sf 1.3sf 1.2sf 1.1sf 1.0 MIT
Developers
License
--------------------------------------------------------------------------------
sfLucenePlugin
--------------------------------------------------------------------------------
Copyright (c) 2007 Carl Vondrick
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
--------------------------------------------------------------------------------
Zend Search Lucene
--------------------------------------------------------------------------------
Copyright (c) 2005-2007, Zend Technologies USA, Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of Zend Technologies USA, Inc. nor the names of its
contributors may be used to endorse or promote products derived from this
software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Introduction
sfLucenePlugin integrates symfony and Zend Search Lucene to instantly add a search engine to your application. The plugin will auto-detect your ORM layer, but currently only supports Propel. (A Doctrine port is coming.)
Requirements
Main Features
- Configured all by YAML files
- Complete integration with symfony
- i18n ready
- Keyword highlighting
- Stop words, short words
- Index optimization
- Custom indexers

Development Status
This plugin is in active development and is constantly evolving. If you use it, you can expect some problems (some bigger than others) and many breaking-compatibility changes. You have been warned.
Installation
Install the plugin:
symfony plugin-install http://plugins.symfony-project.com/sfLucenePlugin
Initialize configuration files (ignore this if you are upgrading):
symfony lucene-init myapp
Clear the cache
symfony cc
Configure sfLucene per the instructions below.
Configuring Lucene
The entire plugin is configured by search.yml files placed throughout your application. You must be careful that you are aware of what search.yml file you are working in because each one has a different purpose. As you will later learn, the project level search.yml file controls the entire engine while a module's search.yml defines indexing parameters.
Open your project's search.yml file, located in myproject/config/search.yml. If you followed the installation instructions above, you will see at the bottom:
index:
name: MyIndex
encoding: UTF-8
Enter a name for the index. This is used internally by the plugin and does not matter much. If you require a different encoding to be used, enter it. Note, however, that UTF-8 is generally the best charset to store your indexes in.
If you require i18n support, you must define the cultures that you support under index. Use the following syntax:
index:
cultures: [fr_FR](en_US,)
(If you receive an exception saying "Culture XXX is not enabled" then define the culture even if you do not use i18n.)
By default, the plugin will not index or search on common words, such as "the" and "a". Further, it ignores single characters. If you require different behavior, you can define them like so:
index:
stop_words: [an, it](the,)
short_words: 2
Indexing
sfLucene currently supports two ways to add information to the index:
1. Through the ORM layer
2. Through symfony actions
Through the ORM layer is the recommended method to add information to the index. The plugin can keep the index synchronized if you use the ORM layer. Through symfony actions is intended only for static content, such as the privacy policy.
ORM layer method
Open your project's search.yml file and you will find a model declaration towards the top. This is where you put the models you wish to index. For each model, you define the fields you want to index and other parameters. The syntax is:
models:
BlogPost:
fields:
id: unindexed
title:
boost: 1.5
type: text
content: unstored
description: text
BlogComment:
fields:
id: unindexed
summary: text
message: text
description: message
title: summary
In the above example, two models are set to index: BlogPost and BlogComment. In BlogPost, the fields title, content, and description are stored, but the title fields holds the most weight with a boost factor of 1.5.
When search results are displayed, the system intelligently guesses which field should be displayed as the result "title" and which field is the result "description." However, to be explicit, you can specify a description and title field, as in BlogComment.
Note that the fields do not have to exist in your database. As long as it has a getter, you can use it in your index.
See the Zend_Search_Lucene documentation for more about the field types.
Next, you must tell your application where to route the model when it is returned. You do this by opening your application's config/search.yml file and defining a route:
models:
BlogPost:
route: blog/showPost?id=%id%
BlogComment:
route: blog/showComment?id=%id%
In routes, %xxx% is a token and will be replaced by the appropriate field value. So, %id% will be the value returned by the ->getId() method. Warning: You must also define the field in the project's search.yml to be indexed or unexpected results will occur!
Finally, you must register the model with the system. If you are using Propel, you must use Propel's behaviors.
Propel
You can do this by opening up the model's file and putting
sfLucenePropelBehavior::getInitializer()->setupModel('MyModel');
after the class declaration. So, for a blog, you would open project/lib/model/BlogPost.php and append the above, replacing "!MyModel" with "!BlogPost".
Doctrine
soon
symfony actions method
To setup an action to be indexed, you must create a file in the module's config directory named search.yml. Inside this file, you define the actions you want indexed:
privacy:
tos:
security:
authenticated: true
credentials: [admin]
disclaimer:
params:
advanced: true
layout: true
As you can see, it is possible to define request parameters, manipulate authentication, and toggle decorating the response. By default, the response is not decorated, the user is not authenticated without any credentials, and there aren't any request parameters.
Building the Index
After you have defined the indexing parameters, you must build the initial index. You do this on the command line:
$ symfony lucene-rebuild myapp
replacing myapp with the name of your application you want to rebuild. This will build the index for all cultures.
Searching
sfLucene ships with a basic search interface that you can use in your application. Like the rest of the plugin, it is i18n ready and all you must do is define the translation phrases.
To enable the interface, open your application's settings.yml file and add "sfLucene" to the enabled_modules section:
all:
.settings:
enabled_modules: [sfLucene](default,)
You are free to define your own routes in the routing.yml file.
Customizing the Interface
As every application is different, it is easy to customize the search interface to fit the look and feel of your site. Doing this is easy as all you must do is overload the templates and actions.
To get started, simply run the following on the command line:
$ symfony lucene-init-module myapp
If you look in myapp's module folder. you will see a new sfLucene module. Use this to customize your interface.
Often, when writing a search engine, you need to display a different result template for each model. For instance, a blog post should show differently than a forum post. You can easily customize your results by changing the "partial" value in your application's search.yml file. For example:
models:
BlogPost:
route: blog/showPost?slug=%slug%
partial: blog/searchResult
ForumPost:
route: forum/showThread?id=%id%
partial: forum/searchResult
The partial that you specify is given a $result object that you can use to build that result. The API for this object is pretty simple:
$result->getInternalTitle() returns the title of the search result.
$result->getInternalRoute() returns the route to the search result.
$result->getScore() returns the score / ranking of the search result.
$result->getXXX() returns the XXX field.
In addition to the $result object, it is also given a $query string, which was what the user searched for. This is useful for highlighting the results.
Highlighting Pages
The plugin has an optional highlighter than will attempt to highlight keywords from searches. The highlighter will hook into this search engine and also attempts to hook into external search engines, such as Google and Yahoo!.
To enable this feature, open the application's config/filters.yml file and add the highlight filter before the cache filter:
rendering: ~
web_debug: ~
security: ~
# generally, you will want to insert your own filters here
highlight:
class: sfLuceneHighlightFilter
cache: ~
common: ~
flash: ~
execution: ~
By default, the highlighter will also attempt to display a notice to the user that automatic highlighting occured. The filter will search the result document for <!--[HIGHLIGHTER_NOTICE]--> and replace it with an i18n-ready notice (note: this is case sensitive).
To highlight a keyword, it must meet the following criteria:
* must be X/HTML response content type
* response must not be headers only
* must not be an ajax request
* be inside the
tag
* be outside of