<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Dave Dash</title>
 <link href="http://davedash.com/tag/zend-search-lucene/atom.xml" rel="self"/>
 <link href="http://davedash.com/tag/zend-search-lucene"/>
 <updated>2012-01-17T21:54:19-08:00</updated>
 <id>http://davedash.com/</id>
 <author>
   <name>Dave Dash</name>
   <email>dd+atom1@davedash.com</email>
 </author>

 
 <entry>
   <title>symfonyCamp </title>
   <link href="http://davedash.com/2007/09/11/symfonycamp/"/>
   <updated>2007-09-11T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/09/11/symfonycamp</id>
   <content type="html">&lt;p&gt;[tags]symfony, symfonyCamp, sensio, dop, zend search lucene, zsl[/tags]&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;photoright&quot;&gt;
&lt;img src=&quot;http://farm2.static.flickr.com/1096/1354502708_97f225a078_m.jpg?v=s&quot; alt=&quot;Tents&quot; /&gt;
&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;Well Katie and I are back from &lt;a href=&quot;http://symfonycamp.com/&quot;&gt;symfonyCamp&lt;/a&gt; and it was great.  I opted to socialize on the business day, except to hear Fabien Potencier's brief overview of what's to come in symfony 1.1/2.0.&lt;/p&gt;

&lt;p&gt;I personally know only a handful of adept symfony developers, so going to camp it was nice to see 50 people or so who knew symfony to varying degrees.  I am a fan of the small successes, such as &lt;a href=&quot;http://www.tempus-vivit.net/&quot;&gt;Fabian Lange's historical reenactment site&lt;/a&gt; which paid the way for two developers to attend the conference.&lt;/p&gt;

&lt;p&gt;One of the most interesting talks to hear was Fabien's overview of symfony 2.0.  It was a 12-step process from going to vanilla PHP to building a strong framework in about 200 lines.  A leaner more robust symfony sounds very appealing, it's also what appeals to me about being a developer.  Every developer builds off of existing technologies and is able to create something great.  I frequently have to dig into the symfony core code to see how things &quot;really&quot; work and everything is simple, easy to follow building blocks.  Effectively giving us a nice framework.  symfony 2.0 seems to be a leaner more flexible framework.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;photoleft&quot;&gt;
&lt;img src=&quot;http://farm2.static.flickr.com/1283/1341189131_1bfe60a945_m.jpg&quot; alt=&quot;Zend talk&quot; /&gt;
&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;Later I spoke on Zend Search Lucene as well as Ajax.  Both are somewhat difficult to speak on for 45 minutes.  Zend Search Lucene really only takes 15 minutes to explain, and maybe just as long to implement.  Sure it can be tweaked quite a bit, but it's straightforward - that's the point.&lt;/p&gt;

&lt;p&gt;Ajax on the other hand is hard to explain in terms of symfony.  Sure there's a helper layer, but the Javascript layer is very independent of the PHP layer.  Anything can support Ajax.  So I tried to cover not just the standard helpers, but show a few demos and how easy it is with the helper system.  Unfortunately I don't really code this way, I try to use UJS and the jQuery plugins with new work, but that would move the talk to a more advanced topic.&lt;/p&gt;

&lt;p&gt;The next day we ended up cleaning up the symfony project.  We split into teams to take care of some house-work.  Some people cleaned up the wiki.  Some cleaned up tickets, some wrote new modules for the site.  Our team worked on plugins and it went rather well.  We closed a number of tickets, created a good deal of patches (which I have yet to apply), but overall the plugins are all a bit better.&lt;/p&gt;

&lt;p&gt;Overall everything was great.  &lt;a href=&quot;http://dop.nu/&quot;&gt;Dutch Open Projects&lt;/a&gt; was great, especially Stefan for arranging so much, Guido for making sure everyone was comfortable and Floris for the great food.&lt;/p&gt;

&lt;p&gt;You can also read &lt;a href=&quot;http://www.symfony-project.com/blog/2007/09/07/symfony-camp&quot;&gt;Fabien's overview&lt;/a&gt; of the camp.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Symfony Camp: Ajax and Zend, what would you like to know?</title>
   <link href="http://davedash.com/2007/08/16/symfony-camp-ajax-and-zend-what-would-you-like-to-know/"/>
   <updated>2007-08-16T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/08/16/symfony-camp-ajax-and-zend-what-would-you-like-to-know</id>
   <content type="html">&lt;p&gt;[tags]symfonyCamp, symfony, netherlands, ajax, zend search lucene, zsl, jquery[/tags]&lt;/p&gt;

&lt;p&gt;I've been asked to speak at &lt;a href=&quot;http://www.symfonycamp.com/&quot;&gt;SymfonyCamp&lt;/a&gt; (&lt;code&gt;symfony['camp']&lt;/code&gt;) next month (you should all go if you can) and I thought I'd present as well as I could on Ajax and the Zend Framework Bridge (including Zend Search Lucene).&lt;/p&gt;

&lt;p&gt;If you're attending the camp and/or would like to hear about these topics please let me know any specific questions you might have about &quot;&lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; and Ajax&quot; and &quot;symfony and Zend&quot; and I'll try to address them in my presentations.&lt;/p&gt;

&lt;p&gt;If you are unable to go fear not, I'll try to post my notes on this site.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Boosting terms in  Zend Search Lucene</title>
   <link href="http://davedash.com/2007/05/29/boosting-terms-in-zend-search-lucene/"/>
   <updated>2007-05-29T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/05/29/boosting-terms-in-zend-search-lucene</id>
   <content type="html">&lt;p&gt;[tags]Zend, Zend Search Lucene, Search, Lucene, php, symfony, zsl[/tags]&lt;/p&gt;

&lt;h3&gt;Boosting terms &amp;mdash; some fields are better than others&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;Lucene&lt;/a&gt; supports boosting or weighting terms.  For example, if I search for members of a web site, and I type in &lt;q&gt;Dash&lt;/q&gt;, I want people with the name &lt;q&gt;Dash&lt;/q&gt; to take precendence over somebody who has a hobby of running the 50-yard Dash.&lt;/p&gt;

&lt;p&gt;If we look at our &lt;code&gt;generateZSLDocument()&lt;/code&gt; method we defined we just need to adjust a few lines:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;

        $doc-&gt;addField(Zend_Search_Lucene_Field::Text('firstname', $this-&gt;getFirstname()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Text('lastname', $this-&gt;getLastname()));
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;Should be turned into:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;

        $field = Zend_Search_Lucene_Field::Text('firstname', $this-&gt;getFirstname());
        $field-&gt;boost = 1.5;
        $doc-&gt;addField($field);
        $field = Zend_Search_Lucene_Field::Text('lastname', $this-&gt;getLastname());
        $field-&gt;boost = 1.5;
        $doc-&gt;addField($field);

&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;This is pretty straight forward way to add weight (1.5 times the weight of a normal term) and you can customize it to the needs of your site.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Finding things using Zend Search Lucene in symfony</title>
   <link href="http://davedash.com/2007/05/23/finding-things-using-zend-search-lucene-in-symfony/"/>
   <updated>2007-05-23T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/05/23/finding-things-using-zend-search-lucene-in-symfony</id>
   <content type="html">&lt;p&gt;[tags]Zend, Zend Search Lucene, Search, Lucene, php, symfony, zsl[/tags]&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;notice&quot;&gt;This is part of an &lt;a href=&quot;http://spindrop.us/tag/zsl&quot;&gt;on going series&lt;/a&gt; about the Zend Search Lucene libraries and symfony.  We'll pretty everything up when we're done =)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;We now know how to &lt;a href=&quot;http://spindrop.us/2007/04/24/creating-updating-deleting-documents-in-a-lucene-index-with-symfony/&quot;&gt;manipulate the index via our model classes&lt;/a&gt;.  But let's actually do something useful with our search engine... let's search!&lt;/p&gt;

&lt;!--more--&gt;


&lt;p&gt;[tags]Zend, Zend Search Lucene, Search, Lucene, php, symfony, zsl[/tags]&lt;/p&gt;

&lt;p&gt;At the time of this writing we're dealing with Propel which uses &lt;code&gt;Peer&lt;/code&gt; classes which are meant for dealing with multiple objects&lt;sup id=&quot;#fnr_1&quot;&gt;&lt;a href=&quot;#fn_1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.  This is the perfect place for a &lt;code&gt;::search()&lt;/code&gt; method.  In otherwords, &lt;code&gt;UserPeer::search('dave');&lt;/code&gt; should query Lucene for users matching &quot;dave&quot;.  Let's make that happen:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;

    public static function search($query)
    {
        $index = self::getLuceneIndex();
        
        $hits = $index-&gt;find(strtolower($query));
        $pks = array();
    
        foreach($hits AS $hit)
        {
            $pks[] = $hit-&gt;user_id;
        }
        
        return self::retrieveByPks($pks);
    }

&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;What we're doing is retrieving our Lucene index.  Somewhere between tutorials we wrote this &lt;code&gt;Peer&lt;/code&gt; function to handle that:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public static function getLuceneIndex($autoIndex = true)
    {
        try 
        {
            return $index = Zend_Search_Lucene::open(sfConfig::get(self::$luceneIndex));
        } 
        catch (Exception $e) 
        {
            $index = $autoIndex ? self::reindex() : null;
            return $index;
        }
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;If our index is missing we'll conveniently create it on the fly.  We then use the Zend Search Lucene API to retrieve the matching hits in this index and then use some Propel trickery to retrieve by an array of primary keys.&lt;/p&gt;

&lt;p&gt;It's now simple to use &lt;code&gt;::search()&lt;/code&gt; functions in the same manner as you use &lt;code&gt;::doSelect()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At this point you should be able to create a basic symfony app that can utilize a Lucene index.&lt;/p&gt;

&lt;div id=&quot;footnotes&quot;&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
        &lt;li id=&quot;fn_1&quot;&gt;The examples refer to using Propel, but it's trivial to adapt this to sfDoctrine &lt;a href=&quot;#fnr_1&quot; class=&quot;footnoteBackLink&quot;  title=&quot;Jump back to footnote  in the text.&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
&lt;/div&gt;

</content>
 </entry>
 
 <entry>
   <title>Creating, Updating, Deleting documents in a Lucene Index with symfony</title>
   <link href="http://davedash.com/2007/04/24/creating-updating-deleting-documents-in-a-lucene-index-with-symfony/"/>
   <updated>2007-04-24T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/04/24/creating-updating-deleting-documents-in-a-lucene-index-with-symfony</id>
   <content type="html">&lt;p&gt;Previously we covered &lt;a href=&quot;http://spindrop.us/2007/04/23/the-lucene-search-index-and-symfony/&quot;&gt;an all-at-once approach&lt;/a&gt; to indexing objects in your symfony app.  But for some reason, people find the need to allow users to sign up, or change their email addresses and then all of a sudden our wonderful Lucene index is out of date.&lt;/p&gt;

&lt;p&gt;Here lies the strength of using &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;Zend Search Lucene&lt;/a&gt; in your app, you can now get the flexibility of interacting with a Lucene index, no matter how it was created and add, update and delete documents to it.&lt;/p&gt;

&lt;!--more--&gt;


&lt;p&gt;The last thing you want to do is have a cron job in charge of making sure your index is always up to date by reindexing regularly.  This is an inelegant and inefficient process.&lt;/p&gt;

&lt;p&gt;A smarter method would be to trigger an update of the index each time you update your database.  Luckily the &lt;acronym title=&quot;Object Relational Mapping&quot;&gt;ORM&lt;/acronym&gt; layer allows us to do this using objects (in our case Propel objects).&lt;/p&gt;

&lt;p&gt;If we look at our &lt;a href=&quot;http://spindrop.us/2007/04/23/the-lucene-search-index-and-symfony/&quot;&gt;user example from before&lt;/a&gt;, we did set ourselves up to easily do this using our &lt;code&gt;User::generateZSLDocument()&lt;/code&gt; function, which did most of the heavy lifting.&lt;/p&gt;

&lt;p&gt;We can make a few small changes to the &lt;code&gt;User&lt;/code&gt; class:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    var $reindex = false;
    public function setUsername ( $v )
    {
        parent::setUsername($v);
        $this-&gt;reindex = true;
    }
    public function setFirstname ( $v )
    {
        parent::setFirstname($v);
        $this-&gt;reindex = true;
    }
    public function setLastname ( $v )
    {
        parent::setLastname($v);
        $this-&gt;reindex = true;
    }
    public function setEmail ( $v )
    {
        parent::setEmail($v);
        $this-&gt;reindex = true;
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;We have an attribute called &lt;code&gt;$reindex&lt;/code&gt;.  When it is false we don't need to worry about the index.  When something significant changes, like an update to your name or email address, then we set &lt;code&gt;$reindex&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt;.  Then when we save with an overridden save method:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function save ($con = null)
    {
        parent::save($con);
      
        if ($this-&gt;reindex) 
        {
            $index = $this-&gt;removeFromIndex();
            $doc   = $this-&gt;generateZSLDocument();
            $index-&gt;addDocument($doc);
        }
    }

    public function removeFromIndex() 
    {
        $index = Zend_Search_Lucene::open(sfConfig::get('app_search_user_index'));  

        // remove old documents
        $term  = new Zend_Search_Lucene_Index_Term($this-&gt;getId(), 'userid');
        $query = new Zend_Search_Lucene_Search_Query_Term($term);
        $hits  = array();
        $hits  = $index-&gt;find($query);

        foreach ($hits AS $hit) 
        {  
            $index-&gt;delete($hit-&gt;id);  
        }

        return $index;      
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;Now we've got the &lt;em&gt;exact&lt;/em&gt; same data that we created during &lt;a href=&quot;http://spindrop.us/2007/04/23/the-lucene-search-index-and-symfony/&quot;&gt;our original indexing&lt;/a&gt;.  This handled creating and updating object, but we miss updating the index when deleting objects.&lt;/p&gt;

&lt;p&gt;Luckily we already made a function &lt;code&gt;User::removeFromIndex()&lt;/code&gt; to remove any related documents from the index, so our delete function can be pretty simple:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function delete($con = null)
    {
        parent::delete($con);
        $this-&gt;removeFromIndex();
    }
&lt;/textarea&gt;&lt;/div&gt;

</content>
 </entry>
 
 <entry>
   <title>The Lucene Search Index and symfony</title>
   <link href="http://davedash.com/2007/04/23/the-lucene-search-index-and-symfony/"/>
   <updated>2007-04-23T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/04/23/the-lucene-search-index-and-symfony</id>
   <content type="html">&lt;p&gt;[tags]Zend, Zend Search Lucene, Search, Lucene, php, symfony, zsl, index[/tags]&lt;/p&gt;

&lt;p&gt;This article is meant to followup &lt;a href=&quot;http://spindrop.us/2007/04/10/sfzendplugin/&quot;&gt;sfZendPlugin&lt;/a&gt; where we learn a newer way of obtaining the &lt;a href=&quot;http://framework.zend.com/&quot;&gt;Zend Framework&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In this tutorial we're going to delve into the Lucene index.  &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;Zend Search Lucene&lt;/a&gt; relies on building a Lucene index.  This is a directory that contains files that can be indexed and queried by Lucene or other ports.  In our example we'll be creating a search for user profiles.&lt;/p&gt;

&lt;!--more--&gt;


&lt;p&gt;We'll want to store in our &lt;code&gt;app.yml&lt;/code&gt; the precise location of this index file so we can refer to it in our app&lt;sup id=&quot;#fnr_lucene_index1&quot;&gt;&lt;a href=&quot;#fn_lucene_index1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;Here's an example:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;all:
  search:
    user_index: /tmp/myapp.user.lucene.index
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now when we need to refer to the index we can do &lt;code&gt;sfConfig::get('app_search_user_index')&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;Index Something&lt;/h3&gt;

&lt;p&gt;Let's try a user search where we can find a user by their name or email address.  It's fairly simple to accomplish, and hardly requires the use of &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;&lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt;&lt;/a&gt;, but by using &lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt; we can easily extend it to do a full-text search of a user's profile or any other textual data.&lt;/p&gt;

&lt;p&gt;Each &quot;thing&quot; stored in the index is a Lucene &quot;document&quot;.  Each document then consists of several &quot;fields&quot; (&lt;code&gt;Zend_Search_Lucene_Field&lt;/code&gt; objects).  In our example, each document will be an individual user and the fields will be relevant attributes of the user (username, first name, last name, email, the text of their profile).&lt;/p&gt;

&lt;p&gt;Initially we'll want to populate our index.  We may also want to regularly reindex all the users at once to optimize the search performance.  Since reindexing involves multiple users it would make sense to have a static &lt;code&gt;reindex&lt;/code&gt; method in our &lt;code&gt;UserPeer&lt;/code&gt; class&lt;sup id=&quot;#fnr_lucene_index2&quot;&gt;&lt;a href=&quot;#fn_fn_lucene_index2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
class UserPeer extends BaseUserPeer
{
    public static function reindex()
    {
        $index = Zend_Search_Lucene::create(sfConfig::get('app_search_user_index'));

        $user = UserPeer::doSelect(new Criteria());
        foreach ($users AS $user)
        {
            $index-&gt;addDocument($user-&gt;generateZSLDocument());
        }

        return $index;
    }
}
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;Very simply, we're creating a new index, getting all the users, adding a document to the index and then committing the index (to disk).  You might have noticed that there's a strange function, &lt;code&gt;User::generateZSLDocument()&lt;/code&gt;.  This function contains all the magic.  In order to not repeat ourselves we keep the internals of making a document for the Lucene index in the &lt;code&gt;User&lt;/code&gt; class itself.  Let's look at it:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function generateZSLDocument()
    {
        $doc = new Zend_Search_Lucene_Document();
        $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('uid', $this-&gt;getId()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('username', $this-&gt;getUsername()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('email', $this-&gt;getEmail()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Text('firstname', $this-&gt;getFirstname()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Text('lastname', $this-&gt;getLastname()));
        /* An unstored contents field as an aggregate 
          * of all data is no longer needed in *ZEND* Lucene 
          * But it's here.
          */
        $doc-&gt;addField(Zend_Search_Lucene_Field::Unstored('contents', implode(' ', array($this-&gt;getEmail(), $this-&gt;getFirstname(), $this-&gt;getLastname(), $this-&gt;getUsername())));
        return $doc;
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;We're really just dumping the relevant search terms into this document.  The beauty of keeping this code internalized in the &lt;code&gt;User&lt;/code&gt; class is we can reuse it later if we need to index a single &lt;code&gt;User&lt;/code&gt; at a time.&lt;/p&gt;

&lt;p&gt;A couple things to note.  &lt;code&gt;Zend_Search_Lucene_Field::Keyword&lt;/code&gt; allows us to store data that we can lookup later.  We store the &lt;code&gt;User::id&lt;/code&gt; in a field called &lt;code&gt;uid&lt;/code&gt; since &lt;code&gt;id&lt;/code&gt; is a reserved word for the index and we can't access it from &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;Zend Search Lucene&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In a batch script or a reindex action we can now just call &lt;code&gt;UserPeer::reindex()&lt;/code&gt; and have a working search index for our users.&lt;/p&gt;

&lt;div id=&quot;footnotes&quot;&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
        &lt;li id=&quot;fn_lucene_index1&quot;&gt;Storing things in &lt;code&gt;app.yml&lt;/code&gt; is great for indexes that don't need to be searched in multiple applications. &lt;a href=&quot;#fnr_lucene_index1&quot; class=&quot;footnoteBackLink&quot;  title=&quot;Jump back to footnote 1 in the text.&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;
        &lt;li id=&quot;fn_lucene_index2&quot;&gt;
Since we're using a Lucene index, which has an open documented structure, we aren't limited to just using Zend Search Lucene or Apache Lucene (java).  We can mix and match and read and write to the same index file.  For very large indexes (65,000+ documents), I rewrote a Java application to index all the documents at once as PHP would time out during such a task.
&lt;a href=&quot;#fnr_lucene_index2&quot; class=&quot;footnoteBackLink&quot;  title=&quot;Jump back to footnote 2 in the text.&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
&lt;/div&gt;

</content>
 </entry>
 
 <entry>
   <title>sfZendPlugin</title>
   <link href="http://davedash.com/2007/04/10/sfzendplugin/"/>
   <updated>2007-04-10T00:00:00-07:00</updated>
   <id>http://davedash.com/2007/04/10/sfzendplugin</id>
   <content type="html">&lt;p&gt;[tags]Zend, Zend Search Lucene, Search, Lucene, php, symfony, zsl, plugins[/tags]&lt;/p&gt;

&lt;p&gt;I originally intended to rewrite &lt;a href=&quot;http://spindrop.us/2006/08/25/using-zend-search-lucene-in-a-symfony-app/&quot;&gt;my Zend Search Lucene tutorial&lt;/a&gt;, but &lt;a href=&quot;http://archivemati.ca/2007/03/08/zend-search-lucene-symfony-and-the-ica-atom-application/&quot;&gt;Peter Van Garderen&lt;/a&gt; covered the bulk of what's changed and I was too busy developing search functionality for &lt;a href=&quot;http://lyro.com/&quot;&gt;lyro.com&lt;/a&gt; (not to mention finding inconsistencies with the Zend Search Lucene port and Lucene) to finish the tutorial.  So I broke it up into smaller pieces.&lt;/p&gt;

&lt;p&gt;I packaged &lt;a href=&quot;http://framework.zend.com/&quot;&gt;Zend Framework&lt;/a&gt; into a &lt;a href=&quot;http://www.symfony-project.com/trac/browser/plugins/sfZendPlugin&quot;&gt;symfony plugin&lt;/a&gt;.  &lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; is easily extended using plugins.&lt;/p&gt;

&lt;p&gt;You can obtain this from subversion with the following command (from your &lt;code&gt;/plugins&lt;/code&gt; directory):&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;svn export http://svn.symfony-project.com/plugins/sfZendPlugin
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; has a &lt;a href=&quot;http://www.symfony-project.com/book/trunk/17-Extending-Symfony#Bridges%20to%20Other%20Framework%20Components&quot;&gt;Zend Framework Bridge&lt;/a&gt; which let's us autoload the framework by adding the following to &lt;code&gt;settings.yml&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;.settings:
  zend_lib_dir:   %SF_ROOT_DIR%/plugins/sfZendPlugin/lib
  autoloading_functions:
    - [sfZendFrameworkBridge, autoload]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;First we define &lt;code&gt;sf_zend_lib_dir&lt;/code&gt; to be in our plugin's &lt;code&gt;lib&lt;/code&gt; directory.  Then we autoload the bridge framework.&lt;/p&gt;

&lt;p&gt;After setting this up, all the Zend classes will be available and auto-loaded from elsewhere in your code.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Coming soon to reviewsby.us</title>
   <link href="http://davedash.com/2006/10/02/coming-soon-to-reviewsbyus/"/>
   <updated>2006-10-02T00:00:00-07:00</updated>
   <id>http://davedash.com/2006/10/02/coming-soon-to-reviewsbyus</id>
   <content type="html">&lt;p&gt;In August I took a break from &lt;a href=&quot;http://reviewsby.us/&quot;&gt;reviewsby.us&lt;/a&gt; only to be plagued by spam.  In September, I relinquished portions of the project planning to my wife.  We haven't released anything publicly, yet, but there's a lot in development.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I updated the development framework to &lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; 1.0 alpha and took care of a whole slew of bugs.&lt;/li&gt;
&lt;li&gt;Katie and I came up with a &lt;a href=&quot;http://flickr.com/photos/davedash/251518309/&quot;&gt;wireframe&lt;/a&gt; that details some of the upcoming changes.&lt;/li&gt;
&lt;li&gt;I upgraded the user logic to take advantage of sfGuardUser, a user management plugin for symfony.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I'm in progress of writing location specific searches.  I'm slow to implement.  It seems that this month is far busier than I'd like, and I can rarely get in a block of enough time to just crank this out.  The problem with geographic-specific searches is mySQL supports those types of queries, but it's not as easy as I'd like.  Zend Search Lucene with some support from PHP, however, may yield some promising results.  As always, I'll share my findings in a forthcoming tutorial.&lt;/p&gt;

&lt;p&gt;Anyway, no visible updates on the actual site, since I didn't want to put alpha software on the live site.  I'm sure by next month symfony will be ready.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Using Zend Search Lucene in a symfony app</title>
   <link href="http://davedash.com/2006/08/25/using-zend-search-lucene-in-a-symfony-app/"/>
   <updated>2006-08-25T00:00:00-07:00</updated>
   <id>http://davedash.com/2006/08/25/using-zend-search-lucene-in-a-symfony-app</id>
   <content type="html">&lt;p&gt;[tags]zend, search, lucene, zend search lucene, zsl, symfony,php[/tags]&lt;/p&gt;

&lt;p&gt;If you're like me you've probably followed the &lt;a href=&quot;http://symfony-project.com/askeet/21&quot;&gt;Askeet tutorial on Search&lt;/a&gt; in order to create a decent search engine for your web app.  It's fairly straight forward, but they hinted that when &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;Zend Search Lucene&lt;/a&gt; (&lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt;) is released, that might be the way to go.  Well we are in luck, &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;&lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt;&lt;/a&gt; is available, so let's just dive right in.&lt;/p&gt;

&lt;!--more--&gt;


&lt;p&gt;If you aren't using &lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; have a look at &lt;a href=&quot;http://devzone.zend.com/node/view/id/91&quot; title=&quot;Roll Your Own Search Engine with Zend_Search_Lucene&quot;&gt;this article&lt;/a&gt; from the &lt;a href=&quot;http://devzone.zend.com/&quot;&gt;Zend Developer Zone&lt;/a&gt;.  It covers just enough to get you started.  If you are using &lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt;, just follow along and we'll get you where you need to go.&lt;/p&gt;

&lt;h3&gt;Obtaining Zend Search Lucene&lt;/h3&gt;

&lt;p&gt;First &lt;a href=&quot;http://framework.zend.com/download&quot; title=&quot;Zend Framework Download&quot;&gt;download&lt;/a&gt; the &lt;a href=&quot;http://framework.zend.com/&quot;&gt;Zend Framework&lt;/a&gt; (&lt;acronym title=&quot;Zend Developer Framework&quot;&gt;ZF&lt;/acronym&gt;).  The &lt;a href=&quot;http://framework.zend.com/&quot;&gt;Zend Framework&lt;/a&gt;  is supposed to be fairly &quot;easy&quot; in terms of installation.  So let's put that to the test.  Open your &lt;a href=&quot;http://framework.zend.com/&quot;&gt;&lt;acronym title=&quot;Zend Developer Framework&quot;&gt;ZF&lt;/acronym&gt;&lt;/a&gt; archive.  Copy &lt;code&gt;Zend.php&lt;/code&gt; and &lt;code&gt;Zend/Search&lt;/code&gt; to your &lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; project's library folder:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;cp Zend.php $SF_PROJECT/lib              
mkdir $SF_PROJECT/lib/Zend
cp -r Zend/Search $SF_PROJECT/lib/Zend
cp Zend/Exception.php $SF_PROJECT/lib/Zend                 
chmod -R a+r $SF_PROJECT/lib/Zend*
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Index Something&lt;/h3&gt;

&lt;p&gt;We'll deviate slightly from &lt;a href=&quot;http://spindrop.us/category/reviewsbyus&quot; title=&quot;ReviewsBy.Us category of Spindrop&quot;&gt;food themed&lt;/a&gt; tutorials and do something generic.  Let's try a user search where we can find a user by their name or email address.  It's fairly simple to accomplish, and hardly requires the use of &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;&lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt;&lt;/a&gt;, but by using &lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt; we can easily extend it to do a full-text search of a user's profile or any other textual data.&lt;/p&gt;

&lt;p&gt;Each &quot;thing&quot; stored in the index is a &quot;document&quot; in &lt;acronym title=&quot;Zend Search Lucene&quot;&gt;ZSL&lt;/acronym&gt;, specifically a &lt;code&gt;Zend_Search_Lucene_Document&lt;/code&gt;.  Each document then consists of several &quot;fields&quot; (&lt;code&gt;Zend_Search_Lucene_Field&lt;/code&gt; objects).  In our example, our document will be an individual user and the fields will be relevant attributes of the user (username, first name, last name, email, the text of their profile).&lt;/p&gt;

&lt;p&gt;We're going to write a general re-indexing tool.  Something that will index all users.&lt;/p&gt;

&lt;p&gt;In our &lt;code&gt;userActions&lt;/code&gt; class let's add the following action:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function executeReindex()
    {
        require_once 'Zend/Search/Lucene.php';
        $index = new Zend_Search_Lucene(sfConfig::get('app_search_user_index_file'),true);
        
        $users = UserPeer::doSelect(new Criteria());
        foreach ($users AS $user)
        {
            $doc = new Zend_Search_Lucene_Document();
            $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('id', $user-&gt;getId()));
            $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('username', $user-&gt;getUsername()));
            $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('email', $user-&gt;getEmail()));
            $doc-&gt;addField(Zend_Search_Lucene_Field::Text('firstname', $user-&gt;getFirstname()));
            $doc-&gt;addField(Zend_Search_Lucene_Field::Text('lastname', $user-&gt;getLastname()));
            $doc-&gt;addField(Zend_Search_Lucene_Field::Unstored('contents', &quot;{$user-&gt;getEmail()} {$user-&gt;getFirstname()} {$user-&gt;getLastname()} {$user-&gt;getUsername()}&quot;));
            $index-&gt;addDocument($doc);
        }
        
        $index-&gt;commit();
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;The code should be fairly easy to follow.  First of all we're requiring the necessary libraries for Lucene.  The next line we are creating the index:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    $index = new Zend_Search_Lucene(sfConfig::get('app_search_user_index_file'),true);
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;&lt;code&gt;app_search_user_index_file&lt;/code&gt; is a symfony configuration that you define in your &lt;code&gt;app.yml&lt;/code&gt;.  It defines which file you want to use for your index.  &lt;code&gt;/tmp/lucene.user.index&lt;/code&gt; works for our purposes.   The second parameter tells Lucene we are creating a new index.&lt;/p&gt;

&lt;p&gt;We then loop through all the users and for each user create a document.  For all the search relevant attributes that a user might have we add a field into the document.  Note the last field:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    $doc-&gt;addField(Zend_Search_Lucene_Field::Unstored('contents', &quot;{$user-&gt;getEmail()} {$user-&gt;getFirstname()} {$user-&gt;getLastname()} {$user-&gt;getUsername()}&quot;));
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;By default search is made for the &quot;contents&quot; field.  So in this example we want people to be able to type in someone's name, email, username without having to specify what field we're searching for.&lt;/p&gt;

&lt;h3&gt;Find those users&lt;/h3&gt;

&lt;p&gt;Finding the user's is equally as straight-forward.  We make a new action called &lt;code&gt;search&lt;/code&gt;:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function executeSearch()
    {
        require_once('Zend/Search/Lucene.php');
        $query = $this-&gt;getRequestParameter('q');
    
        $this-&gt;getResponse()-&gt;setTitle('Search for \'' . $query . '\' &amp;laquo; ' . sfConfig::get('app_title'), true);
    
        $hits = array();
    
        if ($query)
        {
            $index = new Zend_Search_Lucene(sfConfig::get('app_search_user_index_file'));
            $hits = $index-&gt;find(strtolower($query));
        }
        $this-&gt;hits = $hits;
    }

The magic happens in our `if` statement:

    if ($query)
    {
        $index = new Zend_Search_Lucene(sfConfig::get('app_search_user_index_file'));
        $hits = $index-&gt;find(strtolower($query));
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;If we have a query, open the &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;ZSL&lt;/a&gt; index (note that we only have one parameter here).  Run the &lt;code&gt;find&lt;/code&gt; method to find our query and store it to the &lt;code&gt;$hits&lt;/code&gt; array.  Note that our query was cleaned with &lt;code&gt;strtolower&lt;/code&gt;, since &lt;a href=&quot;http://framework.zend.com/manual/en/zend.search.html&quot;&gt;ZSL&lt;/a&gt; is case sensitive.&lt;/p&gt;

&lt;p&gt;The template takes care of the rest:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    &lt;?php use_helper('Form');?&gt;
    &lt;?php echo form_tag('@search_users') ?&gt;
    &lt;?php echo input_tag('q'); ?&gt;
    &lt;?php echo submit_tag() ?&gt;
    &lt;/form&gt;
    &lt;?php foreach ($hits as $hit): ?&gt;
      &lt;?php echo $hit-&gt;score ?&gt;
      &lt;?php echo $hit-&gt;firstname ?&gt;
      &lt;?php echo $hit-&gt;lastname ?&gt;
      &lt;?php echo $hit-&gt;email ?&gt;
    &lt;?php endforeach ?&gt;
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;Fairly simple... but it could use some cleaning up (enjoy).&lt;/p&gt;

&lt;h3&gt;What about new users?&lt;/h3&gt;

&lt;p&gt;Regularly reindexing might be nice in terms of having an optimized search index, but its lousy if you want to be able to search the network immediately when new people join on.  So why not automatically re-index each user every time they are created or everytime one of their indexed components is summoned?&lt;/p&gt;

&lt;p&gt;This should be fairly simple by adding to the &lt;code&gt;User&lt;/code&gt; class:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    var $reindex = false;
    public function setUsername ( $v )
    {
        parent::setUsername($v);
        $this-&gt;reindex = true;
    }
    public function setFirstname ( $v )
    {
        parent::setFirstname($v);
        $this-&gt;reindex = true;
    }
    public function setLastname ( $v )
    {
        parent::setLastname($v);
        $this-&gt;reindex = true;
    }
    public function setEmail ( $v )
    {
        parent::setEmail($v);
        $this-&gt;reindex = true;
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;We have an attribute called &lt;code&gt;$reindex&lt;/code&gt;.  When it is false we don't need to worry about indexes.  When something significant changes, like an update to your name or email address, then we set &lt;code&gt;$reindex&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt;.  Then when we save:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function save ($con = null)
    {
        parent::save($con);
        if ($this-&gt;reindex) {
            require_once 'Zend/Search/Lucene.php';
            $index = new Zend_Search_Lucene(sfConfig::get('app_search_user_index_file'));
            // first find any references to this user and delete them
            $hits = $index-&gt;find('id:'. $this-&gt;getId());
            foreach ($hits AS $hit) {
                $index-&gt;delete($hit-&gt;id);
            }
        
            $doc = $this-&gt;generateZSLDocument();
            $index-&gt;addDocument($doc);
            $index-&gt;commit();
        }
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;We're calling a new function called &lt;code&gt;generateZSLDocument&lt;/code&gt;.  It might look familiar:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function generateZSLDocument()
    {
    
        require_once 'Zend/Search/Lucene.php';
        $doc = new Zend_Search_Lucene_Document();
        $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('id', $this-&gt;getId()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('username', $this-&gt;getUsername()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword('email', $this-&gt;getEmail()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Text('firstname', $this-&gt;getFirstname()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Text('lastname', $this-&gt;getLastname()));
        $doc-&gt;addField(Zend_Search_Lucene_Field::Unstored('contents', &quot;{$this-&gt;getEmail()} {$this-&gt;getFirstname()} {$this-&gt;getLastname()} {$this-&gt;getUsername()}&quot;));
        return $doc;
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;Now, whenever a user is updated, so is our index.  Additionally we can modify our reindex action:&lt;/p&gt;

&lt;div&gt;&lt;textarea name=&quot;code&quot; class=&quot;php&quot;&gt;
    public function executeReindex()
    {
        require_once('Zend/Search/Lucene.php');
        $index = new Zend_Search_Lucene(sfConfig::get('app_search_user_index_file'),true);
        
        $users = UserPeer::doSelect(new Criteria());
        foreach ($users AS $user)
        {
            
            $index-&gt;addDocument($user-&gt;generateZSLDocument);
        }
        
        $index-&gt;commit();
    }
&lt;/textarea&gt;&lt;/div&gt;


&lt;p&gt;That's a &lt;strong&gt;lot&lt;/strong&gt; easier to deal with.&lt;/p&gt;

&lt;h3&gt;...and beyond&lt;/h3&gt;

&lt;p&gt;Hope this article helps some of you jumpstart your &lt;a href=&quot;http://symfony-project.com/&quot;&gt;symfony&lt;/a&gt; apps.  Really cool, easy to implement search is here.  We no longer have to stick with shoddy solutions like HT://Dig or spend time rolling our own full text search, as the &lt;a href=&quot;http://symfony-project.com/askeet/21&quot;&gt;symfony team diligently showed us we could&lt;/a&gt;.  But there is a lot more ground to cover.  Including optimization techniques and best practices.&lt;/p&gt;

&lt;p&gt;Let me know what you think, and if you use this in any of your apps.&lt;/p&gt;
</content>
 </entry>
 

</feed>

