Notice: Undefined index: order_next_posts in /nas/content/live/gadgetmag/wp-content/plugins/smart-scroll-posts/smart-scroll-posts.php on line 194

Notice: Undefined index: post_link_target in /nas/content/live/gadgetmag/wp-content/plugins/smart-scroll-posts/smart-scroll-posts.php on line 195

Notice: Undefined index: posts_featured_size in /nas/content/live/gadgetmag/wp-content/plugins/smart-scroll-posts/smart-scroll-posts.php on line 196

How to add search with Solr

Discover how to set up Solr to index data and return search results via a PHP-based application

Search is the lifeblood of the web, but on many sites it can be a decidedly underwhelming experience, lacking the richness and accuracy that users have come to expect from Google.

If you are developing sites using PHP and MySQL, search can be very difficult to do well – especially if you need it to be quick as well as comprehensive.
So, what is the solution that will give your sites a search facility that will work quickly and accurately?

Enter Apache Solr: an open source search tool providing fast and accurate results on a very scalable platform, with a whole host of advanced features: faceting, spellchecking, hit highlighting, result boosting and more. It runs as a Java servlet, so needs a container like Tomcat or Jetty – to get started, we’ll be using the build of Jetty supplied with the Solr download. It’ll run on OS X, Windows or Linux at the command line.

We’ll introduce the key Solr schema file, set one up for our sample application, then walk through using a PHP library to connect to our Solr server, index data from our database and return search results.


Start with a table

Let’s start by creating the database, which has a single table to store products. Fire up a MySQL client (MySQL Workbench, PHP MyAdmin or the command line are all good choices) and run these commands:

001 CREATE DATABASE solr-tut;
 002 USE solr-tut;
 003 CREATE  TABLE products (
 004  product_id INT NOT NULL AUTO_INCREMENT ,
 005  name VARCHAR(100),
 006  description TEXT,
 007  price FLOAT(2),
 008  PRIMARY KEY (`product_id`) );

Fill the table up

Now let’s add some data – you can either use the command line, or run the supplied product-data.sql file. This table will get more sophisticated in the next tutorial, but for now this will give us five vintage-computing-themed products to start indexing and searching, with a range of prices.

001 INSERT INTO products (product_id, name, description, price)     VALUES (‘1’, ‘Amstrad CPC 464’, ‘Vintage computing, horrible     design’,     ‘29.99’);
 002 INSERT INTO products (product_id, name, description, price)      VALUES (‘2’, ‘Vic 20’, ‘Now you are talking – pure retro joy ‘,     ‘14.99’);
 003 INSERT INTO products (product_id, name, description, price)      VALUES (‘3’, ‘ZX Spectrum’, ‘One for the rubber enthusiasts.  48     whole k.’, ‘49.99’);
 004 INSERT INTO products (product_id, name, description, price)     VALUES (‘4’, ‘Commodore 64’, ‘Games Central’, ‘39.99’);
 005 INSERT INTO products (product_id, name, description, price)     VALUES (‘5’, ‘BBC Micro Model B’, ‘State of the art turtle     navigation.’,     ‘129.99’);

Start Solr

Download the 4.x release from the Solr website and unzip it to a convenient location on your machine (we used /opt/solr-4.2.0/). Solr comes supplied with an example application, ready to run – in a subdirectory of the unzipped Solr build called ‘example’. In the terminal in OS X, a shell in Linux or the command line in Windows, go to the example directory and start Solr:

001 java -jar start.jar

The Solr admin panel

You can now browse to the Solr admin panel from any web browser at localhost:8983/solr/. You should see the Solr admin panel, which we’ll look at in more detail a little later. If you can’t browse to it, look at the terminal output to see if there are any error messages. If all is well, stop Solr running by hitting Ctrl+C in your terminal.

The Solr schema

In the example app we’re looking at, the schema.xml file lives at /example/solr/collection1/conf/schema.xml. Browse to this file and open it in your text editor of choice. Next you’re going to add some fields to your schema – these correspond directly to the fields you have already set up in the database. Solr is very flexible, allowing you to define data types with a huge variety of filters, but here we’re using some stock ones from the example schema.
Find the field definition for ‘id (<field name=”id” type=”string” indexed=”true” stored=”true” required=”true” multiValued=”false” />)’ and underneath it, add these lines:

001 <field name=”product_name” type=”text_general” indexed=”true” stored=”true”/>
 002 <field name=”product_description” type=”text_general”         indexed=”true” stored=”true”/>
 003 <field name=”product_price” type=”float” indexed=”true”         stored=”true”/>

Check Solr schema

After saving schema.xml, start the Solr instance again using the command below. Browse to the admin panel at localhost:8983. If you get any error messages, stop Solr by hitting Ctrl+C, check the schema file has no typos and is properly formed, then try again.

A blank application

Now it’s time to set up your PHP application. You should have a web server set up on your development machine, and a directory that we can use for this tutorial (either the site root, a virtual host or a subdirectory). We’ll refer to this location as the website home directory. We’ll also assume you can browse to the site at http://localhost – if you are using a different hostname, or a subdirectory, just use that instead. Now go to your website home directory and create four empty files with the following names:

001 export.PHP
 002 search.PHP
 003 constants.PHP
 004 results.PHP

Time for constants

There are some variables we’ll need to use repeatedly – database connection parameters, and the host, name and port of our Solr instance. To make life easy, we’ll define those in our constants file. Open constants.PHP in your text editor and add the following lines, making sure you add your own database username and password where appropriate:

001 <?PHP
 002 define(‘DBUSER’, ‘root’); //change this value to your DB     username
 003 define(‘DBPASS’, ‘password’); //change this value to your DB     password
 004 define(‘DBHOST’, ‘localhost’);
 005 define(‘DBSCHEMA’, ‘solrtut’);
 006 define(‘SOLRNAME’, ‘/solr’);
 007 define(‘SOLRPORT’, ‘8983’);
 008 define(‘SOLRHOST’, ‘localhost’);
 009 ?>

Solr PHP clients

There are a number of clients available for PHP that work well with Solr, including the excellent Solarium. The simplest to get up and running with is the Solr PHP Client, available from Download the latest archive and unzip into the root of your web app directory. Once you’ve extracted the archive into a subdirectory (call it SolrPHPCient), you can remove the zip file.

Fixing Solr PHP client

There is an issue with solr-PHP-client that prevents it from working with the current version of Solr (4.2), due to some Solr commands being deprecated. To make sure everything is working as it should, we need to apply a patch file, which is available from Download the Service.PHP.patch file so it’s in the same directory as Service.PHP in the solr-PHP-client – /SolrPHPClient/Apache/Solr/Service.PHP.patch. Then you can either apply it using the command below, or any patching tool (Netbeans has one built in – just select Tools>Apply Diff Patch from the main menu).

Preparing data

Now let’s add data from the database to the Solr index. We’ll start by writing a skeleton file to connect to the database and grab the records. Open export.PHP in your text editor and add the following lines:

001 <?PHP
 002 require(‘constants.PHP’);
 003 $mysqli = new mysqli(DB-HOST, DB-USER, DB-PASSWORD, DB-    SCHEMA);
 004 if ($mysqli->connect_errno) {
 005    echo “Failed to connect to MySQL: (“ . $mysqli->connect_    errno     . “) “ . $mysqli->connect_error;
 006 }
 007 /* Select queries return a resultset */
 008 if ($result = $mysqli->query(“SELECT * FROM products”)) {
 009    printf(“Select returned %d rows.n”, $result->num_rows);
 010     /*We’re going to add rows to Solr here*/
 011   /* free result set */
 012   $result->close();
 013 }
 014 ?>

Indexing data

You should now be able to browse to localhost/export.PHP and see a result count. Now we know we can connect to the database and access our records, let’s loop through our results and send them to Solr. In export.PHP, replace the line ‘/*We’re going to add rows to Solr here*/’ with this code:

001    //declare an empty array to hold our data to send to Solr
 002    $documents = array();
 003    require_once(‘/solr-PHP/Service.PHP’);
 004        $solr = new Apache_Solr_Service(SOLR-HOST, SOLR-PORT,     SOLR_    NAME);
 005    while ($result = $results->fetch_object())
 006          {
 007        // For each result, create a new Solr doc
 008        $document = new Apache_Solr_Document();
 009        $document->id  = $result->product_id;
 010        $document->description = $result->description;
 011        $document->name = $result->name;
 012        $document->price = $result->price;
 013        //add document to array
 014        $documents[] = $document;
 015    }
 016    if(!empty($documents))
 017    {
 018            $solr->addDocuments($documents);
 019        $solr->commit();
 020        $solr->optimize();
 021    }
 022 ?>

Pushing data to Solr

After making sure the Solr server is running (start it if it isn’t), refresh localhost/export.PHP in your browser. Solr will now be populated with the records from our database. There are many ways to get data into Solr, including simply sending it an appropriately formed XML file using CURL. Using the technique above, however, allows us to do some simple error checking and is very effective for replacing the contents of an entire Solr index.

Solr query syntax

Now we can check if the data is actually present by running some simple queries against our Solr instance. The query syntax is rather different to SQL – you start with the field name you want to query, then the data you want to match, like ‘product_name:Vic 20’. To start with, we just want to run a wildcard query to make sure our data is present and correct: ‘*:*’ will do that. We do this by forming an appropriate HTTP GET request directed at our Solr instance and placing the query string in a parameter called ‘q’ – so when we access the URL below, we should see our five records.

Search page skeleton

So we have a Solr index with some data in; now we want to make a form and results page so we can search it from our PHP application. We’ll start by making the form, which in this case is about as basic as it’s possible to get – open search.PHP in your text editor and add the following code:

001    <html>
 002      <body>
 003    <form action=”results.PHP” method=”get”>
 004      <label for=”query”>Search:</label>
 005      <input id=”query” name=”query” placeholder=”Enter your     search” />
 006      <input type=”submit”/>
 007    </form>
 008  </body>
 009 </html>

Search page details

Now you have a query form, let’s make a page to get some results. Open results.PHP in your text editor, then put the following comment skeleton in place:

001    <?PHP
 002    //1. check that a query has been submitted, send user back to     search page otherwise
 003    //2. if we have a query term, connect to Solr, query and grab the     result
 004    //3. check the results – are there any? If not, display an     appropriate message
 005    // if there are results, iterate through them and display
 006    ?>

Get the query term

Some basic control here: check that the query string has been submitted, and if not, redirect back to our search page. Obviously all the usual advice about sanitising user input applies – in production, you should treat the user input that you’re passing to Solr with the same caution you’d use with anything being passed to a database server. So, in results.PHP, enter the following code under the comment that starts ‘//1. check that a query…’

001    if(!isset($_REQUEST[‘query’]) || empty($_REQUEST[‘query’]))
 002    {
 003    header(“Location: http://localhost/search.PHP”);
 004    }
 005    else
 006    {
 007        $query = $_REQUEST[‘query’];
 008    }

Query Solr

We have a query, so we are going to connect to Solr using the same mechanism as we used when populating the index. Then we’ll use the search method of the Solr PHP library, which accepts a query string, an offset and a limit as parameters – the offset and limit work exactly as they do with MySQL queries, determining the starting record and the number of results respectively. So, in results.PHP, enter the following code under the comment that starts ‘//2. if we have a query term…’

001    //our required includes
 002    require_once(‘constants.PHP’);
 003    requ
 004    //instantiate a Solr object
 005    $solr = new Apache_Solr_Service(SOLRHOST, SOLRPORT, SOLRNAME);
 006    //run the query
 007    $results = $solr->search($query, 0, 10);ire_        008    once(‘SolrPHPClient/    Apache/Solr/Service.PHP’);

Checking results

Now you have a results object, which is stored in $results. First, check that the query ran successfully by testing that results is not empty – $solr->query will return false if it failed. If all is well, then get the number of results, which is stored in $results->response->numFound, and display it appropriately. So, under the comment that starts ‘//3. check the results…’

Display results

By now you’ll know whether the query ran and if it has produced any results, so the next stage is to iterate through them and display them to the user. We are passing the results through htmlspecialchars() to make sure any special characters in the Solr output are converted to appropriate HTML entities. So, in results.PHP enter the following under the comment that starts ‘//4. if there are results…’

001     echo ‘<table>’;
 002     echo ‘<tr><th>ID</th>’ .
 003                ‘<th>Name</th>’ .
 004                ‘<th>Description</th>’ .
 005            ‘<th>Price</th></tr>’;
 006      {
 007          foreach($results->response->docs as $doc)
 008          {
 009  echo ‘<tr><td>’ . htmlspecialchars($doc->id) . ‘</td>’ .
 010    ‘<td>’ . htmlspecialchars($doc->product_name) .      ‘</td>’ .
 011              ‘<td>’ . htmlspecialchars($doc->product_        description)  . ‘</td>’ .
 012                      ‘<td>’ . htmlspecialchars($doc->product_price) .     ‘</td></tr>’;
 013          }
 014      }
 015      echo ‘</table>’;

Error handling

Finally, you can add a little error handling around the Solr statement (this can also be applied to the export.PHP file). As we’re connecting to an external service (our Solr server), there is an obvious risk that the service may be unavailable, causing a fatal error. So here we can wrap the connection in a try/catch block, to handle the error and display an appropriate message.

001    try
 002    {
 003        //instantiate a Solr object
 004    $solr = new Apache_Solr_Service(SOLRHOST, SOLRPORT, SOLRNAME);
 005        //run the query
 006        $results = $solr->search($query, 0, 10);
 007    }
 008    catch(Exception $e)
 009    {
 010        //you would probably want to log this error and display an     appropriate 
 011        //(user friendly) message on a production site
 012        echo($e->__toString());
 013    }

Running a search

You can now browse to localhost/search.PHP and try a search like ‘product_name:Vic’ or ‘*:*’. This will produce a selection of search results as typically found when searching any other site.