Local Search – what powers who

9 Jul
2007

There are a few headaches with local. While my current headache is as ambiguous as you can get, the persistent headache has always been search.

Search isn’t fun. Providing relevance for what a user is looking for is a painful task really – you take what a user searches, try to build relationships between what they put in to whatever categorization you have, factor in other variables (user rating, user favorites, distance from end user, etc), and try to throw out the most relevant results.

Not only that, but you have to scale it.

Simple search solutions break down once you hit 100,000+ unique records. More advanced search solutions start to groan under 1,000,000+ records. And at 10,000,000+ records, you have to be smart about it. The ‘where’ isn’t a simple text matching issue (which MySQL can handle for you) – there are multiple variables that are unique in each search. So pre-caching popular searches is also a dead-end solution.

Then you want to add in extra features. Metaphones and stemming, while relatively trivial (in basic implementation), become a bit of a headache when dealing with a large number of records.

When it is all said and done, generating our search engine (one time) takes about 72-96 hours. This is the new version – the current one on iBegin Source is quite slow – the new one should clock in at roughly 5x-10x faster. And it should finally be live sometimes next week.

I’ve always wondered what our competition uses.

A while ago I accidentally stumbled upon Judy’s Book search partner – Launch 21. The blog is quite detailed about the process behind Coupon Looker – interesting stuff.

So today I was intrigued when I saw Yelp’s Jobs page (with the title ‘About us’ for some reason). The first job posting was for a Senior Software Engineer – and the word that caught my eye was Lucene

I’ve spent a lot of time researching and understanding larger search systems, like Lucene, Xapian, Sphinx Search, and others. All are quite nice (and pack power), but molding an existing search system for local search isn’t an easy nor fun task.

I’m curious what other companies are doing (too many to name) – any clue? Are they custom (like us), outsourced/custom (like JB’s Coupon Looker), or an existing system modified (like Yelp)?

2 Responses to Local Search – what powers who

Avatar

GeoSign or eMedia? - Tech Soapbox

October 8th, 2007 at 2:56 pm

[...] UPDATE 2: Sort of an on-going discovery of mine, it seems like TrueLocal also uses lucene. Same thing that Yelp uses. [...]

Avatar

Local SEO Guide

December 18th, 2007 at 7:16 pm

At InsiderPage we started out with a modified version of Verity which was not so great. Then we went for the gold-plated solution of FAST which was ok but not so great. Apparently one of the engineers who is insanely talented built the current search engine in about four weeks.

top