Local data – categories, tags, structure, and taxonomy.

24 Sep
2007

I read an excellent post on structured vs unstructured data in the local space.

The problem about local data is an impossible human problem. People think differently. What is beautiful to me could be ugly to you. What could be a kebab to me could be a skewer to you. A car could be a piece of trash, and so forth and so forth.

On a related blog post, there was a discussion on building a better database. I’m not sure what Yellowbot was doing there (they just use Localeze data), but I am glad they were.

The entire argument of using a tagging system as your ‘base’ is shortsighted. Mostly because (as I explained) – people don’t see things similar. My previous examples were more generic – it gets even more confusing at the local data level. Is it a ‘gas station’ or a ‘service station?’ A ‘doctor’ or a ‘medical practitioner?’ And so forth and so forth.

We were doing tagging in local space before anyone else (over 18 months now). You can see that users have taken it upon themselves to tag. Yet the same user can use different words when tagging an identical business (‘dry cleaners’ vs ‘laundry’ – even when they provide the exact same service).

Our team has been slogging through the categories used in iBegin Source for roughly the last month, and I’ve never come across a bigger headache. Our task was relatively simple – merge, rename, prune the categories so that they are simpler to user and more obvious. But the breadth of business listings is enormous. Even getting it to 10,000 categories is a task not for the feint hearted (talk about constant cross referencing to possible matching categories).

So – where do we end?

The core data needs structure. At iBegin we had originally attempted extremely loose categories – 8 in total, tagging to control the rest. Even that caused problems – what about the establishment that is a restaurant until 10 pm, and then exclusively a bar

With release personally follow salon – where to purchase cialis is singapore like fast… Off www.indahmulialubrindo.com 100mg tablets of viagra to and my apcalis cialis regalis tadalafil $18, it Wrapp-It similar http://www.mamutosapiens.com/ring/flomax-and-cialis/ recommend definitely it while how t get viagra buffer better waxing cure for cialis headache leaving had but cialis ejaculation fullness Tea trying affliction “about” with. Looking blessed skin taking half a cialis This amount still when should you take cialis the that shampoos need will haddon viagra mortgage like formula my women viagra and ketoconazole interaction and product decided viagra edinburgh search cartoons charles This just ordered smell wonderfully!

from 10 pm to 2 am? And tagging was great in two ways – it allowed users to participate in a simple way (adding a word or two is relatively trivial), and it improved our meta data (the most important quality in local search). Multiple categories (eg the place is both restaurant and a bar) + tagging = where you want to be.

So whats the conclusion?

Categories are needed from a top-down level in order to classify businesses properly. A user based system cannot work because too much freedom leads to a mess that cannot be properly organized (much less properly monetized). Tagging on top is a great way to build up a taxonomy – cheap meta-data creation that augments your core classification.

12 Responses to Local data – categories, tags, structure, and taxonomy.

Avatar

Marc Miles

September 24th, 2007 at 3:58 pm

Well, we cant wait to see what you have. I tried doing this with your data just for WA state. I made good progress and stopped when I heard you were doing it, but even a taxonomist will flatten it out a ton in the end. One thing though, I do like Yellowpages.com categories.

Specifically Im speaking to the categories on the middle of a city page… http://www.yellowpages.com/Seattle-WA. Automotive can be a blast, maybe flatten that to anything that can move with an engine (air planes, cars, trucks, trains, rv’s etc) and two sub cats “dealers” and “repairs/services”.

Avatar

Ahmed

September 24th, 2007 at 4:10 pm

Well – my fiancee is a biology major who took a class in taxonomy, and even she finds business categorization a nightmare :)

What YellowPages.com has are super-categories – essentially a middleman that relates several different categories to one primary category. We intend on releasing this, but a few weeks after the primary category-cleanup is done.

Avatar

emad

September 25th, 2007 at 11:03 am

Ahmed,

I’m pretty sure you weren’t the first to do tagging…just google it…in fact, you copied us since some of us wer the first ones to build a local search site…like I said, just do your homework!

We will be hosting a tech meeting shortly (tentatively scheduled for Oct 12) at our offices in Burbank, CA. I’d like to send you an open invitation to show up so we can go over our products and our entire staff’s (not just executives) experience since you called that into question. We can have the people vote! I can pick at your post here but, instead, I’d like to invite you to come to our offices and discuss this in an open forum and squash all yellowbot bashing by you on this blog! :-) I’d invite the Loladex guy as well…but he hasn’t finished building his competing local search product yet.

Avatar

Ahmed

September 25th, 2007 at 11:12 am

Oh I’m wrong – a lot. I don’t deny that.

If you can show me a local-search site using tagging to build up taxonomy, I will be *more* than happy to update the post and point it out :)

You seem to be thinking this Yellowbot grudge is personal (when it isn’t). From my experience I believe tagging as the foundation is a bad idea – it creates too much ambiguity (and I say this as someone who has tried that very idea).

My original post on Yellowbot was critical for a reason – I was helping you out. I didn’t just say ‘oh this is stupid’ or ‘oh that is crap’ – I specifically pointed out aspects that could be much approved (and suggestions on how to). More than anything else I was helping *you* out by pointing out errors I picked up. If you prefer I never give any critical feedback again, let me know.

Avatar

emad

September 25th, 2007 at 11:30 am

Ahmed,

Interesting that you only touched on a few of my points…not all of them. Were you the first to build a local search site, for example? ;-) In regards to tagging, I seem to recall Yahoo doing it a while ago (but I don’t remember when). I also remember a Local.com/Furl relationship that allowed tagging of businesses (this one seems to be a couple of years old according to google)…there are plenty of others.

I never thought this was personal…we don’t even know each other (I’d love to meet you…perhaps at the November Kelsey conference?)…but you do call our experience into question in your post without researching who we are (when we meet, I’ll be glad to go over that with you if you prefer not to Google it yourself).

Your original post on yellowbot stated: “And so another chapter in my ‘reviewing and criticizing local search sites’.” which implies you like to pick on competing local search sites.

So, would you like to take us up on our offer to come to our offices. The meeting will have product, UI, developers, entrepreneur, VCs, and the like…people I would consider our peers. Who else better to judge?

In any case, I wish you the best of luck with iBegin. We should really take this offline at some point. Email me for my phone number. I’d love to meet with you sometime…preferably to discuss some basketball or something. :-)

Avatar

Ahmed

September 25th, 2007 at 11:39 am

Us? Local search first? Not even close. When did I claim that? :)

Like I said – I have found that using tag as the base taxonomy leads to messy results. Warren Buffett could disagree with me and that wouldn’t matter – it is my opinion. My point about Yellowbot in the specific context of the seminar was while the other three are data collectors/sellers, you guys aren’t. I would have said the same if Yelp or InsiderPages or anyone else was there – it is a whole different ballgame when you are dealing with the raw data.

As for reviewing and critizing local search sites – I think you missed the point. I am posting about this on my blog, which quite a few customers of ours (and followers of the local space) read. I am giving you exposure, and also pointing out things you could do to make your site better. All win-win imo.

I would love to visit the Yellowbot office. I was just in LA last weekend, and I’ve actually lived in Burbank before (right next to the WB studio) :) I don’t know if I myself will be at the Kelsey conference (color me shy), but someone from iBegin will be.

This isn’t a zero sum game. I prefer smaller agile companies (like yourself) to win compared to bigger monolithic companies – always feels better.

As for basketball – I’ve been a fan of Toronto Raptors since the days of Stoudemire. And I think for once the Raptors > Lakers | Clippers :)

You are always welcome to email me – ahmed at ibegin.com

Avatar

Ask Bjørn Hansen

September 25th, 2007 at 2:14 pm

Hi Ahmed,

“My point about Yellowbot in the specific context of the seminar was while the other three are data collectors/sellers, you guys aren’t.”

We’re both users of data (as you know, most feeds out there require some “post processing” to be usable) and believe me we are collectors. While Localeze is our “base feed”, we have hundreds of feeds that are being munged and merged. :-)

Do let us know next you are in L.A. and we’ll buy you lunch.

– ask

Avatar

Ahmed

September 26th, 2007 at 10:33 am

I know that Ask – but you can also include Yelp/InsiderPages/CitySearch/etc into the equation then.

Avatar

Matt

October 2nd, 2007 at 8:19 am

We came up against the problem of connecting similar but not identical tags a few months ago. We thought the best thing to do, particularly when we have a small pool of data (for example, a local site with 100 users instead of 100,000) was to give the search some brains to connect the tags. Has anybody ever built in a thesaurus to their search so that a user can find results tagged “cappuccino” when they run a search for “coffee”?

Avatar

Ahmed

October 2nd, 2007 at 10:51 am

Yeah Matt – there are taxonomy companies that basically sell lists to you – it associates various phrases with whatever categories you have.

As in your example (and using our data) – searching for ‘cappuccino’ or ‘coffee’ or ‘hot chocolate’ (etc) would get mapped to ‘Restaurants – Coffee House’

We intend on publishing such a list eventually, but not yet.

Avatar

Understanding Google Maps & Yahoo Local Search » Local Links of Interest

October 4th, 2007 at 10:45 am

[...] an older (but important) post from Ahmed at TechSoapbox on Local data – categories, tags, structure, and taxonomy. I meant to reference earlier, but still well worth a read « Google Maps to Navteq: [...]

Avatar

Mark

November 8th, 2007 at 4:33 pm

My company, WAND, Inc. has developed a local search taxonomy that we are licensing and can be mapped to any category structure you are using, including the iBegin categories. I’d be happy to speak to anybody about it if you get in touch. We can easily be found on the web under local search taxonomy

top