<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Think Locally &#187; Yahoo</title>
	<atom:link href="http://www.loladex.com/tag/yahoo/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.loladex.com</link>
	<description>The Loladex Blog</description>
	<lastBuildDate>Thu, 22 Sep 2011 18:07:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Structured vs. unstructured</title>
		<link>http://www.loladex.com/2007/09/19/structured-vs-unstructured/</link>
		<comments>http://www.loladex.com/2007/09/19/structured-vs-unstructured/#comments</comments>
		<pubDate>Wed, 19 Sep 2007 14:56:00 +0000</pubDate>
		<dc:creator>lhooper</dc:creator>
				<category><![CDATA[Local search]]></category>
		<category><![CDATA[Acxiom]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[InfoUSA]]></category>
		<category><![CDATA[Kelsey]]></category>
		<category><![CDATA[Localeze]]></category>
		<category><![CDATA[Marchex]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[YellowBot]]></category>
		<category><![CDATA[Yelp]]></category>

		<guid isPermaLink="false">http://loladex.wordpress.com/2007/09/19/structured-vs-unstructured/</guid>
		<description><![CDATA[Yellow Pages folks surely do love structure — especially when it comes to data. Here at the latest Kelsey conference, where YP folks abound, the only good datum is a structured datum. Consider the title of yesterday&#8217;s most interesting panel: &#8230; <a href="http://www.loladex.com/2007/09/19/structured-vs-unstructured/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Yellow Pages folks surely do love structure — especially when it comes to data. Here at the latest <a href="http://www.kelseygroup.com/ddc2007/">Kelsey conference</a>, where YP folks abound, the only good datum is a structured datum.</p>
<p>Consider the title of yesterday&#8217;s most interesting panel:<br />
<blockquote><em>Building a Better Database: Acquiring Content in a Dysfunctional Environment</em></p></blockquote>
<p>The title is a bit grad school, but &#8220;dysfunctional&#8221; is a strong word that caught my eye. Here it mostly means &#8220;resistant to structure.&#8221;</p>
<p>And them&#8217;s fightin&#8217; words in the world of Yellow Pages.</p>
<p>By now I&#8217;ve gone to a bunch of YP-oriented conferences. All of them featured a discussion about how to gather structured data. But I&#8217;m starting to suspect that this isn&#8217;t the most important problem to solve — and not just because these conference discussions never go anywhere.</p>
<p>Here&#8217;s my thinking:</p>
<p>In what a YPer would call a <em><strong>functional</strong></em> environment, every business location, small or large, would authorize a regularly updated master version of its &#8220;attributes&#8221; (hours, certifications, parking facilities, etc.), and would post this information in some microformat on its Web site, or supply it directly to each data vendor, or send it to an industry-wide data clearinghouse that&#8217;ll probably never exist.</p>
<p>In addition, lots of other data sources — licensing bodies, rating sites, whatever — would distribute structured information that&#8217;s already normalized and can be correlated perfectly to these master records.</p>
<p>All this data would then be collated by data vendors such as <a href="http://www.localeze.com/">Localeze</a> and sold to Web companies such as Google or, for that matter, Loladex.</p>
<p>Finally, the Web companies would build applications that use the structured data for searching by consumers (input) and display to consumers (output).</p>
<p>This worldview may be summarized thus:<br />
<blockquote><em>More structured data in → Better answers out.</em></p></blockquote>
<p>Or as <a href="http://www.marchex.com">Marchex</a>&#8216;s Matthew Berk (who&#8217;s a smart guy) said at the panel here: &#8220;We think local search is about structured search.&#8221;</p>
<p>Berk gave a very good example, which I also use when discussing Loladex: If you&#8217;re looking for a doctor, you need to know whether he takes your insurance. That&#8217;s true, without a doubt.</p>
<p>But here&#8217;s the problem I have:</p>
<p>The majority of information available about any company, and particularly about any small company, will never be structured. It&#8217;ll exist only on the general Web, where it must be searched on its own terms — that is, as unstructured text.</p>
<p>To me, this suggests that the most pressing data problem isn&#8217;t how to gather more structured data, but how to search unstructured data (on Web pages) and return structured answers.</p>
<p>I live on both sides of this equation, by the way. My wife runs <a href="http://www.lolacookies.com/">a small cookie bakery</a>, and I&#8217;m in charge of distributing her data to online sources.</p>
<p>Because of my background, I&#8217;m more informed and motivated than most small business owners. And yet, to be honest, just keeping her Web site up-to-date is a chore. On <a href="http://www.yelp.com/biz/d0jV1XtWELN07tU9p2epKQ">Yelp</a> right now, I&#8217;m sorry to say, her hours are incorrect. I should update it, but I just haven&#8217;t.</p>
<p>Accuracy on our own Web site is always my #1 priority, because that&#8217;s our official voice. Also it&#8217;s where most people land when they search for &#8220;Lola Cookies.&#8221;</p>
<p>Keeping Yahoo Local accurate is on my list, too, but it&#8217;s lower down. Ditto Google and YellowPages.com and the other big sites.</p>
<p>I never think about the data vendors one layer back, like InfoUSA, unless they happen to call the store. (Which InfoUSA does, to its credit.)</p>
<p>Meanwhile, plenty of interesting and searchable information about the bakery exists in other places on the Web, in formats that aren&#8217;t even addressed by the concept of &#8220;attributes.&#8221;</p>
<p>A <a href="http://www.myfoxdc.com/myfox/pages/Home/Detail?contentId=4313924&amp;version=3&amp;locale=EN-US&amp;layoutCode=VSTY&amp;pageId=1.1.1&amp;sflg=1">TV broadcast from the bakery</a> aired live on the local morning news recently, for instance. If you watched the show, you might search for us with a term like &#8220;fox 5 cookies virginia.&#8221; Where does <em><strong>that</strong></em> fit in the world of structured data?</p>
<p>I raised this general issue at yesterday&#8217;s panel. What were the panelists doing about this wealth of unstructured Web data, which right now is the dark matter of the local-search universe?</p>
<p>The answer I got was, basically, &#8220;Not much.&#8221;</p>
<p>Most panelists said they do only highly targeted crawls, focusing on sites that have structured data that can &#8220;extend or validate&#8221; their own data, in the words of Localeze&#8217;s Jeff Beard. An example might be the site of a professional group such as the <a href="http://www.aoa.org/">American Optometric Association</a>.</p>
<p>No panelist was ready to start indexing the sites of individual businesses, or locally focused blogs, or any other sites that are unstructured but potentially rich in content.</p>
<p>The only (mild) exception was Erron Silverstein of <a href="http://www.yellowbot.com">YellowBot</a>, who also said his company limits itself to targeted crawls — but included local media, such as newspapers, among his targets.</p>
<p>A few players <em><strong>are</strong></em> indexing the broader Web and then associating pages with specific businesses (which is the important part). Most notable are Google and Yahoo, who do it for their local search products.</p>
<p>Of course, they&#8217;re already indexing the entire Web. It&#8217;s less of a stretch for them.</p>
<p>Google and Yahoo also buy structured data from InfoUSA, Localeze and others, so it&#8217;s not like such data is obsolete.  But they&#8217;re getting the same info directly from some businesses, and those updates are likely more timely, more accurate, and more complete.</p>
<p>Meanwhile, their Web indices are opening up a realm of data that traditional vendors like Acxiom &#8212; represented by Jon Cohn on yesterday&#8217;s panel &#8212; simply don&#8217;t care to address.</p>
<p>I suspect that, sooner than you&#8217;d imagine, Google and Yahoo will be buying structured data not so that users can search it directly, but for two less-flattering reasons:
<ol>
<li>To help find Web pages they can associate with each business</li>
<p>
<li>To fill ever-smaller gaps in the coverage that results from #1</li>
</ol>
<p>Matthew Berk of Marchex argued that a good local search must be structured to &#8220;help someone walk down the decision trail&#8221; by using filters to narrow their search progressively:<br />
<blockquote><em>I need a orthopedist in Boston &#8230; in the Back Bay &#8230; who accepts United Healthcare.</em></p></blockquote>
<p>I think users are more likely to learn that they can go to Google and type &#8220;orthopedist back bay united healthcare&#8221; &#8212; particularly if it produces a <a href="http://www.bostonexerciseworks.com/about/Kenneth_Ditzian.html">good top result</a> the first time they try.</p>
<p>The burden of local search, it seems to me, is to do something that Google can&#8217;t match with an unstructured Web search.</p>
<p>In any case, the search portals will ultimately use their indexed Web pages to extract and cross-check structured data directly.  Over time &#8212; probably just a couple of years &#8212; such automated processes will yield data that&#8217;s more current and detailed than anything that&#8217;s produced by scanning phone books or calling stores.</p>
<p>The resulting search functionality, integrating both structured and unstructured data, will be sold to other companies as a Web service, and data vendors such as InfoUSA will become irrelevant to local search.</p>
<p>Now <em><strong>that</strong></em> would be a dysfunctional environment for many of the Kelsey attendees.</p>
<p>I&#8217;m not sure exactly how companies like InfoUSA and Acxiom should tackle the unstructured Web. It&#8217;ll demand a new way of thinking, and probably a new way of selling.</p>
<p>But I&#8217;m certain that they ignore unstructured data at their peril.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.loladex.com/2007/09/19/structured-vs-unstructured/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

