Semantic Web

Web 3.0 is Official With Schema.org

We now finally have an agreement among the three main search engines on a standard for structured data, and how to mark it up. Schema.org was released and it documents how publishers can standardize the structure of their content.

The potential effect can be huge on users, where machines now are getting a huge boost of intelligence simply by us feeding them with knowledge instead of having them become so intelligent that they can understand things the way a human being can.

None of this is actually new, and there is no technology breakthrough. It is simply a human system that we agree upon, and feed the computers with data and "meanings" about these data. Hence, the name 'semantic web'

Imagine how much process and statistics a computer or a search engine needs to do in order to understand when to treat apple as a fruit and when to treat it as a computer company. Imagine something more dramatic like "banana republic", two unrelated words, each in a different field, and yet the phrase is something completely different.

What will happen now is that when someone writes an article, they will mark it up with the related meta tags, so that the computer understands how to treat this string of letters. The users reading, will see the same content, but the computer will see structured data.

The effect can easily be seen on search engine results, and this where search engines graduate to the next level of usefulness and become, as Bing claimed to want to be, decision engines.

They will give us meaningful information about the search queries we are looking for in order to better decide where to go.

A big category of content is recipes:

 

Structured Data Search Results

Traditionally search engines would try to figure out what part of the page is the most relevant to your search query, and then provide the closest thing in the search results' snippets.This required the search engine to understand what you meant by that query, and accordingly provide you with something useful. With semantic markup, and if the user includes in the query something helpful like "recipe", this tremendously helps the search engine, and gives the users relevant results in two ways:

  1. Displaying relevant information: this is usually presented in gray, right under the headline or title of each result. Because the search engine figured out the relevant pages you might be interested in, it gives you information from that page that would help you better decide which result is the best for you. In the above case, you are probably interested to know about the calorie value and preparation time to select which option is the best for you.
  2. Special search options: Since each type of content has its own set of attributes and uses, we need a different set of tools to refine results for each type of information. In a different example, after Google figured out that "laptop" is probably a shopping query, it gave me different options for shopping. It asked for my location, so it can give results "nearby", and it also presented me with a whole bunch of attributes, based on which I can filter the right laptop for me.

Structured Data Search Results

The really important thing here, is that the search engine doesn't really need to deeply know the different options a laptop has, and figure out how to display them. That would require a lot of processing and intelligence.

The content publishers, the sellers of laptops in this case already did the homework for google and when they published their products, they semantically tagged each attribute. Now all the search engine has to do is pick up these attributes and display them for the user in a structured way, where the filter is a useful one for the user.

Instead going to ten different pages, going back, refining your query and finding the best way to phrase it, you can do much of the filtering and choice before you go to the page, and this dramatically increases your chances of finding what you are looking for. 

This is a crucial step in how we find information.

Web 3.0 and Search Engine Optimization

Although the coolest potential applications of Web 3.0 are potentially achieved when our machines start talking to each other in a smart way, making decisions on our behalf, and suggesting meaningul things based on past data, and our preferences, one of the first steps to get there is simply structuring data in a way that computers can deal with immediately, instead of having to extract meaning and pattern from any piece of text.

Semantic search engines extract meaning by "reading" the text and inferring that France is a country, Nescafe is a coffee brand, and The Dalai Lama is a person. This is great, but requires a lot of computing power, and has a lot of challenges in understanding different kinds of text, and the different meanings the same word can have in different contexts.

The simple way to help search engines "understand" content, is to extract those entities ourselves and give them to the search engine.

Structured data simply means that certain "entities" are tagged in a way that describe them as the entities they are. For example, instead of writing

"I live in Dubai, United Arab Emirates" you can tag the same sentence with tags that make "Dubai" a city, and not just the letters D-U-B-A-I, as follows:

I live in<div class="adr">
  <span class="locality">Dubai,</span>

  <span class="country-name">United Arab Emirates</span>

 </div>

The user will still read the same sentence, but search engines and other sites working on structured data will find your content much easier because your entities are identified. Moreover, you can export your reviews, products, information, and anything you want with ease to other sites that classify certain information.

For example, if your site offers product reviews, and your reviews are tagged properly, other shopping sites or shopping engines will be able to extract the relavant data from you, and thus make your products available without much effort on your part.

This is clearly going to become an essential part of search engine optimization, and as anything else in technology it will only pickup when a large enough number of websites start using it. Then we will witness a transformation of our web experience.