Web Information Retrieval is a fairly new science and identifying models and behaviors in Web is still a very much an open question. The amount of active research in the domain of Information Retrieval and effort to analyze data just proves the fact that the hidden semantics in the unstructured network of data, if harnessed properly can lead to amazing results and conclusions. But, this is far easy said than done. The very term "Unstructured Information" can lead to a mathematically exploding count of possibilities of potential problems in analysis.
Last semester's course of Web IR did actually put some light into some issues and motivated us to look into some of them, but the main problem lies with the fact that "we don't know what we don't know" :)
Google is synonyms with search for any internet user now, people don't search anymore they just "Google It". Now what would it take for a company to establish such a branding? So how does it all work after all ? This is itself an open question. Anyways I don't intend this post to be a guide to how search engine works? Probably somebody trying to figure that out will not even land up on this very page. So how does one land on a web page? Is it mere hyper links - which form the structure and edges of the network of WWW. Or is it something way beyond that. The presence of a link to Page B on Page A is not just random, it is being put up on purpose. Now if look deeper, we realize whether it be positive or negative endorsement the occurrence of a link of Page A to Page B, irrespective of nature in a way increases the probability of the user landing on Page B. :) Now if "hits" is the way to identify popularity of a web page, Page B ends up right on top.
This is true and a well established fact. I am sure all people aware of the field of IR would more or less agree with me on this. Incidentally today morning I met my faculty who had taught me the course on Web IR, and a very interesting discussion came up. The endorsement model on a social network like Twitter. Hmmm, so what's the big deal, its again the same underlying concept, that's what I suggested. However he suggested otherwise, he proposed in a model like Twitter the fact whether an endorsement is positive or negative has more relevance. Thinking more, I actually was surprised.
Consider a situation where I receive a tweet from my friend @abc - "What the heck? I can't adjust with anybody in the world". Now if I am re tweeting this, am I actually endorsing him? Probably not, the message has tight coupling with the user and his social image. Now giving away such an endorsement is not going to guarantee any value addition for user @abc. In fact, it would adversely affect the user with a recommendation. Isn't that opposite to our conventional knowledge and assumption. Now that's yet another open question :)
The very thought process that let me motivate to write this blog is the discussion I had today with Mandar. Thanks to him :), he has always been kind enough to share such wonderful insights with me. Now I "knowingly" insert a hyper link to his page. This was not at all random :) Was it?
Mandar on Twitter