The Online Plague 2.0: The very low context-to-content ratio of information on the web (and its bimodal distribution)

Here’s a very good depiction of online / web content:

GIGO: Garbage in, Garbage out

Google’s approach is to skim the top of a huge one-size fits-all pool of undifferentiated bits (actually, Google’s search engine only searches through text, not images or movies; and it also only searches a very small amount amount of text — and besides: the algorithm is so poor that it has become quite useless, unless you want to get to by simply typing “amazon” into Google ;) ).

A more advanced approach to information would be to separate different types of information into different “buckets” (this is known as “classification”), or to point to individual pieces of information from different points of view (this is called “indexing”). There is a long tradition of abstracting, indexing and classification in the field of information science, but Google fan-boys often argue that such tried and true methods are now superfluous, because “just Google it“! :P

Each of the different access points to information can be viewed as a different perspective, or as a different filter or also as a different lens. I usually use the term “context” (to underscore how context and content are related by the container — roughly, the container is the channel, or the medium Marshal McLuhan referred to in his famous quote: “The medium is the message”).

At present, online media are very much like a small number of huge garbage heaps — whether that’s Google’s copy of the web stored on Google servers, or Facebook’s proprietary stash of media stored on Facebook’s computers, or maybe a handful of other companies that try to suck up as much “user generated content” as possible, functioning much like big industrial vacuum cleaners or mutant mega-aardvarks. The vast amounts of data are piled high and deep in proprietary databases (basically, these companies function as gigantic data-collection robots). These companies manage vast amounts of content — undifferentiated datastreams that have by and large no or very little contextual information. Indeed: most of these datastreams are simply duplicated and a copy is placed on each and every one of these huge heaps ad nauseum.

At the moment, there is really only one exception to the rule: Many individuals maintain individual (“personal”) blogs. These are intended to be very particular log-books of individual people — a sort of “daily diary” documenting anything that is particular or peculiar about that person… written as an autobiography. In this case, the context is minute: each blog post is essentially just one individual’s ideas. Most blogs are not intended to be collaborative efforts, though there are also exceptions to the rule (ranging from a company like “Mashable” to the very large group blogging system “Tumblr”).

I have always thought that such group / collaborative efforts are the future of the web — and it is with great reluctance that I have to admit that they apparently still are! 8O

This entry was posted in Uncategorized and tagged , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply