Information overload

An edited version of this article was published in the February 2004 issue of Spider Magazine.

It’s nothing new and it’s been around ever since mankind first started to speak. Here in Pakistan, where technology is almost a form of magic for many, the very word information doesn’t have much meaning. Like most third world countries, the Internet and the usage of computers is not widespread. The total number of internet users is 1.4 million according to a recent estimate, while government estimates over the years put it anywhere from 2 to 5 million.

Before talking about information overload, it’s important to first understand what exactly we mean by the word information. The dictionary definition of information:

  • Knowledge derived from study, experience, or instruction.
  • Knowledge of specific events or situations that has been gathered or received by communication; intelligence or news. See Synonyms at knowledge.
  • A collection of facts or data: statistical information.

Information is everywhere, and it’s not just digital. A pedestrian trying to cross a crowded intersection is overwhelmed by data from all sides, some of which he has to discard (the clothes people are wearing) while others he has to concentrate on (cars, traffic lights) and process, to decide when to cross. In daily life, our brain sub consciously sorts, discards and processes the raw data we encounter without our becoming aware of it. Our brain, with tens of thousands of years of evolution is well adapted to deal with the massive amounts of analog data we encounter and process without having to consciously think about it. When it comes to digital data though, we are at a loss. If you think of the computer as a sort of detached mini brain outside your body, and the only means of interface the keyboard and screen, than you can understand the problem.

We have incredibly primitive and vestigial mechanisms to help us transcribe it from world to idiot-savant computer companion. We’re stuck in a middle-period between the emergence of useful computer processing power and the computer’s upcoming ability to self-annotate, transcribe and create metadata simply, elegantly (and in vast amount) in the background all the time. In the meantime our transcription processes are tedious and long, our computers eager but clueless – and the amounts of metadata available for any given thing trivial compared to the richness of information and association you could get from a genuinely interested and knowledgeable person. A fragment of a world full of metadata…

Around five exabytes (5 billion gigabytes) of information was created in 2002, up from around two exabytes in 1999, according to the latest How Much Information survey produced by the School of Information Management and Systems at UC Berkeley. This is equivalent to half a million libraries the size of America’s Library of Congress, or about 800 megabytes per person per year.

Even in Pakistan, we’re moving to an ever more digital society. Many government offices and records have already moved on to the computer age, and the rest are converting. The US government had fully computerized everything back in the 1970’s, and anything left over then has been digitized now. Not only that, many government documents are only available online, and in the future more and more will be digital only.

As everyone now knows, computers automate and provide humans the ability to process and deal with large quantities of information and data. For anyone who still doubts the virtue of computerization all they have to do is go to any driving license office around the country and watch as they try to dig out any file whatsoever.

Till the computer age, information was bound up in books and people. Over the centuries, mankind had built up such a vast amount of information that it is impossible for anyone to learn more than a tiny fraction. The major problem with a physical library is the amount of effort required to search, locate, learn and use the reams of information stored within.

When we walk into a library or a bookstore, there is an overwhelming amount of information to deal with. What makes computers so useful is the ability they provide to process large amounts of information. Amazon now allows you to do full text searches of over 100,000 books, and they aim to index every single book they sell! For a researcher, and just about everyone else, this is amazing. It takes hours to look through a few books, while this allows you to zip through potentially millions of books in seconds. Google is also aiming to archive and make searchable just about every book ever published. There are other pay services like Questia, Lexis-Nexis and Elibrary available, which provide growing digital libraries.

The problem, and increasingly so in the future, how is a normal human being going to process all that information! To research the topic, I searched for information overload on Google and got back 642,000 results! There must have been a few thousand results worth reading, yet I could hardly manage to go through the first few pages of results. So what then the utility of these hundreds of thousands of results?

Researchers have been working on this issue for years now, and new tools and technologies are being developed to overcome this. The aim is not to have to go through all these thousands of results, but instead to use tools which filter and sift these thousands of pages into a more manageable size, while retaining all the essential information. We are still at the infancy of the internet, and like all other advances, we have to struggle with shortcomings that our grandchildren will not even be able to imagine: ‘Papa, you mean the computer wouldn’t even talk to you? What good was it then!’ Children will be doing computations in a day which the best minds of today struggle to do in their lifetime. The possibilities are endless.

I’m constantly surprised by people’s misperceptions of the internet in Pakistan. The older generation seems to view it as something incomprehensible, and at the most good for emailing and/or wasting time. The younger generation seems to think that the internet consists solely of chat and Hotmail. I have never come across a teenager using a PC who wasn’t either on MSN Messenger or Hotmail, or both. While this is a very informal/personal/biased survey, in the last 6 years I must have come across at least a thousand teens using the internet, and all of them so far have been emailing/chatting or playing some game or the other. It’s sad, when arguably the sum of all human knowledge resides at our collective fingertips, and so many don’t even give it a passing glance. The key is to be learned enough to separate the wheat from the chaff, which is becoming increasingly difficult. They’re lucky in a sense, for they certainly don’t have a problem with information overload, but remain in the digital dark ages.

In the third world, where libraries hardly exist and public schools teach junk, the Internet is a godsend. Let me rephrase that. Proper usage of the Internet is a godsend. Every school should be teaching how to research using libraries and the Internet, to organize and sort, and most important, to evaluate and process all this into something original using the student’s own thoughts. The Pak Govt. has been promoting software parks, call centers, and other white elephants, but when it comes to putting money where their mouth is they fall short. That might be a bit harsh, as they have set up the Virtual University of Pakistan, but that is not going to help the masses. By the time people get out from public schools (the few who can go) most will have received such a bad education (if any) that they no longer think creatively.

A hundred years from now, Google will have reached the Star Trek future: you talk into the air and the computer processes your question, figures out its context, figures out what response you’re looking for, searches a giant database in who-knows-how-many languages, translates/analyses/summarizes all the results, and presents them back to you in a pleasant voice.

For the present we have to make do with what we have, yet even now being able to use the Internet properly is like learning to read once more. Suddenly your horizons open up to a limitless world of knowledge, people, opinions, and the most important of all, the ability to connect with people from all over the world. The heroes of the computer age opened up the world of the internet to us, and now we wait for the giants of the information age to develop the tools which the rest of us can use. And it won’t be long now – already researchers at the Fraunhofer Institute for Computer Architecture and Software Technology (FIRST) in Berlin have developed a promising new way to control computers by thought alone. Others are working on speech recognization software’s which can better understand the inflected meanings behind human speech. Artificial Intelligence research has been ongoing since well before the computer age, and sooner or later more intelligent software agents are going to be developed.

How much has using the Internet expanded your information horizon? With books, TV and magazines, you remain limited to what you have available, and it takes a lot longer to quickly look up new areas. Using the Internet, anything and everything I want to find out about in more detail then I’d ever want to is readily there in a few seconds.

The recently concluded World Information Summit in Geneva emphasizes the importance of an information society, and concludes that it is such an important advance that every man, woman and child on the planet should have access to the Internet as a basic right. The proceedings and other documents are all available at the Information Summit’s website. If nothing else, the fact that leaders from all over the world, even our PM himself (whose expertise in all this remains a state secret) was present says how important the Internet has become in today’s world. Just a few years ago, the very idea of an Information summit with world leaders attending was ludicrous. Today, we have the entire world talking about how to bring everyone into the Information Age.


2 thoughts on “Information overload”

  1. 800 MBs per person per annum is a lot. And you can see that more of the information is redundant. The same said over and over in different ways. You will notice that at least by the search results that end up on our blogs.

  2. I don’t think all the weblogs in all the worlds make a perceivable impact on that 800MB figure… so while most posts on most weblogs might be redundant, that doesn’t apply to the 800MB figure. Videos, pictures, databases etc. take up the bulk of this data, certainly not html websites.

