Untangling the Web/"Every Angle of the Universe"

Untangling the Web
the National Security Agency
"Every Angle of the Universe"
1866439Untangling the Web — "Every Angle of the Universe"the National Security Agency

"Every Angle of the Universe"


One wag has suggested that the Internet is an "electronic Boswell," the chronicler of our age. It is that and more because the Internet chronicles not only a time and place but all times and all places, known and unknown, real and imaginary. The Internet is the closest thing to the fantastical "Aleph" imagined by the great Argentine story-teller Jorge Luis Borges, an object whose diameter is "little more than an inch" but which nonetheless contains all space, "actual and undiminished," and in which one can see "every angle of the universe."

While the comparison with the mythical Aleph may strike you as a bit whimsical, it is in fact not an altogether unfair metaphor. There has never been anything that approaches the Internet's reach (to almost every part of the' globe in less than thirty years), its size (estimated at 532,897 terabytes way back in 2003[1] ), and its ability to link us together in a new kind of world community (words, pictures, sounds, ideas beyond imagining). But, as with all new technologies, it comes at a cost-many costs, in fact. We pay for the benefits of the Internet less in terms of money and more in terms of the currencies of our age: time, energy, and privacy.

The goal of this book is to help you save some of each of these valuable resources: time, by making your searches more efficient; energy, by reducing the frustration using the Internet often entails; and privacy, by pointing out some simple measures to take to lower your cyber-profile and enhance your security.

I cannot emphasize strongly enough that this book was already out of date by the time it was published. Even though I have checked and rechecked every link in this book, some addresses are bound to have changed, some sites will have shut down, and some tips and techniques—such as search engine rules and syntax—will no longer be accurate. This is a testament to the changeable nature of the Internet and I must beg your forbearance for any such errors. Writing about the Internet is much like trying to catch Proteus[2]as with the mythical prophet, it keeps changing and escaping our grasp.

"The Internet has often been called the world's largest library with all of the books on the floor."

Curtin, M., Ellison, G., Monroe, D., "What's Related? Everything But Your Privacy," 7 October 1998, Revision: 1.5, <http://www.interhack.netlpubs/whatsrelated/> (14 November 2006).



What Will I learn?

To achieve these goals, this book will:

  • help you understand how to use the Internet more efficiently to find useful information and, in so doing…
  • make clear why the Internet is an invaluable resource.

This year I have reorganized the book to make it more logical and easier to use. The first part of the book still focuses on the ins and outs of searching: how search engines work, types of search tools, how to handle different types of searches. The next section has expanded to offer in-depth tutorials on six major search engines. Next, the book covers specialized search tools and techniques, including a new section devoted to Wikipedia. I have also moved the discussion of maps and mapping to this section. This is followed by "invisible" web research to include the changes to A9 and Amazon's search inside the book option. Next is the international search and language tools section, followed by specialized research tools, including new sections on video, audio, and podcast searches. The next section covers specific topical research, such as news, telecommunications, blogs, and RSS feeds. This is followed by a series of "how to" guides, culminating with tips and techniques for more effective searching. The book then delves into using the Internet to research the Internet, with the final section still addressing crucial privacy and security issues.


Why Do I Need Help?

There are no Internet research experts.


There are people who make a living using the Internet for research and who know more than others about what is on the Internet, how to find what they want on the Internet, and how to do this with relative efficiency. But no one knows what is truly "out there" for two fundamental reasons:

  • The Internet changes constantly. By that I mean daily, hourly, minute-to-minute, incessantly.
  • It's too darned big! If we can't accurately size the Internet (which we can't), you can be sure we don't know what is available via this resource with any degree of accuracy or completeness.

This doesn't mean you can't ever hope to find anything on the Internet. You often can find what you're looking for (and usually a lot more) with comparative ease, but no one should be deluded into believing he has a good grasp of the entire world of information available on the Internet. Realistically, the best search engines index only a fraction of all webpages and keyword searching is at best an art that routinely misses relevant sites while loading you down with dross. Are you discouraged? Don't be…novices often have more luck finding something arcane than seasoned researchers because of the power of creative thinking and serendipity. I've learned never to underrate luck and intuition when doing Internet research, but I think the two most important tools for successful Internet research are:

  1. a good set of bookmarks
  2. other people with experience searching the Web

Never assume others are already aware of some website, tool, or technique you find particularly useful. The sheer quantity of data, information, and knowledge associated with the Internet is so enormous that no one can know more than a fraction of what's on it. While we're talking size, let me mention an important distinction. The Internet and the web are not one and the same, though the web is what most people think of when you say "Internet."

In fact, as huge as it is, the Worldwide Web is actually a subset of the Internet. The Internet is the network of networks, all the world's servers connected by routers, to put it in semi-technical terms. The web is that portion of the Internet that uses a browser (typically Netscape or Firefox—browsers built upon Mozilla—or Microsoft's Internet Explorer) and some type of hypertext language (usually HTML) to move around. This book focuses primarily on the web because tackling the web by itself is a big enough challenge.

As you have no doubt guessed by now, the Worldwide Web does not come with an instruction manual or users guide, which means much if not most of what you learn about researching using the Internet will come from hard-won experience. On the up side, you probably will not be able to break anything on the Internet. More than likely, no matter how lost or hopelessly confused you become, you will only damage your own computer and/or network—and perhaps your good humor and sanity. However, because of the almost astronomical growth of malicious activity, the Internet has become a dangerous place, and users have discovered that they have inadvertently spread malicious software (malware) such as viruses , worms, and Trojan horses. That is why I have devoted the last section of the book to personal computer security and privacy. We are all at risk from the rising tide of bad and in some cases criminal behavior, so we must take responsibility for protecting ourselves and our computers from the ruses and attacks that grow in number and sophistication each year.

This book will expand on simple "rules" of Internet research, rules that are really more in the nature of friendly suggestions. These rules are the fruit of my own experiences as an Internet user and may prevent you from repeating all the mistakes I made that gave rise to the rules in the first place. Some of these suggestions may at first strike you as odd or inconsistent, but the rationale for each I hope will become clear as we go along.[3] The fact is that today we are drowning in information and starving for knowledge. The goal of Untangling the Web is to help rescue users from the ocean of information (and misinformation) by throwing them a virtual lifeline.


What's New This Year

Most people probably have not thought about or been very much affected by the changing search landscape because, as is only natural, most people have one or two sites they routinely use for search and research, regardless of the nature of the inquiry. However, virtually all search professionals will agree that knowing where to look for information is the key to successful searching. Yet few venture beyond the comfortable confines of the familiar search engine. While the major search engines continue to improve each year, they are far from the be all and end all of search. The problem with general search tools is that they cannot provide targeted or tailored results, certainly not without a lot of work on the part of the user. For this reason, a large part of Untangling the Web is devoted to other ways to uncover information, be it subject guides, "deep web" resources, targeted search tools, or unusual tips and techniques for revealing what is hidden.

Again this year, I have included detailed information on how to use Google, Yahoo, Gigablast, and Live Search (formerly MSN Search) to find very specific data. I have also updated and expanded the section on Exalead and added Ask to the major search engines. However, unless you spend a fair amount of time using each of these search tools, you will probably find their many options too complicated and cumbersome for everyday use. A different approach is to use specialized search tools, which begs the question of how to find these tools. Untangling the Web maps a number of the Internet's less-traveled roads, i.e., excellent but unheralded specialty search tools such as Fagan Finder, Amazon's A9 multipurpose search, and ThomasGlobal's business search. Also, the section on international search is substantially larger than before.

In recognition of the growing importance and influence of collaborative websites, there are several new sections in this year's book. One is a separate section devoted to Wikipedia, contributed in part by my colleague Diane White. Video and audio search exploded during 2006, and this year's edition contains a new and extensive examination of video search sites as well as a new section on audio search and podcasting. Two other new sections are devoted to custom search engines and book search, neither of which is an entirely new technology but both of which spread in popularity and improved in quality in the past year. Custom search is fast becoming a replacement for web directories, which continue their slide into irrelevance.

The section on researching and understanding the Internet now begins with a new section on "internetworkinq." This tutorial is a response to a number of requests from people such as myself who need basic knowledge and understanding of how the Internet works without too much technical jargon or expertise. I hope you find that it falls in a comfortable middle ground between simplistic and abstruse.

Once again, the section on privacy and security grows in proportion to concerns about protecting our privacy and security on the Internet. Fortunately, as the problems increase and the malicious users become more enterprising, so do the ways and means for protecting ourselves. However, home computer security is a personal responsibility few people take seriously until it is too late. Untangling the Web's privacy and security information is designed to help users avoid becoming victims and instead take the offense in protecting themselves, their families, and by extension, the Internet community from the Internet's evil-doers. The 2007 edition includes new sections on clearing private data in Firefox, encrypting files in Windows, pretexting, protecting yourself from search engine leaks, whether or not you can really opt out of online directories, and a brief discussion of wireless Internet use.

I have also reorganized Untangling the Web to make it easier to use. The new section on "Specialized Search Tools & Techniques" brings together some already existing topics, such as Google hacking, with the new sections on Wikipedia and Custom Search Engines. I also moved maps up to this section because they have become integral to basic search. Specialized Research Tools now include the video and audio search sections as well as telephone and email search. Basically, all types of search comprise the first two-thirds of the book, while the remainder focuses on the Internet itself and privacy and security.

As was true of last year's edition, I can again say with confidence that the 2007 Untangling the Web was already out of date before it reached your desk. Experienced Internet users understand the Internet is truly a river of information that is impossible to step into twice. And the basic concepts for using the Internet to research topics of interest to our community of readers are sound despite changes in websites, links, and technology.

Web Tip

Web links often change. In case of a bad link for a news article, use the site's search facility and search by the headline, author, or date. In the case of a bad link inside a website, try going to the site's homepage and working your way down to the page, which may still be there, only in a different location.


  1. School of Information Management and Systems, University of California at Berkeley, "How Much Information? 2003," 27 October 2003, <http://www.sims.berkeley.edu/research/projects/how-muchinfo-2003/internet.htm> (October 2005), Executive Summary.
  2. "Proteus—i.e. full of shifts , aliases, disguises, etc. Proteus was Neptune's herdsman , an old man and a prophet.. .There was no way of catching him but by stealing upon him during sleep and binding him; if not so captured , he would elude anyone who came to consult him by changing his shape, for he had the power of changing it in an instant into any form he chose." "Proteus," Brewer's Dictionary of Phrase and Fable, 1898, <http://www.bartleby.com/81/13723.html> (14 November 2006).
  3. If you are using the hypertext version of this book on line, the links in the paper may not load correctly. Try the refresh button , copy and paste the uri, or type in the uri directly.