The Web knows everything there is to know; we just have to find ways of getting at it, while simultaneously separating the real knowledge from the junk. The current trend in natural language processing is to use existing corpora to guide the development of text understanding systems. Taken to the natural limit, what if we assumed that data was everything, i.e., when in doubt, just throw in more data? I'll show how we could engineer the World Wide Web, the single largest collection of textual information known to humans, to help solve interesting natural language processing problems such as question answering, word sense disambiguation, coreference resolution, and even commonsense.