1. Automating Content Extraction of HTML Documents Gupta, Suhit; Kaiser, Gail E.; Grimm, Peter; Starren, Justin 2004 Reports Computer science