The Whitehouse.gov Website’s Robots.txt File Has 1839 Lines In It

By Daniel Miessler on January 23rd, 2007: Tagged as Internet | Security
  • http://www.whitehouse.gov/robots.txt Some Joker

    Looking at most of those entries, it looks like they’re excluding pages which look to be designed for text only browsers/screen readers.. nearly every directory ends in /text

    Disallow: /asia/2005/photoessay/china/text Disallow: /asia/2005/photoessay/japan/text Disallow: /asia/2005/photoessay/korea/text Disallow: /asia/2005/photoessay/mongolia/text Disallow: /asia/2005/photoessay/mrsbush1/text Disallow: /asia/2005/photoessay/mrsbush2/text

    and if you browse up one directory, you get the same story with pictures..

    I’d say it looks like they are doing it to work around for a poor file structure or possibly to keep search engines from finding duplicate text (although without pictures)

    shrugs I’m all for pointing out when the administration does something crooked, but I can’t see fault in this one.. (granted, I’ve only checked out 20 or so of the links.. the only one that didn’t go anywhere for me was /video/text )

  • http://www.whitehouse.gov/robots.txt Some Joker

    Looking at most of those entries, it looks like they’re excluding pages which look to be designed for text only browsers/screen readers.. nearly every directory ends in /text

    Disallow: /asia/2005/photoessay/china/text Disallow: /asia/2005/photoessay/japan/text Disallow: /asia/2005/photoessay/korea/text Disallow: /asia/2005/photoessay/mongolia/text Disallow: /asia/2005/photoessay/mrsbush1/text Disallow: /asia/2005/photoessay/mrsbush2/text

    and if you browse up one directory, you get the same story with pictures..

    I’d say it looks like they are doing it to work around for a poor file structure or possibly to keep search engines from finding duplicate text (although without pictures)

    shrugs I’m all for pointing out when the administration does something crooked, but I can’t see fault in this one.. (granted, I’ve only checked out 20 or so of the links.. the only one that didn’t go anywhere for me was /video/text )

  • sergei

    Search in Google for ‘robots.txt’ shows whitehouse.gov at position 5

  • sergei

    Search in Google for ‘robots.txt’ shows whitehouse.gov at position 5

  • ghost16825

    Ooooh, /secret/ directories. nods head

  • ghost16825

    Ooooh, /secret/ directories. nods head

  • http://puggy.symonds.net/~deep/ Deepak

    Yup, even I noted it sometime back as an excellent sitemap. ;-)

  • http://puggy.symonds.net/~deep/ Deepak

    Yup, even I noted it sometime back as an excellent sitemap. ;-)


Top

Popular

Information Security / Technology

Politics

Philosophy & Religion

Technology & Science

Culture & Society

Miscellaneous

Arguments

Projects

Collections

Twitter

What I'm Reading

Favorite Books and Essays

Top Blog Categories

Inputs