Information and Vulnerabilities in Content

The first thing to realize about content is that it takes many forms. A typical Web page will obviously contain HTML that is rendered in the browser, but additional information in the page source can be valuable to a hacker or penetration tester. JavaScript, comments, and hidden form fields all yield clues and can even be manipulated to actively test the application. Page-scraping techniques, such as those covered throughout this book, can be used to extend the results of a search to get to this type of data.

However, beyond the page source, a great deal of information is available in the raw HTTP itself status codes, headers, and post data are all valuable areas that are not exposed in the browser. Typically, a crawl is the starting point to discover as much of the site as possible. Additional work will almost always yield more content to scrutinize; this could be a dictionary attack that simply requests a list of files, or it could involve manually poking around and requesting files. More often than not, it's a combination of the two. Although actual vulnerabilities can be discovered in content, for the most part the biggest value comes in information disclosures.

The Fast Road to Directory Enumerations

Some files save a hacker a lot of reconnaissance work by giving him or her a complete list of additional content to analyze. Some of the most obvious files that yield lots of good directory and/or filenames are the robots.txt...

< Previous Excerpt Next Excerpt >

Google Hacking for Penetration Testers

TABLE OF CONTENTS

Information and Vulnerabilities in Content

The Fast Road to Directory Enumerations

Contact Preferences

This is embarrasing...

Customize Your GlobalSpec Experience

Select Your Free Newsletters

Industry Newsletters

Select Your Free Product Alerts

This is embarrasing...