Sometimes, it may be useful to crawl a site and review all of the links that appear on a site. This capability may be especially useful if you are trying to figure out how your favorite search engine continues to find an obscure URL link that may not have a human readable alias or should be blocked via your Apache configuration or link suppression. The attached script provides exactly that capability. In addition to crawling the links that appear on a site, this script will allow you to save the links retrieved to a file such that you can simply load the saved data rather than crawling the site again for your next query.
Read more of this article >>