Crawling


Cheat sheet

  • Check for /robots.txt
URI SuffixDescriptionStatusReference
security.txtContains contact info for security researchers to report vulnerabilities.PermanentRFC 9116
/.well-known/change-passwordStandard URL to direct users to a password change page.ProvisionalW3C Draft
openid-configurationProvides config details for OpenID Connect over OAuth 2.0.PermanentSpec
assetlinks.jsonVerifies ownership of digital assets (like apps) tied to a domain.PermanentSpec
mta-sts.txtDefines SMTP MTA Strict Transport Security policy to secure email delivery.PermanentRFC 8461

ReconSpider

pip3 install scrapy --break-system-packages
wget -O ReconSpider.zip https://academy.hackthebox.com/storage/modules/144/ReconSpider.v1.2.zip
unzip ReconSpider.zip

python3 ReconSpider.py http://inlanefreight.com

Crawling, often called spidering, is the automated process of systematically browsing the World Wide Web.