Table of Contents for
Python Web Penetration Testing Cookbook

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Python Web Penetration Testing Cookbook by Dave Mound Published by Packt Publishing, 2015
  1. Cover
  2. Table of Contents
  3. Python Web Penetration Testing Cookbook
  4. Python Web Penetration Testing Cookbook
  5. Credits
  6. About the Authors
  7. About the Reviewers
  8. www.PacktPub.com
  9. Disclamer
  10. Preface
  11. What you need for this book
  12. Who this book is for
  13. Sections
  14. Conventions
  15. Reader feedback
  16. Customer support
  17. 1. Gathering Open Source Intelligence
  18. Gathering information using the Shodan API
  19. Scripting a Google+ API search
  20. Downloading profile pictures using the Google+ API
  21. Harvesting additional results from the Google+ API using pagination
  22. Getting screenshots of websites with QtWebKit
  23. Screenshots based on a port list
  24. Spidering websites
  25. 2. Enumeration
  26. Performing a ping sweep with Scapy
  27. Scanning with Scapy
  28. Checking username validity
  29. Brute forcing usernames
  30. Enumerating files
  31. Brute forcing passwords
  32. Generating e-mail addresses from names
  33. Finding e-mail addresses from web pages
  34. Finding comments in source code
  35. 3. Vulnerability Identification
  36. Automated URL-based Directory Traversal
  37. Automated URL-based Cross-site scripting
  38. Automated parameter-based Cross-site scripting
  39. Automated fuzzing
  40. jQuery checking
  41. Header-based Cross-site scripting
  42. Shellshock checking
  43. 4. SQL Injection
  44. Checking jitter
  45. Identifying URL-based SQLi
  46. Exploiting Boolean SQLi
  47. Exploiting Blind SQL Injection
  48. Encoding payloads
  49. 5. Web Header Manipulation
  50. Testing HTTP methods
  51. Fingerprinting servers through HTTP headers
  52. Testing for insecure headers
  53. Brute forcing login through the Authorization header
  54. Testing for clickjacking vulnerabilities
  55. Identifying alternative sites by spoofing user agents
  56. Testing for insecure cookie flags
  57. Session fixation through a cookie injection
  58. 6. Image Analysis and Manipulation
  59. Hiding a message using LSB steganography
  60. Extracting messages hidden in LSB
  61. Hiding text in images
  62. Extracting text from images
  63. Enabling command and control using steganography
  64. 7. Encryption and Encoding
  65. Generating an MD5 hash
  66. Generating an SHA 1/128/256 hash
  67. Implementing SHA and MD5 hashes together
  68. Implementing SHA in a real-world scenario
  69. Generating a Bcrypt hash
  70. Cracking an MD5 hash
  71. Encoding with Base64
  72. Encoding with ROT13
  73. Cracking a substitution cipher
  74. Cracking the Atbash cipher
  75. Attacking one-time pad reuse
  76. Predicting a linear congruential generator
  77. Identifying hashes
  78. 8. Payloads and Shells
  79. Extracting data through HTTP requests
  80. Creating an HTTP C2
  81. Creating an FTP C2
  82. Creating an Twitter C2
  83. Creating a simple Netcat shell
  84. 9. Reporting
  85. Converting Nmap XML to CSV
  86. Extracting links from a URL to Maltego
  87. Extracting e-mails to Maltego
  88. Parsing Sslscan into CSV
  89. Generating graphs using plot.ly
  90. Index

Finding e-mail addresses from web pages

Instead of generating your own e-mail list, you may find that a target organisation will have some that exist on their web pages. This may prove to be of higher value than e-mail addresses you have generated yourself as the likelihood of e-mail addresses on a target organisation's website being valid will be much higher than ones you have tried to guess.

Getting ready

For this recipe, you will need a list of pages you want to parse for e-mail addresses. You may want to visit the target organization's website and search for a sitemap. A sitemap can then be parsed for links to pages that exist within the website.

How to do it…

The following code will parse through responses from a list of URLs for instances of text that match an e-mail address format and save them to a file:

import urllib2
import re
import time
from random import randint
regex = re.compile(("([a-z0-9!#$%&'*+\/=?^_'{|}~-]+(?:\.[a-z0- 9!#$%&'*+\/=?^_'"
                    "{|}~-]+)*(@|\sat\s)(?:[a-z0-9](?:[a-z0-9- ]*[a-z0-9])?(\.|"
                    "\sdot\s))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)"))

tarurl = open("urls.txt", "r")
for line in tarurl:
  output = open("emails.txt", "a")
  time.sleep(randint(10, 100))
  try: 
    url = urllib2.urlopen(line).read()
    output.write(line)
    emails = re.findall(regex, url)
    for email in emails:
      output.write(email[0]+"\r\n")
      print email[0]
  except:
    pass
    print "error"
  output.close()

How it works…

After importing the necessary modules, you will see the assignment of the regex variable:

regex = re.compile(("([a-z0-9!#$%&'*+\/=?^_'{|}~-]+(?:\.[a-z0- 9!#$%&'*+\/=?^_'"
                    "{|}~-]+)*(@|\sat\s)(?:[a-z0-9](?:[a-z0-9- ]*[a-z0-9])?(\.|"
                    "\sdot\s))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)"))

This attempts to match an e-mail address format, for example victim@target.com, or victim at target dot com. The code then opens up a file containing the URLs:

tarurl = open("urls.txt", "r")

You might notice the use of the parameter r . This opens the file in read-only mode. The code then loops through the list of URLs. Within the loop, a file is opened to save e-mail addresses to:

output = open("emails.txt", "a")

This time, the a parameter is used. This indicates that any input to this file will be appended instead of overwriting the entire file. The script utilizes a sleep timer in order to avoid triggering any protective measures the target may have in place to prevent attacks:

time.sleep(randint(10, 100))

This timer will pause the script for a random amount of time between 10 and 100 seconds.

The use of exception handling when using the urlopen() method is essential. If the response from urlopen() is 404 (HTTP not found error), then the script will error and exit.

If there is a valid response, the script will then store all instances of e-mail addresses in the emails variable:

emails = re.findall(regex, url)

It will then loop through the emails variable and write each item in the list to the emails.txt file and also output it to the console for confirmation:

    for email in emails:
      output.write(email[0]+"\r\n")
      print email[0]

There's more…

The regular expression matching used in this recipe matches two common types of format used to represent e-mail addresses on the Internet. During the course of your learning and investigations, you may come across other formats that you might like to include in your matching. For more information on regular expressions in Python, you may want read the documentation on the Python website for regular expressions at https://docs.python.org/2/library/re.html.

See also

Refer to the recipe Generating e-mail addresses from names for more information.