Metagoogil is a tool that utilizes the Google search engine to get metadata from the documents available in the target domain. Currently, it supports the following document types:
.docx, .doc).xlsx, .xls, .ods).pptx, .ppt, .odp).pdf)
Metagoogil works by performing the following actions:
The metadata that can be found are as follows:
This information can be used later on to help in the penetration testing phase. Metagoofil is not part of the standard Kali Linux v 2.0 distribution. To install, all you need to do is use the apt-get command:
# apt-get install metagoofil
After the installer package has finished, you can access Metagoofil from the command line:
# metagoofil
This will display a simple usage instruction and example on your screen. As an example of Metagoofil usage, we will collect all the DOC and PDF documents (-t, .doc, .pdf) from a target domain (-d hackthissite.org) and save them to a directory named test (-o test). We limit the search for each file type to 20 files (-l 20) and only download five files (-n 5). The report generated will be saved to test.html (-f test.html). We use the following command:
# metagoofil -d example.com -l 20 -t doc,pdf –n 5 -f test.html -o test
The redacted result of this command is as follows:
[-] Starting online search... [-] Searching for doc files, with a limit of 20 Searching 100 results... Results: 5 files found Starting to download 5 of them: ---------------------------------------- [1/5] /webhp?hl=en [x] Error downloading /webhp?hl=en [2/5] /intl/en/ads [x] Error downloading /intl/en/ads [3/5] /services [x] Error downloading /services [4/5] /intl/en/policies/privacy/ [5/5] /intl/en/policies/terms/ [-] Searching for pdf files, with a limit of 20 Searching 100 results... Results: 25 files found Starting to download 5 of them: ---------------------------------------- [1/5] /webhp?hl=en [x] Error downloading /webhp?hl=en [2/5] https://mirror.hackthissite.org/hackthiszine/hackthiszine3.pdf [3/5] https://mirror.hackthissite.org/hackthiszine/hackthiszine12_print.pdf [4/5] https://mirror.hackthissite.org/hackthiszine/hackthiszine12.pdf [5/5] https://mirror.hackthissite.org/hackthiszine/hackthiszine4.pdf processing [+] List of users found: -------------------------- emadison [+] List of software found: ----------------------------- Adobe PDF Library 7.0 Adobe InDesign CS2 (4.0) Acrobat Distiller 8.0.0 (Windows) PScript5.dll Version 5.2.2 [+] List of paths and servers found: --------------------------------------- [+] List of e-mails found: ---------------------------- whooka@gmail.com htsdevs@gmail.com never@guess narc@narc.net kfiralfia@hotmail.com user@localhost user@remotehost. user@remotehost.com security@lists. recipient@provider.com subscribe@lists.hackbloc.org staff@hackbloc.org johndoe@yahoo.com staff@hackbloc.org johndoe@yahoo.com subscribe@lists.hackbloc.org htsdevs@gmail.com hackbloc@gmail.com webmaster@www.ndcp.edu.phpass webmaster@www.ndcp.edu.phwebmaster@www.ndcp.edu.ph [webmaster@ndcp [root@ndcp D[root@ndcp window...[root@ndcp .[root@ndcp goods[root@ndcp liberation_asusual@ya- pjames_e@yahoo.com.au
You can see from the preceding result that we get a lot of information from the documents we have collected, such as the usernames and path information. We can use the obtained usernames to look for patterns in the usernames and for launching a brute force password attack on the usernames. But, be aware that doing a brute force password attack on an account may have the risk of locking the user accounts. The path information can be used to guess the operating system that is used by the target. We got all of this information without going to the domain website ourselves.
Metagoofil is also able to generate information in a report format. The following screenshot shows the generated report in HTML:

In the report generated, we get information about usernames, software version, e-mail addresses, and server information from the target domain.