I'm learning pentesting with the book "Violent Python" (highly recommended) and one of the exercises is to generate a script to brute force a ZIP file encrypted with a password.
The script works very well when it comes to ZIP 2.0 (portable) encryption (128-Bit AES and 256-Bit AES algorithms are more secure and take longer to find). The issue is that it not only quickly finds the password when using a dictionary, but also finds more than one valid password and I don't understand it and I need to know why this happens.
The code used is the following:
ZIP file encrypted with the password: yoda
TXT file with 2300 single words
Script used:
import zipfile from threading import Thread def extractFile(zFile, password): try: zFile.extractall(pwd=password) print '[+] Found password ' + password + '\n' except: pass def main(): zFile = zipfile.ZipFile('archivo.zip') passFile = open('passwords.txt') for line in passFile.readlines(): password = line.strip('\n') t = Thread(target=extractFile, args=(zFile, password)) t.start() if __name__ == '__main__': main()
Result
>>> [+] Found password Carrie [+] Found password cocacola [+] Found password eagle1 [+] Found password jean [+] Found password panda [+] Found password Grover [+] Found password cfi [+] Found password beautifu [+] Found password yoda <- lo puse al final del diccionario >>>
As you can see, it finds more than one password and worst of all, all these passwords work. That is, it is possible to decrypt the ZIP file with any of these passwords.
In order to investigate further, the questions are:
- What algorithm does ZIP use for this encryption?
- How is it possible to decrypt with more than one password?
Just because multiple passwords can crack doesn't mean they are all valid ( https://security.stackexchange.com/q/33081/13877 ). Only the true password will give a consistent result when decompressing, if you use the others the result will be meaningless files, but the script cannot distinguish meaningful files from files whose content is garbage, so it gives you several passwords as valid.
Any short password will be easy to crack with a brute force attack, and any password that is a common word will be easy to crack with a dictionary attack (what you're doing is more of a dictionary attack, not brute force). The "yoda" password would be easy to crack whatever the cipher is, not just in a zip.