Repository mode in Tortazo

Generation of onion addresses in Tor.

Every hidden service in TOR is identified by an address which is used by the user to access to the service. These addresses are not easy to remember just like the addresses in Internet, due that are values of a fixed size (16 chars) and are automatically generated by TOR when you use the configuration options “HiddenServiceDir” and “HiddenServicePort” in your torrc configuration file. TOR uses public key cryptography to generate a private key for the hidden service. TOR will generates an RSA-1024 keypair and the public key is calculated and then, uses SHA over the public key to generate a hash. Finally, the first half of this hash is encoded using Base32, so the Onion addresses can only contain the digits between 2 and 7 and the letters between “a” and “z”. This means that you have 32 possible values for every char in the onion address, which, as I’ve said before, has a length of 16 characters.

Problem: LIST HUGE, VERY, VERY HUGE OF POSSIBLE COMBINATIONS!

The number of combinations, depending on the number of chars unknown in the onion address to discover could be very huge. For example, If you have the first 15 chars of an onion address, the number of combinations is very simple: 32^1 = 32. If you have the first 14 chars of an onion address, you need to discover two chars, where every char could assume 32 different values, so you should perform a cartesian product between those two chars. http://en.wikipedia.org/wiki/Cartesian_product. In this case the number of combinations still is possible to calculate quickly: 31^2 = 1024. However, ¿what happens if you don’t have 6 or 8 chars of an onion address? Or ¿what happens if you don’t have any character at all? Well, then you have a problem, because the number of combinations could not be processed by any computer on the earth. This is why a hidden service is really “hidden”. If you have the first 10 chars then:

32^6 = 1073741824

If you don’t have any character at all of the onion address::

32^16 = 1208925819614629174706176

¿Do you see the problem we have here? We have a number of combinations which we can’t process in our computer (or even in a cluster). We don’t have enough process capacity or enough time. For example, assuming that you have a powerful machine which could process 10 million 16-character strings per second. ¿When will finish to process 32^16 addresses? Let’s calculate it!:

>>> 32**16 / (10000000*60*60*24*365)
3833478626L

Ok, almost 4 billions of years, do you dare? ;-)

Purpose of Onion Repository mode:

We have a problem which we can’t resolve completely only using programming techniques due his nature. That’s right, but it doesn’t mean that we can’t generate a lot of onion addresses and try to verify if each one of these has a hidden service. Although you can’t get the full space of possible onion addresses due obvious reasons, we can generate and process a lot of them using multiprocessing and a good computer (or computers). The onion repository mode does not try to generate all the possible combinations of onion addresses, just subsets of them and tries to perform connections to that onion addresses to verify if there’s a hidden service up and running. To do that, Tortazo uses a classical “Producer-Consumer” model using the multiprocessing module included in Python. In this case, there are some threads that will try to generate valid onion addresses using the random or incremental mechanism (you’ll see the difference in the next section) and some threads that will perform connections to the onion addresses generated by the producer.

Producer part

The producer part will start an generator for onion addresses using incremental or random mode and every onion address generated will be inserted in a shared resource: A queue. In this case, the user should enter the number of processes that will be used by the generator. The default number is 10.

Consumer part

In the consumer part will start the same number of processes that the generator part, however those processes will be created and terminated dynamically to process every onion address in the shared queue. When the producer inserts an element in the queue, the consumer processes will be notified and the callback function defined will be triggered and then, will perform an connection to the address obtained from the queue. If the response is valid, the processor inserts the details of the response in a separated queue which will be processed in parallel. The callback function for the queue of valid responses, will try to insert every response in database. Both queues are processed separately without blocking or interfering with the generator of onion addresses or the processor itself.

To get a clear idea of the internals of Tortazo in this mode, the following image shows the components explained before.

_images/OnionRepository.png

Onion Repository in Incremental mode

Due that this problem really is related with computing science and processing power, there’s few programming techniques that you could try to apply to reduce the complexity added in the amount of addresses. However, in Tortazo, the Incremental mode of the “Onion Repository” tries to “divide and conquer”. ¿How is it? Well, if the user enters some chars for the onion address, Tortazo will calculate the number of characters unknown and will try to divide that value in 4. Although the number of possible combinations is the same, the memory usage for the cartesian product will be minor for blocks of 32^4 than for 32^n blocks (where “n” will be a value between 1 and 16). For example, if you have this:

dfrh5uig61u6

Tortazo will detect that left 4 characters and will generate just 1 quartet, which will be used for the cartesian product. If you have this:

dfrh5uig

Tortazo will detect that left 8 chars and will generate just 2 quartets, the first quartet will try to generate the combinations for the chars between 8 and 12, the second quartet will try to generate the combinations for the chars between 12 and 16. If you have this:

dfrh5ui

Tortazo will detect that left 9 chars and will generate 3 quartets, the first quartet will try to generate the combinations for the char 9, the second quartet will try to generate the combinations between 8 and 12 and the third quartet will try to generate the combinations for the chars between 12 and 16. Also, the user could enter a limit set of chars to work with it, in this way, the number of combinations and amount of data will be considerably reduced, but the number of addresses to test too. For example, if the user enters the chars “2defrtg46” the combinations will be 9^n, where “n” is the number of chars unknown from an partial onion address entered by the user. This mechanism will reduce the memory usage, but sadly, not the complexity and the amount of onion address. As I’ve said before, this a problem related with computing science and processing power of the computers, not a problem related with programming.

Usage examples for the Onion Repository in Incremental mode

Try to generate and analize the combinations using the partial onion address dfrh5uig61u6 * -R / –onion-repository: Activate the onion repository in Tortazo and performs HTTP connections. * -O / –onionpartial-address Specify the partial onion address for the Incremental mode in the Onion repository mode. * -W / –workers-repository Specify the number of worker processes to use in the onion repository.

python Tortazo -v -R http -O dfrh5uig61u6 -W 15

Try to generate and analize the combinations using the partial onion address dfrh5uig * -R / –onion-repository: Activate the onion repository in Tortazo and performs HTTP connections. * -O / –onionpartial-address Specify the partial onion address for the Incremental mode in the Onion repository mode. * -W / –workers-repository Specify the number of worker processes to use in the onion repository.

python Tortazo -v -R http -O dfrh5uig -W 15

Onion Repository using Random mechanism:

If you just want to gather onion addresses without any pattern or information about a concrete hidden service using the Incremental mode, could be a very expensive task and probably your computer will hang trying to guess every char in the onion address. So, if you’re a little curious and want to test any onion address, the random mode is for you. In this case, the onion address generator will generate random onion addresses of 16-chars To use this mode, use the switch “-O / –onionpartial-address” with the value “RANDOM” and the switch “-R / –onion-repository” to specify the service type. This mode is like having a gun and then give shots into the air to a sky full of ducks, there is no guarantee to succeed, but if you’re lucky, you’ll get a correct answer. To activate the random mode, just use the keyword “RANDOM” as value of the switch “-O / –onionpartial-address”

python Tortazo -v -R ftp -O RANDOM -W 10

Load Known Onion Addresses:

The onion addresses are very important to perform enumerations or perform attacks using the plugins available in Tortazo. Due there’s a lot of onion addresses known in internet or in the deep web searchers, have sense to load those addresses in database to use them from any plugin. In Tortazo, the file <TORTAZO_DIR>/db/knownOnionAddresses.txt contains 400+ onion addresses which by default are loaded in the local database when the user activates the onion repository mode with the switch “-R / –onion-repository”. If you want to disable this behaviour, set to False the property “loadKnownOnionSites” in the configuration file <TORTAZO_DIR>/config/config.py

Searching for specific services:

The onion repository mode, allows to perform connections to services like HTTP, SSH and FTP. So you can discover hidden services which use that kind of protocols. Also, if you specify “onionup” as argument of the “-R / –onion-repository” switch, Tortazo will perform HTTP Requests to the service https://onionup.com to check if the specified address contains a hidden service running.