George Malandrakis: Protecting your anonymity on the internet

Greek version

(Original title: How to stay anonymous on the internet)

Introduction

The purpose of this article is to give advice and tips to whomever is interested in protecting their data and identity and stay anonymous on the internet. Or, to be precise, almost anonymous, since -as we will see with several examples- it is always probable that those who want to deprive you from your anonymity, employ more advanced techniques than you.

The present article the term "anonymity" is used in two ways: the first is the typical one -preventing attackers from tracing internet activity back to specific persons or corporations. The second is the prevention of data theft in general, since the data stored in a computer can lead to the identification of an otherwise anonymous user.

Instead of limiting ourselves to general and well-known techniques (use Tor, reject cookies, etc) we will introduce the reader to some more technical and useful details, including some methods used by attackers for identification of users of anonymity networks, the role of the operating system, and the newly developed browser-fingerprinting techniques, along with some prevention methods.

The reader should be aware that the attackers may always use more advanced techniques that someone who wants his anonymity. It should be also noted that the techniques to be described are in use during 2018.

Anonymity, crime, and the "rationalization" of mass-surveillance

"Why should I want to stay anonymous? I have nothing to hide. Anonymity can be useful only for criminals and perverts".

A typical "argument" of those who claim to have no problem with mass surveillance whatsoever -that is, authorities and third parties being aware of everything we do online. But how would something similar for freedom of speech sound?

"Why should I want freedom of speech? I have nothing to say. Freedom of speech can be useful only for insulters and defamers."

Even you have nothing to say, some others happen to have. And what they have to say may be unrelated to vituperations and defamations -it may just be something that your government does not want you to know. Same for anonymity: even if most of us have nothing to hide, some others happen to have. And it may be unrelated to crime and perversion -it may be something against the interests of a government or a corporation.

It does not take a rocket scientist to understand that the data collected from our use of the internet can be used against us. The foremost purpose of mass surveillance is not the "protection of society from criminals and terrorists", but the protection of our governments from dissidents. And in the extreme case of a totalitarian regime gaining power to a country, the collected data will be used to detect people with particular political views, religious beliefs, or even sexual preferences.

History is full of plot-twists, and there is no guarantee that we won't find ourselves under a totalitarian regime some day -no matter where we live. It would be foolish if we had given the regime the right and power to control and access all of our information in advance. The "logic" of "I have nothing to hide -therefore we do not need anonymity" is ridiculous.

Furthermore, we must always take into account the factor called "human nature": even if you have nothing to hide when it comes to what you do in your bed and bathroom, you would not feel OK if someone put a camera and watched you in there. There is no particular reason for that; it is just human nature. Those who would not want a camera recording their most private moments owe no explanation -just like those who do not want to be watched on the internet. They just don't feel well with that.

What is funny here is that those who speak about "the human nature" to justify the avarice of the rich, suddenly remember "logic" and "reason" when it comes to the right to not being surveilled. Wanting always more even if you have enough is justified because it is "human nature" -but if you don't like being watched for the same reason it is "illogical" and "irrational".

Finishing off, we will mention some people who need it without being perverts or terrorists:

Independent journalists
Atheists of Iran and Saudi Arabia
Non-communists of China and North Korea
Political dissidents of Turkey, Russia, and other semi-dictatorships
People fighting against the drug lords and corrupt governments of South America

Remaining anonymous!

1. Use Tor or any anonymity network (with caution)

It is well known that an IP address is extremely easy to hide by using a proxy server. In essence, a device connected to the internet obtains a connection to a system with a different IP address, and the subsequent visits to websites seem to have come from the latter. That is the case if the cookies are disabled. Yet even if we have disabled the cookies, it is easy for the authorities to figure out who did what using the proxy server: the connections to it are recorded in logs, and even if the website one visits gets "fooled", that is not the case if there is a search warrant for the authorities. This problem can be solved with the use of the Tor network, which you can access by downloading the Tor browser.

We ought to elaborate how Tor works and its' potential vulnerabilities.

Instead of connecting directly to the requested website, the user connects to a router belonging to the tor network, considered to be the entry node. The user's device now "sees" the entry node instead of the website and carries out all communications through it. The entry node then gets connected to a middle node, with the latter being unaware of the IP or any information related to the user. The middle node receives information and packet requests from the entry node instead of the user, and sends everything to the former. If the middle node is not connected to another node, hence forming a longer "chain", it is connected directly to the exit node, with the latter being connected to the website. That means that the website knows information only for the exit node, and not for the user requesting the information.

The nodes and the user form a tor circuit, within which data is sent and received. The user is provided the capability to change the circuit at will -a function that terminates the connection to the website.

The strength of this system lies in that the nodes keep no log of their connections and it is impossible to find out which node connected to which. The only thing the authorities can find is the IP address of the exit node, without being capable to locate the others. All data is encrypted differently in every node, and even if someone somehow steals them they will be of no use.

The strongest proof of the trustworthiness of the Tor network is the extensive (and unfortunate) use of it by drug lords and terrorists, along with the fact that both the US and the Russian government offer millions to whomever comes up with a way to hack into the tor network. Yet we ought to be very careful when using it and have some doubt as to whether it is perfectly anonymous for everyone and under all circumstances.

Known methods of deanonymization of tor users

A method of deanonymization of Tor users is based on surveilling the connections from and to the network, and the subsequent comparison of their timing, the size of the data packets, and the transmission patterns of a suspect's communication with the entry node, and of the exit nodes with the websites (traffic analysis). It is even possible to interfere with the data in order to find out to whom they are heading towards, despite the encryption.

The simplified idea is this: if a user receives a file sized 15.4MBytes divided into 513 packets, and at nearly the same time there is a node transmitting 513 packets of the same total size, it is highly probable that the user is connected to that node. Especially in case the transmission pattern is changed -for instance, by transmitting the packets with particular variations of the transmission delays- the certainty is considerably increased. A hacker can obtain control over several nodes in the network and keep track of the data traffic through them. The nodes would then be classified as "bad relays".

Bad relays in the tor network is an existent problem, and even though there have been developed several techniques to locate them, they are often removed later than they should. One defense against traffic analysis attacks is the addition of noise (that is, packets with no real data) in the network, which means that the nodes sent "nonsensical" packets to one another, making it hard for attackers to carry out a traffic analysis. Another defense relies on that, even if an attacker controls several nodes, he certainly does not control all of them, which means that if someone changes his circuit, a bad relay will be capable of surveilling him only for a short timeframe.

Another method was implemented in 2011 during operation torpedo, leading to the arrest of several suspects. In particular, after the authorities managed to take over the control of some tor services, they sent to all visitors a script that pinged a server belonging to the FBI, using the real IP address of the victim. The authorities then proceeded to "visit" all suspects whose IP pinged the server. This attack would not have been performed without adobe flash, yet similar methods may be used with other tools.

In 2013, the widely known website called Freedom Hosting, the servers of which contained child pornography among others, went down along with several other websites of the tor network, when the authorities exploited a vulnerability of firefox, using javascript to make the devices of the visitors to sent their real IP and MAC address to the FBI.

The common denominator of all those attacks was that they were based on flash and javascript, which are de facto disabled if one uses tor with the proper security settings

The Tor browser provides the user the capability to disable everything that can put his anonymity in danger, with the use of high security level -even though that would make browsing harder, since large parts of most websites are based on javascript and flash.

From time to time, various vulnerabilities are discovered -a recent example being Tormoil-, as well as some methods of deanonymization irrelevant to tor network itself -a typical instance being the so called evercookies, which are to be examined later.

Another rather peculiar method of deanonymization is the use of the patterns of one's mouse movement! The way one scrolls down and moves the mouse on the screen differs from person to person. So, it is highly recommended to... use your mouse differently when on tor (or just disable javascript, since it is once again to be blamed).

Despite the possible attacks, tor is still a powerful tool for whomever wants to protect himself on the internet, and the identifications of users on it are rather the exception, even after thorough investigation. The extensive use of tor by criminals is the ultimate proof for that -just like the fact that the so called "whistleblowers", who publish highly classified documents, make an extensive use of the tor network.

In any case, we ought to be very careful with tor. For reasons we have elaborated, it is strongly recommended that you use it with high security level*, which among others means that javascript is disabled by default, along with all plug-ins. Changing your tor circuit periodically is also of great importance. For reasons to be elaborated later, it is also recommended to use it in systems that run Linux, BSD, or Qubes OS, preferably in a Virtual Machine. Generally, it is better use tor after taking all the measures described in this article.

A noteworthy alternative of the tor network is Freenet.

*During the translation of this article, a zero day exploit for the high security level was revealed. So far, the extent to which it was used in the wild is unclear, just like if there have been any arrests because of it, but the fact remains: you have to be very careful with Tor. And always manage to have the most recent version of it.

2. Use a secure Operating System

It is well known that everything is surveilled in the commercial Operating Systems (such as Windows and MacOS). Windows 10, for instance, has the capability of tracking everything you type as if there was a keylogger enabled by default. Of course, the user is provided the right to disable this kind of tracking, and even in case he would not, not all of his data would be sent to microsoft (that would result in an awfully large amount of data, costful to store and difficult to use, with the greatest part of it being literally useless). Yet the fact remains the same: they are capable of watching everything you do.

It is also widely known that the vast majority of viruses -a part of which could serve the purpose of deanonymizing a user of an anonymity network- is designed for windows. It is so, both because the majority of the users run windows, and because it would be more difficult to design a program capable of running on more secure Operating Systems without the consent of the administrator.

There is no guarantee that Windows do not track everything you do on the Tor network (even thought that seems implausible, given how few visitors and owners of illegitimate websites get caught), just like no one can be sure about what kind of information may be sent to microsoft without you being aware of it. The source code of Windows, just like the one of MacOS, besides being a little... hidden, is also highly intricate -which means that even if we gained access to it we would have a very tough time finding all its' parts concerned with stealing the user's data.

Hence, if someone wants to protect his data and anonymity, it would be prudent to abandon commercial OSes and switch to another one -preferably an open source, so that every change in it would be checked by numerous independent developers capable of finding the backdoors. Let's take a look to our main alternatives, focusing on the most paranoid ones in terms of security.

α) Linux (generally)

Linux is based on Unix, and was first released during the early 90s. There have been innumerable distributions (versions) of it ever since, among others Ubuntu, Debian, Slackware, Fedora, Mint, and Arch, to name some of the most widely used. Its' core characteristic is the capability it provides the user to customize it to his needs -if he is experienced enough with programming-, and that it is Open Source -which means that anyone can access and edit its' code, and any updates are checked not only by numerous experts, but of anyone interested.

Viruses for Linux are barely existent, and the fact that its' code is open source does not allow for deliberate planting of backdoors in it.

β) OpenBSD

Considered to be the safest OS developed so far (even though Qubes OS may be safer). Based on the Unix-like Berkeley Software Distribution, OpenBSD is one of the toughest operating systems you may ever encounter. The fact that even the GUI must be manually installed is indicative of the toughness. As someone who has some experience with it, I can assure you that it is utterly unsuitable for anyone not infinitely patient with computers.

Nevertheless, it is the best choice for those who seek a near-perfect security, since during the twenty three years OpenBSD is in use only two remote holes have been detected, while the source code is under a constant auditing from well-experienced developers checking it for bugs and backdoors. Even experts who consider BSDs untrustworthy due to the lack of a large number of developers occupied with them, recognize that OpenBSD is a powerful and safe OS.

Yet we ought to be slightly suspicious even when it comes to an OS with such reputation since, as previously mentioned, not many programmers work on it, and if they were more it is highly probable that there would have been found more security holes. Keep in mind that there have been accusations that FBI paid developers working on the OpenBSD to plant backdoors in its cryptographic framework -even thought it was never proved that they were implemented and it has been a decade since it was rumored.

OpenBSD's security is a result of the simplicity of its' code, and extensive experimentation in a VM is recommended before one installs it on his system.

γ) Qubes OS

Another Unix-based OS that, even though not (yet) reputed as the safest out there, since it has been tried only about 1/5 of the time OpenBSD has this reputation, it can certainly be considered to be the most paranoid OS developed so far. Qubes OS implements a radically different approach from the other security-oriented OSes, assuming that the user is already hacked!

With that assumption as a starting point, the system is divided into cubes (hence the name), fully separated from one another. Those "cubes" are basically Virtual Machines (VMs): each uses its' own part of the RAM, fully separated from the others, along with its' own part of the CPU and cache. The "cube" of the file-system is distinct from the "cube" for the browser or for the programs.

The idea behind is that even if someone successfully hacks into the user's system, he will still be unable to hack into the other "cubes" -so, in case someone hacks the cube of the internet it will be impossible to get access to the files, since the internet's "cube" has no connection to the one of the files. In Qubes OS, even copy-pasting from one cube to another is impossible without special commands. Such a system structure, if properly implemented, makes it technically impossible to steal data or plant evercookies.

Even though from time to time it has been proved that perhaps VMs may not be in all cases so perfectly separated from one another (e.g. Venom vulnerability), generally speaking they actually are separated, and the structure of the Qubes OS leaves not much space for security holes. The only serious disadvantage is that it cannot run on all types of hardware.

δ) whonix

A Linux distro widely used for security and anonymity, which implements an approach reminiscent of the one of Qubes OS: it uses two distinct VMs, the first of which is connected to the internet (using tor), and the other is an offline "workstation".

Its' creators asseverate that even a malware gaining root privileges would be incapable of leaking the real IP of the user, since the applications itself are unaware of it -something that makes this OS very useful for those who want to stay anonymous.

Strongly recommended for experimentation.

ε) Tails

Tails is a live Linux distro (that is, it can be run from a USB or cd-rom without being installed), characterized by an extensive use of the Tor network, just like by the fact that it does not leave any kind of footprints in the system that its' used, since everything is deleted after the shutdown.

Tails uses exclusively RAM, and does not write anything on any disk unless explicitly requested to do so -but for extra safety one can physically disconnect any HDDs to prevent programs from leaving footprints on the disks, if hackers somehow render it possible.

3. Prevent device fingerprinting

The idea of having all devices recognized from a unique identifier is certainly neither recent nor revolutionary. Ever since the first days of the internet, it was rendered necessary to have an identifier that would distinguish all devices from one another so that it would receive the data packets supposed to reach it: whenever one downloads a file from a website, the website has to know where to sent the data. Ideally, each device has a unique identifier -and on the internet that would, theoretically, be the IP address.

Yet the number of the devices connected to the internet is larger than the possible IP addresses -at least for the time being, since IPv4 is still in extensive use (IPv6 covers an astronomical number of addresses), so we are forced to give the unique address only to particular devices (modems, routers, etc) and connect to the internet through them and not directly (NAT technique). So, an IP address cannot be used as a unique identifier, since hundreds of devices may be connected to the internet through the same IP, assigned to i.e. a router.

An IP Address cannot be the unique identifier of a device.

Even with IPv6, there is no guarantee that this technique will cease to exist, and certainly there is no guarantee that Tor would stop being in use. So, even with an over-adequate number of IP addresses, the authorities would not be able to use them as identifiers.

The impracticality of using the MAC address -which is by definition unique- for other purposes than routing packets in home networks, rendered it necessary to develop other methods to uniquely identify users on the internet. A typical example are the so called cookies. Yet it soon became clear that every device has plenty of characteristics that, if not unique, they are certainly rare and fixed in each device, and to use several of them at the same time could identify it uniquely

There are numerous devices using Windows, and numerous devices using Google chrome, but how many have both of them installed? Among them, what percentage uses e.g. Greek as a language? From the latter, how many would use a particular set of fonts and how many would have a particular plug-in installed? It is obvious that by minimizing the probability that other devices have the same set of properties, we can identify it uniquely or almost uniquely.

Here come the browser (or device) fingerprinting methods, that use such properties of each device in order to create a uniquely identifiable set of properties -a fingerprint- for them. The user is more reliably tracked with it, and it can be used by advertisers to provide him personalized ads, or by the authorities to have more information for him.

Anyone can see some of the system's characteristics that can be used for fingerprinting on amiunique, where a list with the collectable properties is available. The image below, taken from this exceptionally interesting research paper by Yinzhi Cao, Song Li, and Erik Wijmans, lists some of the properties that can be used

The entropy of a property in this case, refers to the certainty with which a system can be uniquely identified with its' use. The number of devices possibly characterized by a particular property is 2 powered to the value of the property's entropy -for instance, an entropy 10 allows for 2^10=1024 devices sharing the same fingerprint.

The terms single-browser and cross-browser refer to a single browser of a system and a set of browsers used on it, respectively. It is useful for the particular paper, but not of much use in this article. The term "stability" needs no explanation.

The use of this table is merely to see the wide range properties that may be checked by an attacker in order to deduce the fingerprint of a device. A user seeking to remain anonymous ought to keep in mind that he should regularly change a considerable number of those properties for to achieve this purpose.

Explaining all of those characteristics would far exceed the purposes of this article, so we will examine only some of them -along with methods to prevent their collection.

a) User agent: can be used to identify among others, the Operating System, and the version of the browser you use. You may check what it can reveal about your system by using whatsmyua. An example of a user agent as appears on the browser is as follows:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36

indicating that the user has Windows 10 (Windows NT 10.0) installed on his system, along with Chrome/67.

A modification of the user agent would trick the site into believing that the user has a very different system than in reality. So, if someone uses e.g Firefox on OpenBSD he can make the websites he visits, just like any potential attacker, treat his browser as e.g Chrome running on Windows 8. The user agent can, theoretically, be used by someone intending to send a virus or script (recall the attacks against tor users) -in which case he would send the proper virus or script based on the OS and the Browser. Since not all viruses and scripts run on all OSs and browsers, a modification of the user agent can potentially save you from various forms of attacks -theoretically speaking, the attacker would send something that does not work on our system.

The modification of the user agent on firefox is as follows:

After writing about:config in the address bar and hitting enter, search for the useragent. Right click and add a String named general.useragent.override, and use the user agent you want to appear as its' value. For Chrome 60 (most popular browser in 2018) and Windows 10 (most popular OS in 2018), the user agent is as follows:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36

You may find a list of user agents on this webpage.

b) Screen resolution and color depth: the screen resolution and the color depth can be used for the identification of a device. It is obvious that it is not exactly unique in every system, and if we take 7.41 at face value, one out of 170 devices have the same resolution and color depth.

According to this research, zooming in in a browser changes the value of the screen resolution, so what is taken into account is the analogy between width and height that remains stable, along with the availHeight, availWidth, availLeft, availTop, and screenOrientation values. By disabling javascript not a single one of them is returned, thus the determination of screen resolution and color depth is prevented.

c) List of fonts: It is exceptionally rare for two systems to have the same list of fonts, especially in case unusual fonts have been installed on a system. So, it would be prudent if we left the fonts of the system as they were after the installation. An additional and perhaps more effective countermeasure, is disabling flash -since it is flash that returns the list of fonts.

We ought to keep in mind this paper by a compatriot of mine named Nikiforakis, which can also be found here, that there is a side-channel attack capable of returning the list of fonts in case flash is disabled. In particular, a script creates a string in a particular font and attempts to make it appear using the fonts available on the computer, and then compare their sizes. This can be prevented by disabling javascript.

d) List of plugins: Something we must be very careful with, especially if we use them extensively. The writer uses only three additional plugins, yet the similarity ratio as appearing on amiunique is just 0.49%, and it is clear that the more plugins on uses, the less probable it is to find another system with the very same list. Disabling javascript should suffice here -along with avoiding installing plugins, perhaps.

e) Device and content language: The language of the OS and the installed writing scripts of a system (e.g. Greek). It would be wise to use exclusively English in your anonymous device. Alternatively, you may trick the attackers into believing that you live in another country by installing writing scripts in e.g. Chinese. Or you may just disable flash and javascript.

Besides the language of the OS and the writing scripts, the attacker can use the content language -that is, the main languages of the content we view. The content language of the writer is as follows:

"el,el-GR;q=0.9,en;q=0.8,sv;q=0.7,de;q=0.6,it;q=0.5"

and shares it with less than 0.1% of all users! It is, therefore, apparent that, at least on the device we use for anonymity, we ought to view our content exclusively in English. The content language is extracted from our HTTP Header, which can be modified with a plug in, but it would be wiser to take security measures on the most fundamental level possible (e.g. VM or even whole system that visits only particular websites with particular languages, along with taking all other measures available).

f) Canvas fingerprinting: Something that puts our anonymity in great danger, since the properties extracted with canvas fingerprinting are almost unique, with more than 99% of the users having their own distinct fingerprint.

Canvas is an HTML5 element allowing a system to create and depict objects using the GPU of our system. The basic idea behind canvas fingerprint is to lure a system into creating an object as requested by the website, and return its' message digest. But since every GPU is slightly different from the others at the circuit level -at the same way with which every weapon is slightly different from the others, thus enabling the authorities identify it using ballistics-, the returned values are in the majority of cases, different in each system.

If other identifiers are taken into account along with canvas fingerprint, it is highly improbable that systems of the same specifications are to be found. Fortunately for our anonymity, there are security measures that we can take. Among them (as you may have already guessed), the use of tor browser, which notifies us if a website uses HTML5 and canvas elements, and returns a blank image to them. Another method would be the use of any Canvas Defender, which modifies the fingerprint. It is recommended that you only use it scarcely, since its' very use can be an identifier.

The collection of the majority of the aforementioned properties can be prevented with three basic measures: disabling javascript, disabling flash, and use of tor with torbutton so that many of the properties will be normalized to the most common values. For even greater safety, the use of a VM with the proper settings can contribute to the prevention of being deanonymized, even if the three measures are somehow bypassed.

4. Beware of the (ever)cookies!

-Do you accept the cookies?

A cookie is nothing but an identification number for the visitor of a website. It is something fully independent from the IP address, and even if changed, the website is still able to identify the users using it. The cookies are saved as files in our computer, used thereafter by the websites and advertisers in order to provide us the most suitable advertisements based on our internet activity.

Theoretically (and most often practically), cookies are deleted from our device upon our request. Even though deleting cookies is no guarantee that the next cookies will not be "connected" to the old ones (e.g. with the use of the e-mail as identifier), the deletion is possible and a matter of seconds.

Yet there is a barely legal form of cookies that resembles a virus -the so called evercookies. Unlike normal cookies, they are not saved only once but multiple times in several parts of the system, rendering it almost impossible to locate and delete them all. And not only they are tough to locate, but they actively "resist" all attempts to get rid of them, as they copy themselves as soon as you delete one of them, thus restoring the deleted copies.

Here is a list of mechanisms utilized to save evercookies, as posted both on Wikipedia and on the personal webpage of their innovator.

- Standard HTTP Cookies
- HTTP Strict Transport Security (HSTS) Pinning
- Local Shared Objects (Flash Cookies)
- Silverlight Isolated Storage
- Storing cookies in RGB values of auto-generated, force-cached
PNGs using HTML5 Canvas tag to read pixels (cookies) back out
- Storing cookies in Web History
- Storing cookies in HTTP ETags
- Storing cookies in Web cache
- window.name caching
- Internet Explorer userData storage
- HTML5 Session Storage
- HTML5 Local Storage
- HTML5 Global Storage
- HTML5 Database Storage via SQLite
- HTML5 IndexedDB
- Java JNLP PersistenceService
- Java CVE-2013-0422 exploit (applet sandbox escaping)

The evercookie is arguably the worst nightmare of whomever wants to cover his internet tracks, since once downloaded to a system, it is to remain within it (at least) until the next HDD format. Even if one thinks he got rid of the cookies, the websites tracking him will still know who he is. Yet that is not the worst part; the latter is that they may be used to deanonymize tor users!

Even if during our everyday computer use we have to (and are often forced to) use cookies, we ought to take paranoid security measures when it comes to anonymity networks, and the existence of the evercookies is among the reasons. The most secure measure I can come up with is to use a VM when surfing with Tor, so in case someone manages to sent us evercookies they will be stored in the VM instead of our own machine. That's why one must never use the same VM he uses with Tor for "eponymous" surfing. We are to elaborate further why in the next chapter.

5. Use Virtual Machines extensively

A Virtual Machine (VM) is, in essence, a computer within a computer. It is very easy to allocate a part of the computational power of our computer to a distinct system -a virtual one-, capable of running even other OSs.

A computer with Mac OS running a VM with Windows 10.

Technically, it is a second computer within the first one.

The main OS of our computer (host OS) runs a program -the virtual machine application- that utilizes some of our system's resources (i.e. RAM memory, CPU cores, etc) in order to have a Virtual Machine use them exclusively, like a distinct computer. As you may recall, it is the logic upon which the Qubes OS was built -thus becoming one of the safest OSs available.

The pros of a VM include that even if it gets affected by a virus or evercookies, they are trapped in the VM: the attacker has no way of knowing that your machine is a virtual one. And even if he somehow finds out (it depends on your VMs specifications, as we will elaborate), there isn't much he can do to affect the Host System. Theoretically at least.

A typical example of a so-called Virual Machine Escape, is what is now known as the venom vulnerability, which might have not affected all VM programs and been patched, but remains an fairly convincing argument that it is always probable that there are parameters we are not aware of.

A VM Escape is possible if the VM has some folder in common with the host OS, as well as through the network adapter, if there is no security on the network level. What can we do to prevent a VM escape? Be very careful with what we download, always use different OS in the VM and the host machine, periodically delete our VMs and create others (with different specifications to prevent fingerprinting), and always be informed about any bugs found.

Keep in mind that it is highly unlikely that an attacker finds out he is attacking a VM -especially if it is not somehow apparent in the fingerprint (e.g. if one uses 3741 MBs RAM instead of 4096 or any power of 2)-, so we can always count on the VM as an effective security measure, preventing viruses, evercookies and fingerprinting.

As you may have guessed, it is prudent to use a secure OS in the host machine, since we cannot be sure that the commercial OSes don't keep track of what we are doing in any VM running within them. Even with a commercial OS that kind of tracking would be exceptionally impractical -yet still we ought to take all measures available.

The use of TOR within a VM with a secure OS is strongly recommended.

6. Be careful with google, social media, and smartphones!

It is well known -and needless to say-, that google not only surveilles everything, but keeps the data for an arbitrarily long time. You can use search engines and services that boost your anonymity. A typical example is the duckduckgo search engine, which provides results similar to google's without tracking the users. It is highly recommended that you use it for all of your searches that may reveal sensitive data (i.e. medical history, sexual orientation, political views, etc), just like for all searches that can prompt security agencies to keep an eye on you.

It is also needless to say that one ought to be very careful with social media, both when it comes to posts and likes, and messages. It is prudent to avoid any reference to your anonymous activity in chatrooms -and never use them via the VM you use for anonymous surfing. Same for google's services.

Leaving the most paranoid consult for the end, we remind you that it is possible for corporations to record your real-life conversations with your smartphone's microphone, and that you should manage that not a single smartphone must be anywhere around if you have something extreme to hide -e.g. in case a group of journalists discusses about a forthcoming revelation. That is extremely paranoid, yet wise. Whoever has enough money may use a GSMK Cryptophone as well.

Setting up an anonymity machine

Keeping in mind all of the above, let's see some steps towards an anonymity machine

1. The machine has to be based on a high quality laptop. The portability of a laptop means that the user will be able to carry the anonymity machine and connect to any public record -e.g. a cafe. It also means that he won't have a tough time getting rid of it or giving it to a friend if it's possible that the authorities will ring his bell -e.g. in case a journalist obtains top secret data and suspects someone has found out.

2. The machine ought to run a Linux distro. OpenBSD, Qubes OS, and Whonix, previously suggested as the most secure OSes, are very hard to use and have very few programs available, hence they are more suitable for our anonymity VMs. That's not the case for most Linux distros -e.g. Fedora or Debian-, and they can be used as host OSes. It is recommended that you encrypt your HD with a strong password. When it comes to the latter, why not use a normal password combined with the verse of a song to make it extra long?

3. Whichever OS you choose should be used exclusively as the host OS. The rest should be done within VMs to avoid getting viruses, evercookies, and your system fingerprinted. You should not even open a browser within the host OS to exclude such possibility, and add all the files your VM is going to need using a USB stick.

4. Internet connections for general use should be done within a distinct VM than the one(s) you use for the TOR network. The purpose is once again the avoidance of viruses, fingerprinting, and evercookies, all of them turning out to be completely useless for an attacker to track your activity. It is recommended that you do not use a single VM for too long, and replace the deleted ones with others with different characteristics to be safe from fingerprinting.

5. When connecting to anonymity networks, take all the possible security measures previously elaborated .

6. Take all possible measures to avoid browser-fingerprinting even when using distinct VMs. We remind you that every single VM ought to have different specifications, so that they will never be treated as the same machine, even if the attackers find a way to bypass your security measures. We remind you to set your VM up so that the attacker won't be able to figure out that he is attacking a VM -e.g. the available RAM MBs should be a power of two, so that an attacker won't understand he is dealing with a VM because of the 3786 MBs RAM, and won't use VM escape techniques.

7. Always change the MAC address "seen" by the OS of the VM, since attackers may try to uniquely identify you using the MAC instead of the IP, as was the case with FreedomHosting's take-down. Some ways to change it may be found here and here. It is not clear whether it would lure all attackers into believing that you actually have a different MAC address than your actual one, but it might, as the attackers would very likely fetch the MAC that the OS "sees" instead of the real one, depending on their methods.

8. It is recommended that you use public networks as much as possible -without being revealed. That means that even if an attacker obtains the real IP a e.g. researcher used to reveal something he wasn't supposed to, it will lead to... a cafe, or any device that does not allow the attacker to trace the activity back to a specific person. It is apparent that measures irrelevant to computers have to be taken in order for this method to be of any use (i.e. not be seen by a camera)

9. We remind you that all web-browsers developed by large corporations are more probable to keep track of everything you are doing. Tor browser is recommended for anonymous surfing, and any open-source browser for your everyday internet activity.

10. It is imperative that you set the VMs and all of the above, before even connecting your anonymity machine to the internet -something easy to achieve once you set up your system without being connected to any network. If possible, buy your laptop with cash so that your name won't appear in the transaction, thus even if there are system fingerprints we are unaware of, the machine will not be traced back to a particular person. It is also strongly recommended that you periodically shut down your system, to prevent effective cold boot attacks in case of an unexpected visit from the authorities.

Further reading

Tor network and vulnerabilities

[1]https://www.torproject.org/

[2]https://en.wikipedia.org/wiki/Tor_(anonymity_network)

[3]https://thehackernews.com/2017/11/tor-browser-real-ip.html

[4]https://motherboard.vice.com/en_us/article/kb7kza/the-fbi-used-a-non-public-vulnerability-to-hack-suspects-on-tor

[5]https://www.theguardian.com/world/2013/oct/04/tor-attacks-nsa-users-online-anonymity

[6]https://en.wikipedia.org/wiki/Operation_Torpedo

[7]https://wccftech.com/research-discovers-rogue-tor-nodes/

[8]https://threatpost.com/tor-browser-users-urged-to-patch-critical-tormoil-vulnerability/128769/

[9]https://news.softpedia.com/news/tor-users-can-be-tracked-based-on-their-mouse-movements-501602.shtml

Browser fingerprinting