Τετάρτη 9 Ιανουαρίου 2019

How to effectively protect sensitive data. Even from the CIA.

Update 19/2/2019: I apologize for not warning you in the introduction earlier. The present article assumes that you do not use commercial OSs (i.e. Windows) that track your behavior and activity and even sell info about them to third parties. That was mentioned in the 6th advice. 

It is also assumed -though not explicitly mentioned- that you use a full drive encryption or/and work in encrypted Virtual Machines, or work with live OSs that utilize the RAM instead of writing on a HD (i.e. Tails), thereby preventing attackers from recovering digital footprints of the data you have worked with (e.g., images saved in thumbs.db, if you have been unwise enough not to deactivate them). The digital footprints that can put your secrets in danger are to be described in a forthcoming article.

------------------------------------------------

Introduction 
The tools used by the states to protect their secrets are double-edged swords: they allow dissidents -and, unfortunately, common criminals- to protect theirs, just like they enable you to do so.

The common perception that security agencies and corporations can access all of your data is a myth. That could be the case if someone uses commercial OSs that track everything one does, but not if someone is deeper into computer security. There are some certain methods to hide your sensitive data from literally everyone.

I boldly and proudly proclaim that if you carefully use the methods to be described, nobody will ever access your secrets, unless quantum-computing methods are developed in the future (a scenario to be described in the final section of the article).

Who would want to protect data even from the agencies of his country or the corporations without being a common criminal?
  • Journalists about to reveal something they are not supposed to.
  • Politicians hiding data disastrous for governing opponents.
  • Dissidents of oppressive regimes and dictatorships. 
  • Atheists of radical Islamic countries.
  • Scientists wanting to protect their ideas and/or inventions.
  • Technology researchers wanting to protect their work from competitors.

1. Don't just encrypt them.
There is no guarantee that the algorithms of an encryption tool are properly implemented in code. Even if algorithms as Rijndael and Serpent are secure and powerful by a mathematical aspect, how can you be sure that their implementation in code is not buggy, thus allowing attackers to execute attacks similar to the simple yet ingenious padding oracle attack?

Furthermore, who can assure you that the software company that developed the encryption tools has not deliberately allowed backdoors that enable the authorities bypass the encryption? Or even sent the passwords you are typing to an agency? If you are a i.e. journalist intending to publish something you are not supposed to, you might want to be sure that the related data would be secure even if the police were informed about your plan.

A possible way to prevent such scenarios could be to multiply encrypt your sensitive data, with various encryption algorithms in different encryption tools. The odds that two distinct tools developed by other companies are both flawed are rather negligible. An easy solution would be to encrypt your data with winRAR in the first place, thus securing them with AES of 256-bit key, and reencrypt the .rar using i.e. Serpent or Kuznyechik with another tool (i.e. Veracrypt).


Veracrypt provides an exceptional variety of encryption
algorithms even for the most paranoid. But you can
always add some more layers of encryption.

When it comes to the .rar encryption, even though winRAR uses plain AES, the nature of the .rar files does not allow for a quick brute force attack since many computations must be carried out in order for the headers to be checked. Tomshardware carried out an experiment in 2011, and no more than 15.000 passwords per second could be pushed for .rar even with the use of GPUs. In comparison, about 500.000 passwords per second could be pushed for an encrypted .zip. Even though the computational power has more than tripled ever since, with rar5 files using a BLAKE2 checksum instead of a CRC32 a brute-force would be extremely slow, since the former message digest demands considerably more time to be deduced.

Update 19/2/2019:  Keep in mind that RAR keeps unencrypted copies of your data in your .temp files, which means that it is secure if and only if you open them in encrypted drives or VMs, or in OSs such as Tails that use RAM exclusively -thus not writing anything on your disks. Read the 19/2/2019 update before the introduction.

When it comes to serious encryption tools, such as Veracrypt or Truecrypt (with the former being the successor of the latter), it is imperative that you use another encryption algorithm. You may prefer Serpent, which is technically more powerful than Rijndael, yet too slower to be chosen as the AES. Twofish and camellia are also a good alternatives. If you belong to the conspiracists who think that algorithms developed by westerners are deliberately flawed, you can trust the Russians and use Kuznyechik instead (even though it is not that powerful).

Since there is always the probability that some OSs or even encryption tools keep track of your passwords and perhaps even sent them to some authority -no matter how paranoid it sounds. It is highly unlikely, yet not impossible. Other paranoid measures you may take to prevent it besides the aforementioned use of several different tools, is the avoidance of commercial OSs, as well as to work without being connected to the internet.

The only problem with the multiple encryption method is that it creates a "babooshka" of encrypted files which is technically unbreakable even with all known (2019) "futurist" methods used by quantum computers. That means, that if you forget the password or the data gets corrupt, your files are lost forever. 


2. Use a message digest instead of a plain password.

This is more of a smart way to create passwords for the encryption than a protection method.

We are all fully aware that the use of passwords nowadays is widely considered ineffective and insecure, and many corporations actively try to replace them by bioidentifiers (i.e. fingerprints or photo of a user's face). The reason is not only that the agencies need a database of our bioidentifiers without us being aware of it, but also that using passwords is actually considered insecure nowadays. Yet there is a minor detail we have to take into account: it the use of passwords that is considered insecure, and not passwords per se.

The reason why is that many users choose easy-to-guess and easy-to-remember passwords, which allows attackers to carry out successful brute-force attacks without having to wait a long time. Since computational power nowadays is more than ever, thus allowing brute-forcing software to try literally billions of passwords per second, it is easier than ever for an attacker to recover i.e. a 7zip password. Yet there is a solution if someone wants a strong password that is tough to guess and remember, and yet easy to deduce if you remember what you've done: message digests!

Message digests are the results of hash functions such as SHA1,  MD5, or BLAKE2. The passwords you use on a website are actually stored as message digests, and whenever you try to log in with a password the message digest of that password is deduced and compared to the one saved in the DB. It is impossible to deduce the initial passwords from the message digests, since hash functions are constructed deliberately to prevent reverse engineering of the results.

By using hash functions as password generators, you can get passwords that are impossible to break by using simple ones as keys. It is recommended that you use an offline hashing tool, such as HashCalc (contact me if deleted), so that you won't transfer your passwords online. The password "123456" gives the message digests below:

Why use a password when you can use its' message digest 
or a message digest of one of its' message digests?

Imagine using the sha512 digest as a password! Pretty hard to break with brute force, ain't it? Even for an easy password such as "123456" there are 14 different possible passwords only from the famous message digests, as the attacker has no way of knowing which one you used, or even if you used a message digest as a password.

  • Tips: You can use even harder passwords to generate message digests, and may use a HEX string instead of the text string (even though that prevents you from including other characters than 0-9 and A-F in your password). Another method would be to use one of the message digests to generate other message digests (i.e. using the sha256 M.D. to generate a password based on the MD5 digest). You may also add some symbols at the end, for extra strength. 


3. Hide them in encrypted Virtual Machines and change the VM's keys'.

As I was messing with my virtualbox's files one day, I noticed that there was this parameter for my encrypted VM drivers:

Change the KeyStore parameter, and lure the attackers
 into believing that your encrypted VM is damaged.

This property, named "CRYPT/KeyStore", can be found in the Virtual Machine Definition file, which is stored along with the VH in the same folder. It is nothing but the key used to check the validity of the password. If you try messing with that property (always make sure that all virtualbox-related processes have terminated, even those running in the background, else the initial key will mysteriously pop up again) you will realize that it has a well-defined structure, allowing someone to edit only particular parts of it without making VirtualBox mark the whole drive as "inaccessible".

That key is generated once you choose a password for to encrypt a VM, and the passwords you insert to decrypt it thereafter produce a key that is checked against the one appearing in that string. If they match, then the system proceeds to decrypt the drive using your pass. But what if you are a Turkish hacker suspected for stealing super top secret files from Erdogan using a virtual machine? It wouldn't be nice if the Millî İstihbarat Teşkilatı hacked into your VM with a brute force or a cold boot attack and saw what you did, wouldn't it?

The solution is to edit the key above so that even if they somehow found your password they would never be able to decrypt the drive; the password's digest wouldn't match the string and the system would not proceed to the decryption. You will have to reedit the file before and after opening the VM but at least you'll be safe.


  • Method to lure the attackers into believing that the VM is useless: change the key so that it would match a password as "123456". It results in that the password appears correct and the Virtual Drive damaged beyond repair -the data is decrypted with a wrong password thus resulting to unreadable gibberish. Give it a try but be careful -I have crashed several systems with such "experiments" as a youngster. 

The string matching to 123456 appears below:

          <Property name="CRYPT/KeyStore" value="U0NORQABQUVTLVhUUzI1Ni1QTEFJTjY0AAAAAAAAAAAAAAAAAABQQktERjItU0hB&#13;&#10;MjU2AAAAAAAAAAAAAAAAAAAAAAAAAEAAAADp5hcZ/RTjQcIScDhrfS1BFeiar2va&#13;&#10;jQWkl0b+b6y8oyAAAABQ8ToJ8c3Omh3p555zN26b02znT4k+akUfjN7A7vxxLyBO&#13;&#10;AACeV7TMA9AzllKd94T8EQQyIpTNLuztie5hOgN1WDp3qOAiAgBAAAAAjFNc1+WL&#13;&#10;kGEMx+IquRdDPJ4bJ/umShtj/7q+46NayOfkIZlTtjLol+vM3Z/J10V8trHIL8Mb&#13;&#10;mI2+wUYhsREupQ=="/>


4. Never store them in multiple devices.

This does not need much elaboration, I guess. Storing a file in multiple devices can make it extremely difficult to get rid of if needed. One backup device containing is enough -what are the odds of it getting lost or damaged the same day as the original?

Yet since it is always probable that transmission errors will occur upon copying the file, thus rendering them useless because of the encryption, it would be prudent to have at least two distinct copies of them stored in your backup device -as well as in your main HDD.


5. Always manage to wipe the RAM after opening or editing them. 

All of your passwords and even files you opened are stored in your RAM, which could allow an attacker to recover them despite the encryption. The most effective way to wipe your RAM is to completely shut down your computer for at least five minutes: it has been proved that RAM can actually retain the data for thirty seconds to two minutes after shut down, thus allowing agencies to execute a cold boot attack and fully recover it.

Another method to wipe your RAM would be to initiate a fork attack in your own system until rewritting all of your RAM and running out of it, but that would result to a crash.


6. Shred the initial unencrypted files.

OK, that may be too kitsch, but should always be mentioned. It is well known that when a user deletes a file the data on the disks stays as it was. It is not the data itself that gets deleted -it is the space upon which the data is saved that is marked as free. But until other data is written on it, it can be recovered. Which means that you should always manage to shred the files -that is, to actually wipe their data- before marking the space as "free".

If you use Linux, the "shred" command would be very useful for that purpose -unless you use an SSD. If you use windows... switch to Linux because everything you are doing is tracked.

  • Tips: Change their titles before deleting them, as they can serve as evidence that they were in your system. And manage to clear command history so that no one will find out what you've done.

7. Avoid SSDs

The SSDs are based on flash memory. Unlike HDDs, a flash drive can wipe and rewrite data on the same physical address for a limited number times before getting useless: rewriting data on the same part of the flash drive considerably diminishes its' life-span, since once a part gets damaged the whole device is unreadable. 

Corporations selling SSDs have resorted to a technique called wear-leveling, preventing the OS from knowing the actual physical address of the data and returning a logical one instead. Once the user demands that the data on an address is rewritten, the SSD's controller writes the new data next to the physical address, marks it as free, and edits the address table to make the logical address point to a new physical one. 

That means that to actually get rid of sensitive files on an SSD you may have to fully rewrite it -and no one can call it an effective way. If you want to be sure you have gotten rid of unwanted data, always use an HDD.


8. Beware of the quantum computers!

So far the brilliant and powerful Grover's algorithm to be implemented by quantum computers, threatens to make 128bit keys as breakable as the 64bit ones. Always manage to use encryption of at least 256bits, as 128bits are not easy to brake and there will not make much of a difference in the foreseeable future. A cipher using 512bit keys, such as RC5 or Kalyna, might be more secure against Grover's algorithm.

Yet it is possible that other similar methods are to be developed, and quantum computing threatens to render useless all -or at least most- of our current cryptography. Still, we have good reasons to hope that more powerful cryptographic methods, such as lattice-based cryptography, will prevent even quantum computers from accessing our data.

Always manage to be well-informed about new bugs in the encryption and software in general, as well as about newly developed quantum decryption techniques. And manage to add a layer of lattice-based cryptography to your files as soon as it is available!

-------------------------------------------------
©George Malandrakis
All rights reserved