Deleting Files in the Cloud

The public perception is that, when something is deleted, it no longer exists. IT in general prefers the fuzzier idea of the trashcan, where the deleted compromising documents can blow around a digital landfill site for years. The Cloud takes this further, and so the data you serve up to the cloud can be stored out there indefinitely, no matter how hard to try to delete it. Rob Sheldon investigates, and finds the cloud a worryingly public place.

No Nudes Is Good News

When it comes to the cloud, there’s a lot more at stake than just naked celebrity photos. Individuals and organizations alike are turning to the cloud in record numbers to share and store data, much of it sensitive and confidential. There might be medical records and personal profiles and credit card numbers and legal documents and a host of other information.

That’s not to diminish the violation of privacy experienced by the likes of Jennifer Lawrence and Kate Upton-or the harm done to Apple’s reputation, on the eve of its September 9 hoopla, no less-but it does point to a serious concern of the cloud that’s often overlooked. What happens to those files you thought were deleted?

Actress Mary Elizabeth Winstead, one of the celebrities targeted by the drooling hordes, claims she had deleted her hacked photos years ago. And therein lies the problem. Deleting a file in the cloud does not always mean the file is deleted. In fact, you’d be hard pressed to guarantee that anything you send to the cloud ever disappears completely, no matter what procedures you follow to ensure that it does.

Send in the Clouds

For you multi-device mavericks always on the move, being able to stick files up there in the cloud has proven a popular mechanism for making those files available whenever you need them and on whatever device you’re using. The cloud service houses your data on its own servers or on leased servers and facilitates the process of data exchange. From any place and any time, you just reach into the Internet ethers and, voila, your files are there.

The device you use to access those files might store a copy of them locally or simply provide an interface for accessing them from the cloud servers. If files are stored locally, the service provides a mechanism for automatically backing them up to the cloud servers, whether you realize it or not, and keeping that data in sync among all your devices.

For example, a user with a Dropbox account might access files from a desktop, laptop, smartphone, and tablet. The desktop and laptop will each host a local copy of the files. If the user modifies a file on the desktop, Dropbox automatically syncs the file to the cloud servers and from there syncs the file to the laptop. If the file is updated on the laptop, then the syncing is reversed. The Dropbox syncing process is ongoing and almost instantaneous and for the most part invisible to the user. The user should be able to access the updated file from any device in relatively short order. If that device is a smartphone or tablet, the file is not stored locally on the device. The user accesses the file via the Internet directly from the Dropbox servers, which contain the most recent file versions, along with earlier versions.

A service provider such as Dropbox has a lot at stake when it comes to delivering reliable services. Users have come to expect their files to be kept in sync and available from all devices at all times. No provider wants to develop a reputation for being unreliable (or unsecure, but that’s a separate issue). To ensure that users can access their files when they need them, the provider builds redundancy into its systems so that if a component goes down, another is in place to take over. This can apply to servers or power supplies or network switches or any number of components. It also applies to the data itself.

Cloud services routinely copy customer data to multiple data centers, often in separate parts of the world, to ensure the data is always available in the quickest way possible and to maintain backup copies should a disaster occur. That means all those “artistic” photos you took in your Las Vegas hotel room could very well have made their way from your smartphone to the nearest North American data center to other North American data centers to centers in Europe and beyond. To complicate matters, your cloud service might actually be turning to another provider for computing resources, and that provider might be contracting with yet another company to provide the actual hardware and data storage. The original provider might not even know where the computing resources are coming from.

Despite this obscure chain of command, there are no shortages of cloud services ready to receive and store your data across geographically dispersed data centers. We have, for example:

  • Document sharing and storage services such as Dropbox and Box
  • Online documentation services such as Google Docs and Office 365
  • Photo and video services such as Flickr and Instagram
  • Cloud backup services such as Backblaze and CrashPlan
  • Social media services such as Twitter and Facebook
  • Web hosting services such as GoDaddy and WordPress
  • Email services such as Gmail and Hotmail

Users are embracing the cloud in droves for good reason. They can get to their data whenever they want, regardless of location, as long as they have Internet access, like magic, really, without worrying about those pesky disks or jump drives or workarounds such as emailing files to themselves.

The flip side to this is that the cloud also translates to a loss of control. Among other issues, it’s not always clear who owns the data out there. The user? The provider? The company housing the servers? And who gets to access the data? Administrators? Engineers? The NSA? Even the holder of the encryption key can be the subject of controversy.

Although these questions have been asked before-and answered-those answers are not always satisfactory, which brings us back to the issue of the deleted file. If a file is deleted and, in theory, gone for good, does cloud storage still pose a risk?

The Naked Truth

For most backup and file sharing services, you can delete files either locally (on the device through which you access the files) or directly on the cloud server, usually through a browser or app. However, when you delete a file, you don’t actually remove it from the cloud server. Instead, the service merely marks the file as deleted, but keeps it around in case you change your mind. That said, most services also support the concept of “permanent deletion,” the idea being that the user can irrevocably expunge the file from the system, rather than only pretending to delete it.

For example, when you delete a file in SugarSync, a cloud file sharing and storage service, the file is moved to the Deleted Files folder, which is similar to the Recycle Bin in Windows or the Trash folder in Mac OS. From the Deleted Files folder, you can either restore the file or permanently delete it. If you choose to permanently delete it, according to SugarSync documentation, you can never restore or recover the data. A permanent deletion is irreversible. A permanent deletion is forever.

You’ll find similar processes in place for other services, such as OneDrive or Google Drive. In some cases, the service will purge deleted files after a specified amount of time. For example, Dropbox and Box will purge those files after 30 days. Regardless of the form the purging takes, most services have a simple mechanism in place to permanently delete your files, with iCloud being the notable exception (although iCloud Drive could change that). What is more difficult to pinpoint with any of these services is what they mean by “permanent.” Just because the user can no longer view or restore a file does not mean that file doesn’t exist.

You might recall from a couple years back the well-publicized story of the Facebook photos that took over three years to disappears from public view. The user had deleted them in 2009, but they were still showing up (via direct link) in 2012. After months of bad press, Facebook eventually removed the files, or at least hid them from view. And that’s the problem. We don’t know what happened to the photos. We also don’t know whether those photos still exist in secondary backups or archived snapshots or on some forgotten data center on the other side of the world.

And it’s not just Facebook in which the delete process is not quite what it seems. Earlier this year, a Flickr user thought she had deleted her account, including all the images associated with that account. Upon completing the deletion process, she confirmed that she was perfectly happy to have all her photos discarded. In fact, that’s what she wanted. After she was done, she could no longer log on to the account, as she would have expected. What she hadn’t expected was that her photos would still be publicly accessible and continue to be so for a while.

Eventually, her account was deleted, but it points to the fact that deleting files and accounts from the public cloud is never what it seems, and just because they disappear from view doesn’t mean they’re gone forever. If you consider how carefully cloud services build redundancy into their systems, you can be sure that eradicating data from existence is no small matter, if possible at all. Even if a user tries to delete a file or account shortly after adding it, the data has probably already been copied to other locations. Once data is sent to the cloud, it is usually replicated to multiple servers and geographic regions. Replication, along with other types of archiving and backup operations, can be complex processes, but necessary to ensure reliability and the ability to recover data in the event of disaster.

A permanent deletion might make a file or account disappear from the user’s view, but there’s no telling what happens to the file behind the scenes. It could be flagged as deleted but still exist on the primary servers and archived to others. It might actually be deleted in some places, but still exist in others. At some point, it might exist only in archived backups. In addition, if any of the data has been offloaded to third-party storage services, the existence of deleted data becomes more uncertain.

A service provider’s documentation might state that a permanent deletion prevents a user from viewing or restoring a file, but you’d be hard pressed to find any provider that claims the file is immediately and irrevocably destroyed across all systems and obliterated from existence. In all likelihood, your deleted files are out there somewhere. The only questions are where and how safe are they.

What Could Possibly Go Wrong?

In this murky world of privacy and file safety, it can be difficult to tell the bad guys from the good guys. Users are seldom given detailed specifics about who can legally gain access to their data, under what circumstances, and whether that access includes deleted files. In the past, many of the larger cloud providers have been fairly cooperative with agencies such as the NSA, although Edward Snowden’s revelations have dampened their spirits somewhat. Even so, law enforcement and security agencies continue to have access to private data.

Only recently, Dropbox reported that they received 268 government requests for user information in the first half of 2014. That might seem a small amount compared to their 300 million users, but for those whose data was seized, it’s plenty. The requests included subpoenas, search warrants, and court orders and varied from one to the next in terms of what information was wanted. Some requests came from outside the US. However, not all of them resulted in actual user files being handed over. Fewer than half, in fact. What we don’t know, however, is how much of that content might have been files thought to be permanently deleted. Would Dropbox have given over those files if they had the ability to do so? Would an agency such as the NSA be satisfied with receiving content that did not include deleted files? Details about such requests are few and far between, but with Condoleezza Rice now on the Dropbox board, privacy advocates generally fear the worse.

Requests for data, however, represent only one method of accessing private information. Both government and cybercriminals alike have at their disposal an arsenal of tools aimed at accessing confidential data. Case in point. Elcomsoft Phone Password Breaker (EPPB) is a tool ostensibly available for governments and law enforcement agencies to enable forensic access to password-protected iOS and BlackBerry backup files. All you need is the tool and the file to dig out its hidden treasures. If pulling that file off the cloud is a problem, you can turn to a brute force password hacking tool to get into the cloud account where the file is located. Once you download the backup file, you have access to everything contained inside, including account information to other services, where deleted files might be lurking.

That’s doesn’t mean getting at deleted files is necessarily easy, especially if the files exist only in some obscure archive in an unknown part of the world, yet hackers and forensic experts are a persistent lot, and with the right tools and right access, nearly anything is possible, even finding those files you thought you’d erased five years earlier. If nothing else, criminals can resort to breaking into data centers and physically stealing the files. We live in a world with few impenetrable barriers.

Users who don’t know quite what they’re doing when interfacing with the cloud can certainly fuel these fires. They might continue to use inadequate passwords or reuse passwords or answer security questions with obvious answers. They might not even realize their devices are being backed up to a cloud, assuming that if they delete those X-rated pictures locally they’re gone for good. Yet even with the most careful users, it’s possible that a hacker will find the electronic backdoor that provides access to confidential data, including all those supposedly deleted files.

Another area of concern is the proliferation of APIs for integrating apps and services with each other. Not all APIs are created equal, nor the apps that support them. They can include flaws related to a number of issues, including security. Poorly designed apps and APIs can bring with them faulty data storage, weak server-side controls, and insufficient transport layer protection, among other concerns. When you start connecting services using APIs, you increase a system’s attack surface area and consequently increase the risks of revealing information that can lead hackers down a path to your deleted data. In such cases, however, users are likely to be much more concerned with immediate issues than they are with deleted files, unless those files happen to contain government secrets.

Organizations contending with data governance issues have to be particularly careful when it comes to deleting files. Regional laws, industry standards, and in-house policies can mandate how long files should be retained as well as when they should be deleted. And these restrictions are constantly changing. Cloud services have gotten good at the retention part, but the deletion part is still a bit fuzzy. An organization that stores data in the cloud must be able to ensure that the data can also be effectively destroyed. Without that guarantee, an organization can easily find itself out of compliance.

Other concerns are looming out there as well. Imagine if the service provider storing your files goes belly up. What happens to your data? How do you get it back? Who has it? Who can access it? Even if you were lucky enough to delete all those humiliating Halloween pictures before the service went defunct, chances are, they’re still out there somewhere, sitting idly in one of the backups or snapshots or now defunct replicated servers.

The past has an uncanny way of coming back to haunt us. Remember MySpace? Despite the assumption by many that this was a service long dead (and all those photos long gone), MySpace appears to be experiencing a rebirth. The company has started sending to their old users the photos they had posted 10 years ago, to “re-engage them through a personalized experience“, interpreted by many bloggers as being more likely in hopes of embarrassing them back into submission. If this is the case, MySpace has plenty of ammunition-over 15 billion images. That’s a lot of data and a lot of potentially red-faced users.

The Cloud Conundrum

The cloud is a terrific and terrifying beast. It can enhance flexibility and efficiency. It can facilitate communication and collaboration. It can lead to humiliation and exposure in all sorts of way. For that reason, careful users take great care in protecting themselves. They use strong passwords, implement two-factor authentication, encrypt their files, and take other precautions. However, they can’t control what service providers do with their data, especially the deleted stuff.

Providers, of course, want to stay in business, and that means reliable and secure reputations, built on trust and a maze of redundant systems that can span the globe. They want their users to be satisfied and happy. Even so, purging files from their systems can seem a low priority when compared to the task of keeping those systems running and that data flowing and secure.

Any data you serve up to the cloud can be stored out there indefinitely, no matter how hard you try to delete it, so give careful consideration before sending sensitive information into the void, whether it includes your company’s trade secrets or the photos you took on your Jamaican holiday to Hedonism II. Even encrypted data can be compromised. Yes, the cloud can be an amazing tool, but know that anything you put out there can come back to haunt you for a long time to come. Just ask Jennifer Lawrence.