How I want my Data: Locality & Cloud Aware

Boy, do I wish I had written this down first. The op-ed talks about 'syncing.' Or more appropriately, dealing with getting your data on any of your devices (smartphone, laptops, desktops, DVRs, etc…) whenever you want without any hassle.

This is something that I (and I'm sure many others) have been struggling with for a while. I own a boatload of devices; several computers, portable devices, and such; all of which I want to stay in sync. I've got a theoretical solution to the problem which I'll detail below.

Jon Stoke's article points out several obvious problems: there is currently no easy way to sync your data, the stop-gap solutions that exist now are poorly done and anything but set-and-forget, and that computers should be very good at performing a simple repetitive task (just like syncing files). However, Stokes forgets to tackle on important issue. Yes, I want my data to be the synced across all my devices. It's horribly inconvenient to have to deal with multiple versions of files and its a tragedy when I realize that the file that I need is not on the machine that I'm working on right now. Yet, the data that I want on each device is not identical.

The bigger issue

For instance, I've got a rather large iTunes library. Last I checked, it's over 115GB. It lives happily on my desktop which has more than enough storage available to handle that amount of frivolous data happily. The obvious problem is that I do not want all that music on my laptop which only has a 80GB hard drive, and it certainly won't all fit on my iPod Nano.

Or, take my photo library. It's as large as my music library and then some. I do want some of it on my laptop: current work, my portfolio, the images I have up on my site; but some random shoot from 3 years ago? There's no need to carry that with me (at least until SSD reach multi-terabyte capacities).


The problem is not that I need a solution to have all my data all the time. If I did or could, the solution would be fairly easy – the technology to sync two equally sized devices together already exists. No, the problem is that I want certain data to go to certain devices and not to others. My solution, and suggestion for those that can make this happen is this:

  1. Time Machine by Apple is a ready-made solution for syncing to devices. It already syncs one hard drive with versions itself. It can even do it across a network. All that needs to be done is to transition it from a backup utility to a sync utility. I'm no engineer, but I don't imagine it would be a very hard task to accomplish. Especially if you consider the underling technology behind time machine.
  2. As Stokes suggests, the 'drive' paradigm is very cumbersome once you get more than a couple of hard drives floating around. Changing this to the 'cloud paradigm would be fantastic.Think about it like this: Google has tons (literally) of hard drives in their servers which provide the space for them to provide all their services to their millions of users. But as a user, you have no idea where your data is actually physically stored. It's somewhere, but you don't actually know where your gmail emails are stored, nor do you care.Introducing a similar solution at the personal level is a logical evolution. No, it's not really applicable to the 'average user' who only has a laptop and perhaps one external hard drive, but for small businesses or power users it could be a god-send.Again, I'm no engineer, but I imagine it working something like this:

You're home has a network setup, or at least one computer that is 'cloud-aware.' When you plug a drive into the system for the first time, it will ask you if you'd like to make this disk a part of the cloud – akin to the way Time Machine asks if you'd like to make a new disk a time machine backup drive. This is to ensure that portable storage (like a hard drive that's meant to travel with your laptop, or a flash card) won't get data stored on them that belongs in the cloud.

Once configured, the cloud would be just be similar to a RAID 5 array. It handles all of your backup for you, it presents itself as one drive, if you unplug a drive from the system, or one dies, the cloud automatically compensates – all without you worrying about it.

Furthermore, the cloud is accessible to all of your devices. It's online, so if you have a password, you can get your data anywhere. Think, Back to My Mac, but simpler. If your device is connected to the Internet, it automatically connects to your cloud.

Not into dealing with a mess of hard drives? Perhaps third-party companies can offer cloud storage for a price. Amazon and Google (among others) are already perfectly positioned to do this. Apple's iDisk already offers an omni-present storage space, and approaches this paradigm.

Cloud storage can help to solve many of the syncing problems, your data is always accessible, no matter where you are.

  1. There is one glaring disadvantage though – what if you're offline? Or what if the files are huge and would take too long (even over broadband) to access remotely?Here's where my solution gets a new twist.I propose a new kind of metadata that I'll call 'Locality.'
This is what I Imagine the interface looking like.

Every file has metadata attached to it. It's how the computer knows what date it was created, who last opened it, etc... What I'm proposing is an addition that keeps track of what devices a file is supposed to be stored on. All of your devices will know about all of your other devices. Your smartphone, iPod, laptop, DVR, desktop, and so on, will all know that each other exist. That way, they will be able to automatically keep track of what data is supposed to be on each device – automatically.

The 'Save' dialog in every OS functions basically the same. It asks you what you want to name the file, where you want to put it, and the file type. A system that is 'locality-aware,' would ask you one more thing: which devices you want your data on. It could for instance, default to storing everything on your cloud. If you're offline, then it stores the file locally, until you can connect. However, the user can also decide which other devices get the data stored locally – in other words – you can decide right in a save dialog where you want to always access your data.

For example, If you're saving a new Word (or Pages) document, you can set it to save on your cloud and also go to your smartphone. The next time your computer comes in to contact with either your smartphone or your cloud, it sends that file off. That way if your smartphone contacts your cloud (more likely that it contacting your laptop) it gets the file right away.

The advantage to this system is that you can decide to keep a huge library of pictures safely on your cloud. Where it can be accessed by any of your devices, but isn't stored on them, so space isn't an issue. If you choose, however, you can tell any of your locality-aware devices to set the locality of a file however you choose. You can set some songs to go to your desktop and your iPod, but not to your smartphone.

Apple is most likely going to rebrand .Mac, mostly likely calling it "mobile me." In theory, it will tie the iPhone more closely to Mac computers via a cloud interface. Who knows, maybe Apple is on the right track.

I seriously doubt that Apple will present a solution as complete as what I've just suggested. I'm certain that my solution requires a re-wiring of an OS. Nonetheless, here's hoping that it's a step in the right direction.

How I want my Data: Locality & Cloud Aware

Joey Baker


I write code most days. Prevously: photojournalist, EMT. Somewhat obsessed with jouralism.