“Conquered Data” - Data that is optimally organized and managed according to an overall strategic plan.Caveats:
- This is not the perfect solution. In fact a perfect solution probably doesn't exist, technology changes so quickly, that a data conquering system must evolve overtime.
- This is a solution I have come up with to fit my data needs; it might not satisfy yours.
- Don't think by using this strategy you will have solved all of your data management problems. Part of conquering data requires personal discipline in following the strategy you set forth.
- I have done my best to make my solution operating system agnostic.
- I have only tried to solve the problem of managing data, not applications (as these differ widely across operating systems).
- The system is not meant to be setup in a day. It should take serious thought, effort, and time on your part to get one up and running. Hopefully, I've cleared some of the bigger hurdles.
- All of the topics I have written about, I have already implemented in my own system, to ensure they are practical. If you have more specific questions about the details feel free to ask.
To conquer your data you must be able to properly manage it, which means it must be organized, backed up, synchronized across devices, and secured. Organization is the critical step in this process because it determines what files need to be backed up, synced, and secured. Thus, I will spend most of this post describing my organizational scheme.
I'm not the first person to organize files. In fact, a great Lifehacker article got me on the right track.
The goal of organizing my data was to make it
- Canonical - any file in my system has a single logical location.
- Fast – Information retrieval should be quick as well as the act of storing that information.
- Robust – the ability to easily accommodate changes to or the introduction of new data sets in the system.
There are four properties that govern the entire organizational properties of a file:
- Activity – Is the file accessed regularly? Old high school papers, or a todo list you use everyday?
- Ownership – Who owns this information? Is it an original piece of work such as your college thesis, or is it a purchased mp3 song, or part of your company's confidential papers
- Privacy – Can the file be made public or is it sensitive and/or confidential information that no one but you should be able to access?
- Size – Although more of a practical attribute, it currently is a big factor in how to manage the file.
Here is the high level organizational structure of system, analogous to real world objects to make it more logically meaningful.:
- Archives [properties: low activity, original content]
- Briefcase [properties: high activity]
- Media [properties: purchased content, usually large]
- Vault [properties: sensitive information]
- Career – a place for your past resumes, employers, etc.
- Communications – email history, contact information, etc.
- Education – your schoolwork over the years
- Life Management – real world type stuff such as: housing papers, legal documents, insurance, financial documents, etc.
- Projects – past projects that you have worked on
- Audio – all types of audio files from music to audio books
- Games – Video games for the desktop or other devices
- Images – Where pictures and art go
- Programs – A place for computer programs
- Text – electronic books and magazines can be found here
- Video – podcasts, movies, and TV shows live here
The Briefcase consists of all regularly accessed files. The directory structure of the briefcase should mirror the system's high level structure, consisting of a documents (as opposed to archives), media, and a vault directory. In fact, the Archives directory should also mirror the system with similar sub-directories.
An optional directory one might consider is a “projects,” “workspace,” or “office space” directory. A place to put sets of related files too substantial to store in a briefcase. Since I am a software engineer, most of my coding projects fall into this category, but other things could apply such as the current year's classwork. Once completed, these projects can migrate to the archives directory.
The entire top level hierarchy can be found at the end of the post.
Synchronization [data property: high activity]:
With this type of data hierarchy, it is clear what information needs to be synchronized across all of your devices, the Briefcase. As the most accessed directory, by definition, the briefcase should always be synced across your devices. I use Dropbox to sync to all of my computers and it works wonderfully.
Security [data property: sensitive]:
Knowing that the files in the Vault are sensitive, it is important to secure them against unauthorized eyes. A sensible security scheme for me* is to encrypt the content, especially when backing it up over the Internet. I use Truecrypt for this purpose.
Backup [data property: owned content]:
The critical data to back up really comes down to whether it is original content. If you lose your first paper in your Philosophy class freshman year, you can't go email your teacher for the backup copy, it's gone forever. For everything else copies exist. However, I still like keeping records of important data for myself, these include: email, tax forms, legal papers, etc. Essentially, anything I am willing to archive should have a backup. So what does this leave out? Mostly media. Although it took hours of downloading to amass a music collection, it's not the end of the world if it is lost. It might hurt financially, but it can be restored. Since media files are usually large, it's quite a relief to not back up multiple copies of them anyway.
I backup my files in the following scheme based on the following Lifehacker article. I have a local external hard drive that performs automated daily, weekly, and monthly backups onto with Syncback. I also have an automated daily remote backup to the Mozy web service. My local backup stores all of my data excluding the media directory, but due to storage limits on my free Mozy account, I only backup my Briefcase there.
Since backup settings can be complicated, I recommend writing a backup readme file, containing your specific backup instructions. A couple of years down the road this can come in handy when trying to upgrade your system. The same goes for the organization, synchronization, and encryption schemes.
I can report that it feels great to use a conquered data management system. Many everyday tasks on my machine have become simpler, and I am not completely frightened if my hard drive were to crash tomorrow. However, my mission is not complete, it may never be, as systems are constantly in flux. I must remain vigilant, keep the system up to date, and allow it to evolve over time.
I recommend you do the same.
*This is not claiming to be an unbreakable system. Absolute security requires a lot of sophistication and would require a blog in itself to explain.
Top Level Hierarchy: