Thursday 31 December 2015

Happy New Year! Have some links.

So, 2015 has come and will be gone in a few hours (or already has gone if you're in Australia).  Happy New Year to all my readers! I hope you have a great 2016.  As a present to see in the New Year, here are some links that you might find interesting if you're a documentation person.  Happy reading :-) 

N.B. Apropos of nothing, this is also the 50th post here on Agile Documentation.  That's gone quick.  When I started the blog back in February after about a year of procrastinating, I hoped I'd have the self-discipline to keep it going for a few months.  I didn't expect that 10 months later I'd be constantly thinking about blog articles, with 50 articles published and another 50 or so in my backlog.  My free time is tight but (family aside, obviously) Agile Documentation has become a priority and I really enjoy writing it.  Thanks to everyone that's taken the time to read an article, and especially to those who've posted a comment - I really appreciate every one of you.

An Alternative to a Hierarchy

When building documentation management systems (DMS) there is a temptation to put all of the content into a hierarchy of one kind or another.  This isn't always a mistake, but it shouldn't be the default structure.  In fact, I'd go so far as to suggest that it should only be used when there are compelling reasons to do so (and "I'm the most important person and I want a hierarchy" is not compelling).

A note for those who might not have a DMS in place yet, or those who want to learn more about them: A DMS is more properly called an EDMS, or Electronic Documentation Management System, but the "E" is often left off due to the assumption that people are talking about the management of documents on computer systems.  A non-electronic DMS would nowadays probably be akin to archival work.  DMS is not carried out using a normal file system like Windows Explorer (at least, not by professionals), but rather on specialised software tools that have functionality like version control, check in/check out and permissions.  Examples would be Documentum and SharePoint, but there are many others.
 

So, if not a hierarchy then the obvious question becomes: What structure should be used instead?

The obvious answer is: A tagged heap.

By a tagged heap, I mean a heap, a pile, a bucket, a single folder/list/library that contains all of your documents, with each document tagged, labelled, indexed, categorised or having otherwise had metadata applied.  Retrieval becomes a matter of searching, filtering and ordering rather than navigating, drilling down and visually scanning.  In most cases it's quicker for users and easier for administrators to maintain in the long term.

This is a big claim, especially if you've never been exposed to the power of searching a tagged heap. Also, hierarchies are popular, and the power of control they offer is a hard thing to give up, so I might have some convincing to do.  Let's start that by looking at the pros and cons of hierarchies:

Pros of folder hierarchy

  • Already understood and used by computer users - low to none learning curve for functionality;
  • Easy to permission a folder (and by extension all of the files inside it);
  • Low cost of setup;
  • High level of control for administrators.

Cons of folder hierarchy:

  • Content can only be stored in one folder (and by extension one place in the hierarchy);
  • High learning curve for large or complex hierarchies;
  • Only someone who knows the whole hierarchy can move through it efficiently or quickly;
  • High cost of moving content in the event of a change of hierarchy (e.g. due to organisational change);
  • "Logical" for the hierarchy creator will not be the same as "logical" for a lot of the users;
  • Very inflexible;
  • No standardised metadata between different file types (or versions, such as .doc and .docx);
  • Searching is difficult if you don't already know the name and location, leading to user frustration.
  • Slow to click through folders, read the next list of folders, click the next folder, etc, etc.

What can we take from this? Well, folder hierarchies might be familiar and easy to use, but they're inefficient, inflexible and illogical for a lot of users.  Familiarity and ease of use are all well and good, but there is a significant life-long cost to that which is hard to justify. The two costs that stick out immediately are the high cost of administration AND the high cost of using.  Isn't that actually a worse case scenario?  That's only 1 step above not having a DMS at all.

Still, familiarity breeds both comfort and contempt, and maybe I've gone too far towards the latter.  After all, there's a reason folder hierarchies are so popular and ubiquitous, and it can't be because people actively like things that are rubbish (no X Factor jokes, thank you very much).  So people must value the control a hierarchy gives them and the low cost of setting up and permissioning folders, because those are the big benefits, right? Or, is it because there is no obvious alternative? Or that people can't see the benefits of an alternative?

A tagged heap is a strong alternative, and has a lot of benefits.  Let's look at the pros and cons.

Pros of tagged heaps

  • Content can be viewed based on any number of searches, orders and filters, so location doesn't matter;
  • No learning curve as search is as ubiquitous as folders;
  • A system novice can find something as quickly as a system expert;
  • No cost of hierarchy change;
  • No logic gap between creator and user;
  • Infinitely flexible, as views can be created in any way the system allows;
  • Standardised metadata across all file types (using things like mandatory columns, labels, etc);
  • Searching is easy because of the metadata (such as date of creation, author, file type, description, etc)
  • Extremely fast file retrieval (modern search algorithms can search and return results from tens of thousands of records in less time than it takes to blink your eye).

Cons of tagged heaps:

  • Higher learning curve for users who are adding documents, especially for adding metadata;
  • Permissions can require more design;
  • Higher initial setup cost;

Unlike a folder hierarchy, the more a user puts in the more they get out. This specifically applies to metadata, but on the basis that, as every administrator of every system ever built knows, not every user is as diligent as they should be, any modern DMS system can be setup to make metadata entry mandatory.  The more metadata is entered, the more powerful the views and searches will become.  This does require an intelligent, thoughtful approach to metadata design though, hence the higher initial cost of a tagged heap system.

My argument would be that ease of setup and ongoing administration is not as important as ease of use for the people who'll be using the system on a daily basis.  A balance does need to be struck, because a system that's incredible for the 50 people who use it, but for which 50 administrators are needed, isn't a sensible purchase, but in general the users of the system are the ones who'll determine its success (or not).  We've all worked in places where a decision has been made to implement a system that the directors and/or administrators love, but which is execrable for the users.  And what happens? People use it as little as they can get away with, thus removing a lot of the intended benefit.  There's no point implementing a system that has such a poor ROI.



Although using a tagged heap does have a higher initial cost, the benefits for the users are legion and compelling.  On that basis, I'd use a tagged heap and consign complicated hierarchies to the recycle bin by default.

If you've got experiences that confirm or contradict the benefits and ROI of a tagged heap, drop a comment in below and let everyone know which you prefer.


Sunday 13 December 2015

Corollary to A Note on Latin Phrases

The morning after A Note on Latin Phrases was posted, an article dropped into my feed reader with the serendipitous timing that normally occurs when you find your lost wallet just as you finish cancelling the final credit card it contained. McWhorter's erudite and deeply engrossing homily to the mongrel that is the English language conjures an image of a hotch-potch of invaders, nouveaux speakers and gnarled locals mangling the mother tongue time and again until it loses a single family resemblance and becomes a child of most of Northern Europe.

The overlay of one language on another, with the resultant twisting and changing and joining of the vocabulary and grammar, is followed by the overlaying of another, and another, intertwined with pockets of native resistance and other languages that become, briefly or otherwise, bedfellows with our increasingly individual and unrecognisable language.  This commingling and inbreeding leads to English's - apparently well-deserved - reputation for illogic and difficulty of mastery.

But aside from being a fine read (and I whole-heartedly recommend reading it) McWhorter has provided a mine of interesting gems about the effect of Latin on the English language, hence the poor timing for me.  I would have gladly folded in some quotes, but, alas, I clicked Publish too soon.  You'll just have to read the article yourselves, but here are a few points of interest:

"[S]tarting in the 16th century, educated Anglophones developed a sense of English as a vehicle of sophisticated writing, and so it became fashionable to cherry-pick words from Latin to lend the language a more elevated tone."

"One result was triplets allowing us to express ideas with varying degrees of formality. Help is English, aid is French, assist is Latin. Or, kingly is English, royal is French, regal is Latin – note how one imagines posture improving with each level: kingly sounds almost mocking, regal is straight-backed like a throne, royal is somewhere in the middle, a worthy but fallible monarch."

"Nevertheless, the Latinate invasion did leave genuine peculiarities in our language. For instance, it was here that the idea that ‘big words’ are more sophisticated got started. .... The English notion that big words are fancier is due to the fact that French and especially Latin words tend to be longer than Old English ones – end versus conclusion, walk versus ambulate."
So, if you want to know why there is an air of the common man about exhortations to "Never use a long word where a short word will do", it's because longer words are associated with Latin, the language for a long time of the only educated man in the village: the priest.  Shorter words are, quite literally, more Anglo-Saxon, and associated as such with the less educated.  If you want to be understood by the widest possible audience, you use the most commonly understood language, and that is not Latin.

(In light of this and the previous post on Latin phrases, comments on the irony of my gratuitous over-use of adjectives and long sentences on this blog are more than welcome.)