Monday 30 May 2016

How to Scale Documentation, Part 4

My Input phase is now as efficient as I can make it.  What about the Delivery phase?

Already?  That was quick....

Ok, it's not quite as efficient as it could be, but we're getting there.  It takes time to get authorisation for new licenses, a better build server, etc, etc, etc.....

Tell me about it.  But you're making progress, which is positive.  Let's talk about Delivery then.

Yes please.

In Part 3 I said that you need to replace the following things from your processes with efficient and effective tools, processes, procedures and standards:
  1. Unnecessary/complicated process steps;
  2. Bottlenecks;
  3. Anything done by humans which is better done by a machine;
  4. Decision making.
I seem to remember you also said that I'd spend more time on automation in the Delivery phase?

Good memory, nice to see you're taking this in. Yes, whereas Input - the creation and maintenance of your documentation - requires creativity and (a certain level of) decision making, Delivery should be a "dumb" process.  By that I mean that it shouldn't require much, if any, thought once you've got an effective process in place.

That seems a bit odd.  You said that all 3 parts of the cycle were equally critical; now you're saying that Delivery is just a "dumb process"?

No, I'm saying Delivery is a "dumb" process.  The placement of the quote marks is important!  A "dumb process" is a process that seems stupid.  A "dumb" process is one that doesn't require any thought to follow.  If Input is the phase where you carefully build something, Delivery is the phase where you cover it in bubble wrap and hand it over to a courier to deliver.  You're not interested in the mechanics of the courier's van, or whether it's a petrol or diesel van, or the route they take, or what sat-nav they use, or how often they have to take a break to comply with legal regulations for couriers.  You're just interested in the end result, which is that the customer gets the correct package, on time, and undamaged.

(For those writers who moonlight as couriers, I'm not suggesting your second job doesn't require any thought, just that the people who use your services are paying you so that they don't have to think about this aspect of their business.)

Nice back-covering at the end there.

Thanks, I've learnt the hard way not to annoy couriers.  Anyway, the point is that once your Delivery system is set up, you should be able to hit a button, metaphorically and preferably literally, and then not worry about it. That doesn't mean you don't need to design the Delivery phase carefully, but it does mean that you can rely on automation for a lot more of the Delivery phase than you can with the Input phase.

This has the significant advantage that it's normally much easier to scale an automated process than a manual one, because computers are cheap and ubiquitous whereas people to run a manual process are not.

That makes sense, but when does the Delivery phase start?  What's the transition point from Input to Delivery?

Good question.   The final part of the Input phase is the proofreading and review stage.  The Delivery phase starts when you have content that you're ready to release to customers.  That's not internal stakeholders - e.g. product management, support, etc - who might point out errors or omissions, that's your actual customers who are getting the finished, can't-be-changed, officially-versioned content that people who've paid you money will be getting with a release of the software.  Obviously if all of your customers are internal they won't have paid, but the important distinction is between the final version of the content that hasn't been officially released, and the version of the content that will be officially released.

Why have you started saying "content" instead of "documentation"?

Because the Delivery phase consists of 2 parts: Build and Deploy.  The content will be built into documentation in whatever form you use (PDF, HTML, CHM, etc) and then deployed to the customers. Whilst there might be a manual check on the final documentation between building and deploying, the proofreading and review stage of the Input phase, along with your automated build stage in the Delivery phase, should be robust enough that the release build will only need a minimal paranoia check. 

Assume I'm at that stage with content ready to be built.  How do I make sure my Delivery can scale?

Because there are only Build and Deploy stages, there shouldn't be many steps to go through for Delivery.  Therefore you can focus on making both stages as automated as possible, whilst removing any potential bottlenecks.  The good news is that if you have created a fully-automated Delivery phase the only bottlenecks should be ones that can be solved with additional resources:
  • Appropriate licenses for tools that build and deploy the documentation;
  • Bandwidth for uploading/downloading the documentation;
  • Processing power to build the documentation.
Ideally your Delivery automation will all be initiated in an automatic chain based on an entry action such as uploading your content to a specific folder on the build server. For example, your automation could be:
  1. You upload content to "Auto-Build" folder on build server.
  2. Process that polls "Auto-Build" folder finds the new content and kicks off a documentation build via a script that uses the command line commands for your help authoring tool.
  3. Once the build is completed the script deploys the contents of the newly created documentation folder to the live web server.
  4. Your customers are told about the new documentation either by an RSS feed or an automatically generated email that is sent whenever something is uploaded to the live web server.

I've got so many questions! A process that polls a folder? Command line commands? Automatically deploys content to a live web server? How do I do all of that??

Yeah, you're just a writer, how are YOU supposed to do all this stuff?  

Exactly! This is way out of my league!

Good news: No, it's not, stop being so defeatist!  You can use PowerShell to watch a folder and kick off an action - such as a command line script - when the contents change.  Writing a script (also known as a batch file) to run on the command line is not hugely difficult, and help authoring tools normally provide command line tools for building documentation, like this for Flare or this for RoboHelp.  Finally, you can copy files and folders from the command line; here's the syntax.

I won't say that any of that can be done in 5 minutes if you've never done it before, but if you're willing to put the effort in then the ROI will be immense and your scalability will be massively increased.  Plus you work with developers, and every software house has plenty of people who can write batch files so you should be able to get help, especially when you're trying to troubleshoot a problem.  

Well....ok, maybe I overreacted a bit there....

No problem, we've all panicked a bit at one point or another.  But if you don't try, you'll never learn!

Agreed.  Now, what was that about a live web server? 

As much as possible your documentation should be O/S and application agnostic, so the best way to provide it is in web format.  This means some combination of HTML, XML, XHTML, CSS and JavaScript, which any and all browsers can display,

However, this is a surprisingly big topic so I'll deal with it in the next post.  What's important right now is that you understand that automation has the ability to make your Delivery phase quicker, easier and more scalable.  We'll look at the specifics of what to deliver to customers, and where there are mutual benefits to both you and the customer of delivering in certain way, once you've got your head round automation.

It does look a bit scary, but I can see how automation might help a lot.

Excellent, give it a go.  You know it makes sense.

I'm willing to try.  Although it would help if I understood more about why I need a live web server and why I can't just build whatever targets I'm asked for.

I can understand that.  Stand by, we'll look at deployment options in Part 5.

Write the Docs NA 2016

Sarah Maddox has recently been at Write the Docs NA 2016, and as usual she's posted a series of informative articles about the sessions she attended.  If you're not already following her blog - and you should be if you're at all serious about being a better technical writer - here's the list of relevant posts:

I can't stress enough how much useful information Sarah manages to pack into a blog post, so get reading, get learning, get better at what you do.

Sunday 22 May 2016

How to Scale Documentation, Part 3

I'm glad you're starting with Input processes, it's always better to start with the meat of the subject.

You're a writer, aren't you?

Yes.  Well, Information Engineer, technically, but writer will do.  How did you know? 

The people that write and maintain the documentation (i.e. provide the Input) always think that the Input is not only the most important part, but also logically the first stage. 

Creation and maintenance of the documentation IS the first stage, surely?

Not normally.  In Part 1 we talked about the Input > Delivery > Feedback cycle, if you remember. When you create documentation for the first time you might think that Input is the starting point of that cycle, but in actual fact there is no starting point.  The Input will come from the Feedback, be that a customer request for documentation, a legislative need, a stakeholder requirement from a user story, or something else.  The Feedback in turn comes from Delivery, which came from the previous Input, and so on. 

Ah, so Input isn't as important?

Not at all, I'm just illustrating a point: All 3 parts of the cycle have equal value.  As mentioned at the start of this series, you can't scale past a bottle neck.  If you have a bottleneck in any one of those 3 stages then you can't properly scale your documentation.  Besides, if you don't create and maintain your documentation there'll be no Delivery and the Feedback will be all negative, so let's not denigrate the Input.  It's absolutely critical.

Glad to hear it.  Let's cut to the chase: how do you scale the Input?

As mentioned in Part 2, scalability is the ability of your processes to handle a lot more work and still run at the same level of efficiency.  This means that you need to remove the following things from your processes where possible:
  1. Unnecessary/complicated process steps;
  2. Bottlenecks;
  3. Anything done by humans which is better done by a machine;
  4. Decision making.
You need to replace all of those things with efficient and effective tools, processes, procedures and standards.

Remove decision making?  I'm not a robot, thank you very much!

No you're not, but there are some decisions you don't need to make, especially not again and again.  I'll come back to that though, don't worry.

Hmm.  Carry on then....

Much obliged. The reason you need to remove these things is that the more steps in a process, and the more complicated those steps, the more difficult it is to scale that process.  
  • Every step has a resource cost, so the more steps there are, the more resources that process burns up.  The more resources a single process needs, the more difficult it is to scale it without the cost becoming too high.
  • The more complicated a step (or series of steps), the higher the cost of performing that step because a) specialist skills or knowledge are required which means expensive people, and b) "complicated" almost always means "time-consuming and prone to error".
Therefore you get rid of every unnecessary step, remove bottlenecks that channel everything through one individual or tool, stop making the same decisions over and over, and automate whatever you can.  We'll look at some specific examples of each of these next.

Hang on, are these Input-specific, or do they count for all 3 stages of the cycle?

They count for all 3 stages of the cycle to a greater or lesser extent.  However, decision making is much more critical in the Input phase than anywhere else in the cycle, because the Input phase is the most creative part and as such there are many more decisions to be made.  Decision making scales terribly! We'll be looking at Delivery and Feedback mechanisms later in the series but as a sneak peek, once those mechanisms have been decided you'll spend more time on automation (for Delivery) and removing unnecessary process steps (for Feedback) than removing decision making.

Shall we start with unnecessary/complicated process steps then?

Yes.  Process steps for the Input phase - that is, creation and maintenance of documentation - start at the point where a story is added to the backlog ready to be groomed.  However, at this point you'll still be working within your scrum team process to groom and estimate the backlog items rather than within your documentation process.  Agile processes scale well already, so we'll start on Day 1 of your sprint when you're working under your documentation processes and standards.

An example of an unnecessary process step would be having to create your skeleton documentation from scratch every time.  Aside from the fact that this provides an opportunity to make a mistake and/or make your documentation inconsistent, it takes time that could be better used actually writing content.

Templates, then?

Absolutely.  I've talked about this before so I won't rewrite it all again but I'll quote this one sentence:

"Anywhere you have to perform a documentation task more than once is an opportunity to create something that will save you time in the future."

Apart from the ROI benefits, using templates, single sourcing, variables and so on will remove unnecessary steps from your process and thus make it more scalable.

As far as complicated processes go, the definition of "complicated" is very contextual.  The fact that something is complicated to me doesn't mean it's complicated to you, so I'm hesitant to give a concrete example.  As a general principle, if you can't fairly easily teach a colleague to follow that process correctly, then it's too complicated to scale well.  Break it down into smaller steps or find a less complicated way of doing it (see automation, below).

Interesting. Ok, what about bottlenecks?

A bottleneck is any step in the process where only one document can go through that step at a time at the normal speed.  This generally takes the form of a lack of resources in one of the following areas:
  • People - an example would be a process step that only one person can perform because of skills, authority, permissions, etc;
  • Tools - an example would be a process step that requires a tool for which you only have one license;
  • Bandwidth/processing power - an example would be a build server that builds your help output but only has enough processing power to build one help output at a time without degradation of performance. Or having a network without sufficient bandwidth to upload/download multiple help outputs concurrently.
There are a myriad possible bottlenecks; the key to finding them is to run your whole process end-to-end for several documents concurrently.  If at any process step you develop a queue, or could have developed a queue in any easily-obtainable circumstances, you've got a bottleneck.

Remove bottlenecks by providing what you need - trained/authorised people, tools, bandwidth/processing power - or by removing/amending that process step.

Which presumably brings us on to automation, right?

Right.  Well figured out, by the way.  Yes, automation can be a very useful tool for preventing bottlenecks, although there are plenty of other benefits to automation. Computers are very good at rule-based calculations, whereas humans aren't.  This makes computers particularly useful for performing steps that involve checking things - like a spellchecker - or steps that involve changing or selecting many things based on a rule.

A simple example would be using variables in your help authoring tool for pieces of text that occur in many places.  If the text changes, you change the variable and it gets changed everywhere in your documentation.  Similarly, you can use tags to choose what documentation will be included in a compilation (this is part of single sourcing).

A more complicated example would be writing a batch script that compiles a documentation project, or using VBA to write formatting macros in a Word document.  Automation can be as complicated as you want it to be, and it's up to you to determine what the ROI on the initial investment in automation needs to be to make it worthwhile.  But computers and the programs they run typically scale a lot more easily than people and their skills, so if you can automate a step in your process, even partially, do it.  

Sounds like that requires a decision.  Didn't you say to remove decision making?

Yes, although I said "where possible" and I was referring to decision making within individual process steps.  You still need to - and should! - make decisions about those process steps.

Why would I want to stop making decisions within the process?  Writing is a creative skill, and that means decisions.....

You're partially right, because writing is partially a creative process.  But it's also a proscribed process, at least when you're writing technical documentation.  It's those proscribed bits, like writing standards and responsibilities, that you shouldn't be making decisions about.

The lack of decision making is the biggest efficiency saving that you get from a production line and this can be used in the documentation process to allow the writer to concentrate on the creative part of their job.  I've covered this in a fair amount of detail before, and I'd encourage you to read that post to fully understand this.  One of the key takeaways is that:

"The single biggest time sink is the decision making process."

Your Input phase will not scale if each individual writer is making decisions about writing standards, terminology, responsibilities, and so on.  These things needs to be documented outside of the day-to-day documentation process and applied to the Input phase for all documentation.

So we should have proscriptive and general standards to allow us to focus on the specifics of what we're trying to write?

Exactly.  Writing is a hard-won and specialist skill set and writers should be given as much time to write as possible. Any process step that requires a writer to make a decision that in theory should apply to all documentation - e.g. the correct terminology to use - is a process step that can't be scaled because decision making can't be reliably replicated at the same level of efficiency.  And it's a waste of time for writers to keep having to make these decisions again and again, as well as leading to inconsistent and therefore poor quality documentation. 

I've read that article you linked to, makes a lot of sense. 

Stop it, you're making me blush.

Uh-huh.  Moving on, is there anything else I can do to scale my Input?

As long as you remember the principle that you can't scale past a bottleneck, and cover the 4 points I mentioned above, you should be able to run your Input phase with the same efficiency no matter how many documents are being worked on concurrently.

Excellent.  See you next time to talk about Delivery?

I'm already looking forward to it.


Saturday 14 May 2016

How to Scale Documentation, Part 2

Right then, let's get on with this.  How do I scale my documentation?

Did you think about where your processes fit in the Input > Delivery > Feedback cycle, and what bottlenecks you've got, like I asked you to in Part 1?

Yes, although I thought bottlenecks affected efficiency more than scalability...

They can do, but not necessarily. Efficiency and scalability are closely related.  Would you like me to clarify the difference for you?


It's OK, lots of people are confused by this.

Really?  Alright then, what's the difference between efficiency and scalability?

Efficiency is a measure of what you can get done, with minimum waste, given certain requirements.  The requirements normally relate to the end product, so in the case of documentation the requirements might be the accuracy, consistency and completeness.  If you've reached a situation where there is no way to go through the Input > Delivery > Feedback loop cheaper (in resource terms) or quicker without breaching the requirements then your processes are 100% efficient.

So efficiency is relative to requirements?

If you're talking about the mathematical definition of efficiency, no. Efficiency is essentially the ratio of output to input, and this can be measured very precisely (Warning: Link takes you to a PDF.  Of maths.).

In the practical terms we're interested in though, yes, efficiency is relative because, for example, some waste in your documentation processes might be an acceptable trade-off for significantly increased speed.  It depends on what requirements you measure the efficiency against.

Right, I see.  Carry on then.

A process is generally linear.  This means that you perform one step at a time, one after the other, until you've completed all of the steps in the process.  Although the order of the steps may not matter, and you might be able to start some steps before you've completed the previous step, the general concept of a process is that it is a sequence of steps to achieve a goal.

Efficiency can be thought of as minimising both the number of steps and the time it takes to complete each step whilst still meeting the requirements.  The elimination of the un-required steps and speeding up of the required steps is the removal of waste from the process.  (This concept should be familiar if you've ever seen the Lean process at work.)  

Gotcha.  What's scalability then?

Scalability - in this context - is the ability of your processes to handle a lot more work and still run at the same level of efficiency.  In other words, it doesn't matter whether you have 1 document for 1 release or 100 documents spread over 10 releases, your processes should still work just as efficiently.  

Included in this is the ability for multiple resources to work through those processes concurrently.  This is why bottlenecks need to be addressed. If only 1 person can perform part of the process then whilst your process might be very efficient for 1 document for 1 release, it isn't a scalable process because only 1 document can go through that process at any one time.  That means your process doesn't allow multiple resources to work concurrently.  

Ahhh, so bottlenecks might only become apparent when you try to scale, right?

Spot on, gold star for you! Removing un-required steps and speeding up slow steps will make your linear process more efficient, but not necessarily help you with scaling if your process has a bottleneck that only affects concurrent resources.

I see, I see. I might need to go back and look at my processes again....

Take your time, I'll be here.  When you're ready, we'll start going through the Input processes of your documentation cycle in part 3 and make some suggestions for making sure it can scale.

Tuesday 3 May 2016

How to Scale Documentation

Firstly, what does "scale" mean?

Scaling applies to systems and processes.  It is most frequently associated with computer hardware systems, networks and algorithms, although it applies to anything which can grow to accomodate more work.  As an example, a network is said to be scalable if it can handle an increased number of nodes and increased traffic without requiring anything more than perhaps some additional hardware to be bolted on. 

In practice, if there are a large number of things (n) that affect scaling, then resource requirements (for example, algorithmic time-complexity) must grow less than n2 as n increases."  

Interesting.  Can you scale documentation?

Yes.  The processes that you use to generate and deliver your documentation can either be scaled - i.e. you can take on the production and delivery of more documentation with no loss of speed and quality - or they can't.  Where a computing process may need more disc space or processor power in order to scale, your documentation process might need more writers in order to scale.  That's fine, a process that can handle more work if you add more people is scalable.  A process that can't handle more work by adding people is not scalable.  (There are other things like bandwidth you might need more of to scale your documentation and we'll talk about that a bit later on.)

Great, ok, like it.  What processes need to scale?

There are 3 documentation processes that need to be scaled:

  • Input (creation and maintenance)
  • Delivery (released to customers)
  • Feedback (bugs, change requests, stakeholder enhancements)
This can be visualised as:

That's it? Just those 3 processes?

Yup.  Those 3 processes cover the whole documentation lifecycle.  Everything that happens to documentation comes under one of those 3 processes.  The end of life of a piece of documentation - e.g. when a product is discontinued - is covered by Delivery: you deliver the last iteration of the documentation, place the documentation into the standard repository and then start work on something else.  In this specific case the cycle ends there as there will be no feedback.

Fair enough, sounds good.  Any high-level principles or rules of thumb to follow?

Only one:

                         You can't scale past a bottle neck.

This means that if there is a bottle neck in one of the 3 processes, your documentation won't scale. Even if the Input and Delivery both scale with perfect efficiency, and 99% of the Feedback scales perfectly, if that 1% causes a bottleneck then your processes don't scale.  It's all or nothing.  That doesn't mean you have to have an entire system set up to be 100% scalable from the start (that's unrealistic) but until all 3 processes scale completely, you haven't scaled your documentation.  Nature abhors a vacuum, scalability abhors a bottleneck.  Both will get filled extremely quickly.

Are you quoting an Ancient Greek philosopher just to look clever?

Partly.  But there's a serious point to be made about bottlenecks, which is that they will fill up very quickly as you start to scale and thus prevent you delivering more documentation, even if you have more resources.  I'll say it again: You can't scale past a bottle neck.

Right, bottlenecks are bad, I understand, stop banging on about it..... 

As long as you're sure you get it. It's critical.

Critical, roger that.  Let's move on.  How do you make your processes scale?

We'll start looking at that in the next part.  For now, have a think about your own processes and try to identify where they sit in the Input > Delivery > Feedback cycle, and what bottlenecks you've got in those processes.  That'll help you apply what we're going to talk about to your own situation.

You're giving me homework?

I am.  Get on with it, I'll see you in part 2.