Saturday 25 June 2016

A Closer Look at .docx Files

The default file type for Word documents has been .docx since Word 2007.  It's part of Microsoft's Office Open XML family, which also contains .xlsx (Excel spreadsheets) and .pptx (PowerPoint presentations).

Something that comes as a surprise to many people is that the .docx file type - along with the other file types in the Office Open XML family - is a compressed file type that can be opened with a file archiver like 7-Zip.  (If you have problems opening the file then change the .docx extension to .zip e.g. example.docx becomes
If you do this you'll see that a .docx is actually a wrapper for XML files, primarily of type .xml, with a few .rels files as well (.rels are XML relationship files that tell a program how the .xml files relate to each other):

So a .docx file isn't one file, it's a whole bunch of files that are parsed by Microsoft Word into a coherent whole for the user to work on. Word is essentially a GUI front end for XML. 

The correct technical term for a Word document is a package (the .docx file) which is comprised of parts (the .xml files) which are connected using relationships (the .rels files). 

The list of files and folders in a new, blank Word 2016/365 document is as follows:
  • \docProps\app.xml - Metadata from the application.
  • \docProps\core.xmlMetadata from the document.
  • \word\theme\theme1.xml - The default theme for the document.
  • \word\_rels\document.xml.rels - The relationship file that contains the relations between document.xml and other parts of the package.
  • \word\document.xml - The actual content in your document.
  • \word\fontTable.xml - A list of all the fonts used in the document.
  • \word\settings.xml - Various documents settings, including defaults and security settings.
  • \word\styles.xml - The styles that are available in the document (not the styles that have been used, all the styles that are available).
  • \word\webSettings.xml - The web-specific settings used by the document.
  • \_rels\.rels - The package level relationship for the document.
  • [Content_Types].xml - The various types of parts that are used in the package.

If you add things to your document then other folders and files will be created by Word. 
For example, embedded media such as images will be in the \word\media folder and embedded files such as spreadsheets will be in the \word\embeddings folder.  

An important issue to note is that when you embed an image and then edit it, the original image is stored in the \word\media folder, and the edits are stored in  \word\document.xml, in the <w:pict> element.  If you've cropped or otherwise obfuscated private information using Word's inbuilt editing tools then anyone who opens the .docx package using something like 7-Zip can still see the original picture.

There is a lot more that can be said about the Open XML format, so rather than rewrite the wheel here are some links to explore:
MSDN contains plenty of useful OOXML information for users and developers, and the specifications and XSD information linked to above will tell you everything you could ever want to know about how the .docx structure works.

Finally, if you're thinking that this is very interesting but not particularly useful for writing topic-based documentation then you should know that Yes, you can write DITA in Word!

Sunday 19 June 2016

How to Scale Documentation, Part 6

Ok, I've got me a web server and an efficient, scalable automated build.  How do I get feedback?

No chit-chat or flim-flam, straight down to business.  I like it.  How's that for feedback?

That's payback for me mocking the Star Wars joke at the end of Part 5, isn't it?

Yes, but I'm not one to hold a grudge so let's leave it there and talk about Feedback.

That would be awesome.....

Good.  Feedback is the part of the cycle where work for the writer enters the pipeline.  This could be things like change requests and bug reports, all of which come under the "common" understanding of the word Feedback. However, in terms of the documentation cycle of Input > Delivery > Feedback, Feedback can also mean legislative changes, statutory changes, paid enhancements, and anything else that enters the pipeline and requires you to make an addition or change to the documentation.

But you said in Part 5 that having the documentation on a web server allowed you to gather feedback.  How does that cover all of those Feedback types?

It doesn't.  But real live customer feedback and usage data is tough to get without a central help site where you can track a user's movements through the documentation.  The other Feedback - legislative/statutory changes, paid enhancements, that sort of thing - will normally be tracked by Product Managers and Account Managers, and the Product Owner will feed them into the backlog in priority order.

This gives us 2 types of Feedback:

  • Product-focused - Anything that will require a change to the product (legislative/statutory changes, enhancements, software bugs, application maintenance, etc) and therefore a change to the documentation.
  • Documentation-focused - Anything that relates to the documentation itself, such as spelling/grammar errors, lack of clarity, omissions, requests for additional documentation, etc.
Product-focused feedback is something that requires a change to the documentation to accommodate work done by development.  We'll leave that feedback to Support and Product Managers and only look at the documentation-focused feedback, as this is the feedback that a writer should be responsible for tracking and responding to.  A lot of this feedback will come from the documentation on your web server.

I understand that distinction, but how do you get the feedback from the website? And how can you scale it?

Both fair questions.  As far as getting feedback, you've got several options, such as:

  • Google Analytics, which will supply you with more data than you thought possible about who, when and how people are using your site;
  • MadCap Pulse, if you're using MadCap Flare to create your documentation, which will provide analytics and also the ability for users to provide topic ratings and feedback;
  • Wufoo (or other such services), which allows you to create forms and surveys that you can embed in you site;
  • Comments, if you're using a wiki like Confluence, which allow users to give you any kind of feedback they want.
Of course, you can embed your own feedback mechanisms in to your web site if you wish; your choices are limited only by the usual restraints of price and time. 

I'll look into those.  Don't they scale already though?

Yes, they do.  Any of those options will work whether you get 100 visitors a week or a 100,000 a day.  When it comes to scaling your feedback processes, you need to focus on how you deal with that feedback.

Ideally, you will:

  1. Triage the feedback as it comes in (or at least triage it in bulk a couple of times a day);
  2. Delete the spam/junk feedback;
  3. Answer any positive feedback with some form of thank you;
  4. Answer all negative feedback, regardless of whether you can do anything or not (because an acknowledgement is often surprisingly good at diffusing unhappiness, even if you can't actually fix the problem);
  5. Add all feedback that requires you to make a change to the backlog in priority order.
Bearing in mind that you can't scale past a bottleneck, there are several areas you need to make sure you can scale;
  1.  An alert system should be in place so that at least it's obvious to the whole team when feedback has been received, even if you don't want a popup notification on your screen every time feedback arrives. 
  2. Keep a rota of who's responsible for dealing with feedback in any given time period;
  3. Make sure that multiple people have the authority and permissions to access the feedback and reply to it;
  4. Make sure that the same people have the authority and permissions to add things to the backlog;
  5. Provide templates for responses to different types of feedback.  These shouldn't be "robo-replies" because people don't like getting a pro forma that obviously no-one has put any thought into, but enough of a skeleton to include contact information, company header, standardised greetings and sign off, that sort of thing;
  6. Decide on the prioritisation order of feedback.  This could be severity, number of users affected, speed/complexity of fixing, or anything else that's most important to you.  This order should be used by everyone when prioritising the feedback that's added to the backlog.
Once all this is done then you can use your automation of the build and deployment to fix documentation issues and make the new version live on the web server on the same day.  This is the documentation equivalent of continuous integration, and it is the ultimate aim of your feedback system - find a problem in the documentation, fix it, build and deploy the new documentation.  All done efficiently and scalably.

That sounds awesome!

Yes, it is :-) it really that simple?

Conceptually, yes.  Practically? Maybe, maybe not.  It depends on your company culture towards documentation, customer satisfaction, continuous integration (which some people see as a risk), and so on.  But this kind of system is eminently possible.

Hmm.  There's a lot to think about there.

Yes there is.  But the bones of everything you need to scale your documentation is in this and the 5 previous posts on the subject.  I hope they've helped.

They sure have.  I was expecting a bit more on Feedback though.... 

If you bear in mind the principles we've covered in the previous posts - the removal of unnecessary steps and bottlenecks, automation and cutting out repeated decision making - you should be able to see that you can apply them to the Feedback processes.  I reckon you know enough to be able to take these principles and apply them to your situation without me holding your hand any more.

Bit scary, but nothing ventured, nothing gained.  I'm going for it!  Thanks for talking to me about this.

My pleasure, and I'm sure you'll succeed.  Good luck!

Saturday 11 June 2016

6 Reasons You Should Hire More Working Mothers

Over the years of working in, reading about and chatting to people within the tech industry, there have been a few common threads that come up a lot.  These threads cross companies, specific fields, countries and continents, and many of them have to do with the hard problems of documentation (as you would expect, given that I'm a technical communicator).

But one thread has absolutely nothing to do with documentation itself: Working mothers find it harder to get a job, and harder to be taken seriously when they do get a job (or return to their old job).  This is very relevant to our field at a time when the American Medical Writers Association has a 4:1 ratio of female to male technical writersThe Journal of Business and Technical Communication has a paper that discusses the "socioeconomic influences that contribute to women's dominance of the technical writing profession".  My personal experiences (for what they're worth) support this; my opinion is that there are more female writers than male ones.

The reasons for this female preponderance, whatever they are, would make a very interesting study, but that's not the aim of this post.  Nor, despite this ratio, is this article specific to technical writers.  The fact that there are so many more female writers than male writers gives me a justification to speak about it on a blog about documentation, but when many women feel discriminated against when they become mothers, and when there are proportionally so few women in tech, let alone tech leadership, this is an issue that affects our industry as a whole, not just the technical communication part of it.

Now, if you live in a country in the Western world it's pretty likely that not only is it illegal to discriminate against somebody based on either their gender or family situation, but the popular consensus is likely to be that it's (at best) morally dubious to hire or promote someone based on their gender. (We're talking hiring for a office role such as a writer in the tech industry here, not jobs where there may be a good reason for gender discrimination in hiring, such as a bra fitter for a department store.)  There is plenty of information about your legal obligations around discrimination, and if you want to read about the ethical implications of discrimination then Google is your friend.  This article is not aimed a people who are ignorant of the law or so strongly misogynistic that they'll happily discriminate against a woman just because they hate women (these people are beyond rational argument). 

This article is in fact aimed at people who don't hire or promote women based on what they feel are sound business reasons. 

As a slight aside, please don't assume this is about male sexism or patriarchy.  There's good evidence to suggest that women are more likely than men to discriminate against other women, because gender bias is not limited to men.  There's plenty of anecdata, such as here, here and here to support this; it's called Queen Bee syndrome, and I know quite a few women who feel they've suffered from it.  Obviously there are plenty of examples of men discriminating against women as well.  The point is that this is about a manager, regardless of gender, discriminating against women, specifically working mothers, for what they feel are good business reasons.

There are pros and cons to hiring or promoting anyone, because no-one's perfect.  Also, I wouldn't say "all working mothers are great employees" any more than I'd say "all degree-educated people are great employees", because groups of people just aren't that homogeneous.  But I've seen a few attributes that working mothers in general seem to share, and these attributes are very business-friendly:

  1. Their children come first.  Which means they can stay calm and keep things in proportion, so when a "crisis" happens at work it's not the worst thing in the world.  Great leaders know how to stay calm, partly because freaking out is not helpful, partly because both calmness and stressing out are contagious.  A parent will know that a "crisis" at work is just a minor inconvenience in the bigger scheme of things compared to a real crisis involving their child.  Ask the parent of a child with a serious medical issue whether finding a serious bug the day before a big release is a crisis.  Then feel like crap for even considering that the two things might be in the same ballpark.
  2. Working mothers are often more flexible about when they work.  This might seem odd because often they have fixed start and end times due to childcare/nursery/school demands, but outside of normal working hours is sometimes the best time for a working mother to get stuff done.  You need someone to log in remotely and triage the overnight change requests onto the backlog before everyone else comes into the office?  Who better than the working mother who has to be up anyway to supervise the insomniac toddler who's awake at 5am and glued to Peppa Pig on the tablet for the next hour?  Better for you, better for her, insomniac toddler doesn't care either way.
  3. Mothers are used to dealing with whiny, inexperienced people who are prone to childish anger when they can't get something to work and think that their ideas are the most important thing in the world.  So that's "tech bros" dealt with, then.
  4. They're organised.  Working mothers have 2 choices:  Be organised, or....actually, no, they've only got 1 choice.  Organised people make better employees: They're punctual, they respond to communications in a timely fashion, they're good at prioritisation, they make their deadlines.  What's not to like about an employee who's organised?
  5. They work damn hard.  Working mothers often feel a serious motivation to work hard, not just to earn money or to get respect from their colleagues, but because they want to teach their child that hard work is valuable and important.  Mothers are role models, and a lot of them take that very seriously. 
  6. They've got a different perspective to non-parents, especially the young men who predominate in tech.  Businesses with diverse teams have better decision-making and problem-solving skills. Diverse perspectives correlate with greater profitability.  If that's not a sound business reason then nothing is.
These attributes are generalised, of course; not every working mother has all of them (or indeed any).  But the point is that mothers can bring plenty to the table because they have children, aside from their specific technical skills and experiences.  Being a parent can improve a person's value to the company.  Here's a great quote from a tech CEO:

"There’s a saying that “if you want something done then ask a busy person to do it.” That’s exactly why I like working with mothers now.

Mom's tell me when a project can be done, and they give me very advanced notice when they have to take time off work. If they work from home, it doesn’t matter if a kid gets sick. Yes, they might not be able to Skype with me as often through that day, but they can still be productive because they can work from home while keeping an eye on their child. (And, like me, many have childcare. There’s no way you can work from home without support, usually from another woman.) Mom's work hard to meet deadlines because they have a powerful motivation—they want to be sure they can make dinner, pick a child up from school, and yes, get to the gym for themselves.

But, I know there are still a lot of people like my 28-year-old self — they undervalue mothers’ contributions because they count hours logged in the office and not actual work. Most mothers lose if that’s the barometer for productivity."

Yes, working mothers are not the "perfect employee".  But no-one is.  So they might need to work from home occasionally because their child's sick.  So what?  That's what Slack and GitHub and VPNs are for.  What difference does it make why they're working from home, as long as - like everyone else - they get their work done?  And I've seen plenty of young devs at their desk unable to work properly because they're hungover or because they've only had 45 minutes sleep after an all-nighter on XBox Live.  Not seen many working mothers do that.  There are pros and cons to hiring or promoting just about anybody.

But ruling out a significant part of the talent pool just because you think a child is some form of baggage to a woman is dumb.  Not illegal or immoral (although it is those things), dumb.  You're missing out on a whole group of motivated, organised, hard-working people based on that fact that these people have something in their lives which is more important to them than your company.  News flash: Children are more important to men too.  We're just not discriminated against for having them, so the bias against us being parents hasn't formed as strongly or as pervasively. 

When you hire or promote someone, choose the person with the skills, experience, work ethic and perspective that will allow them do the job well - and remember that quite often that'll be a working mother.

Sunday 5 June 2016

How to Scale Documentation, Part 5

So, deployment....

Yes.  How's the build part going?

I think I'm getting my head around the automation that you talked about in your last post, but without knowing what the end goal is, it's hard to design it properly.

If you've never done any form of automation before then it's important to understand the concept and to try it out, even if you don't know the final goal yet.  A musician can play anything once they've mastered the instrument!

Well, I haven't mastered automation yet, but I'm getting there.  I've also got deadlines to hit, so if we could talk about deployment now that would be great....

I hear you, loud and clear.  As I said in Part 4, your documentation should be O/S and application agnostic as much as possible.  This means that the format you choose needs to be as open and non-proprietary as possible, which means you should use web format, i.e. HTML/XML, CSS and JavaScript.

Wait, why is being O/S and application agnostic so important?

Because if you create your documentation in a format that is primarily designed for one O/S - e.g. CHM - then no matter how good the viewers are in other operating systems you can't always be sure that viewers are a) able to read it on every machine without having to install software to read it, which sucks for them, and b) going to see the same layout and formatting on every machine they read it on.

Secondly, you're dependent on the features that the proprietor decides to give you when using that format, and those features can be taken away or changed at a moment's notice without you having any choice. Believe me when I say that upgrading to the new version of the build tool and finding that they've deprecated functionality that you relied on is not fun.  At all.

Thirdly, proprietary formats can be withdrawn without notice at any time, plus as long as you use that format you're tied to that supplier.  The older a proprietary format gets, the more likely it is that it will be superseded by newer formats and therefore the more likely that both the supplier and others involved in the format ecosystem who provide tools to read and edit the format will eventually stop supporting it in favour of something more profitable.

All good points.  But why web format?  Why not PDF?  Why not web format AND PDF?

There's 2 questions there:
  1. Why web format as opposed to anything else, and
  2. Why a single format?
To answer in reverse order, the benefits of a single format are:
  1. Only 1 set of documentation to layout, format, review, proofread, build, deploy and maintain;
  2. Only 1 help authoring tool (or help authoring suite) is needed to cover the whole of the documentation;
  3. Customers never have to work out which is the right format for them.
Part of scalability is keeping each step of your process as simple as possible, and then keeping the number of steps down to a minimum.  Having more than one build target - e.g. HTML, CHM, PDF - increases both the number of steps and the complexity of the steps (especially when it comes to reviewing and proofreading the documentation in more than one format).  

The benefits of using web format, as opposed to PDF or any other open format, are that: 
  1. Every computer, including servers, comes with a browser that allows them to read web format without installing any software, plugins or extensions;
  2.  Web format can be read either from an external website or from an internal server (if there are limitations that a customer places on the external connectivity of their database or application server, for example);
  3.  Browsers are backwards-compatible for older versions of HTML/XML, CSS and JavaScript (at least up to a point) and newer versions of HTML have appeared on average about once every 4.5 years, so there's little danger of your documentation being unreadable for - realistically - a decade, and probably a lot longer than that;
  4. Because web format is so widely-used there are a myriad tools you can use to write/generate/build/deploy it, and that's before we get onto the benefits of using DITA, which is an XML format and therefore ripe for web deployment.

Are there scalability benefits to using web format though? 

Yes, and this is where the live web server that we mentioned in Part 4 comes into play.  The most scalable solution is to provide your documentation in web format and put it on the internet for your customers to access (behind a login if necessary).  If you have customers that might need the documentation locally, you can also provide a copy with the software that can be deployed on the customer's intranet.  This means you still only build one target, you just copy it to the live web server and also to the software build server.

This gives you:

  1. Smaller, quicker install (if the customer doesn't need the documentation locally);
  2. The ability to embed files, e.g. video, of any size;
  3. Dynamic updating of the documentation rather than waiting for a release;
  4. Gathering of stats to identify pain points in your product.
All of these points will help you scale your documentation by giving you a single documentation set to maintain.  You also get the additional benefits of identifying pain points in your product by gathering usage stats, and you get lots of scope for gathering both implicit and explicit feedback.

Feedback as in the 3rd part of the documentation cycle?

Well spotted.  Yes, if you've scaled Input and Delivery then the 3rd and final part to complete the loop is Feedback.  Having your documentation online for customers to access will definitely help you with that because your customers are already coming to your documentation, you've just got to take advantage of that and get the feedback from them whilst they're there.

Ahhhh, you cunning fox.  It's all starting to get a bit clearer now.

This kind of system isn't just knocked together in 5 minutes on the back of a napkin you know.  There's some actual thought gone into this.

So I'm starting to realise.  How do I start getting feedback then?

Easy, my young padawan.  The force is strong with you, but you're not a Jedi yet.

Star Wars? Really?

Just trying to provide a bit of levity, but suit yourself.  We'll take a look at Feedback next time.

Yes, Master.

Stop taking the mick.


Saturday 4 June 2016

Muhammad Ali: The Passing of an Icon

Normally I steer clear of writing about things that aren't related to documentation or agile methodologies.  The one exception to that so far has been the death of Leonard Nimoy, as Spock was such an inspiration to me (and many, many others). can I not comment on the passing of The Greatest, Muhammad Ali?  

Muhammad Ali training in his gym in 1965
It's been a bad year for deaths amongst famous people you've never met but feel like you've spent your life with.  But, without denigrating any of these people and what they meant to their fans, none of them were Muhammad Ali.  None of them were even in the same ballpark.  There are people all over the world who had never heard of David Bowie or Prince or Alan Rickman, at least until they died and social media went nuts.

But Ali?  Everyone knew who Ali was.  Everyone.  He was famous and revered around the world on a level that perhaps only Usain Bolt can approach amongst the modern generation.  But fame tells only a tiny fraction of how big Ali was.

He was more than famous, he was in the truest sense of the word, legendary.  Not that he was from a long time ago and possibly mythical, but that he was a touchstone for courage, bravery, power, heroism. He used his physical skills and his charisma to stand up for equal rights and against a war he believed America had no business being involved in, and paid the price. People believed in him.

There's no doubt that Ali had his faults.  His merciless taunting of Joe Frazier left his opponent bitter and angry for the rest of his life and there have been many, many allegations of marital infidelity.  This is not a revisionist hagiography.  Ali was no saint.

And yet there is something so captivating, so thrilling, so visceral about his performances in the 60's and 70's, whether it was in or out of the ring.  Inside the roped arena he could be spell-binding, one moment gliding just out of reach like a Shaolin monk, all grace and speed, the next moment thundering an anvil right into his opponent like an angry diesel engine, all power and venom.  He wasn't the single greatest boxer that ever lived and he wasn't the single greatest fighter.  But his combination of a cruiserweight's speed and a heavyweight's power, together with a virtuoso's self-belief and a granite chin, made him in his prime an untouchable force of nature.

Outside of the ring his poet's tongue and showman's timing meant audiences were left eating out of his hand.  His charisma was luminous, lighting up not only his own physical beauty but everyone and everything around him. Even now you can look at videos of him being interviewed from 40 or 50 years ago and you can see the electricity coursing through him and into everyone around him.  The difference between him and almost anyone else you've ever seen is so huge it's sometimes hard to comprehend that he was just a man.

Perhaps the most appropriate comparison would be Nelson Mandela, both as the only person who could justifiably claim to have the same level of global fame and respect as Ali, but also as someone who's past was by no means uncontroversial.  Both men approached their dotage without losing any capacity to inspire, Mandela with his call for all men to put aside their past hatred and live as brothers, Ali with his dignified and typically powerful refusal to be destroyed by Parkinson's.  Both left this world to a global fanfare of mourning and sadness, the like of which we may not see again, certainly in my lifetime.  That's what truly sad about this passing, the sense that the last of the great men has finally been laid to rest.

Perhaps the last world should go to someone who started as his greatest opponent and ended as one of his greatest friends, George Foreman:

"If you put Ali in boxing, you won’t get what he really was. The life he lived 
outside of the ring, what he had to say, the bravery he had, made him what he was: 
a prophet, a hero, a revolutionary — much more than a boxer. It really brings 
him down to rate him as a boxer. He only did boxing to run his mouth. 
Whatever message he was destined to get across, he used boxing to do it. 
I mean, he could do the shuffle and occasionally throw a good jab, even get a 
few knockouts, but that doesn’t put him in boxing. 

Forget about boxing, he’s been a gift to the world."

Muhammad Ali, boxer, poet, fighter, hero, January 17, 1942 – June 3, 2016. 

                    * Annie Leibovitz - Mohammed Ali (Cassius Clay): Ali Aka, Ali Annie, Muhammad Ali Quotes, Ali Character, Facing Ali