“Microsoft Word 2010 Developer Building Blocks” and other wordy links…

Eric White: “The altChunk functionality of the Open XML file formats enables easy merging of documents.  You can merge content from multiple sources (other Open XML documents, HTML, plain text, and more) into a single document.  After using the Open XML SDK to set up the document that imports alternative content, if you want to convert the document so that the new content is transformed to typical Open XML WordprocessingML, you need to open and save the document using Word 2010.  Alternatively, you can use Word Automation Services to process the document and import the alternative content.”

“For your OOXML Conspiracy Theories”

Miguel de Icaza: “The energy that went into stopping OOXML could have been better used in actually completing the formula spec for ODF, which almost four years later is still not part of the ISO spec. In the eyes of the ISO world, it remains an "implementation specific" work. But "advocacy" is a little bit like watching the TV, it is relatively easy. While actually working on improving open source, or open standards is equivalent to going to work. It requires skills, time and longs hours of difficult work (particularly if you are working on the OpenOffice code base).”

listsync.codeplex.com

Sync between file folder and SharePoint list for large file scenario. Huge file ( such as media file, cad etc.) was not recommended to be directly stored in SharePoint document library. This project is focused on Huge file storing problem.”

writers.stackexchange.com

Writers is a collaboratively edited question and answer site for people who love writing. It’s 100% free, no registration required.”

“Better Handwriting For You: Book 4”

I was raised on this book!

My First Post Pasted from WordWalkingStick

холни маси
WordWalkingStick 10-21-2010 10-42-07 AM

There are a happy few people on this planet who expressed concern about the next version of CleanXHTML. So today finds me pasting this Blog post into WordPress from my next version of CleanXHTML: WordWalkingStick. This project will be released on CodePlex.com (along with my other projects) ‘soon.’ This project will be released under the same license as Eric White’s license for PowerTools for Open XML, the Ms-PL. This is done out of respect for portions of WordWalkingStick depending on PowerTools for Open XML.

Here are some random points about WordWalkingStick, written just before midnight:

  • WordWalkingStick is based on .NET 4, using MEF, WPF and PowerTools for Open XML.

  • This application has a scope larger than CleanXHTML as it provides a small framework for rolling up all of my Office Word customizations. Previously, my practice depended on customizing Normal.dot.

  • This move puts me out of the commercial software business (based on a business model from the early 21st century).

Using my “Swiss Army Knife” (my stick) for Office Word is supposed to make customizing faster and easy to migrate to future versions of Office (until it moves entirely to the cloud or VSTO is discontinued as we now know it).

Word 2010 Allows Nesting of Content Controls!

The shot below shows a Word 2010 Rich Text Content Control, nesting a Plain Text Content Control:

This idea of nesting content controls comes to me from Eric White’s “Using Nested Content Controls for Data and Content Extraction from Open XML WordprocessingML Documents.” Eric mentions this very important bit:

Important note: In order to nest content controls, the containing content control must be a rich-text content control.  You create one of these using the upper-left button in the Controls section of the Developer tab.  Thanks, Darin.

Another important bit: you cannot nest content controls in Word 2007 or earlier! This new feature in Word 2010 effectively replaces the functionality of “Custom XML” that has been removed by a court order from Word 2010. I daresay nested content controls are not as conceptually embarrassing as some critics of Microsoft have claimed. The Content Control does not require the use of an external schema file (which was technically entertaining to me—but not to many, many others).

It is very, very important (to me) to see nested content controls in Design View (above). However, most writing about this subject shows them in print/layout view (below):

Without the news in Eric’s article, I would be essentially doomed. Yes, ‘doomed’ is a strong word so let the research of Peter Sefton help me be a bit more articulate. He has a 2008 article entitled “Embedding metadata and other semantics in word processing documents” and the title speaks clearly to  me. Modern word processing file formats need a standard way to store metadata. And, no, there is no quiet, elegant Open Source program out there that saves the day. Anyone out there who considers their documents first-class entities for any data management system cannot dismiss Word 2010 with a bunch of Microsoft player-hating. I keep trying to get rid of Word and I keep going back.

BTW: In case you can’t get that Peter Sefton article, try the slide deck “Embedding Metadata In Word Processing Documents” (or the PDF).

Putting Together Open XML and VSTO

My limited research informs me that Eric White has gone the longest way toward consistently (almost daily at times) and explicitly applying contemporary .NET technologies with Microsoft Office. Surely Eric would suggest that he deals in Microsoft Office file formats—not Office itself (VSTO). Moreover Eric might say that he had very little to do with the Document Reflector—arguably the most important tool written for the VSTO world seen through the lens of Open XML.

The Document Reflector is part of version 2.0 of the Open XML SDK bundled in a GUI application called “Open XML SDK Development Productivity Tools.” BTW: since I am unknown for “beating up”on Microsoft’s Brian Jones, it must be said that Brian Jones mentions Document Reflector more than Eric White, according to my last search.

Hey, Eric is a busy guy—he’s been writing or pointing us to articles like:

Transforming Flat OPC Format to Open XML Documents Even though Eric White makes no mention of VSTO in this article. This is the one that suggests (to me) how to use the full power of the Open XML SDK inside of Microsoft Word. The official priority by the way appears to be that Open XML tools are written for processing documents outside of word (for massive, long-awaited, server-based solutions).
The Flat OPC Format “Note that the Flat OPC format is not the same as the ‘Word 2003 XML Document’ format.  Those documents have a schema that is very different from the Flat OPC format.”
Using Open XML to Improve Automation Performance in Word 2010 for Large Amounts of Data

“The Range.WordXml object returns a Flat OPC XML document for that range as a string. You use this to prepare an in-memory package so that your code can access necessary parts such as the main document part, the styles part, and the numbering part.”

The Word object model is moving target with regard to Open XML. The Range.WordXml object has been replaced by the Range.WordOpenXML Property.

Transforming Open XML WordprocessingML to XHtml A “map” listing 18 articles on the subject of Open XML and XHTML. Truly groundbreaking for Microsoft!

Open XML for Word 2010 VSTO links:

Open XML, the “Custom XML” litigation and Content Controls

So, I’ve talked about what appears to be my “Custom XML” problem earlier. It may be the right place and time to add a few flippant remarks.

Microsoft’s recognition of this Texan ruling lies in “Utility to manage custom XML markup feature availability for customers outside the United States and its territories”; the title speaks for itself. Articles like “Associating Data with Content Controls” from the TechNet world (Gray Knowlton) go deeper into this “Custom XML” issue (and back to Eric White).

Eric White is my personal replacement for Brian Jones…

Okay! The title does sound “negative” but I’m the dude that wrote “Good Stuff about Brian Jones” in 2007. The issue here is that Eric White kicks the asses I’m thinking should be kicked—and I’m willing to childishly sing his praises at the expense of getting petty with Brian Jones.

Without Eric White, I would have no ‘sane’ way of dealing with this Microsoft litigation issue mentioned earlier. Here’s the plan:

  • Build a tiny, thin VSTO project that is a façade/gateway into the Open XML SDK. This stub would replace CleanXHTML.
  • Use the Open XML SDK to process Flat OPC strings obtained directly from any given Word Range object.
  • The ‘process’ is based entirely on Eric White’s code, the static method FlatToOpc, in “Transforming Flat OPC Format to Open XML Documents.”

My little OPC documents will litter the writeable temp folders as the Open XML SDK does its magic. One little reward for this revitalized interest in Open XML in general and the .DOCX file format in particular, is stumbling upon the way to get WPF to work in Office Solutions (VSTO):

Update: actually the two WPF articles mentioned above are not necessary to get WPF working with VSTO. All that is needed is a reference to a WPF Window object inside the VSTO project or as an external reference (my preference). More on this later…