gogreennomad.blogg.se - Why does word 13 default to microsoft open xml converter

#WHY DOES WORD 13 DEFAULT TO MICROSOFT OPEN XML CONVERTER ARCHIVE#
#WHY DOES WORD 13 DEFAULT TO MICROSOFT OPEN XML CONVERTER SOFTWARE#

I think docx is designed to be obtuse so that most of us foot-soldiers are intimidated and that the software companies like Microsoft and that Aspex Words, etc. I often wonder when people say "oh, that wx namespace has been dropped, because the developers understand that it is redundant", yeah, but I doubt most of the people who say that so lightly have ever done these transformations. I have done such things too, but it's another few days I'd need to invest. The recovery of the nesting levels (wx:sub-section) is also non-trivial, and you have to sort of break out of normal XSLT workflows to make that happen. I have done this sort of inheritance scheme in XSLT, it is even fun to do, but it is hard and would take me several days, time which I don't have. But it is pretty hard to implement, as you have to chase through levels of styles and w:basedOn parent styles, concrete number formats, abstract number formats, until you really gather the number format, and then you also must keep track of the counting of all the levels so that you have the numbers for each level that then you format. The principles are described in Wordprocessing Numbering, Levels & Lists, the principle is not hard to understand. I find particularly the reconstruction of the section numbers an immensely hard task if I want to do it correctly. wx:sub-section - which you can transform to elements to have nested sections instead of a flat list of headings and paragraphs.- which gives you the section heading numbering strings for numbered sections.However, with docx there is one major loss of a really useful feature: the wx namespace. So, this is all fairly doable with XSLT, especially the old Word XML. I care about content, and HTML formatting can be rather different. I demand that my source documents are built with Styles, and when it's just formatting bold, italics, font size, stuff like that, then I will not try to preserve all that exactly. I have found the trick to be to just cut to the chase, by extracting just what you need, essentially paragraphs, tables also convert pretty straight-forwardly to HTML tables.

#WHY DOES WORD 13 DEFAULT TO MICROSOFT OPEN XML CONVERTER ARCHIVE#

The fact that docx is a multi-file archive is not a problem for me, because I use Saxon XSLT running in java and I can use jar file URLs to open the word/document.xml file and from there get to all the other files with the document() XPath function. I did some study comparing the old Word XML with the new docx format. I have done this with the older Word XML output.