Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Open XML Translator for Microsoft Word Available

Posted by Zonk on Fri Feb 02, 2007 05:06 PM
from the translate-the-night-away dept.
narramissic writes "The first phase of a Microsoft-funded project to create software that can convert Microsoft Word documents between Open XML and Open Document Format (ODF) has been completed. As a result, the Open XML Translator is now available for download in version 1.0 from SourceForge.net. A ComputerWorld article details the history of the project, discussing the work of companies like CleverAge and AztecSoft, as well as community efforts to bring this project to realization."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Aw... (Score:5, Funny)

    by darkhitman (939662) on Friday February 02 2007, @05:11PM (#17866186)
    Please no clippy, please no clippy...

    "It looks like you're trying to convert to a non-Microsoft proprietary format. I can't let you do that, Dave"
  • Relation to Linux? (Score:4, Insightful)

    by Anonymous Coward on Friday February 02 2007, @05:13PM (#17866204)
    And how's this related to Linux? It is just a ODF - OpenXML convertor for Windows.
    • by mtenhagen (450608) on Friday February 02 2007, @05:29PM (#17866486) Homepage
      Software Requirements

      Before installing the add-in, make sure you have one of the followings...

              * Microsoft Word XP
              * Office Compatibility Pack
              * .NET framework 2.0

      or

              * Microsoft Word 2003
              * Office Compatibility Pack
              * .NET framework 2.0*

      or

              * Word 2007 with .NET Programmability Support activated
              * .NET framework 2.0*

      Minimum Software Requirements

      To compile the source distribution, you will need Microsoft Visual Studio 2005.
      • by albalbo (33890) on Saturday February 03 2007, @04:36AM (#17871372) Homepage
        Michael Meeks made a version of this converter available which compiles using mono, see entry 2007-01-29 on http://www.gnome.org/~michael/ [gnome.org] .

        Realistically, there's no reason it even needs to be in C# - the various bits of wrapper could be rewritten into other languages, and the main work is done by an XSLT. The OpenDocument Fellowship might include a similar tool in future tool sets, translated to be a bit more native.
        • That's just for the Word plugin, the command line tool will merely require you install the MS patent litigation timebomb known as "Mono". Doubtless there are thousands of Steve Irwin types who won't flinch at being asked to insert their member in the crocodiles mouth, common sense should prevail for the rest of us.

          Mono is built from Shared Source code (see: http://www.microsoft.com/sharedsource/ [microsoft.com]) and is perfectly legally licensed to the Mono team. Microsoft cannot open lawsuits against people using code under terms Microsoft and the developers mutually agreed to.

  • Why is this such a big thing considering that OpenOffice has the ability to import from and export to MS Word format? It even allows you to email the document in Word format without having to explicitly save it in that format.
    • The idea is that it would let Microsoft Word users do the conversion, and save their documents in ODF, rather than leaving them in DOC and requiring OpenOffice users to do the conversion.

      The big difference is which format the documents get stored in. If they're being stored in DOC, then you're still mostly at the mercy of Microsoft; it's easy for someone to open the document in some new version of Word, save it, and silently move it into some new MS-created "binary blob" format, breaking backwards compatibility.

      So basically, a converter would let states like Massachusetts start to move away from DOC as the de facto standard format for electronic documents. They'd probably still use it as an editing format, because I don't see them tossing Word for OO.org anytime soon, but it would help get rid of the huge "silos" of DOC stuff that's sitting around, getting silent migrated from one version of Microsoft's formats to the next.
        • Re: (Score:2, Informative)

          It's open source. If there is some piece of code that causes it to produce bad ODF files, you can fix it yourself, and make the fix available for anyone else. If they refuse to merge it back into the the main branch, you can fork it and then fix it, and again make it available to anyone else.

          As for the profit motive, more and more governments are starting to talk about mandating non-proprietary file formats. Microsoft doesn't want to include this in Word, obviously, but if a city, state, or even nationa
        • Re: (Score:2, Interesting)

          Just ask yourself, what profit motives does Microsoft have in making this work?

          I'll second that emotion.

          This initiative is at odds with Microsoft's decision to use Open XML for the Office suite. If they really think folks are going to be stuck with Open XML-format Office documents that they need converted into ODF (say, for distribution reasons) what is it that stops them from saving the documents as ODF directly out of the Office app?

          I think Microsoft is feeling a little shaky on this issue. They'
    • IIRC (correct me if i'm wrong), but i think OpenOffice can only handle the word-processor part of Open XML, not the rest of it (spreadsheet, presentation, etc).

      and, since many large organizations/governments have already switched to ODF, those groups wouldn't really be able to switch back to Microsoft without a conversion tool, preferably built into Office. this could be MS's attempt to get them to switch back.
      • IIRC (correct me if i'm wrong), but i think OpenOffice can only handle the word-processor part of Open XML, not the rest of it (spreadsheet, presentation, etc).

        OpenOffice.org is a complete office suite, comprising Word Processor (Writer), Presentations (Impress), Spreadsheets (Calc) and Vector Graphics/Diagrams (Draw). The Open Document Format (ODF) is able to encapsulate all these document types.

        Whether the Office Open XML (OOXML) to ODF convertor can handle all of these transformations, I don't know. I'm not holding my breath for a complete converter from OOXML to ODF either - 6000+ pages of OOXML spec is going to be hard to read, let alone code all the diff

        • yeah, i'm an idiot. in my head "Microsoft Word" and "Microsoft Office" are interchangeable. didnt occur to me that they were just talking about Word, not all of Office.
    • Why is this such a big thing considering that OpenOffice has the ability to import from and export to MS Word format?

      Some governments have conflicting directives including support for ODF and a contract to buy MS Word. Many tools designed to allow the blind to use computers work only with specific products, like MS Word. As a result, some governments asked for a converter that would move documents back and forth between these formats and for some reason they asked that MS not contribute or control the co

    • When OpenOffice imports the Word document and it does not look exactly like it did in Word, people will blame OpenOffice. You can cry till you are hoarse that it was because of bugs in word, and no one would even listen to you.

      Now you export word doc to ODF using Microsoft Ceritified Export tool, and if the user complains that it does not look exactly as it did in Word, we can 1. Blame Microsoft for insincere export. 2. Read it into Word and see if the document survives round-tripping. If it does not, we h

  • by User 956 (568564) on Friday February 02 2007, @05:13PM (#17866214) Homepage
    The first phase of a Microsoft-funded project to create software that can convert Microsoft Word documents between Open XML and Open Document Format (ODF) has been completed.

    Unfortunately, when you run it, it starts off with, "Hi! It looks like you're trying to convert a Microsoft Word Document! Would you like some help?"
  • by mandelbr0t (1015855) on Friday February 02 2007, @05:13PM (#17866220) Journal
    Anyone else feel chills? Remember how good the Import/Export of .WPD files was in Word? I'm guessing that this will be of similar quality. At least it's OSS. But I wouldn't hold my breath waiting for this to bridge the gap between ODF and OpenXML. Best is to use OpenOffice and save as .DOC if you have to. Here's the Microsoft Press Release [microsoft.com] about it.
    • At least it's OSS
      Maybe I'm reading too much between the lines here, but it said the project was OSS *and* either sponsored or blessed by Microsoft. Again, maybe it's nothing, but I see MS setting a trap here so that anybody with DOC and ODT conversion becomes a target for future lawsuits by MS (kinda like the SCO deal - "Oh, you must have our code, so we're gonna sue the crap outta you")...
    • This MS-sponsored converter converts between ODF and OOXML, both of which are publicly spec'ed. So it can be *perfect* in theory.
      OO.o's conversion converts between ODF and OO.o's best guess as to what the binary .DOC format is. And OO.o's best guess is pretty poor for anything but the simplest of documents.
  • by Ace905 (163071) on Friday February 02 2007, @05:17PM (#17866266) Homepage
    Can I ask, since the article doesn't seem to really explain -- what good is this? I know converting to XML is supremely important _in theory_ so that your documents can be easily parsed and used among other software applications - but say for example:

    I have a document
    I convert it to XML

    then what? Is this excellent news in theory, or is there a demand for this?

    I honestly don't know, I'm not claiming there isn't. Please tell me.

    ---
    this isn't xml [douginadress.com]
    • by Iphtashu Fitz (263795) on Friday February 02 2007, @05:21PM (#17866330)
      I have a document
      I convert it to XML

      then what?


      The latest and greatest(?) versions of the MS Office programs save natively in XML. This converter lets you convert to ODF, which lets you read the files into OpenOffice on any operating system, or any other application that supports ODF. It basically lets you get out from under the MS proprietary format and into an open standard.
      • Ok, my mistake - it converts TO ODF from XML ; but aside from the incredible sense of freedom you experience getting out from under the crushing weight of MS proprietary format (haha) -- does this actually, at the moment, open up any new avenues for anybody?

        Like someone else commented, OpenOffice and already import and export to microsoft word. So is this really a practical utility, or does it just make everybody happy that hates Microsoft but still actually uses them?

        ---
        Open What? [douginadress.com]
        • The problem is that OO will screw the formatting for anything that's a little more complex. If whenever you open something, everything is out of place, or you can't be sure that somebody will be able to open the document how you saved it, it's best just to use MS Office.

          It means that now, I will be able to send ODT's to people who seem to think OO is somehow inferior to the "real" MS Office.
          • Re: (Score:3, Interesting)

            LordVader717 wrote as part of a post:

            The problem is that OO will screw the formatting for anything that's a little more complex. If whenever you open something, everything is out of place, or you can't be sure that somebody will be able to open the document how you saved it, it's best just to use MS Office.

            The problem is, this is not even viable in a pure MS Word environment. An often-heard complaint is that MS Word documents will look different on different computers, even if both users are using MS W

            • Even other microsoft products will fail badly...
              The mac versions of word have many compatibility problems with the windows versions, and just try loading/saving word documents with ms publisher (publisher even comes bundled with the more expensive versions of office)
              • Bert64 wrote:

                Even other microsoft products will fail badly...
                The mac versions of word have many compatibility problems with the windows versions, and just try loading/saving word documents with ms publisher (publisher even comes bundled with the more expensive versions of office)

                Maybe the solution is to move away from the entire WYSIWYG-while-editing method of document preparation, to an instruction-based system like LaTeX where the document will print on paper exactly as instructed regardless of the

    • XML (and open for that matter) is just a synonym for "good." So read "OOXML format" simply as "(MS) Office format." The fact that it is sort of based on XML is irrelevant.
    • Can I ask, since the article doesn't seem to really explain -- what good is this? I know converting to XML is supremely important _in theory_ so that your documents can be easily parsed and used among other software applications - but say for example:

      I have a document I convert it to XML then what? Is this excellent news in theory, or is there a demand for this? I honestly don't know, I'm not claiming there isn't. Please tell me.


      Can some one please correct me if I'm wrong, but I thought MS's new format was
    • You save the document in XML. 6 months down the road, you think, gee, bob edited a document I worked on, but I don't remember which one. Because XML is basically a text flat file, you do a search for edits to your documents done by Bob. It checks all documents, including the files done in word, wordperfect that the legal team uses, OO that the engineers use, etc. Or, again, because its not a proprietary format, you send the same document out to 10 people to edit.. they all email back changes.. very, ver
      • What are you talking about?

        Say I create a document in Word 2007. Say it's my resume. I would save it "resume.docx". Say I create it on OpenOffice. I would name it "resume.odt". Six months from now, I would see one of those files and know what it is based on the name, extension, and icon.

        It's not any different than going by doc, xls, ppt, etc now. Why would it being a zipped XML make a difference? To a user the only difference is the extension and what programs can open/save it.

        Open Office Extensions:
      • I'm not going to argue semantics because you're right - but I know what XML is. My _point_ once again is what is the practical use? OpenOffice and KWord import and export from Microsoft Office format.

        Speaking of something end users shouldn't care about, my whole question is, why care about this?

        ---
        Definitely don't care about this [douginadress.com]
          • Anonymous Coward wrote and included with a post:

            OpenOffice and KWord import and export from Microsoft Office format.

            Imperfectly. Microsoft have a better track record (though not perfect) with writing software that can understand Word documents.

            One of the problems with trying to write a program that can accurately read and write MS Word files is the way that information is stored in an MS Word file. The article "In Depth With StarOffice Filters" by Brian Proffitt http://www.linuxplanet.com/linuxpla [linuxplanet.com]

  • by Anonymous Coward
    Surprised? Seems Microsoft just see this as another way to infect the better platforms with their CLR, an attempt to start the countdown on the patent timebomb.

    If you're writing cross platform code at least have the decency to use C, C++ or Java, requiring a CLR is insulting.
    • Surprised? Seems Microsoft just see this as another way to infect the better platforms with their CLR, an attempt to start the countdown on the patent timebomb.

      If you're writing cross platform code at least have the decency to use C, C++ or Java, requiring a CLR is insulting.


      OMG! Mono & Rotor! WTFBBQLOL!?!?!
  • by schwaang (667808) on Friday February 02 2007, @05:25PM (#17866412)
    From the Microsoft press release:

    The second phase of the translator project, including translators for Spreadsheet (Microsoft Office Excel®) and Presentation (Microsoft Office PowerPoint®), will begin in February. Regular customer technology previews will be posted to SourceForge.net beginning in May 2007, and the final versions are scheduled to be available for customers in November 2007.
    One thing I'm wondering is how to automatically keep the OpenXML translator up to date on windows. If you install it from the MS Office Downloads [microsoft.com] site, will WindowsUpdate just keep it updated for you?
    • No...
      windows update only really updates what comes bundled with the os, and not any additional apps, even if those additional apps came from microsoft
  • WTF?! (Score:3, Funny)

    by TheWoozle (984500) on Friday February 02 2007, @05:27PM (#17866448)
    I just tried to use it, and here's what I got:

          This is not a winning document. Better luck next time.
     
  • What's the point? (Score:4, Insightful)

    by protactin (206817) <(chris) (at) (orr.me.uk)> on Friday February 02 2007, @05:28PM (#17866468) Homepage

    What's the point if the add-in doesn't allow ODF to be set as the default file type, or even used via the Save As menu [robweir.com]?

    Hopefully the Word "interop" API actually allows for this sort of thing to be properly integrated.

    • Re: (Score:2, Informative)

      Actually if you look at where it appears it's right off the root of the File menu. So it stands out more than Save As, which needs to be chosen; then subtype chosen. It looks (to my mind) to be more important in the menu structure.
    • The point? Being able to pretend to support ODF while not actually doing so in any meaningful sense, of course.
      • the last time I tried to open the crappy new .DOCX with Open Office, it did not work.

        Cos OpenOffice doesn't support it yet? Hardly Microsoft's fault.

        (Nor, on the other hand, is it OpenOffice's fault. It's still a new format, for christ's sake.)

  • A while back a state IT Department (I think Massachusetts) decided to only use open-source document formats and talked back and forth with Microsoft. The head of the IT Department (or something similar) privatly asked some of Word's programmers, who said an odf/xml feature would be trivial to add, but MS flatly refused to make a plugin for Office to convert to odf/xml, even though it meant losing the state's patronage.

    Microsoft is really determined to strangle open formats.
  • by LibertineR (591918) on Friday February 02 2007, @06:04PM (#17866942)
    IT Admin = "Boss, we can move to Open Office now, we have an XML converter for MS Word!"


    CIO = "What is this 'ribbon' thing I keep hearing about?"

    IT Admin = "Boss, we dont need the ribbon, its just Microsoft hype."

    CIO = "Have you seen the ribbon? Bring me the ribbon!"

    IT Admin = "Khaaaaaaaannnnn!"

  • I can't wait to see what Vista does with this. "Premium content detected, doc files are proprietary content I have notified Microsoft about your attempt to circumvent the DMCA. Please stay where you are and make no sudden moves".
  • It's XML, but... (Score:3, Interesting)

    by bbtom (581232) on Friday February 02 2007, @07:15PM (#17867776) Homepage Journal
    Try reading Microsoft's documentation for OOXML. It's 6,000 pages long. Seriously. This is a great Microsoft PR stunt - yes, you've gotten your data in to XML format, but the XML format is so complicated that only the Microsoft programmers who wrote it can actually understand it. Part of the point about XML is interoperability. There's no way that sane people are going to read a 6,000 page Microsoft specification and write an XSLT to convert Microsoft OOXML in to a simpler and saner format. In short, this will not mean any competition with Microsoft. They buy PR in the geek community by saying "Office is going XML! Open data! Whee!" and making an XML format that's so complicated that nobody would ever use it. That's a pretty smart move. And it's a pretty dumb move on the part of ECMA. Congratulations on just giving your dignity away by signing off on a specification that's about nineteen times longer than War and Peace...

    No document in living history is ever going to be so complicated that it needs to be in a format that's specification is 6,000 pages long. Part of the point about XML was that we should be setting up simple, domain-specific markup languages and extending already existing markup languages. OOXML is bad because it's needlessly complicated and obscure. Having visited the OOXML website, I'm missing a lot of things I expect. First, I'm missing schema. If these guys are serious about XML, where are the XSD/RNG schema? Secondly, where are the cross-platform translators - ie. XSLs? I'm missing some kind of high-level summary of how I'm supposed to parse the XML. If the only way of doing anything with OOXML is a closed, black-box Microsoft converter, then we still haven't really got anywhere.

    Well, I'm breaking the cycle. All my documents are going to be either ASCII or a standard, non-obscure XML format like XHTML. Or something home-brewed and simple that can be easily transformed using XSL and XSL-FO. Screw Microsoft's phony attempt at interoperability. The Internet is interoperable by design. (X)HTML is interoperable by design. Let's prove to them that we mean interoperability by sticking to simple, sensible, semantically-based and scalable principles.
      • The format is so long, because they created new methods of storing data when there are already standard formats in existence, for instance images... MSXML defines an internal way of storing images, while opendocument specifies the use of external formats (png, jpeg etc) the documentation for which is already publicly available, and widely implemented.
          • Yes, they refuse existing standards like MathML and SVG instead of reinventing the wheel.
            If there already exists a standard that is capable of performing some of the functions you require, why not use it? This means that people who have already written code to support these functions won't have wasted their effort, and people implementing OpenDocument will be able to find existing code and expertise to implement several parts of the standard rather than having to write it themselves.
  • Word suuports "art borders". Users can choose from a wide selection of images selected from 1980's western culture. Does this get exported properly?

    If I can't get the candycorn borders, no deal!
  • I installed it and tried in in Word 2003 on a small, simple documents. I noticed that when the result file was loaded into Writer, it looked at first glance the same as if Writer had loaded a .doc file - e.g. white page background turns red, numbered section headers lost the numbering. I'll have a closer look and do the round-trip test, but it seems that the Windows team has exactly the same difficulties that OOo has in the conversion? Someone(TM) should perhaps compare some code here - not to point any bl
    • Please define your usage of the word "reasonable".

      To the level at which I personally use MS Office as a (non-Windows/MS) computer professional of some 25 years, OO has all the features that I need from a word processor and spreadsheet program. I also do quite a lot of training slides in Powerpoint and find that OO's Impress imports those pretty well, albeit with some minor adjustments afterwards. I have some simple databases in Access but have not yet checked OO's ability to import those. I therefore find