A completely new version of Chem4Word
In the last post, we showed you some of the improvements we’re making in the new version of the Chem4Word addin, and how they improve the experience of our users. In this post, we’d like to share with you some of the changes that we’ve made, but from a developer’s perspective
Changes to the chemistry storage model!
Chem4Word currently stores everything in Chemical Markup Language (CML). We aren’t going to stop doing this. CML makes all the information readily available to both software and humans. This is a key advantage that Chem4Word has over other tools.
We looked at how the CML was being stored in the document. Currently, Chem4Word wraps the CML in a rather complex way before presenting it to you, the user.
The current storage model is directly coupled to word objects by their internal ID. This is what made it impossible to copy and paste chemistry between sections or documents.
We have developed a new loosely coupled storage model, the major benefit is you can now cut and paste structures directly within a document, or even between documents without going through the Navigator. The Navigator will still be there, and we think you’ll like the changes we’ve made (it now as its name suggests allows you to navigate round the linked chemistry objects).
Better Performance!
Not only does the current version store the chemistry as CML (a dialect of XML), it also manipulates that CML directly. Every time we draw an atom or bond we have got to go back to the CML document. This clobbers rendering performance, but it gets even worse when you try to change anything. Every little change means that we have to traverse the XML document directly, locate the information and transcribe it. XML is a rather verbose text format. The overhead of reading and changing this text incurs a big performance penalty.
So, in the new version, we built in an intermediate layer of chemical objects that sit between the CML stored in the document, and the visual chemistry itself. So, instead of working directly on the CML, we load it up into the object model, make our changes there, and either save it back or draw it. No longer do we have to work directly with text describing the chemistry. The object model works as a chemist would imagine it would, with bonds connecting atoms.
Initial results are…well, we were about to say ‘encouraging’, but ‘astounding’ would be more accurate. Small structures show a five-fold quicker rendering. Large structures, such as insulin, render up to one hundred and twenty times quicker! What used to be a matter of several minutes to render a large structure now takes a few seconds!
FlexDisplay
Because it’s now much easier and quicker to work with the chemistry directly, this opens up all sorts of exciting new possibilities. On-screen rendering is now much easier than it was, so we don’t have to rely on storing the bitmaps with the structure.
Using the magic of Windows Presentation Foundation (WPF), we’ve been able to replace the old grainy-style bitmaps used for previewing the chemistry with an up-to-date vector graphic. We call this new component the FlexDisplay.
You’ll be seeing a lot more of the FlexDisplay in the new version of the addin. It’s used in the new Navigator (viz), the Library (which replaces the gallery), and in the enhanced Search function. The FlexDisplay also shows atom and bond labels on hovering, helping you to identify them in the underlying CML.
The FlexDisplay is a reusable software component in its own right. So are the object model and rendering components. A Windows programmer can make their software chemically capable using these three tools, freely available under the permissive Apache license.
New Navigator and Library
We have completely overhauled the Navigator in this new version. Adding the FlexDisplay to the Navigator tool, along with clearer, office-style buttons makes the Navigator a little more user-friendly. All of this is done with data-binding to the document’s chemistry models, meaning lightning-fast response, along with the better graphics.
Old Navigator | New Navigator |
Goodbye Gallery – Hello Library!
The old Gallery stored structures in a Word document template. This inefficient storage model also causes problems for users of the add-in, wiping out customization that they may have made to document styles. It was one of the drivers for us to make significant changes to the add-in.
The Library no longer uses document templates to store its structures. We now use a simple SQLite database to store the chemistry. This approach allows users to edit structure names easily and search them, so the Library can potentially store thousands of structures and you can find them easily. It also opens up storing future metadata with the structures, like text tags, and searching on them.
Plug-in Architecture
This is the least visible improvement to Chem4Word. It is also the most important. We’ve restructured the program from the bottom up. Plug-in components now provide configurable searching, editing and rendering. If a new online data source comes along, or you want to incorporate a new editor, then it’s simply a case of writing a wrapper to predefined standards and installing it.
We will publish these standards as part of the new version release. If you want a preview of this version, then please get involved in the beta testing. We want to make this the best version of Chem4Word yet!