If you saw our last post, you might have worked out that the new buttons allow us to draw simple chemical reactions.
Now you can write documents about the synthesis of caffeine from theobromine!
Chemistry Add-In for Microsoft Word
CML is Cool
by Andy Wright
by Andy Wright
by Andy Wright
by Clyde Davies
This is the first in (what I hope will be) a series of occasional postings. Targeted at developers mainly, these will cover some of the problems we come across, and how we solve them.
Chem4Word is about ten years old. It started out as a research project. Such projects typically explore new ways of solving problems. They don’t set out to optimize established approaches. When you keep adding new functions at the expense of making existing ones work better, you end up with cruft.
The cruftiest portion of Chem4Word was its in-memory chemistry handling. Chem4Word’s standard file format is Chemical Markup Language, an XML dialect. There was no logical differentiation between the file containing the information and its in-memory representation. The code simply loaded the CML as-is into an XDocument. It then manipulated the XML directly. This was very wasteful as it required the XML parsed every time we needed to manipulate something. XDocuments are fine for limited XML editing, but when you’re constantly hitting the document then the conversion overhead becomes huge.
So, we wrote a completely new, in-memory Model class in C#. The Model in essence represents a chemical drawing. A Model contains Molecules, which can contain other Molecule objects or Atom and Bond collections.
Atoms obviously are the building blocks of Molecules. We store these in Molecules as dictionaries. These allow us to quickly retrieve an Atom by its label, such as ‘a1’. Bonds are simple collections, which reference a StartAtom and EndAtom as a text label. So Bond ‘b1’ would link Atoms ‘a1’ and ‘a2’, for example.
Atoms obviously have to reference a specific Element. There are obviously a limited number of these, stored in a single global PeriodicTable.
Each Molecule also contains what we term as emergent objects. So, a Molecule has a collection of Rings, and each Ring collects several Atoms. Atoms can be shared by more than one ring.
This simple approach allows us to create compact in-memory representations of quite complex molecules. The XML representation, instead of being central, is now relegated purely to persistence roles. Previewing, in-document rendering, editing and many other functions now work on the Model. A dedicated CMLConverter class handles conversion to and from the CML format, typically when storing this in a Custom XML Part.
No doubt some purists will take issue with this rather literal implementation but we (and you) are already reaping the benefits. The improved performance is by far the most dramatic effect. Even on a relatively fast machine, rendering insulin in the document used to take six or seven minutes. Now it takes a few seconds. We have seen performance improvements of 400-500% on small molecules and 12,000% on large ones!
One final benefit comes from exposing the atoms and bonds as objects in their own right, and organizing them as collections. We can now data-bind to them. Windows Presentation Foundation makes heavy use of data binding. So, we could theoretically data-bind various visual elements to their corresponding logical ones. And this is precisely what we do in the FlexDisplay component, which we will cover in a later post.
by Clyde Davies
2021 was a challenging year for many of us. We are still glad that we managed to produce a new release of Chem4Word. The most effort went into improving ACME, our new molecule sketcher.
ACME opens new horizons for Chem4Word. And I’d like to talk about some of those now.
Chemistry is the study of change, as Walter White memorably points out.
Chem4Word up to now has been more concerned with static molecules. We think it presents these as well as any paid-for package.
But molecules are boring, yeah? You got enthused when you saw some chemistry happening! You have been demanding change. And we’re going to deliver it!
We’ve been working on basic reaction functionality. What’s that? Well, it’s drawing different kinds of chemical reactions and specifying reactants, products, reaction type with reagents and conditions. Here’s a sneak preview:
But Chemical Markup Language, which underpins Chem4Word, is capable of much, much more. CML describes reactions in great detail using CML-React. You can fully specify the reaction type, reagents and conditions. You can also set reactants and products for a reaction.
We want to treat reagents and solvents as first-class chemical objects in their own right. So, we will add custom dictionaries of these, both from libraries and from online sources like Wikipedia.
We also plan to fully support reaction mechanisms with electron pushers (‘curly arrows’). Chem4Word will be the best free chemistry tool for teaching and understanding chemistry!
We are now beta-testing this new version, so get involved if you can. Download the beta and tell us what you think. All your feedback is valuable!
Clyde Davies
Project Leader.