Understand the Code’s Structure and Model Architecture
In any approach to a new code base, especially one that is complex with many thousands of lines of code and multiple files, it is often useful to start with a visual overview of the architectural structure. Preferably, this understanding includes both the static structure (via class, component or package UML diagrams or equivalent) and data flows through the app. Component views of the file structure are one useful starting point for understanding the physical relationships between the JS, and HTML, XHTML or XUL, and possible data files. Then, a next desirable view would be to run the code base through a code visualizer such as MIT’s Relo, which has the usefulness of enabling interactive and gradual discovery and presentation of the class structure of a code base.
So, what other automated options short of reverse engineering to UML are there for gaining this structural understanding of the code base? Frankly, not many.
When All Else Fails, Go Manual
Manual diagramming, possibly with UML compliance, next begs the question of which drawing program to use. Like program editors, UML and drawing tools are highly varied and a personal preference. I mostly hate selecting new tools since I only want to learn a few, but learn them well. That means tools represent real investments for me, which also means I need to spend substantial time identifying and then testing out the candidates.
Dia is a GTK+ based diagram creation program for Linux, Unix and Windows released under the GPL license. Dia is roughly inspired by the commercial Windows program Visio. It is more geared towards informal diagrams for casual use, but it does have UML drawing templates. Graphviz is a general drawing and modeling program used by some other UML open source programs. I liked this program for its general usefulness and kept it on my system after all evaluations were done, but is really not automated enough alone for UML use.
In the UML realm, one end of the spectrum are “light”, simpler tools that have lower learning thresholds and fewer options. Some candidates here are Violet (which is quite clean and can reside in Eclipse, but has very limited formatting options) and UMLet (which does not snap to grid and also has limited color options). Getting a bit more complex I liked UMLGraph, which allows the declarative specification and drawing of UML class and sequence diagrams. Among its related projects, LightUML integrates UMLGraph into Eclipse and XUG extends UMLGraph into a reverse engineering tool.
Since I use Eclipse as my general IDE, my actual preference is a UML tool integrated into that environment. There are about 40 or UML options that support that platform, about a dozen of which are either free (generally “community versions” of commercial products) or open source. I reviewed all of them and installed and tested most.
LightUML is a little tricky to install, since it has external dependencies on Graphviz and the separate UMLGraph, though those are relatively straightforward to install. The biggest problem with LightUML for my purposes was its requirement to work off of Java packages or classes. I next checked out green. With green, UML diagrams are educational and easy to create and it provides a round-tripping editor. I found it interesting as an academic project but somewhat lacking for my desired outputs. Omondo’s free version of EclipseUML looks quite good, but unfortunately will only work with bona fide Java projects, so it was also eliminated from contention. I found eUML2 to be slow and it seemed to take over the Eclipse resources. I found a similar thing with Blueprint Software Modeler, 85 MB without the Eclipse SDK, which seems rather silly for the degree of functionality I was seeking.
Papyrus is a free, very new, Eclipse plug-in. It has only 5 standard UML diagrams and sparse documentation, but its implementation so far looks clean and its support for the DI2 (diagram interchange) exchange format is a plus (I’ve kept it on my system and will check in on new releases later). Visual Paradigm for UML has a free version usable in Eclipse, which is professionally packaged as is common for commercial vendors. However, this edition limits the number of diagram types per project and the download requires all options and a 117 MB install before the actual free version can be selected, and even then implications of the various choices are obscure. (These not uncommon tactics are not really bait-and-switch, but often times feel so.) The limit of one diagram type per project eliminate this option from consideration, plus it is incredibly complicated and a real resource hog. The community edition of MagicDraw UML operates similarly and with similar limitations.
None of these tools support JS reverse engineering or round-tripping. However, keeping everything in the Eclipse IDE would aid later steps when adding in-line code documentation (see below) to the code base. On the other hand, because of the lack of integration, and because UML diagramming needed to occur before code commenting, I decided I could forego Eclipse integration for a standalone tool so long as it supported the standard XMI (XML metadata interchange) format.
This decision led me back to again try StarUML, which is the tool I ended up using to complete my UML diagramming. I remain concerned about StarUML’s lack of recent development and reliance on Delphi and some proprietary components that limit it as a full open-source tool. On the other hand, it is extensible via user-added profiles and tags using XML, it is very intuitive to use, and it has very flexible export and diagramming options. Because of strong XMI support, should better options later emerge, there is no risk of lock-in.
Looking to the future, because of my Eclipse environment and the growing availability of JS editors in Eclipse (Aptana, which I use, but also Spket and JSEclipse), one project worth monitoring is UML2, an EMF- (and GMF-) based implementation of the UML 2.x metamodel for the Eclipse platform due out in June. This effort is part of the broader Model Development Tools (MDT) Eclipse initiative. UML2 builds are presently available for testing with other bleeding-edge components, but I have not tested these.
Throughout this process of UML tools investigations, I came to discover a number of preferences and criteria for selecting a toolset that include: 1) a free, open-source option; 2) easy to use and modify, with flexible user preferences; 3) support for UML 2.x and most or all UML diagrams; 4) support for XMI import and export; 5) extensible profiles and frameworks via an XML syntax, preferably with some profile building utilities; 6) operation within the Eclipse environment, measured by standard plug-in and update installs; 7) clean functionality and user interface; 8 ) the ability to handle large diagrams, particularly in tiling; 9) a variety of output formats; and 10) strength in the class, component and package diagrams of most use to me. Of course, these factors may differ substantially for you.
Now, Document the Code as You Dive Deeper
The most widely used approach is JSDoc, which is a Perl-based analog to Javadoc (click here for an example report). JSDoc is the most established of the systems and has an interface familiar to most Java developers. The first version is being phased out, with version 2 release apparently pending. There is a Google Group on this option. JSDocumenteris a graphical user interface built for Javadoc-like documenting using JSDoc. Within this family, the JSDoc-2 option appears to be the choice, since the initial developers themselves have moved in that direction and recognize earlier problems with JSDoc they are planning to overcome.
ScriptDoc is an attempt to standardize documentation formats, and is being pushed and backed by the Aptana open source JS IDE. The major differentiator for the ScriptDoc approach is the ability to link IDs in the code base to external .sdoc files that could perhaps have more extensive commenting than what might be desirable in-line. (The other side of that coin is the synchronization and maintenance of parallel files.) To date, ScriptDoc is also tightly linked with Aptana, with no known other platform support. Also, there is a disappointing level of activity on the ScriptDoc standards site.
On this and other bases, I think JSDoc-2 is the best choice of the options above. The general Javadocs form and expansion of necessary tags appears to have overcome earlier JSDoc limits, and the easy parser recognition for the comments appears sufficiently flexible. Plus, output formats can be tailored to be as fancy as desired.
Code Validation and Debugging
Since these tools are beyond the scope of this present review, more detailed discussion of them will be left to another day.