Modeling efforts can usefully range from informal sketches
to formal, notation-heavy specifications.
My preference is to keep things informal and flexible
until the need for more formality becomes evident.
This is in line with some of the ideas of the
Agile software development movement.
Diagrams are very useful for showing connectivity
(e.g., control and data flow, method usage).
If all you're trying to do is capture the basic structure,
use any icons (e.g., boxes, circles) that seem comfortable.
Picking a consistent notation becomes important, however,
if the reader is separated by time or distance.
Similarly, although paper and whiteboards work well
for small diagrams and informal meetings,
diagram generation tools (e.g.,
Dia,
OmniGraffle,
Visio ) also have their place.
Aside from handling presentation details
(e.g., icon and arrow styles, fonts, layout),
these tools manage connectivity constraints, etc.
As the size and complexity of the model grows,
the need for even more structure and support will become evident:
Just as general-purpose drawing tools
cannot support architectural, electronic, or mechanical design,
diagram generation tools cannot analyze the diagrams they record.
Tools for generating and analyzing
conceptual schemas must be able
to represent and manage knowledge about the system under study.
Here is an elegant description of
knowledge representation,
taken from John F. Sowa's excellent
book on the topic:
Knowledge representation is a multidisciplinary subject
that applies theories and techniques from three other fields:
Without logic,
a knowledge representation is vague,
with no criteria for determining
whether statements are redundant or contradictory.
Without ontology,
the terms and symbols are ill-defined, confused, and confusing.
And without computable models,
the logic and ontology cannot be implemented in computer programs.
Knowledge representation is the application of logic and ontology
to the task of constructing computable models for some domain.
Most of the modeling methods and tools discussed below
are aimed at assisting with ontology development.
They keep track of the definitions
of classes, instances, relationships, etc.
If our goal were to create an
expert system,
both the ontology and the logic rules
would need to be extremely detailed and precise.
In MBD, however,
most of the reasoning will be done by humans.
The developer will look over the ontology
and decide what to present.
The user will look over the presented material
and decide which parts are currently of interest.
As long as the material is plausibly interesting,
the user is unlikely to complain.
So, these approaches and tools often strive
for more detail and precision than MBD requires.
Don't get caught up in the details;
your main objective is to produce a useful but simplified model!
Also note that knowledge representation is an active research area.
There are many approaches and theories,
a few emerging standards,
and little interoperability between existing tools.
Assorted communities (e.g.,
AI,
DBMS,
Semantic Web)
have developed methods and tools for knowledge representation.
Several of these approaches appear to be quite applicable to MBD,
but I have yet to find a
category killer.
Before we look at the available offerings,
let's consider the general characteristics we're looking for.
The most critical characteristic, from my perspective,
is the model's fundamental organization.
The components of a system can have arbitrary relationships;
the model must be able to encode these,
allow the user to traverse them, etc.
Because the relationships are arbitrary and (initially) unknown,
the approach must not restrict the modeler to, say,
a list-based or even hierarchical organization.
Consequently, most
outliners and
mind mapping tools
aren't suitable.
Because HTML links
can only be traversed in one direction,
most HTML-based approaches
(e.g., typical wikis are also unsuitable.
(Pairs of links can be used for bi-directional relationships,
but this is tedious and error-prone, if done manually).
In addition, the model should allow the user
to interact with assorted subsets and views of the system.
For these and other reasons,
I believe that the model must be based
on a fully-traversable (and very extensible)
graph-based organization.
A modeling tool must allow the user to interact with
(e.g., view, navigate, edit) the model.
Although most of the detailed information will be textual in nature,
text is a poor medium for presenting relationships.
So, most modeling tools use some sort of diagramming format.
The design of this format is both critical and challenging.
If the format is too simple,
it won't be able to convey the needed information.
If it is too complex, the user will become confused and frustrated.
Ideally, the tool should allow the user to use simplified notation,
adding details as desired.
Modeling tools should have reliable and convenient ways
to exchange information with other tools.
Unfortunately, this is seldom going to be the case for existing tools.
There are many formats for encoding conceptual models,
varying at syntactic, structural, and semantic levels.
Efforts are being made to provide paths between these formats.
For example, the
International Organization for Standardization (ISO)
has a working group which recently proposed a standard:
Common Logic (CL) is an information exchange
and transmission language, based on
first-order logic.
The CL definition allows a variety of different syntactic forms,
called "dialects".
A dialect may use any desired syntax or structure,
but it must be equivalent to the abstract syntax of Common Logic
(and thus, to any other CL dialect), in terms of its semantics.
The World Wide Web Consortium (W3C)
is working on a related, though less formal standard:
SKOS Core is a model and an RDF vocabulary
for expressing the basic structure and content
of concept schemes such as thesauri, classification schemes,
subject heading lists, taxonomies, 'folksonomies',
other types of controlled vocabulary,
and also concept schemes embedded in glossaries and terminologies.
The SKOS Core Vocabulary is an application of the
Resource Description Framework (RDF),
that can be used to express a concept scheme as an RDF graph.
Using RDF allows data to be linked to and/or merged with other data,
enabling data sources to be distributed across the web,
but still be meaningfully composed and integrated.
Although these efforts are proceeding well,
no adopted standards appear to be imminent.
In the meanwhile,
although one can hope for a documented interchange format,
about all that one can reasonably require
is a readable (e.g., XML) file format.
Now, let's look at some of the available offerings...
For informal modeling, I would suggest the use of
concept maps.
They aren't overloaded with notation,
but they have enough structure
to capture the basic entities and relationships of a system.
Unlike mind maps,
concept maps aren't restricted to hierarchies.
Dr. John Sowa has written a nice overview of
Concept Mapping,
covering concept maps, conceptual graphs, topic maps, etc.
The classic reference on concept maps is
Learning How to Learn.
The database community uses assorted variations on
entity-relationship diagrams (ERDs).
Unfortunately, these can cause the modeler
to focus on low-level (e.g., database-related) issues,
rather than high-level concepts.
In addition, ERDs can run into difficulties
when a relationship needs to be treated as an entity.
If we say "Romeo loves Juliet",
how do we discuss the different meanings of "loves"?
Modeling Methodologies is a good introduction to some of these issues.
Object-Role Modeling (ORM2)
appears to handle these issues nicely,
at some cost in notational complexity.
I don't know of any Open Source ORM2 tools
(though one is promised for 2006),
but some gratis and inexpensive tools have emerged.
Some versions of
Microsoft's
Visio handle various aspects of ORM2 diagrams.
For more information on ORM2, visit
www.orm.net.
Unified Modeling Language (UML)
Class Diagrams are also complex,
but they may appeal to programmers
who are already familiar with this notation.
UML is very well documented
and many supporting tools are available for it.
The Expert Systems community has been working with problems of
Knowledge Engineering for several decades.
Not surprisingly, they have some useful tools to offer.
I'm particularly interested in
Protégé.
As described in
An AI tool for the real world,
Protégé is an Open Source, well-supported,
standards-friendly tool for creating models, ontologies, etc.
Conceptual Graphs (CGs)
are structurally similar to ORM2 diagrams,
but they are based on a form of
predicate calculus known as
first-order logic (FOL).
So, they are a good match for expert system technology.
It's quite likely that Protégé could be augmented
to support CGs, ORM2, or other diagramming notations.
This might ease the recognition and specification
of complex sets of relationships.
The Semantic Web community
is developing standards (e.g.,
Resource Description Framework,
Topic Maps)
for describing concepts, encoding document metadata, etc.
The standards are still "works in progress",
but they are worth watching because
of their large and active developer communities.
Resource Description Framework (RDF)
is based on sets of three-part declarations (i.e., "triples"):
subject, predicate, object.
The apparent simplicity of this approach is balanced
by the need to create large numbers of triples
when complex concepts need to be expressed.
Topic Maps use a much richer vocabulary,
including terms such as association, name, occurrance,
scope, topic, etc.
This allows relatively small expressions
to express complex concepts, conditional assertions, etc.
A gratis tool
(Ontopia Omnigator) is available;
Open Source tools are under development.
Although the use of models is central to MBD,
modeling is a tool, rather than a goal.
If you develop a crystal-clear model,
but generate no documentation,
you haven't really accomplished your objective.
So, curb your desire to generate the "perfect" model.
Instead, try to generate a useful and flexible model,
improving it as you proceed.
As you work with the model,
you'll find areas that could use clarification, expansion, etc.
Your modeling approach should make it easy to make these changes.
Also, avoid the temptation to "start at the bottom",
detailing every data item, field, etc.
This sort of information can be researched as it is needed,
but it doesn't serve the general purposes of the model:
finding useful information, understanding the system, etc.
Next: Semantic Wikis
Knowledge Representation
Methods and Tools
Organization
Diagramming Format
Interchange Format
Concept Maps
Data(base) Modeling
Knowledge Engineering
Semantic Web
Caveats