The Eclipse Modeling Framework (EMF) has an excellent reputation in the Eclipse community and beyond. It is one of the most popular modeling frameworks available. Whether you apply model-driven development using standardized modeling languages, such as UML or BPMN, or use domain specific languages, EMF provides an optimal basis to define the involved modeling languages and to implement modeling environments. Of course, with such a modeling environment EMF-based models are developed in teams making appropriate tooling for comparing model versions and merging parallel model changes inevitable.
In source code text tools, such as diff or similar algorithms in SVN or Git, are successfully implemented. However, for models a text-based comparison is inadequate. After all, these tools consider changes at the text level of the serialized model (usually in XMI format) and hence are incapable of capturing the logical structure of models. If you have ever tried to apply a text-based comparison to larger XMI files, you probably know why: it is hardly possible to grasp model changes based on XMI diffs efficiently. Moreover, a text-base merge will likely break your models because the logical structure of models is not taken into account. Thus, to achieve comprehensible difference reports and safe merge results, a model comparison and merging tool needs to operate on the same level that the changes have originally been performed by the developers. Depending on the modeling editor being used, this is the level of diagrams, model elements, references, and attribute values.
EMF Compare is a sub-project of the Eclipse Modeling Project founded back in 2006 and is dedicated to the challenge of collaborative modeling. EMF Compare provides powerful and extensible tools for realizing the comparison of model versions, conflict detection in parallel model changes, as well as automatic and manual merging of model changes. Please note, however, that EMF Compare is in many ways more of a framework than a set of tools. Although EMF Compare provides basic functionality and tools out of the box — e.g., it provides ready-to-use comparison viewers and even dedicated support for UML — for other modeling languages EMF Compare often needs to be tailored to the specific needs of each context. Therefore, EMF Compare provides a number of extension points and customization options to account for the requirements of specific modeling languages.
Technical background of EMF Compare
Before we take a look at EMF Compare in action (see next section), we will briefly discuss some technical details on model comparison in the current section. Note that a solid understanding of these details is only necessary if you want to extend and optimize EMF Compare for specific modeling languages. However, if you are mainly interested in EMF Compare from a user perspective, you can safely skip this section.
The core of EMF Compare is a generic model comparison component. This component is independent of the respective modeling language and can therefore process all EMF-based models using the powerful reflection mechanism of EMF. EMF Compare supports two-way and three-way comparisons. A two-way comparison simply determines the differences between one version and another. This is sufficient for obtaining the changes that have been applied between one model version and its ancestor version. If you instead want to use EMF Compare in the context of a version control system, it is necessary to also merge concurrently performed changes in order to create a common successor version. To this end, we need a three-way comparison. The three-way comparison is composed of two two-way comparisons; two model versions (“left” and “right”) are each compared with their common ancestor version and the results of these comparisons are combined into a list of changes that have been applied on each side. For building a new common successor version, which reflects the changes of both sides, the changes of one side (for example, “left”) are applied to the opposite side (for example, “right”).
The model comparison itself is carried out in three phases: the matching phase, the diffing phase, and the analysis of the detected differences. Let’s assume that we want to compare the two model versions depicted in the figure above, Version 1 and Version 2. First, in the matching phase, the corresponding model elements of the two versions are identified. The matching can be determined either via unique identifiers of model elements, such as XMI IDs, or it is computed based on similarity measures. In our example, we see that the class Artist from Version 1 corresponds the class Artist of Version 2. For the class Actor of Version 2, however, no match was found in Version 1.
In the second step, all previously determined corresponding model element pairs are examined in more detail regarding differences in their attribute and reference values. For each detected difference a so-called diff element is created that accurately describes each variation between the model elements. For instance, when the classes named Artist of Version 1 and Version 2 are examined regarding their attribute and reference values, we see that in Version 1, the attribute isAbstract is set to false, while the same attribute is set to true in Version 2. Moreover, we identify a difference in the values of the reference superClasses: in Version 2 the value list contains the class Actor, whereas the value list is empty in Version 1. Model elements that don’t have a corresponding element in the opposite version were either deleted or added — depending on the direction. In our example in the figure above, this applies to the class Actor. Each detected difference is described in a so-called diff element, which is eventually stored in a comparison model alongside the correspondences that have been identified during the matching phase.
In the third phase, the diff elements are examined concerning equivalences, dependencies, and conflicts. Equivalent diff elements denote changes that would lead to the same end result in the model when applying either of them. For example, a change of a bi-directional one-to-one reference consists of two equivalent diff elements; one diff element for changing either side of the reference. When applying either of the two diff elements, however, we obtain the same effect, because EMF would update the opposite side of the reference automatically. Dependencies between diff elements indicate the order in which they must be executed. For example, a reference to a model element can only be added after the referenced model element has been created. In contrast to dependencies, conflicting diff elements do not depend on each other but mutually exclude each other. Users must therefore decide which of the conflicting diff elements is to be rejected or applied in the course of a model merge. Conflicts can, of course, only arise in three-way comparisons, while equivalences and dependencies may occur in both two- and at three-way comparisons. This stands to reason that in our example, we don’t have conflicts (it is only a two-way comparison), but we do have a dependency: the addition of the value in superClasses depends on the diff element that represents the creation of the class Actor.
EMF Compare in Action
After this rather theoretical explanation, it is about time a to have a closer look on how to use EMF Compare. Once EMF Compare is installed (see https://www.eclipse.org/emf/compare/download.html), models can be compared with their local history by right-clicking on a model file and selecting Compare With → Local History. When selecting two model files, the same context menu allows you to compare the two selected files with each other (two-way comparison). If, however, you select three model files and click on Compare With → Each Other, you have to choose which of the three model files should be used as a common ancestor version, as you are about to start a three-way comparison among the three selected model files.
Once you have started the comparison, the Merge Tool of EMF Compare will be opened. The figure below shows the Merge Tool for a comparison of two versions of an Ecore model. The Merge Tool consists of two parts. The upper part is the so-called Structure Merge Viewer, which depicts the detected differences in a tree that is structured according to the hierarchy of the compared model. In this viewer, we see that three changes have been performed between the compared model versions: the name of the attribute lastName in the class Actor has been changed, a reference named categories has been added to the class Movie, and a new class called Category has been created in the package movies. In the square brackets next to the differences displayed in this viewer, we see the name of the features (reference or attribute) that has been modified, as well as the change type (added, changed, deleted). Using the buttons in the toolbar we can apply selected changes to the left- or the right-hand side model file. Moreover, there are buttons for navigating through the changes as well as to configure grouping and filters in the merge viewer.
In the lower area of the Merge Tool, there is the so-called Content Merge Viewer, which shows — depending on the element selected in the upper area — further details about the selected item. If a diff element is selected that represents the addition of a model element (as in the figure above), the Content Merge Viewer displays the location at which the new element was inserted in the containment tree of the model. If however a diff element is selected that represents, for instance, a change of a string attribute (such as the change of the attribute name shown below), the corresponding content Merge Viewer for text changes is shown:
EMF Compare also ships with a dedicated Content Merge Viewer for changes of graphical diagrams that are realized using the Graphical Modeling Framework (GMF). For example, if an element is moved in a diagram, the dedicated Content merge viewer displays and highlights the moved element instead of showing the raw changes of the X and Y values.
What’s new in EMF Compare 3.1
The new release of EMF Compare not only comes with a variety of bug fixes and performance improvements, but also includes a handful of new features.
For example, the support for feature-map changes has been significantly improved in EMF Compare. Feature maps are often used in metamodels that have been imported from XML schemas and allow managing reference values of multiple references. For example, feature maps allow you to create a reference, orders, that represents the union of two distinct references named preferredOrders and standardOrders.
Besides, EMF Compare 3.1 now includes special handling of changes in multi-line text attributes. So far, parallel changes of the same attribute value in two model versions have always been considered as conflict, even if the parallel changes affected distinct lines in a multi-line text value and hence could have been merged with a three-way text diff algorithm. To overcome this drawback, a dedicated text comparison component has been integrated so that a conflict is only reported when the parallel changes of multi-line text attributes are actually overlapping concerning the text lines they affect.
Another novelty in EMF Compare 3.1 is the support for cross-file model changes. Models frequently span across multiple files. This is realized by means of cross-file paths to model content in other files. Thus, EMF Compare not only needs to consider the contents of the referenced models, but also must consider file renames, as they need to be reflected in the paths in order to maintain the cross-file references. To this end, dedicated types of differences, as well as special mergers for those difference types have been introduced to account for such cases accordingly. These types of changes include not only file renames but also extractions of partial models in new files or the reintegration of extracted models into an existing file.
Model Versioning with EMF Compare and EGit
In addition to these improvements in EMF Compare 3.1, the development team is busy working on another significant improvement: the seamless integration of EMF Compare with EGit. This integration, however, is not included in EMF Compare 3.1, but will hopefully be delivered with Mars SR1 in November. If Mars SR1 is too far away for you and you want to use the EGit integration already now, visit the Collaborative modeling page. On this web page, you can download the current builds that already include the EGit integration.
The integration of EMF Compare and EGit allows you to manage EMF-based models adequately in a Git repository just as conveniently as it is the case for conventional source code. The cooperation of EMF Compare and EGit is realized via the interfaces and extension points of the Team API.
As a consequence, a model that is managed in a Git repository with EGit, can be compared using EMF Compare via the context menu Compare With → Branch, Tag, or Reference with other versions of the model in Git repository. Of course, you can also check you local changes of a model using Compare With → HEAD revision. EMF Compare intervenes with its dedicated support for EMF models when it comes to merging models in the course of a Git merge or rebase, instead of leaving this task to the line-based merge algorithm of Git. Thus, model-based conflicts are also supported. For instance, the figure below shows a model change conflict, which was revealed during a merge of two branches that included commits affecting a UML model. Conflicts between model changes can now be easily resolved by choosing which of the two conflicting changes should be rejected or accepted. Once the conflict is resolved, we can add the model file to Git index and commit in order to complete the merge, just as for conventional files in EGit.
In this context we would like to highlight another major change in EMF Compare: the support for so-called “Logical Models”. As already mentioned above, EMF-based models often span across several files. A well known example for such a distribution of models across multiple files is the separation of the model and the diagram in distinct files, as it is also done in Papyurs, which splits the UML model (file with the extension .uml) and the diagram information (file with the extension .notation) in separate files. Of course, the diagram file references the UML model file. Therefore, both model files must always be considered together during a merge to prevent invalid references (“dangling references”) after the merge of parallel model changes. To achieve this, EMF Compare contributes an EMF-specific implementation of the extension point org.eclipse.core.resources.modelProviders and, with this, always groups all model files which have dependencies to each other, into logical models. EMF Compare considers not only outgoing but also incoming references among files. Therefore, it builds a cached index of model file dependencies in the background to improve performance. The extent to which EMF Compare should record dependencies among model files can be configured via the Eclipse preferences. Using the logical model mechanism, it will be avoided to add incomplete models to the Git index. If you, for instance, try to add only the UML model to the index (model.uml), you will be notified that the logical model also includes the diagram file (model.notation) and the selection will be adjusted accordingly.
EMF Compare has been an integral part of the EMF family for a long time, especially in the recent years, the community around EMF Compare is very active. The development team accomplished several significant improvements and new features in EMF Compare. In this post we only mentioned a few of them. Beyond what has been discussed in this post, many substantial performance improvements, new extension points, as well as optimized support for UML models and better integration with Papyrus have been the focus of the development of EMF Compare. Many of these improvements are developed in the course of the Collaborative Modeling Initiative, a joint initiative with the aim of providing industrial-quality and highly-integrated open source tools for collaborative modeling of UML models with Papyrus. In addition to the model versioning functionality of EMF Compare and EGit, in the future, this initiative will also address other interesting topics, such as systematic model review on the basis of Gerrit. So we can look forward to exciting new features and improvements in and around EMF Compare; and we are thrilled to be part of this endeavour. To stay up to date, you can follow the collaborative modeling initiative on Google+ Page or on the Twitter handle @CollabModeling.