Diffing is the process of determining what changed between two sets of objects.
Typically, the two sets of objects are two versions of the same thing (of a pulled Revit model, of a Structural Model that we want to Push to an Adapter, etc), in which case Diffing can effectively be used as a Version Control tool.
🤖 Developers: check out also the Diffing and Hash: Guide for developers.
The Diffing_Engine gives many ways to perform diffing on sets of objects. Let's see them.
The most versatile method for diffing is the
BH.Engine.Diffing.Compute.Diffing() method, also called
IDiffing. Ideally, you should always use this Diffing method, although other alternatives exist for specific cases (see Other diffing methods below). A detailed technical explanation of the IDiffing can be found in the guide for developers.
This method can be found in any UI by simply looking for
The method takes three inputs:
pastObject: objects belonging to a past version, a version that precedes the
followingObjects: objects belonging to a following version, a version that was created after the
diffingConfig: configurations for the diffing, where you can set your
ComparisonConfig object, see below.
The IDiffing, like all diffing methods, relies on an identifier assigned to each object, which can be used to match objects, so it knows which to compare to which even across multiple versions of the objects. The identifer is generally a unique number assigned to each object, and this number is assumed to remain always the same even if the object is modified. The identifier looked for is of type
IPersistentAdapterId, searched in the object's Fragments; this is typically stored on objects when they are Pulled from an Adapter. This means that the IDiffing works best with objects pulled from a BHoM Adapter that stores the object Id on the object (most of them do).
In case no Identifier can be found on the objects, the IDiffing attempts other diffing methods on the objects; this is explained in more detail in the diffing guide for developers.
The output of every diffing method is always a
diff object, which we will describe in a section below.
DiffingConfig object can be attached to any Diffing method and allows you to specify options for the Diffing comparison.
The Diffing config has the following inputs:
ComparisonConfigallows you to specify all the object comparison options; see its dedicated page.
EnablePropertyDiffing: optional, defaults to
true. If disabled, Diffing does not checks all the property-level differences, running much faster but potentially ignoring important changes.
IncludedUnchangedObjects: optional, defaults to
true. When diffing large sets of objects, you may want to not include the objects that did not change in the diffing output, to save RAM.
AllowDuplicateIds: optional, defaults to
false. The diffing generally uses identifiers to track "who is who" and decide which objects to compare; in such operations, duplicates should never be allowed, but there could be edge cases where it is useful to keep them.
The Diffing output: the
The output of any Diffing method is an object of type
diff output can be
Exploded to reveal all the available outputs:
AddedObjects: objects present in the second set that are not present in the first set.
RemovedObjects: objects not present in the second set that were present in the first set.
ModifiedObjects: objects that are recognised as present both in the first set and the second set, but that have some property that is different. The rules that were used to recognise modification are in the
UnchangedObjects: objects that are recognised as the same in the first and second set.
ModifiedObjectsDifferences: all the differences found between the two input sets of objects.
DiffingConfig: the specific instance of
DiffingConfigthat was used to calculate this
Diff. Useful in scenarios where a
Diffis stored and later inspected.
ModifiedObjectDifferences output contains a List of
ObjectDifferences objects, one for each modified object, that contains information about the modified objects. These can be further
PastObject: the object in the
pastObjsset that was identified as modified (i.e., a different version of the same object was found in the
FollowingObject: the object in the
followingObjsset that was identified as modified (i.e., a different version of the same object was found in the
Differences: all the differences found between the two versions of the modified object. This is a List of
PropertyDifferenceobjects, one for each difference found on the modified object.
Finally, exploding the
Differences object, we find:
DisplayName: name given to the difference found. This is generally the PropertyName (name of the property that changed), but it can also indicate other things. For example, if a
ComparisonInclusion()extension method is defined for some of the input objects (like it happens for Revit's
RevitParameters), then the
DisplayNamemay also contain some specific naming useful to identify the difference (in the case of
RevitParameter, this is the name of the RevitParameter that changed in the modified object).
An example of a DisplayName could be
StartNode.Position.X(given a modified object of type
PastValue: the modified value in the
FollowingValue: the modified value in the
FullName: this is the modified property Full Name. An object difference can always be linked to a precise object property that is different; this is given in the Full Name form, which includes the namespace. An example of this could be
BH.oM.Structure.Elements.Bar.StartNode.Position.X. Note that this FullName can be significantly different from
DisplayName(as happens for
RevitParameters, where the Full Name will be something like e.g.
Other Diffing methods
In addition to the main Diffing method
IDiffing(), there are several other methods that can be used to perform Diffing. These are a bit more advanced and should be used only for specific cases. The additional diffing methods can be found in the Compute folder of Diffing_Engine.
Other than these, Toolkit-specific diffing methods exist to deal with the subtleties of comparing Objects defined in a Toolkit. Users do not generally need to know about these, as Toolkit-specific diffing methods will be automatically called for you if needed by the generic IDiffing method. Just for reference, a Toolkit-specific Diffing method is
These two methods are "ID-based" diffing methods. They simply retrieve an Identifier associated to the input objects, and use it to match objects from the
pastObjs set to objects in the
followingObjs set, deciding who should be compared to who.
DiffWithFragmentId()retrieves object identifiers from the objects' Fragments. You can specify which Fragment you want to get the ID from, and which property of the fragment is the ID.
DiffWithCustomDataKeyId()retrieves object identifiers from the objects' CustomData dictionary. You can specify which dictionary Key you want to get the ID from.
Both method then call the
DiffWithCustomIds() to perform the comparison with the extracted Ids, see below.
DiffWithCustomIds() method allows you to provide:
- Two input objects sets that you want to compare,
- Two input identifiers sets,
followingObjsIds, with the Ids associated to the
You can specify some
null Ids in the
followingObjsIds; however these two lists must have the same number of elements as
The IDs are then used to match the objects from the
pastObjs set to objects in the
followingObjs set, to decide who should be compared to who:
- If an object in the
pastObjs does not have a corresponding object in the
followingObjs set, it means that it has been deleted in the following version, so it is identified as "Removed" (old).
- If an object in the
followingObjs does not have a corresponding object in the
pastObjs set, it means that it has been deleted in the past version, so it is identified as "Added" (new).
- If an object in the
pastObjs matches by ID an object in the
followingObjs, then it is identified as "Modified" (it changed between the two versions). This means that the two objects will be compared and all their differences will be found. This is done by invoking the
ObjectDifferences() method, that is explained in detail here.
DiffOneByOne() method simply takes two input lists,
followingObjects, and these have the objects in the same identical order. It then simply compares each object one-by-one. If matched objects are equal, they are "Unchanged", otherwise, they are "Modified" and their property difference is returned.
For this reason, this method is not able to discover "Added" (new) or "Removed" (old) objects.
DiffWithHash() method simply does a Venn Diagram of the input objects' Hashes:
The Venn Diagram is computed by means of a
HashComparer, which simply means that the Hash of all input objects gets computed.
If objects with the same hash are found they are identified as "Unchanged"; otherwise, objects are either "Added" (new) or "Removed" (old) depending if their hash exists exclusively in following or past set. For this reason, this method is not able to discover "Modified" objects.
The Hash is leveraged by this method so you are able to customise how the diffing behaves by specifying a
ComparisonConfig options in the
This method was designed for the AECDeltas workflow and is currently not widely used.
It essentially expects the input objects to be wrapped into a
Revision object, which is useful to attach additional Versioning properties to them.
The Revisions can then be provided as an input to
DiffRevisions(), and the logic works very similarly to the other diffing methods seen above.