Skip to content

Diffing: tracking changes between objects

Diffing is the process of determining what changed between two sets of objects.

Typically, the two sets of objects are two versions of the same thing (of a pulled Revit model, of a Structural Model that we want to Push to an Adapter, etc), in which case Diffing can effectively be used as a Version Control tool.

🤖 Developers: check out also the Diffing and Hash: Guide for developers.

image

The Diffing_Engine gives many ways to perform diffing on sets of objects. Let's see them.

IDiffing method

The most versatile method for diffing is the BH.Engine.Diffing.Compute.Diffing() method, also called IDiffing. Ideally, you should always use this Diffing method, although other alternatives exist for specific cases (see Other diffing methods below). A detailed technical explanation of the IDiffing can be found in the guide for developers.

This method can be found in any UI by simply looking for diffing: image image

The method takes three inputs: - pastObject: objects belonging to a past version, a version that precedes the followingObjects's version. - followingObjects: objects belonging to a following version, a version that was created after the pastObject's version. - diffingConfig: configurations for the diffing, where you can set your ComparisonConfig object, see below.

The IDiffing, like all diffing methods, relies on an identifier assigned to each object, which can be used to match objects, so it knows which to compare to which even across multiple versions of the objects. The identifer is generally a unique number assigned to each object, and this number is assumed to remain always the same even if the object is modified. The identifier looked for is of type IPersistentAdapterId, searched in the object's Fragments; this is typically stored on objects when they are Pulled from an Adapter. This means that the IDiffing works best with objects pulled from a BHoM Adapter that stores the object Id on the object (most of them do).

In case no Identifier can be found on the objects, the IDiffing attempts other diffing methods on the objects; this is explained in more detail in the diffing guide for developers.

The output of every diffing method is always a diff object, which we will describe in a section below.

DiffingConfig (and ComparisonConfig)

The DiffingConfig object can be attached to any Diffing method and allows you to specify options for the Diffing comparison.

New Project (12) The Diffing config has the following inputs:

  • ComparisonConfig allows you to specify all the object comparison options; see its dedicated page.
  • EnablePropertyDiffing: optional, defaults to true. If disabled, Diffing does not checks all the property-level differences, running much faster but potentially ignoring important changes.
  • IncludedUnchangedObjects: optional, defaults to true. When diffing large sets of objects, you may want to not include the objects that did not change in the diffing output, to save RAM.
  • AllowDuplicateIds: optional, defaults to false. The diffing generally uses identifiers to track "who is who" and decide which objects to compare; in such operations, duplicates should never be allowed, but there could be edge cases where it is useful to keep them.

The Diffing output: the Diff object

The output of any Diffing method is an object of type Diff. The diff output can be Exploded to reveal all the available outputs:

image

  • AddedObjects: objects present in the second set that are not present in the first set.
  • RemovedObjects: objects not present in the second set that were present in the first set.
  • ModifiedObjects: objects that are recognised as present both in the first set and the second set, but that have some property that is different. The rules that were used to recognise modification are in the DiffingConfig.ComparisonConfig.
  • UnchangedObjects: objects that are recognised as the same in the first and second set.
  • ModifiedObjectsDifferences: all the differences found between the two input sets of objects.
  • DiffingConfig: the specific instance of DiffingConfig that was used to calculate this Diff. Useful in scenarios where a Diff is stored and later inspected.

The ModifiedObjectDifferences output contains a List of ObjectDifferences objects, one for each modified object, that contains information about the modified objects. These can be further Exploded:

image

  • PastObject: the object in the pastObjs set that was identified as modified (i.e., a different version of the same object was found in the followingObjs set).
  • FollowingObject: the object in the followingObjs set that was identified as modified (i.e., a different version of the same object was found in the pastObjs set).
  • Differences: all the differences found between the two versions of the modified object. This is a List of PropertyDifference objects, one for each difference found on the modified object.

Finally, exploding the Differences object, we find:

image

  • DisplayName: name given to the difference found. This is generally the PropertyName (name of the property that changed), but it can also indicate other things. For example, if a ComparisonInclusion() extension method is defined for some of the input objects (like it happens for Revit's RevitParameters), then the DisplayName may also contain some specific naming useful to identify the difference (in the case of RevitParameter, this is the name of the RevitParameter that changed in the modified object).
    An example of a DisplayName could be StartNode.Position.X (given a modified object of type BH.oM.Structure.Elements.Bar).
  • PastValue: the modified value in the PastObject.
  • FollowingValue: the modified value in the FollowingObject.
  • FullName: this is the modified property Full Name. An object difference can always be linked to a precise object property that is different; this is given in the Full Name form, which includes the namespace. An example of this could be BH.oM.Structure.Elements.Bar.StartNode.Position.X. Note that this FullName can be significantly different from DisplayName (as happens for RevitParameters, where the Full Name will be something like e.g. BH.oM.Adapters.Revit.Parameters[3].RevitParameter.Value).

Other Diffing methods

In addition to the main Diffing method IDiffing(), there are several other methods that can be used to perform Diffing. These are a bit more advanced and should be used only for specific cases. The additional diffing methods can be found in the Compute folder of Diffing_Engine.

Other than these, Toolkit-specific diffing methods exist to deal with the subtleties of comparing Objects defined in a Toolkit. Users do not generally need to know about these, as Toolkit-specific diffing methods will be automatically called for you if needed by the generic IDiffing method. Just for reference, a Toolkit-specific Diffing method is RevitDiffing().

DiffWithFragmentId() and DiffWithCustomDataKeyId()

These two methods are "ID-based" diffing methods. They simply retrieve an Identifier associated to the input objects, and use it to match objects from the pastObjs set to objects in the followingObjs set, deciding who should be compared to who.

  • The DiffWithFragmentId() retrieves object identifiers from the objects' Fragments. You can specify which Fragment you want to get the ID from, and which property of the fragment is the ID.
  • The DiffWithCustomDataKeyId() retrieves object identifiers from the objects' CustomData dictionary. You can specify which dictionary Key you want to get the ID from.

Both method then call the DiffWithCustomIds() to perform the comparison with the extracted Ids, see below.

DiffWithCustomIds()

The DiffWithCustomIds() method allows you to provide: - Two input objects sets that you want to compare, pastObjs and followingObjs; - Two input identifiers sets, pastObjsIds and followingObjsIds, with the Ids associated to the pastObjs and followingObjs.

You can specify some null Ids in the pastObjsIds and followingObjsIds; however these two lists must have the same number of elements as pastObjs and followingObjs, respectively.

The IDs are then used to match the objects from the pastObjs set to objects in the followingObjs set, to decide who should be compared to who: - If an object in the pastObjs does not have a corresponding object in the followingObjs set, it means that it has been deleted in the following version, so it is identified as "Removed" (old). - If an object in the followingObjs does not have a corresponding object in the pastObjs set, it means that it has been deleted in the past version, so it is identified as "Added" (new). - If an object in the pastObjs matches by ID an object in the followingObjs, then it is identified as "Modified" (it changed between the two versions). This means that the two objects will be compared and all their differences will be found. This is done by invoking the ObjectDifferences() method, that is explained in detail here.

DiffOneByOne()

The DiffOneByOne() method simply takes two input lists, pastObjs and followingObjects, and these have the objects in the same identical order. It then simply compares each object one-by-one. If matched objects are equal, they are "Unchanged", otherwise, they are "Modified" and their property difference is returned.

For this reason, this method is not able to discover "Added" (new) or "Removed" (old) objects.

DiffWithHash()

The DiffWithHash() method simply does a Venn Diagram of the input objects' Hashes:

image

The Venn Diagram is computed by means of a HashComparer, which simply means that the Hash of all input objects gets computed.

If objects with the same hash are found they are identified as "Unchanged"; otherwise, objects are either "Added" (new) or "Removed" (old) depending if their hash exists exclusively in following or past set. For this reason, this method is not able to discover "Modified" objects.

The Hash is leveraged by this method so you are able to customise how the diffing behaves by specifying a ComparisonConfig options in the DiffingConfig.

DiffRevisions

This method was designed for the AECDeltas workflow and is currently not widely used.

It essentially expects the input objects to be wrapped into a Revision object, which is useful to attach additional Versioning properties to them. The Revisions can then be provided as an input to DiffRevisions(), and the logic works very similarly to the other diffing methods seen above.


Last update: September 20, 2023