Skip to main content

1E 8.1 (on-premises)

Device Deduplication

Several scenarios can lead to the same device being reported more than once within Tachyon, which can find their way into Experience and other applications. The main issue is duplicates having different TachyonGuid's. TachyonGuid is the primary distinct identifier of a device, meaning that when calculating the total number of devices within the system it can easily show a count that is higher than the real world number. It also means that data associated with a device (for example, Experience performance data) will eventually be split amongst multiple TachyonGuids, affecting access to the device's data as well as affecting aggregations.

In most cases, deduplicating using the device FQDN should suffice, in that it is reasonable to assume that multiple devices with the same FQDN are the same device. However, this is not true in all cases - for various reasons some organizations do not use FQDN to identify a device, they may uniquely identify devices based on Make, Model and Serial Number. For this reason, the deduplication feature can be configured with one or more device attributes to uniquely identify a distinct device.

Warning

The Device Deduplication feature is not enabled until you have modified the Coordinator configuration settings as described below.

The configuration steps described below for updating the Coordinator configuration file, specify False (default) for each of the configuration arguments. You are strongly advised that all configuration arguments - except for DeviceIdentifier - are set to False (default) for the first run or for whenever a new DeviceIdentifier value is being set. Results should then be analysed to ensure that the DeviceIdentifier specified is correct for your environment.

Configuration

Tachyon stores its Device Deduplication configuration in the <INSTALLDIR>\Tachyon\Coordinator\Tachyon.Server.Coordinator.exe.config file.

After any changes reboot the server or restart the 1E Tachyon Coordinator service.

The time and frequency of when the Device Deduplication process runs is changed by adding and editing the DeviceDeduplication setting.

<module assemblyName="Tachyon.Server.Coordinator" providerName="Tachyon.Server.Coordinator.Scheduling.SchedulingModule">
  <settings>
    <add key="Crontab1" value="50 23 * * * LogDevicesSeenInTheLast7Days" />
    <add key="Crontab2" value="15 * * * * MaintainPolicyRulePartitioning" />
    <add key="Telemetry" value="40 23 * * 5 SendTelemetryStats" />
    <add key="DeviceDeduplication" value="0 1 * * 0 DeviceDeduplication {DeleteDevices:'false',UpdateDeviceData:'false',DeleteOrphanedDeviceData:'false',DeviceIdentifier:['FQDN']}" />
  </settings>
</module>

The numbers that you see after value is a crontab schedule expression. The schedule 0 1 * * 0 means "at 01:00 hours, any day of the month, any month of the year, on weekday 0 (Sunday)". In other words, this configuration runs the Device Deduplication process at 1AM UTC every Sunday night.

You can use an online tool such as https://crontab.guru/ to verify your crontab schedule expression.

Warning

Please ensure there are no spaces in the arguments text {within the brackets}.

Do not change the other crontab keys unless advised by 1E.

Modifying when Device Deduplication runs

By default, Device Deduplication runs at 1AM UTC every Sunday night. Another process called Experience Synchronization runs at 2AM UTC daily. It is very important that Experience Synchronization is not running when Device Deduplication starts, therefore if you change the time when either of these processes start, then you must ensure Experience Synchronization process starts after the Device Deduplication process start, and ideally there should be at least a 1 hour gap to avoid overlap. You may need a longer gap for very large systems.

Arguments

Running with all "Update..." and "Delete..." arguments set to false will make no database changes and is considered a diagnostic/practice run. Following a run such as this the DeviceDeduplication and DeviceDeduplicationLog tables could then be analyzed to figure out how many duplicates are in the environment based on the DeviceIdentifier input. This would provide a projected impact for a subsequent run with one or more of these arguments set to true. Alternatively, the user could then also choose to modify the DeviceIdentifier value to include more/less/different columns that would provide a more fine tuned target of what the user expects a truly unique device to be.

Argument

Default

Usage

Instance

Instance number of last run +1

This can optionally be added to the list of parameters in the app setting. By default, the orchestration will handle this automatically. If specifying, previous instance numbers can be found in the DeviceDeduplicationLog table. A use case for specifying this is retroactive execution, see Instance below section for more details.

DeleteDevices

False

Delete duplicate devices from both databases, and any management group associations for those devices.

UpdateDeviceData

False

Relink TachyonExperience performance data from duplicate devices (that will be deleted) to the singular active device that will be kept.

DeleteOrphanedDeviceData

False

Remove TachyonExperience performance data that is associated with a device that no longer exists.

DeviceIdentifier

['FQDN']

This is an array of one or more passes, with each pass made up of one or more device attributes (column names) from the Device table in the TachyonMaster database.

Please refer to Device Deduplication below for more details about using multiple passes.

You can use any column from theDevicetable that is adevice attribute, however, you should only use attributes that uniquely identify a device. Device attributes that are commonly used to uniquely identify a device are: Fqdn,SMBiosGuid, MAC, SerialNumber,Manufacturer, Model, Domain, Name,User, andLocation. Some of these attributes can be used on their own, whilst others must be used in combination. You should avoid using TachyonGuid, or any of the date columns.

Each pass must be wrapped in single quotes. Any attributes specified within a pass are create a single hash per device. When using multiple passes, or multiple attributes within a pass, they must each be separated by acomma.

For example: DeviceIdentifier:['SerialNumber,SMBiosGuid','Manufacturer,Model,Location','FQDN',]

Note

Any devices that contain identical values across all the specified attributes will be assigned the same hash. Of these devices, the one that has the most recent LastConnUtc value will be marked as ToBeDeleted=0 whilst the others will be marked as ToBeDeleted=1. This is how we identify duplicates as well as the single active device that we want to keep. The CreatedUtcthat is retained for the kept device is not necessarily the earliest or the latest of all its duplicates.

Note

If the column schema of the Device table changes between runs, then the DeviceDeduplication table will have to be backed up and deleted before the next run. This will generate a new version of the table in the next run with up-to-date device columns, which also means new columns can now be specified within the DeviceIdentifier value.

The following table describes the system behavior for every permutation of boolean arguments passed to DeviceDeduplication:

DeleteDevices

UpdateDeviceData

DeleteOrphanedDeviceData

Device Deduplication process behavior

F

F

F

This is the default configuration (3 x False). Before you change these settings, you should run a diagnostic/practice run where no changes will be made to the devices or their data. Check DeviceDeduplication and DeviceDeduplicationLog table for results analysis.

T

F

F

Delete devices identified as duplicates, do not update or merge any existing performance data.

F

T

F

Do not delete devices identified as duplicates, reassign duplicate devices performance data to the single device that has been identified as the most recent instance of the device. *1

F

F

T

Do not delete devices identified as duplicates, do not update or merge any existing performance data. Delete any performance data that is associated with a TachyonGuid that no longer exists in the TachyonMaster Device table. This might be useful as a database cleanup exercise to remove "unparented data". *1 *2

T

T

F

Delete devices identified as duplicates and merge any existing performance data that was associated with duplicates.

F

T

T

Do not delete devices identified as duplicates, reassign duplicate devices performance data to the single device that has been identified as the most recent instance of the device. Finally, after relinking the data, delete any performance data that is associated with a TachyonGuid that no longer exists in the TachyonMaster Device table. *1

T

F

T

Delete devices identified as duplicates, do not update or merge any existing performance data. Delete any performance data that is associated with a TachyonGuid that no longer exists in the TachyonMaster Device table. *2

T

T

T

Delete devices identified as duplicates, reassign duplicate devices performance data to the single device that has been identified as the most recent instance of the device. Finally, after relinking the data, delete any performance data that is associated with a TachyonGuid that no longer exists in the TachyonMaster device table.

*1 - This could lead to continued inflated device counts in the system

*2 - Be careful as this data will be deleted and cannot be recovered or merged at a later date even if the TachyonGuid that reported it comes online again.

Instance

Each run of Device Deduplication will result in entries in the DeviceDeduplicationLog table. All rows for the run will be assigned an instance number. The Instance number is also used to be able to share context between the TachyonMaster and TachyonExperience parts of the process.

Retroactive execution: An old instance can also be passed to the process via the app setting. When an old instance is provided duplicates will not be recalculated, instead the list of devices from that previous run will be used (from the DeviceDeduplication table). This will also be logged in the DeviceDeduplicationLog table, an entry that reads "This is a rerun of a previous instance. Previous history from the DeviceDeduplication table will be used for this instance". In an execution such as this any new value for DeviceIdentifier will be ignored however all other new arguments will be applied.

Multiple Passes of Deduplication

The DeviceIdentifier argument is a comma separated list of passes, each pass itself being a comma separated list of device attributes. The order of specified passes is adhered to for Device Deduplication. Each pass further refines duplicate devices against the results of the previous pass. There is no limit to the number of passes specified.

For example, the passes listed in the table below would be run if the DeviceIdentifier is set to DeviceIdentifier:['SerialNumber,SMBiosGuid','Manufacturer,Model,Location','FQDN',]:

Pass number

Device attributes used

Identify duplicates based on...

1

SerialNumber

SmBiosGuid

Devices in the TachyonMaster Device table that have the same SerialNumber AND SMBiosGuid.

2

Manufacturer

Model

Location

Devices identified as duplicates in pass 1 that also have the same Manufacturer, Model and Location.

3

FQDN

Devices identified as duplicates in pass 2 that also have the same FQDN.

The results of pass 3 alone are then used for the rest of the process and considered duplicates going forward.

Experience Synchronization

If any of the Boolean arguments are set to true, Device Deduplication may affect existing data in the system, whether through relinking to a new TachyonGuid or removing as orphaned data. For this reason, when the Device Deduplication process completes (with any settings set to true), it triggers a full Experience Synchronization. By default, the regular Experience Synchronization process is incremental, and runs at 2AM UTC daily.

The full Experience Synchronization includes a full process of the TachyonExperience aggregated data, which is initiated regardless of the current ProcessMode option specified (GlobalSetting table in the TachyonExperience database). A full process is required because data which typically Tachyon handles as additive can have historical data modified by the DeviceDeduplication process, which could otherwise lead to inaccurate counts and aggregations.

For large environments the full process of the aggregable data is likely to be a long running background process, it will not impact data displayed in the UI until it has completed. If it conflicts with an ongoing Experience Synchronization process it will be skipped. Server resource demand is likely to increase during an aggregation process. Be aware that depending on the resources available, this could affect the query response time of the UI.

The latest Accumulated Hotfix for Tachyon Platform Server allows you to add the ExperienceSync setting to the Coordinator configuration file. You do not need to add or modify this unless advised by 1E.

<module assemblyName="Tachyon.Server.Coordinator" providerName="Tachyon.Server.Coordinator.Scheduling.SchedulingModule">
  <settings>
    <add key="Crontab1" value="50 23 * * * LogDevicesSeenInTheLast7Days" />
    <add key="Crontab2" value="15 * * * * MaintainPolicyRulePartitioning" />
    <add key="Telemetry" value="40 23 * * 5 SendTelemetryStats" />
    <add key="DeviceDeduplication" value="0 1 * * 0 DeviceDeduplication {DeleteDevices:'false',UpdateDeviceData:'false',DeleteOrphanedDeviceData:'false',DeviceIdentifier:['FQDN']}" />
    <add key="ExperienceSync" value="0 2 * * * ExperienceSynchronization" />
  </settings>
</module>

The numbers that you see after value is a crontab schedule expression. The schedule 0 2 * * * means "at 02:00 hours, any day of the month, any month of the year, any day". In other words, this configuration runs the Experience Synchronization process at 2AM UTC daily.

You can use an online tool such as https://crontab.guru/ to verify your crontab schedule expression.

Logging and analysing results

The Device Deduplication process creates logs in the Tachyon.Coordinator.log file.

The DeviceDeduplicationLog table - in the TachyonMaster database - logs more granular detail including the arguments used for the instance, details of each pass if more than one is specified, numbers of duplicates found, as well as rows affected in specific tables by various actions.

The DeviceDeduplication table - in the TachyonMaster database - stores a list of devices to be kept and their duplicates for the instance. The ToBeDeleted column is either 0 (false) for a device to be kept, and 1 (true) for a duplicate to be deleted. Kept and duplicate devices are associated by the Hash column.

With all arguments set to false you can review whether the specified DeviceIdentifier is able to uniquely identify devices and genuine duplicates. Once you are sure the DeviceIdentifer is suitable, you can then change other arguments from false to true according to your needs.

When the DeleteDevices argument is set to true, the Device table is updated for each device where older duplicates of it have been identified. Each device will have a timestamp set for LastDeduplicationUtc column, as well as an Experience event logged. The timestamp is updated when one or more new duplicates are found. There is also a DeduplicationCount column which represents the number of complete Device Deduplication processes where a device has been found to have duplicates. These values will only be set on the device identified as being the most recent device that will be kept, not it s duplicates.

Devices with a LastDeduplicationUtc value can be found using the below SQL query.

Genuine devices that have had duplicates identified

SELECT * FROM [TachyonMaster].[dbo].[Device] WHERE LastDeduplicationUtc IS NOT NULL and DeduplicationCount IS NOT NULL

The Experience event can be viewed within Experience (Devices → {Select a device} → Logs).

The details of the log will show what attributes were used when it was identified as having duplicates, if multiple passes were specified they will be delimited with a '#' in the order of execution. The IdentifyingHash value can be used to query the DeviceDeduplication table Hash column, in the TachyonMaster database.

This will provide all device details of the device itself as well as all of it's duplicate devices.

Logging and analysing results