Contents
What is Tachyon Agent Historic Data Capture?
On Tachyon Windows Agent devices Tachyon continuously captures events, which enables Tachyon to capture all significant events as they happen. This should be contrasted with polling, which to a certain degree relies on luck to capture conditions that are brief enough to fall between polls. In this way Tachyon Agent Historic Data Capture compares with the Windows Task Manager or Perfmon. Tachyon captures the data to a compressed and encrypted database to ensure that it has a very low impact on device performance and security.
The data is captured and stored to a local, encrypted persistent store and then periodically aggregated according to an ongoing daily, weekly, monthly window. This means that the data is held securely and the amount of data is minimized while still maintaining its usefulness.
Configuration options for each capture source are described in the public documentation reference for Tachyon Agent configuration properties.
What are the data capture sources?
The table below lists currently supported capture sources, and on which OS they are supported.
The Agent has two key mechanisms of knowing when an event occurs that is of interest - event-based and polling-based
- Event-based relies on a source external to the Agent (normally the operating system) providing a notification to indicate that something has happened
- Polling-based is where the Agent will periodically check a source of data and work out what has changed by looking at differences in the data returned
The Windows Agent can be configured to use polling instead of ETW for individual capture sources.
When using the polling method, the polling interval is every 30 seconds.
Historic data source | Description | Windows | MacOS | Linux | Solaris | Android |
---|---|---|---|---|---|---|
DNS resolutions | The Agent captures whenever a DNS address is resolved. |
| Polling | Not yet available | Not yet available | Not yet available |
Process executions | The Agent captures whenever a process starts on the device. |
| Polling | Polling | Polling | Not yet available |
Software installations | The Agent captures which software is present on a device, and when it is installed and uninstalled. |
| Polling | Polling | Polling | Not yet available |
Outbound TCP connections | The Agent captures whenever an outbound TCP connection is made. |
| Polling | Polling | Not yet available | Not yet available |
How do I retrieve the data from the Tachyon Agent devices?
Live and aggregated historic data is available in inventory tables.
Historic data source | Live tables | Hourly tables | Daily tables | Monthly tables |
---|---|---|---|---|
DNS resolutions | $DNS_Live | $DNS_Hourly | $DNS_Daily | $DNS_Monthly |
Process executions | $Process_Live | $Process_Hourly | $Process_Daily | $Process_Monthly |
Software installations | $Software_Live | $Software_Hourly | $Software_Daily | $Software_Monthly |
Outbound TCP connections | $TCP_Live | $TCP_Hourly | $TCP_Daily | $TCP_Monthly |
/* Sum the number of connections made per process today */ SELECT SUM(ConnectionCount) AS Connections , ProcessName FROM $TCP_Daily WHERE TS = DATETRUNC(STRFTIME("%s", "now"), "day") GROUP BY ProcessName;
SELECT * FROM $Process_Live WHERE ProcessName = "chrome.exe"
Note that because the inventory tables are not created with COLLATE NOCASE, they need to be queried in a case-sensitive fashion. So the example above won't match "Chrome.exe" or "chrome.EXE" - to work around this, you can use WHERE ProcessName LIKE "chrome.exe"
How is the data managed?
The Tachyon Agent automatically aggregates and grooms data in each inventory table. Aggregation intervals and data retention are configurable in the Agent configuration file.
- Default aggregation cycle interval is every 60 seconds, therefore it may take up to a minute before an event appears in an aggregated table
- Default retention for live tables is 5000 entries provided at least 3 aggregation cycles have occurred (older entries are deleted to make room for new entries)
- Default retention for hourly tables is 24 hours.
- Default retention for daily tables is 31 days.
- Default retention for monthly tables is 12 months.
Data is stored in a local, encrypted persistent store, which persists during an Agent upgrade, uninstall and re-installation, unless specifically deleted.
If the Agent is unable to write to storage (out of disk space or other file-system problems), it will fail but continue monitoring in the hope this situation will improve later.
Historic data capture inventory schema
The following table shows the fields which exist only in the Live and Aggregated (Hourly, Daily, Monthly) tables.
Historic data source | Fields that only exist in Live tables | Fields that only exist in Aggregated tables |
---|---|---|
DNS resolutions | n/a | LookupCount |
Process executions | CommandLine, ProcessId, ParentProcessId | ExecutionCount |
Software installations | IsUninstall | InstallCount, UninstallCount |
Outbound TCP connections | ProcessId | ConnectionCount |
Timestamps
The timestamp column (TS) in each table is stored in Unix Epoch format. Defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970. To convert to a readable format use DATETIME(TS, "unixepoch").
Timestamps are truncated in the aggregated tables.
- Hourly - time is truncated to the hour - so an event that occurred at 2017-01-27 18:03:54 would be included in the summary for 2017-01-27 18:00:00
- Daily - time is truncated to midnight on that day - so an event that occurred at 2017-01-27 18:03:54 would be included in the summary for 2017-01-27 00:00:00
- Monthly - time is truncated to midnight on the first day of the month - so an event that occurred at 2017-01-27 18:03:54 would be included in the summary for 2017-01-01 00:00:00
DNS resolutions
ETW
- Windows 8.1 and above
- Polling
- Windows 8 and below
- MacOS
- Not supported
- Linux
- Solaris
- Android
Field | Datatype | Sample value | Tables | Description |
---|---|---|---|---|
Fqdn | string |
| The FQDN which is being resolved. | |
LookupCount | integer | 1234 |
| Sum of resolutions per FQDN within the hour, day, month. |
TS | integer | 1500756083 |
| See Agent Historic Data Capture. |
The Agent attempts to capture DNS queries at the point that they are made, although on non-Windows platforms (and pre-Win 8.1 - see below), this is not presently possible and instead the local DNS cache is queried through polling.
When the Agent captures DNS queries, it captures the query, not the result of that query. That is, the Agent will capture a request to resolve an FQDN which may ultimately not be resolvable.
When using ETW, the Agent will not perform an initial poll to establish the contents of the DNS cache. When polling, the Agent will capture all unique FQDNs available in the DNS cache; new entries that appear in the cache are deemed to correspond to resolutions
Process executions
ETW
- Windows Vista and above
- Polling on Windows XP
- Polling
- Windows XP
- MacOS
- Not supported
- Linux
- Solaris
- Android
Field | Datatype | Sample value | Tables | Notes |
---|---|---|---|---|
CommandLine | string | "C:\Windows\system32\VmConnect.exe" "1EUKDEVWKS1231" "TCH-CLI-WXPX86" -G "B2C72520-BBC6-4736-BBBC-5CCF50FE6666" -C "0" |
| The full command-line of the process, including (on Windows) the executable name. Sometimes the executable name part of the command-line is quoted, sometimes it's not - it's arbitrary based however the parent process launched the child; so you may see a mix of command-lines like...
|
ExecutableHash | string | dae0bb0a7b2041115cfd9b27d73e0391 | All | The MD5 hash of the process executable. |
ExecutableName | string | vmconnect.exe | All | The filename (including extension) of the process executable. |
ExecutablePath | string | \device\harddiskvolume8\windows\system32\vmconnect.exe | All | The path and filename of the process executable. On Windows, this is the NT-device format version of the path (as a process does not necessarily need to have been launched from a device which has a drive-letter mapping). |
ExecutionCount | integer | 1234 |
| Sum of executions per executable within the hour, day, month. |
ParentExecutableName | string | mmc.exe | All | The filename (including extension) of the executable of the process which spawned this one. |
ParentProcessId | integer | 2088 |
| The process ID of the process which spawned this one. |
ProcessId | integer | 178 |
| Operating-system dependent process ID. |
TS | integer | 1500756083 | All | See Agent Historic Data Capture. |
UserName | string | 1E\bill.gates | All | The account name of the user who launched the process (or blank if it is a system-launched process). |
On Windows, the Agent runs as LOCAL SYSTEM, therefore details of almost every process will be available; however some processes may not be accessible because of permissions.
The Agent captures process starts; it does not track how long the process has been running, or how much CPU-time (or user/kernel/active time) the process has used.
Each time the Tachyon Agent starts it does an initial scan of processes before it starts capturing. To prevent double-counting a persistent storage setting called "Inventory.ProcessesLastScan" records the last time the Agent checked for processes. This corresponds to the last time the Agent polled, or if ETW is used it is the time when the Agent inventory module was last terminated.
Software installations
Field | Datatype | Sample value | Tables | Notes |
---|---|---|---|---|
Architecture | string | x64 | All | The platform architecture of the software |
InstallCount | integer | 1234 |
| Sum of installs per product within the hour, day, month. 0 if installed |
IsUninstall | integer | 0 |
| 0 = install, 1 = uninstall |
Product | string | Google Chrome | All | The title of the software that was installed/uninstalled |
Publisher | string | Google Inc. | All | The publisher of the software that was installed/uninstalled |
TS | integer | 1500756083 | All | See Agent Historic Data Capture. The Agent assumes a "new" installation/uninstallation occurred at the point of polling. |
UninstallCount | integer | 1233 |
| Sum of uninstalls per product within the hour, day, month. |
Version | string | 55.0.2883.87 | All | The version of the software that was installed/uninstalled |
On all platforms, the Agent will poll (via a call to the Software module) the list of installed software, and will use deltas between polls to infer installs and uninstalls
- The Agent stores in persistent storage (under the "
Inventory.SoftwareInstallations
" and "Inventory.SoftwareInstallationsLastScan
" keys) a JSON representation of the results of the last scan of software, and the time that this scan occurred - If these keys are present, the Agent will, on start-up, attempt to identify installs/uninstalls which occurred while the Agent was not capturing data
- For example, if Adobe Acrobat was present last time the Agent scanned, but is no longer present, it can infer that the program was uninstalled
- Since the Agent has no way of knowing when this install/uninstall happened, it will mark the event as having occurred "now"
- This may be improved in the future for installs - the Agent can generally derive at least the date on which the install happened (but not the time on Windows)
- Unlike other data captures, the Agent also tracks the "presence" of software on the machine (not just whether it was uninstalled or uninstalled)
- This is described in more detail in the Data Aggregation section
The following fields are captured:
Windows
- Software installations are read from the registry from
HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall
andHKLM\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall
- Per-user installations are not yet supported
Linux
Note
- The mechanism is like Windows in that it uses polling and "
Inventory.SoftwareInstallations
" & "Inventory.SoftwareInstallationsLastScan
" in persistent storage. However, there are 2 variants of Linux packages: RPM and Debian-style, the latter also being used for Ubuntu. The data is accessed, as it is for all operating systems, using theSoftware
module's installation enumerator. - Polls are run every 120 seconds by default.
- For RPM-based Linuxes, we enumerate through the RPM DB using the RPM API, getting the package name, version, release, vendor, and installation time.
- For Debian-style packages, we read through the text file
/var/lib/dpkg/status
. Only packages that have aStatus
of "installed
" are recorded. There is no recorded package installation time, so that is taken from the modification time of the corresponding/var/lib/dpkg/info/package_name.list
file.
Mac
- The mechanism is like Windows in that it uses polling and "
Inventory.SoftwareInstallations
" & "Inventory.SoftwareInstallationsLastScan
" in persistent storage. The data is accessed, as it is for all operating systems, using theSoftware
module's installation enumerator. - Polls are run every 120 seconds by default.
- The Mac Agent enumerates through installed packages using the pkgutil utility, getting the package name, version, release, vendor, and installation time.
- The publisher is determined by reversing Product names to produce a URL. So a product com.apple.pkg.CoreADI will produce a Publisher name of apple.com and similarly a product of uk.co.bewhere.chrome.video.osx produces a Publisher of bewhere.co.uk
Solaris
- The infrastructure is similar to the implementation for Linux (and hence Windows), but works by looking for all files matching the pattern "
/var/pkg/publisher/*/pkg/*/*
". Each file path itself gives the publisher, package name and version number. The last modification time of such a file is used as the package installation time. - There is at time of writing a bug 66121 whereby packges that are "known" but not actually installed are treated as if they installed.
Outbound TCP connections
Field | Datatype | Sample value | Tables | Notes |
---|---|---|---|---|
IpAddress | string | 132.245.77.18 [2001:4860:4860::8888] | The target remote IP address of the connection, either an IPv4 or IPv6 address. Windows support for IPV6 is limited; the Agent will capture the connections, but the format used to represent the target IPV6 may differ slightly depending on the mechanism used, and may be subject to change in future versions of the Windows Agent. | |
Port | integer | 443 | The target remote port of the connection | |
ProcessId | integer | 11828 | The operating-system specific identifier of the process which instigated the connection. Not supported for Mac OSX earlier than Mac OSX Lion (10.7). | |
ProcessName | string | chrome.exe | The executable filename of the process which instigated the connection Connections originated from system-oriented processes are captured as "(system)" | |
The Agent captures TCP connections, not UDP connections - as UDP is inherently connectionless (each packet sent is effectively a new connection).
Each time the Tachyon Agent starts it does an initial scan of connections before it starts capturing. A limitation of the Windows API is means that all established TCP connections, whether inbound or outbound, are captured; there is no way to distinguish between the two. This means that it is possible for the Agent to double-capture a connection if that connection was established before the Agent stops monitoring, and still exists when the Agent starts monitoring again, for example between Agent restarts. Unlike other capture sources, there is no persistent storage setting to prevent double-counting.
The Agent captures initial "connect" requests, not just successful connection establishment. This means that an attempt to perform a connection will be captured, even if that connection does not complete, for example, because of a timeout, or the server-side does not permit the connection.