Everyone likes a pretty graph
Presenting data is a tabular way is all good and well, but as the volume of data grown, it gets easier to lose the big picture. Aggregating data helps to present it in a more concise manner, but to allow the user to get a good picture of the situation in just a few seconds, you need a graph.
Visualizations are Tachyon’s way to allow you to present the data as graphical form. In a way, they are a fancy way of saying “put this is a chart for me”.
Tachyon supports several chart types: Pie, SmartBar, Line, Bar, Column, Area and StackedArea.
Pie and Area are designed as “2d” or “single series” charts while the rest are designed to be “3d” or “multi series” charts. Some of them (like Bar and Column) can be used as 2d but they don’t look quite right – the while the data is correct the colours are wrong.
But the charts themselves are not enough. You need a way to process the responses so that they can be displayed in one of the charts. The processing is done by something we call a response processor. Tachyon has several processors built-in, like date-time single series, date-time multi series, default single series and default multi series. You can also provide your own code to do the processing but that is beyond the scope of this article.
The examples below will have the raw configuration Json that each graph uses, so let’s quickly go over the fields in that Json to give you an idea what they mean.
Each chart should be uniquely identified by its Id field. It is up to you what that unique value is. You can put a Title on the chart and choose from one of several chart types available.
X defines which column from the schema to use as X axis, Y does the same for value. Z axis is available on the 3d charts and allows you to pick the column over which o aggregate the data. PostProcessor defines which processing function to use and multiple charts can use the same function and essentially plot the same data differently if you so desire, or each chart can use its own processing function. Parameters of each function can be found in the documentation and they are outside the scope of this article.
Row defines which row the chart should be in. This value becomes important if you have more than one chart and helps you define if the charts should be side by side, in which case they would have the same row number, or one over another, where they would have different row numbers.
To aggregate or not to aggregate.
Aggregated data is readily available and is relatively small in volume and can be processed quickly, even if the processor has to do extra aggregations. Dealing with raw data would mean the processor would potentially have to process hundreds of thousands of rows in order to produce the data for the graph. That would put an undesired strain on the server and should be avoided at all costs.
So, to all intents and purposes, graphs deal with aggregate data. If you want to use a visualization in one of your instructions, that instruction must have an aggregation schema and you should use columns from the aggregate schema as your X, Y and Z axis.
2d vs 3d
Is short, a 2d chart is a chart that displays a single series of name-value elements, while the 3d one groups entries and then displays series of name-value elements for each grouping.
This does sound a bit convoluted but in a moment, I’ll show you some examples and it will become much clearer.
Single series (2d) charts
Let’s looks at a few single series charts. First, I’ll show you time-based chart that use the date time processor.
As an example we’ll look at an instruction that looks on DNS resolutions over time. This instruction has a simple aggregated schema with two columns: TimeStamp and Count.
Above graph has following configuration:
Moving away from the time-based data, we have everyone’s favourite – the pie chart.
The instruction we’re looking at shows count of disable services and has two columns: Caption and Count.
And here’s the configuration:
Multi series (3d) charts
Now let’s looks at a multi series chart that plots data over time
Here we have an instruction that returns aggregated data with following schema:
And the aggregation is performed over Fqdn and Timestamp fields. Based on that data, we’ll plot two graphs.
First one will shows Domain name resolutions over time.
This was achieved by using following template configuration:
Same data is then used to plot total resolution per day:
Using this configuration:
And the processor definition that serves both looks like this:
As you can see we’re grouping over the Fqdn column, which will give us our series. Then within each grouping we have a series consisting of a value for a given date.
The entire response visualization configuration looks like this:
But of course, we can plot a multi series that doesn’t necessarily use time on the X axis. We can have any value there.
As an example we’ll use an instruction that returns all software installed on the system. This instruction is aggregated by publisher, product and the product’s version and gives a simple count.
The chart above shows us the 5 most common apps from 5 most common publishers. To achieve that, we grouped over the Publisher and used Product as our X-axis with the count being the value. It’s worth noting that this will include all versions of a given product, since taking that into account would require another level to the chart.
Here’s configuration for the chart above:
Just to wrap things up, here’s an example that uses multiple graphs arranged over two rows.
Each graph has its own processor to go over data that is aggregated on Agents column and has four other columns, each representing a sum of all occurrences of a specific state (Success, Error, NotImplemented, PayloadTooLarge):
And here’s the full configuration Json:
And the charts themselves: