There are several ways to group events. The most common approach uses either the transaction or stats command. But when should you use transaction and when should you use stats?
The rule of thumb: If you can use stats, use stats. It’s faster than transaction, especially in a distributed environment. With that speed, however, comes some limitations. You can only group events with stats if they have at least one common field value and if you require no other constraints. Typically, the raw event text is discarded.
Like stats, the transaction command can group events based on common field values, but it can also use more complex constraints such as total time span of the transaction, delays between events within the transaction, and required beginning and ending events. Unlike stats, transaction retains the raw event text and field values from the original events, but it does not compute any statistics over the grouped events, other than the duration (the delta of the _time field between oldest and newest events in the transaction) and the eventcount (the total number of events in the transaction).
The transaction command is most useful in two specific cases:
- When unique field values (also known as identifiers) are not sufficient to discriminate between discrete transactions.
- When it is desirable to see the raw text of the events rather than an analysis on the constituent fields of the events.
Again, when neither of these cases is applicable, it is a better practice to use stats, as search performance for stats is generally better than transaction. Often there is a unique identifier, and stats can be used.
For example, to compute statistics on the duration of trades identified by the unique identifier trade_id, the following searches yield the same answer:
… | transaction trade_id | chart count by duration … | stats range(_time) as duration by trade_id | chart count by duration The second search is more efficient.
However, if trade_id values are reused but the last event of each trade is indicated by the text “END”, the only viable solution is:
… | transaction trade_id endswith=END | chart count by duration
If, instead of an end condition, trade_id values are not reused within 10 minutes, the most viable solution is:
… | transaction trade_id maxpause=10m | chart count by duration
Finally, a brief word about performance. No matter what search commands you use, it’s imperative for performance that you make the base search as specific as possible. Consider this search:
sourcetype=x | transaction field=ip maxpause=15s | search ip=188.8.131.52
Here we are retrieving all events of sourcetype=x, building up transactions, and then throwing away any that don’t have an ip=184.108.40.206. If all your events have the same ip value, this search should be:
sourcetype=x ip=220.127.116.11 | transaction field=ip maxpause=15s
This search retrieves only the events it needs to and is much more efficient.
7.2.1 Unifying Field Names
You need to build transactions from multiple data sources that use different field names for the same identifier.
Typically, you can join transactions with common fields like:
… | transaction username
But when the username identifier is called different names (login, name, user, owner, and so on) in different data sources, you need to normalize the field names.
If sourcetype A only contains field_A and sourcetype B only contains field_B, create a new field called field_Z which is either field_A or field_B, depending on which is present in an event. You can then build the transaction based on the value of field_Z.
sourcetype=A OR sourcetype=B | eval field_Z = coalesce(field_A, field_B) | transaction field_Z
7.2.2 Finding Incomplete Transactions
You need to report on incomplete transactions, such as users who have logged in but not logged out.
Suppose you are searching for user sessions starting with a login and ending with a logout:
… | transaction userid startswith=”login” endswith=”logout”
You would like to build a report that shows incomplete transactions—users who have logged in but not logged out. How can you achieve this?
The transaction command creates an internal boolean field named closed_txn to indicate if a given transaction is complete or not. Normally incomplete transactions are not returned, but you can ask for these “evicted” partial transactions by specifying the parameter keepevicted=true. Evicted transactions are sets of events that do not match all the transaction parameters.
For example, the time requirements are not met in an evicted transaction. Transactions that fulfill all the requirements are marked as complete by having the field closed_txn set to 1 (rather than 0 for incomplete transactions). So the pattern for finding incomplete transactions would generally be:
… | transaction <conditions> keepevicted=true | search closed_txn=0
In our case, however, there’s a wrinkle. An endswith condition not matching will not set the closed_txn=0 because events are processed from newest to oldest. Technically, the endswith condition starts the transaction, in terms of processing. To get around this, we need to filter transactions based on the closed_txn field, as well as make sure that our transactions don’t have both a login and a logout:
… | transaction userid startswith=”login” endswith=”logout” keepevicted=true | search closed_txn=0 NOT (login logout)
7.2.3 Calculating Times within Transactions
You need to find the duration times between events in a transaction.
The basic approach is to use the eval command to mark the points in time needed to measure the different durations, and then calculate the durations between these points using eval after the transaction command. but you have the full range of eval functions available to you for more complex situations.
…| eval p1start = if(searchmatch(“phase1”), _time, null()) | eval p2start = if(searchmatch(“phase2”), _time, null()) Next we make the actual transactions: … | transaction id startswith=”start of event” endswith=“end of event”
Finally we calculate the duration for each transaction, using the values calculated above.
…| eval p1_duration = p2start - p1start | eval p2_duration = (_time + duration) - p2start
In this example, we calculated the time of the last event by added _time (the time of the first event) and adding duration to it. Once we knew the last event’s time, we calculated p2_duration as the difference between the last event and the start of phase2.
7.2.4 Finding the Latest Events
You need to find the latest event for each unique field value. For example, when was the last time each user logged in?
At first, you might be tempted to use the transaction or stats command. For example, this search returns, for each unique userid, the first value seen for each field:
… | stats first(*) by userid
Note that this search returns the first value of each field seen for events that have the same userid. It provides a union of all events that have that user ID, which is not what we want. What we want is the first event with a unique userid. The proper way to do that is with the dedup command:
… | dedup userid
7.2.5 Finding Repeated Events
You want to group all events with repeated occurrences of a value in order to remove noise from reports and alerts.
Suppose you have events as follows:
2012-07-22 11:45:23 code=239
2012-07-22 11:45:25 code=773
2012-07-22 11:45:26 code=-1
2012-07-22 11:45:27 code=-1
2012-07-22 11:45:28 code=-1
2012-07-22 11:45:29 code=292
2012-07-22 11:45:30 code=292
2012-07-22 11:45:32 code=-1
2012-07-22 11:45:33 code=444
2012-07-22 11:45:35 code=-1
2012-07-22 11:45:36 code=-1
Your goal is to get 7 events, one for each of the code values in a row: 239, 773, -1, 292, -1, 444, -1. You might be tempted to use the transaction command as follows:
… | transaction code
Using transaction here is a case of applying the wrong tool for the job. As long as we don’t really care about the number of repeated runs of duplicates, the more straightforward approach is to use dedup, which removes duplicates. By default, dedup will remove all duplicate events (where an event is a duplicate if it has the same values for the specified fields). But that’s not what we want; we want to remove duplicates that appear in a cluster. To do this, dedup has a consecutive=true option that tells it to remove only duplicates that are consecutive.
… | dedup code consecutive=true
7.2.6 Time Between Transactions
You want to determine the time between transactions, such as how long it’s been between user visits to your website.
Suppose we have a basic transaction search that groups all events by a given user (clientip-cookie pair), but splits the transactions when the user is inactive for 10 minutes:
… | transaction clientip, cookie maxpause=10m
Ultimately, our goal is to calculate, for each clientip-cookie pair, the difference in time between the end time of a transaction and the start time of a more recent (i.e. ‘previous’ in order of events returned) transaction.
That time difference is the gap between transactions. For example, suppose we had two pseudo transactions, returned from most recent to oldest:
T1: start=10:30 end=10:40 clientip=a cookie=x
T2: start=10:10 end=10:20 clientip=a cookie=x
The gap in time between these two transactions is the difference between the start time of T1 (10:30) and the end time of T2 (10:20), or 10 minutes. The rest of this recipe explains how to calculate these values.
First, we need to calculate the end time of each transaction, keeping in mind that the timestamp of a transaction is the time that the first event occurred and the duration is the number of seconds that elapsed between the first and last event in the transaction:
… | eval end_time = _time + duration
Next we need to add the start time from the previous (i.e., more recent) transaction to each transaction. That will allow us to calculate the difference between the start time of the previous transaction and our calculated end_time.
To do this we can use streamstats to calculate the last value of the start time (_time) seen in a sliding window of just one transaction— global=false and window=1—and to ignore the current event in that sliding window—current=false. In effect, we’re instructing streamstats to look only at the previous event’s value. Finally, note that we’re specifying that this window is only applicable to the given user (clientip-cookie pair):
… | streamstats first(_time) as prev_starttime
global=false window=1 current=false by clientip, cookie
At this point, the relevant fields might look something like this:
T1: _time=10:00:06, duration=4, end_time=10:00:10 T2: _time=10:00:01, duration=2, end_time=10:00:03 prev_starttime=10:00:06 T3: _time=10:00:00, duration=0, end_time=10:00:01 prev_starttime=10:00:01
Now, we can finally calculate the difference in time between the previous transaction’s start time (prev_starttime) and the calculated end_time. That difference is the gap between transactions, the amount of time (in seconds) passed between two consecutive transactions from the same user (clientip-cookie pair).
… | eval gap_time = prev_starttime – end_time
A detailed understanding of Splunk is available in this blog post for your perusal!