![]() If the log message is a valid json, the json operator can be used directly. It makes the fields of the json structure accessible for downstream operators. When handling JSON logs, the json parse command is a valuable ally. This time-series can easily be visualized to reveal trends or unusual behavior. On return, the count operator yields the number of logs per bucket and we end up with a list of tuples. It implements a bucket sort where the buckets are regular intervals and get filled with log lines. This is the go-to pattern for creating a time-series from logs. Example 2Ĭounting over time never gets old. Each parse clause may define another constraint to reduce the search result and/or parse out variables that are used in a downstream clause. We often see a parse operator followed by another parse operator and another one. This operator is used individually or in succession. The most common operator is the parse operator. ![]() It can target a single number or a histogram if used with a `by` parameter. Finally, counting aggregates the remaining data. The ` where` clause enforces some condition such as a threshold or matches a specific string value. Parsing filters and extracts specific values such as numbers, status codes, timestamps from the search result. A search expression might outline some kind of data. Variants of this are the most frequent queries that we process. The Parse Where it Counts pattern leverages the parse-where-count power triple. Moreover, we use this table to distill some common patterns. Learn all 10 operators and you are ready for most of the action. In the top 10 operator list `parse` takes the lead, followed by `where` and `count.` These three already form a powerful trifecta. The second and the third table show the occurrence of operator tuples and triples, respectively. The first table shows shows the occurrence probability of individual operators. ![]() This table shows the probability of different operators to occur. This led us to the five design patterns that we’ll out line below. To find these, we’ve sampled a representative set of user queries, derived the operator sequences and counted their occurrence. There are, however, some canonical operator sequences that are used more often than others. It can be challenging to find an effective sequence. There are myriads of sequences of operators to retrieve information from the logs. Finally, a TargetExpression may or may not end with an aggregation operator such as ‘count’ to produce a condensed view. Each clause contains one operator that specifies its function and some arguments that are specific to the log lines under consideration. Clauses are separated through the pipe ‘|’ symbol. A TargetExpression is specified by a sequence for clauses. The TargetExpression is then used to slice and dice the data to extract insights such as an error code. Typically, a SearchExpression contains search keywords like ‘error’ that select a relevant set of log lines from the log stream. Clauses combine operators and their arguments. A TargetExpression can have one or more clauses. SearchExpression identifies the relevant log lines through search keywords and TargetExpression specifies how to slice and dice the data to provide meaning. Anatomy of a QueryĪ Sumo query decomposes in SearchExpression and TargetExpression. In this post, we consolidate their queries (we’ve analyzed a 615 MB sample of query string for this) into a set of query design patterns, so that everyone can learn from their wisdom and become a Sumo Master. ![]() The Log Operators Cheat Sheet is a valuable resource to learn syntax and semantics of the individual operators, but the bigger questions become “how can we tie them together” and “how can we write query language that matters?”įortunately, there are people who have walked this path before - and succeeded. Achieving mastery is no easy path and all who set on this path may suffer greatly until they see the light. ![]() The Sumo query language can be a source of joy and pain at times. ![]()
0 Comments
Leave a Reply. |