In our earlier post, Making Threat Graph Extensible: Leveraging a DSL to Improve Data Ingestion (Part 1 of 2), we explored how and why CrowdStrike leverages HCL as a domain-specific language (DSL) in creating CrowdStrike Threat Graph®, our purpose-built graph database. We also reviewed our DSL specification and how it is converted to an intermediate representation for further processing.
In Part 2, we discuss how Go code is generated from the intermediate representation.
Code Generation
Code generation is the final step of the DSL implementation. Since a majority of the validations —such as variable and constant types defined and used in userlib functions and properties in vertices and edges — are checked during the intermediate representation conversion, we can now proceed to converting HCL and HIL into Go code by parsing the HCL and HIL AST (abstract syntax tree). Here we review a few examples of HCL and HIL AST conversion to Go code. The following HCL block ...
var "processID" {
key = field.ProcessId
onError = action.Error
action = "${len(processID) == 0 || processID == badProcessID ? break : continue}"
}
… will generate the below Go code, given badProcessID is a constant defined earlier to this var block:
processID, err := event.GetField(“ProcessId”)
if err != nil {
return err
}
if len(processID) == 0 || processID == badProcessID {
return nil
}
The first step in generating the above code is processing the key and onError fields, as well as the first conditional expression. This is a fairly straightforward process and does not require an AST traversal.
The second step is parsing the HIL AST and understanding the conditional expression — it willensure we return an error if the field fails extraction from the event. The code for that process is below:
func UnwrapTernaryOp(expr hcl.Expression) (*TernaryOperator, error) {
op := &TernaryOperator{}
switch e := expr.(type) {
case *hclsyntax.TemplateWrapExpr:
wrapped := e.Wrapped
switch conditional := wrapped.(type) {
case *hclsyntax.ConditionalExpr:
condition := conditional.Condition
trueStmt := conditional.TrueResult
falseStmt := conditional.FalseResult
goCond, err := ToGoCondition(condition)
if err != nil {
return nil, err
}
op.GoCondition = goCond.Condition
op.TrueStmt = trueStmt.(*hclsyntax.ScopeTraversalExpr).Traversal.RootName()
op.FalseStmt = falseStmt.(*hclsyntax.ScopeTraversalExpr).Traversal.RootName()
default:
return nil, fmt.Errorf("unknown expression in ternary op - %T", conditional)
}
}
return op, nil
}
func ToGoCondition(condition hclsyntax.Expression) (string, error) {
stmt := ""
switch s := condition.(type) {
case *hclsyntax.LiteralValueExpr:
var err error
// cty is a dynamic type library used by HCL
stmt, _, err = ctyValueToGo(s.Val)
if err != nil {
return "", nil, err
}
case *hclsyntax.FunctionCallExpr:
args := make(<>string, len(s.Args))
for i, arg := range s.Args {
var err error
args, err = ToGoCondition(arg)
if err != nil {
return "", err
}
}
stmt = fmt.Sprintf("%s(%s)", s.Name, strings.Join(args, ", "))
case *hclsyntax.BinaryOpExpr:
op := ""
switch s.Op {
case hclsyntax.OpEqual:
op = "=="
case hclsyntax.OpNotEqual:
op = "!="
case hclsyntax.OpGreaterThan:
op = ">"
case hclsyntax.OpGreaterThanOrEqual:
op = ">="
case hclsyntax.OpLessThan:
op = "<"
case hclsyntax.OpLessThanOrEqual:
op = "<="
case hclsyntax.OpModulo:
op = "%"
case hclsyntax.OpLogicalOr:
op = "||"
case hclsyntax.OpLogicalAnd:
op = "&&"
case hclsyntax.OpLogicalNot:
op = "!"
default:
return "", fmt.Errorf("unknown operator '%s' in ternary expression", s.Op.Type.GoString())
}
lhs, err := ToGoCondition(s.LHS)
if err != nil {
return "", err
}
rhs, err := ToGoCondition(s.RHS)
if err != nil {
return "", err
}
stmt = fmt.Sprintf("%s %s %s", lhs, op, rhs)
case *hclsyntax.TemplateExpr:
val := ""
for _, p := range s.Parts {
out, vars, err := ToGoCondition(p)
if err != nil {
return "", err
}
val += out
}
stmt = val
case *hclsyntax.UnaryOpExpr:
op := ""
switch s.Op {
case hclsyntax.OpNegate:
op = "-"
case hclsyntax.OpLogicalNot:
op = "!"
}
out, err := ToGoCondition(s.Val)
if err != nil {
return "", err
}
stmt = fmt.Sprintf("%s%s", op, out)
default:
return "", fmt.Errorf("unknown expression type: %T", s)
}
return stmt, nil
}
Although the code appears complex, it is fairly simple. The above represents the AST tree traversal and the appending of pieces of code in order to form the final conditional statement.
The final code-generated handler for the above DSL looks like the below:
type ProcessHandler struct {
graphStore *GraphStore // API to interact with Threat Graph
}
// Constructors and helper funcs removed to keep context of blog relevant
func (h *ProcessHandler) ProcessEvent(ctx context.Context, event Event) error {
if event == nil {
return nil
}
const (
badProcessID = -1
)
processID, err := event.GetField(“ProcessId”)
if err != nil {
return err
}
if len(processID) == 0 || processID == badProcessID {
return nil
}
host, err := event.GetField(“Hostname”)
if err != nil {
return nil
}
user, err := userlib.GetUser()
if err != nil {
return err
}
processVertex := NewVertex(“Process”, processID)
processVertex.AddProperty(“process_id”, processID)
processVertex.AddProperty(“timestamp”, time.Now())
userVertex := NewVertex(“User”, user)
userVertex.AddProperty(“timestamp”, time.Now())
hostVertex := NewVertex(“Host”, host)
hostVertex.AddProperty(“timestamp”, time.Now())
userToHostOutEdge := NewEdge(userVertex, UserHostEdge, DirectionOut, hostVertex)
userToHostOutEdge.AddProperty(“timestamp”, time.Now())
userToHostInEdge := NewEdge(userVertex, HostUserEdge, DirectionIn, hostVertex)
userToHostInEdge.AddProperty(“timestamp”, time.Now())
userToProcessOutEdge := NewEdge(userVertex, UserProcessEdge, DirectionOut, processVertex)
userToProcessOutEdge.AddProperty(“timestamp”, time.Now())
err = h.graphStore.SaveVertices(userVertex, hostVertex, processVertex)
if err != nil {
return err
}
err = h.graphStore.SaveEdges(userToHostOutEdge, userToHostInEdge, userToProcessOutEdge)
if err != nil {
return err
}
return nil
}
Below is a visual representation of graph mutations we get from the above DSL spec.
And here is the data representation of the graph:
Vertex ID | Type | Time | Adjacent ID | Properties |
proc:1234 | ProcessVertex | 2021-01-01T07:11:00Z | <binary blob of props> | |
user:jsmith | UserVertex | 2021-01-01T05:10:00Z | <binary blob of props> | |
user:jsmith | UserHostEdge | 2021-01-01T05:09:00Z | host:DC-123 | <binary blob of props> |
user:jsmith | UserProcessEdge | 2021-01-01T05:08:00Z | proc:1234 | <binary blob of props> |
host:DC-123 | HostVertex | 2021-01-01T05:08:00Z | <binary blob of props> | |
host:DC-123 | HostUserEdge | 2021-01-01T05:08:00Z | user:jsmith | <binary blob of props> |
Migration
All new handlers written in Threat Graph will use our DSL. However, existing handlers need to be migrated to be represented by the DSL to make the process consistent and maintainable. One way to validate that code generated from the DSL will produce the same graph mutations as handwritten handlers is by conducting tests that pass the same events with the same values to each of these handlers. These mutations generated by the handlers can then be recorded and compared. This gives us the confidence that we have full parity before we cut over to leveraging the handlers generated from the DSL.Challenges
As happens with many big projects, our team encountered several challenges during this process. These included:Migration: When migrating from handwritten handler to HCL handlers, it is important to consider the parity (functionality) of graph mutations, as well as the performance of the generated code. The latter plays an equal or even greater role as a deciding factor in deploying the DSL-generated handler. To that end, we collected a baseline performance profile of the handwritten handler and compared it with the DSL-generated handlers. This allowed our team to generate a 30% performance gain through improved processing time and memory allocations. Performance and scalability: When operating at scale, adding a toolset to the arsenal should improve team velocity, as well as enhance overall system scalability, performance and maintenance. By standardizing the code generation, we were able to create a stable code base that offers our team greater predictability and enhanced performance.
Tuning: Our team continuously tunes the generated code in order to reduce the memory allocation and improve processing times on the handler. One of the major changes we made is
caching the known event fields and properties on the event. This allows us to create them in the generated code and avoid looping them at run time. Balance of features in a DSL: Finding the balance between the available feature set of the DSL and the necessary features is important to maintaining the overall functionality of the tool. Adding too many features may result in a new language, which would require the team to learn a new DSL or limit inputs from other teams.
Uncertainty: Our decision to use HCL as our DSL came after careful consideration. As part of our process, we built prototypes in YAML and JSON and went through our review process to evaluate simplicity, readability and ease of use. Ultimately, HCL offered the best overall performance and the least amount of toolchain modifications.
Success
To recap, below are the three main things our team was able to achieve through a DSL:
- Simplify the process of new contributions to the Threat Graph data model
- Create a Threat Graph handler that is intuitive, performant and safe
- Reduce the time and resources needed from the Threat Graph team to validate extensions of the graph
Additional Resources
- For more information about CrowdStrike Threat Graph, download the data sheet.
- Find out more about the benefits of Threat Graph by visiting the webpage.
- Learn about the powerful, cloud-native CrowdStrike Falcon® platform by visiting the product webpage.
- Get a full-featured free trial of CrowdStrike Falcon® Prevent™ and see how true next-gen AV performs against today’s most sophisticated threats.