Tracee和Falco是两个利用ebpf探测系统事件的工具,其中falco还能探测k8s的特有事件等。
Tracee
运行tracee容器后,这里指运行tracee-ebpf(使用 eBPF 的 Linux 跟踪和取证程序)和tracee-rules(运行时安全规则检测引擎)两个组件,那么会默认到容器内部的 /tracee/rules/ 目录下去读取规则进行配置,用户也可以使用参数 --rules-dir 来指定rules加载目录,默认加载目录下所有的规则,如果只想加载部分规则可以使用参数 --rules 来指定。也可以给tracee容器设定参数 -v /root/my_rule:/tracee/rules
来使用主机 /root/my_rule 下的rule.
Tracee-Rules由三个基础部分构成:
- Inputs - Event sources to be processed. Currently only Tracee-eBPF is a supported event source.
- Rules (a.k.a Signatures) - The particular behavioral pattern to detect in the input source. Signatures can be authored in Golang, or Rego (OPA).
- Outputs - How to communicate detections. Print to stdout, post to a webhook, or integrate with external systems.
这里面的output好理解即输出的格式等,那么input-tracee是什么意思?input就是tracee-rules的事件来源,由tracee-ebpf提供事件源,tracee-ebpf把时间打印到标准输出或者文件中,然后tracee-rules从这个input中取事件源来处理。
tracee-rules提供了两种方式自定义规则:使用.rego语言的规则文本,或使用golang接口的规则。
The tracee full container will compile an eBPF object during startup, if you do not have one already cached in /tmp/tracee.
rego规则示例
如何自定义规则,捕获可疑的系统行为。如下的规则被tracee载入后,当用户在容器所在的主机上输入命令'strace ls' 时就会触发事件。
package tracee.TRC_2
__rego_metadoc__ := {
"id": "TRC-2",
"version": "0.1.0",
"name": "Anti-Debugging",
"description": "Process uses anti-debugging technique to block debugger",
"tags": ["linux", "container"],
"properties": {
"Severity": 3,
"MITRE ATT&CK": "Defense Evasion: Execution Guardrails",
}
}
tracee_selected_events[eventSelector] {
eventSelector := {
"source": "tracee",
"name": "ptrace"
}
}
tracee_match {
input.eventName == "ptrace"
arg := input.args[_]
arg.name == "request"
arg.value == "PTRACE_TRACEME"
}
这里eventSelector中的 source 一般都写成 tracee(看完了官网上所有的示例,source 值都是 tracee)。 name 字段的值就是系统事件名称, 一般跟tracee_match里面的input.eventName 一致。tracee_match 可以在同一个rego文件中有多个,例如官网 https://github.com/aquasecurity/tracee/blob/v0.7.0/signatures/rego/ 里面的 code_injection.rego 就是有多个tracee_match,并且有的有返回值有的没有返回。这里面的如果在主机执行strace ls
, 或者尝试写/proc/xx/mem文件就会触发该告警,可以看到又返回值的就是把值写在Data里面。
*** Detection ***
Time: 2022-05-11T03:07:15Z
Signature ID: TRC-2
Signature: Anti-Debugging
Data: map[]
Command: strace
Hostname: test-virtual-mach
*** Detection ***
Time: 2022-05-11T03:06:09Z
Signature ID: TRC-3
Signature: Code injection
Data: map[file flags:O_WRONLY|O_CREAT|O_LARGEFILE file path:/proc/23/mem]
Command: vim
Hostname: test-virtual-mach
另外, illegitimate_shell.go里面示例了限定event发起的proc的写法。最后,这些rego文件中常常看到helpers.get_tracee_argument(“pathname”)等函数,还有helpers.is_elf_file(xxx)等函数。helpers的函数定义在了tracee代码库的/tracee/signatures/rego/helpers.rego中了,其中get_tracee_argument的参数还可以是 bytes、 alert、type等,如果需要查看可以获取哪些参数可以直接使用
docker run --name tracee --rm -it --pid=host --cgroupns=host --privileged -v /etc/os-release:/etc/os-release-host:ro -e LIBBPFGO_OSRELEASE_FILE=/etc/os-release-host -e TRACEE_EBPF_ONLY=1 aquasec/tracee:0.7.0
只启动 tracee-ebpf,通过它打印出的事件参数来决定。
golang 接口的规则
package main
import (
"fmt"
"strconv"
"strings"
"github.com/aquasecurity/tracee/types/detect"
"github.com/aquasecurity/tracee/types/protocol"
"github.com/aquasecurity/tracee/types/trace"
)
// counter is a simple demo signature that counts towards a target
type counter struct {
cb detect.SignatureHandler
target int
count int
}
// Init implements the Signature interface by resetting internal state
func (sig *counter) Init(cb detect.SignatureHandler) error {
sig.cb = cb
sig.count = 0
return nil
}
// GetMetadata implements the Signature interface by declaring information about the signature
func (sig *counter) GetMetadata() (detect.SignatureMetadata, error) {
return detect.SignatureMetadata{
Version: "0.1.0",
Name: "count to " + strconv.Itoa(sig.target),
}, nil
}
// GetSelectedEvents implements the Signature interface by declaring which events this signature subscribes to
func (sig *counter) GetSelectedEvents() ([]detect.SignatureEventSelector, error) {
return []detect.SignatureEventSelector{{
Source: "tracee",
//Name: "execve",
}}, nil
}
// OnEvent implements the Signature interface by handling each Event passed by the Engine. this is the business logic of the signature
func (sig *counter) OnEvent(event protocol.Event) error {
ee, ok := event.Payload.(trace.Event)
if !ok {
return fmt.Errorf("failed to cast event's payload")
}
if ee.ArgsNum > 0 && ee.Args[0].Name == "pathname" && strings.HasPrefix(ee.Args[0].Value.(string), "yo") {
sig.count++
}
if sig.count == sig.target {
m, _ := sig.GetMetadata()
sig.cb(detect.Finding{
Data: map[string]interface{}{
"count": sig.count,
"severity": "HIGH",
},
Event: event,
SigMetadata: m,
})
sig.count = 0
}
return nil
}
// OnSignal implements the Signature interface by handling lifecycle events of the signature
func (sig *counter) OnSignal(signal detect.Signal) error {
source, sigcomplete := signal.(detect.SignalSourceComplete)
if sigcomplete && source == "tracee" {
sig.cb(detect.Finding{
Data: map[string]interface{}{
"message": "done",
},
})
}
return nil
}
tracee不仅可以打印敏感信息到stderr或者文件,还可以通过Prometheus告警,如下所示连接webhook:
tracee-rules --webhook http://my.webhook/endpoint \
--webhook-template /path/to/my.tmpl \
--webhook-content-type application/json
vagrant@ubuntu-impish:/vagrant$ sudo ./dist/tracee-ebpf \
--output=format:gob \
--output=option:parse-arguments \
| ./dist/tracee-rules \
--input-tracee=file:stdin \
--input-tracee=format:gob
Loaded 14 signature(s): [TRC-1 TRC-13 TRC-2 TRC-14 TRC-3 TRC-11 TRC-9 TRC-4 TRC-5 TRC-12 TRC-8 TRC-6 TRC-10 TRC-7]
*** Detection ***
Time: 2022-03-26T18:48:00Z
Signature ID: TRC-2
Signature: Anti-Debugging
Data: map[]
Command: strace
Hostname: ubuntu-impish
例如,命令sudo tracee-ebpf -o format:gob | tracee-rules --input-tracee file:stdin --input-tracee format:gob
将会按照如下规则运行:
- Start tracee-ebpf with the default tracing mode (see Tracee-eBPF's help for more info).
- Configure Tracee-eBPF to output events into stdout as gob format, and add a terminating event to signal end of stream.
- Start tracee-rules with all built-in rules enabled.
input 值可以是file:stdin,或者file:/root/custom-file, 或者format:gob或者format:JSON. gob是一种输出格式和json、xml是相同的概念,它是一种比json更高效的表达方式,全称是go binary,我理解就是gRPC传递的那种二进制数值。
Falco
更加偏重规则, Falco目前注意支持两个event-engine,分别是sysdig和k8saudit,另外还由官方插件支持 AWS Cloudtrail 和 Okta 事件引擎。Falco支持的系统调用详见 https://falco.org/docs/rules/supported-events/
如果sysdig,libsinsp或libscap不够快,无法跟上来自内核的事件流呢?sysdig会像strace那样导致系统变慢吗(聪明的读者会问)?当然不可能。在这种情况下,事件缓冲区会被填满,sysdig-probe开始遗漏传入的事件。 所以你会丢失一些跟踪信息,但机器和运行的其他进程不会减慢。(sysdig的架构从低到上分别是:sysdig-probe即tracepoints--libscap--libsinsp--sysdig. libscap和libsinsp这两个lib提供了读取、解码和解析事件的支持。具体来说,libscap提供了跟踪文件管理功能,而libsinsp包含复杂的状态跟踪功能(例如,您可以使用文件名而不是FD号),还可以过滤事件解码,Lua JIT编译器来运行chisels等等。最后,sysdig把它作为一个简单的包装器放在这些库中。)
环境变量FALCO_BPF_PROBE=""这样设置表示Falco can use eBPF with minimal configuration changes.
运行时考虑如下配置;
Your kernel has CONFIG_BPF_JIT enabled
net.core.bpf_jit_enable is set to 1 (enable the BPF JIT Compiler)
on kernels <5.8, Falco requires CAP_SYS_ADMIN, CAP_SYS_RESOURCE and CAP_SYS_PTRACE
on kernels >=5.8, CAP_BPF and CAP_PERFMON were separated out of CAP_SYS_ADMIN, so the required capabilities are CAP_BPF, CAP_PERFMON, CAP_SYS_RESOURCE, CAP_SYS_PTRACE. Unfortunately, Docker does not yet support adding the two newly introduced capabilities with the --cap-add option. For this reason, we continue using CAP_SYS_ADMIN, given that it still allows performing the same operations granted by CAP_BPF and CAP_PERFMON. In the near future, Docker will support adding these two capabilities and we will be able to replace CAP_SYS_ADMIN.
Falco rules
Falco Rules是包含三个部分的YAML文件:
- Rules(规则): 是重中之重,Rules是产生警报的条件(Condition)。 Rule一般伴随着一个描述性输出字符串,该字符串与警报一起发送。Rules中包含Macros和Lists.
- Macros(宏): 在Rules甚至其他Macros中可重用的规则条件片段。宏提供了一种方法来命名常见模式和排除规则中的冗余。前两句话好理解,“排除规则中的冗余”就是指可重用Macro来表示规则条件。
- Lists(列表):与规则和宏不同,列表不能被解析为过滤表达式! lists包含在规则、宏或其他列表中的项目集合中使用。
如下是一个判断在容器中判断执行了Bash操作的条件:
if evt.type = execve and evt.dir = < and container.id != host and proc.name = bash
其中evt.type表示事件的类型,evt.dir表示事件的方向即“>”表示进入事件,"<"表示退出事件,container.id != host
表示不是在主机上即是在容器内部执行的,proc.name就是进程名称。另外falco还支持其他条件语法,可参考链接https://falco.org/docs/rules/conditions/
运行期一个falco容器或者进程后,查看它的规则文件/etc/falco/falco_rules.yaml中有很多rule,我们截取两个示例rule来分析:
- rule: Write below root
desc: an attempt to write to any file directly below / or /root
condition: >
root_dir and evt.dir = < and open_write
and proc_name_exists
and not fd.name in (known_root_files)
and not fd.directory pmatch (known_root_directories)
and not exe_running_docker_save
and not gugent_writing_guestagent_log
and not dse_writing_tmp
and not zap_writing_state
and not airflow_writing_state
and not rpm_writing_root_rpmdb
and not maven_writing_groovy
and not chef_writing_conf
and not kubectl_writing_state
and not cassandra_writing_state
and not galley_writing_state
and not calico_writing_state
and not rancher_writing_root
and not runc_writing_exec_fifo
and not mysqlsh_writing_state
and not known_root_conditions
and not user_known_write_root_conditions
and not user_known_write_below_root_activities
output: "File below / or /root opened for writing (user=%user.name user_loginuid=%user.loginuid command=%proc.cmdline parent=%proc.pname file=%fd.name program=%proc.name container_id=%container.id image=%co ntainer.image.repository)"
priority: ERROR
tags: [filesystem, mitre_persistence]
上面这条规则表示进入宏root_dir 定义的目录,并探测到事件open_write退出时如果进程名称不为空,并且不满足后面的那些宏条件就输出output格式的告警信息。这些告警信息的级别是error, Tag就是给这个rule打的标签,更多rule栏位信息参考https://falco.org/docs/rules/,rule中所有条件的表示意义可查询https://falco.org/docs/rules/supported-fields/.有时我们可以根据输出的告警信息关键字来查找rule。另外,这个规则里面condition和output后面有个符号>
我的理解是换行符。
现在再来看另外一条rule:
- rule: Terminal shell in container
desc: A shell was used as the entrypoint/exec point into a container with an attached terminal.
condition: >
spawned_process and container
and shell_procs and proc.tty != 0
and container_entrypoint
and not user_expected_terminal_shell_in_container_conditions
output: >
A shell was spawned in a container with an attached terminal (user=%user.name user_loginuid=%user.loginuid %container.info
shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty container_id=%container.id image=%container.image.repository)
priority: NOTICE
tags: [container, shell, mitre_execution]
这条规则的意思是在容器中执行shell_procs列表中的命令就会提示告警,例如执行如下命令:
docker exec -ti 796130fc3acc /bin/bash
docker exec -ti 796130fc3acc sh /root/test.sh
falco会输出如下信息:
2022-05-06T03:37:09.417552054+0000: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 unruffled_sinoussi (id=796130fc3acc) shell=bash parent=runc cmdline=bash terminal=34816 container_id=796130fc3acc image=citizenstig/dvwa)
2022-05-06T03:38:01.430140657+0000: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 unruffled_sinoussi (id=796130fc3acc) shell=sh parent=runc cmdline=sh /root/test.sh terminal=34816 container_id=796130fc3acc image=citizenstig/dvwa)
其中proc.tty表示进程运行所在的终端不为0,0表示没有终端。可以用ps -ef
来查看进行的TTY值,有很多进程的TTY值是问号,表示该进程运行不需要终端,与终端无关(未考证这个说法,从百度上看来的)
如果要自定义规则要把规则写到falco容器内的/etc/falco/falco_rules.local.yaml 文件中,如果自定义的规则和/etc/falco/falco_rules.yaml中的规则重名,那么自定义的规则会覆盖/etc/falco/falco_rules.yaml中的规则。