• Proj THUDBFuzz Paper Reading: Ankou: Guiding Greybox Fuzzing towards Combinatorial Difference


    Ankou: Guiding Greybox Fuzzing towards Combinatorial Difference

    Abstract

    P1: 介绍Greybox fuzzing;不足:现有的fitness函数无法区分达到相同coverage的不同program executions,导致容易困在局部最优值里(The problem is that current fitness functions only consider a union of data, but not their combination);
    为了解决这个问题,不再被困在局部最优值,本文提出了Ankou
    特点: greybox, 能够识别不同的执行信息组合(recognize different combinations of exec)
    实验:
    竞品软件: AFL, Angora
    效果: 1.94x-8.0x more effective in finding bugs

    1. Intro

    P1: 介绍Fuzzing
    P2: seed; fitness function(衡量test case的质量)
    P3: 主流fitness function: 用code coverage
    P4: 用code coverage的缺陷: 有些test case能探索宝贵的execution paths,但是因为没有覆盖新的基本块所以被忽略:例如,buffer overflow bugs在第一次覆盖的时候常常不会显现,需要重复执行一个循环若干次才会体现
    P5: fitness function需要满足:
    C1: informative: 能够量化程序执行之间的差异
    C2: 算起来快
    C3: 不应该接受过多seeds,以handle them in a practical manner
    P6: C1: fitness function往往在1. 决定一个种子是否应该选取 2. 一个种子是否比其他种子更应该选取 之间不可得兼
    P7: C2
    P8: C3
    P9: distance-based fuzzing:
    C1: distance-based fitness functions
    C2: dynamic PCA
    C3: adaptive seed pool update
    P10: distance-based fitness function: 通过测量两次execution中的执行到的branches的组合来给这两次执行的行为相似性打分
    P11: 引入distance-based fitness function使得fuzzer的执行减慢13.22倍,为此,用dynamic PCA
    P12,13:PCA, dynamic PCA: 让PCA增量计算
    P13: we can compare test cases based on their fitness to actively decide the sensitivity of the pool update function

    2. Background

    2.1 Fitness and Local Optimum Problem

    P1: we say we have reached a local optimum as we cannot obtain any more test cases that fulfill our fitness criterion even through we have not yet tested all possible executions of the PUT.
    P2: 举例coverage的局限
    P3: AFL branch-hit-count state
    P4: 举例AFL coverage的局限

    2.2 PCA

    rt

    3. Distance-based Fuzzing Fitness

    P1: 本文认为AFL的branch-hit-count states已经提供了判断test case作为未来种子潜力的足够信息
    P2: 相同覆盖但是不同AFL覆盖的两次执行应该有不同的向量表示

    3.1 Fitness as Distance between Vectors

    用欧几里得距离作为衡量两个branch-hit-count execution的距离。用当前test case到全体已经选择了的种子库的最小距离作为当前种子的noverty

    3.2 Impracticality of Distance based Fitness

    O(mn)的复杂度使得该距离衡量方法过于不可行。
    改进措施

    1. M-tree
    2. PCA

    4. Dynamic PCA


    5. Distance-based fuzzing

    5.1 Adaptive Seed Pool Update



    阈值就是全局距离最小值

    5.2 Ankou Architecture

  • 相关阅读:
    中科大算法分析与设计分布式算法复习知识点
    记录一些实用网站
    《TensorFlow机器学习项目实战》pdf及源码
    DevC++连接MySQL可用详细教程
    【转】MySQL合理使用索引
    【原】基于Feign 重写自定义编码器
    【原】logback实现按业务输出到对应日志文件
    【原】MDC日志链路设计
    关于看源码的心得体会
    【原】基于Spring实现策略模式
  • 原文地址:https://www.cnblogs.com/xuesu/p/14644167.html
Copyright © 2020-2023  润新知