DBPD: A Dynamic Birthmark-based Software Plagiarism Detection Tool Zhenzhou Tian...

Preview:

Citation preview

DBPD: A Dynamic Birthmark-based Software

Plagiarism Detection Tool

DBPD: A Dynamic Birthmark-based Software

Plagiarism Detection Tool

Zhenzhou Tian

zztian@stu.xjtu.edu.cn

MOE Key Lab for Intelligent Networks and Network Security

Xi’an Jiaotong University, China

23/4/20

1

2

Introduction Software plagiarism has been a serious threat to the healthy

development of software industry• Violate licenses for commercial interests or unwittingly

• Weak code protection awareness• Powerful automated code obfuscation tools• Distributed in binary form

3

Introduction Many software birthmark based techniques are proposed

Static Birthmarks: CVFV,SMC,IS,UC… Dynamic Birthmarks: WPP, SCSSB, SCDG, DKISB… Seldom tools are publically available

Dynamic birthmarks are believed to perform better than static birthmarks

Tool Static/Dynamic Language

Sandmark Static Java bytecode

Stigmata Static Java bytecode

Birthmarking Dynamic Java bytecode

JPlag Static Source code

4

Framework of DBPD Software BirthmarkA set of characteristics extracted from a program that reflects

intrinsic properties of the program, and which can be used to identify the program uniquely.

Design Overview

Dynamic Analysis Module

Similarity Calculator & Decision Maker

Plaintiff Binary

Defendant Binary

Input

DKISB Generator

SODB Generator

SCSSB Generator

Birthmark Generator

5

Three Dynamic Birthmarks Three Birthmark Approaches Implemented DKISB: Dynamic Key Instruction Sequence BirthmarkGenerated using k-gram algorithm from dynamic key instructions

(instructions that are both value updating and input correlated).

SCSSB: System Call Short Sequence BirthmarkExtracted by splitting system call sequence into short sub-sequences

SODB: Stack Operation Dynamic BirthmarkGenerated by analyzing the behavior of stack operations, utilizing

the law of push and pop operation of call stack to uniquely identify a program

6

Demonstration

Independently implemented software with similar functionalities

7

Demonstration

Plagiarism Using Different Compilers and Optimization Levels

8

Demonstration

Plagiarism Using Specific Obfuscation Tools

9

Demonstration

Cross-Platform Plagiarism Scenario

10

Some Definitions

11

Some Definitions

Recommended