目录

Thrift对象的序列化与反序列化

目录

对象的序列化与序列化,可能大家更多接触的是谷歌的protobuf。

Thrift作为一个跨语言的RPC代码生成引擎,也具备此功能。

本文要说的是如何使用Thrift实现对象的序列化与反序列化,其实就是,如何以protobuf的方式使用Thrift。

Thrift描述文件:

1
2
3
4
5
# filename: demo.thrift
struct Node {
    1: string host
    2: i32 port
}

以生成的Python代码为例,Thrift生成的类型提供了两个关键方法:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class Node:
  """
  Attributes:
   - host
   - port
  """
  thrift_spec = (
    None, # 0
    (1, TType.STRING, 'host', None, None, ), # 1
    (2, TType.I32, 'port', None, None, ), # 2
  )
  
  def __init__(self, host=None, port=None,):
    self.host = host
    self.port = port

  def read(self, iprot):
    ...

  def write(self, oprot):
    ...

read/write方法按照指定协议传输对象,所以需要一个TProtocol对象。

TProtocol对象构造时需要传入一个TTransport对象,即传输层,所以还需要一个TTransport对象。

由于数据已经准备完毕,要做的只是反序列化。

好,TMemoryBuffer满足需求。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class TMemoryBuffer(TTransportBase, CReadableTransport):
  """Wraps a cStringIO object as a TTransport.

  NOTE: Unlike the C++ version of this class, you cannot write to it
        then immediately read from it.  If you want to read from a
        TMemoryBuffer, you must either pass a string to the constructor.
  TODO(dreiss): Make this work like the C++ version.
  """

  def __init__(self, value=None):
    """value -- a value to read from for stringio

    If value is set, this will be a transport for reading,
    otherwise, it is for writing"""
    if value is not None:
      self._buffer = StringIO(value)
    else:
      self._buffer = StringIO()

TMemoryBuffer继承TTransportBase,也属于一种TTransport,内部封装了一个StringIO对象。

利用目标数据构造一个TMemoryBuffer对象,然后调用read/write方法实现反序列化和序列化。

需要注意的是,Python在初始化TMemoryBuffer对象时必须指定value。

序列化/反序列化的示例代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#! /usr/bin/env python
# -*- coding: utf-8 -*-

import sys
sys.path.append('gen-py')

from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from demo.ttypes import *

def serialize(th_obj):
    """ Serialize. 
    """
    tmembuf = TTransport.TMemoryBuffer()
    tbinprot = TBinaryProtocol.TBinaryProtocol(tmembuf)
    th_obj.write(tbinprot)
    return tmembuf.getvalue()

def deserialize(val, th_obj_type):
    """ Deserialize. 
    """
    th_obj = th_obj_type()
    tmembuf = TTransport.TMemoryBuffer(val)
    tbinprot = TBinaryProtocol.TBinaryProtocol(tmembuf)
    th_obj.read(tbinprot)
    return th_obj

if __name__ == '__main__':
    node1 = Node('localhost', 8000)
    print 'node1:', node1

    # modified
    node1.host = '127.0.0.1'
    node1.port = 9000

    val = serialize(node1)
    node2 = deserialize(val, Node)
    print 'node2:', node2

输出结果:

1
2
node1: Node(host='localhost', port=8000)
node2: Node(host='127.0.0.1', port=9000)