copy.deepcopy vs pickle

Question

Problem is, pickle+unpickle can be faster (in the C implementation) because it’s less general than deepcopy: many objects can be deepcopied but not pickled. Suppose for example that your class A were changed to…:

class A(object):
  class B(object): pass
  def __init__(self): self.b = self.B()

now, copy1 still works fine (A’s complexity slows it downs but absolutely doesn’t stop it); copy2 and copy3 break, the end of the stack trace says…:

  File "./c.py", line 20, in copy3
    return cPickle.loads(cPickle.dumps(d, -1))
PicklingError: Can't pickle <class 'c.B'>: attribute lookup c.B failed

I.e., pickling always assumes that classes and functions are top-level entities in their modules, and so pickles them “by name” — deepcopying makes absolutely no such assumptions.

So if you have a situation where speed of “somewhat deep-copying” is absolutely crucial, every millisecond matters, AND you want to take advantage of special limitations that you KNOW apply to the objects you’re duplicating, such as those that make pickling applicable, or ones favoring other forms yet of serializations and other shortcuts, by all means go ahead – but if you do you MUST be aware that you’re constraining your system to live by those limitations forevermore, and document that design decision very clearly and explicitly for the benefit of future maintainers.

For the NORMAL case, where you want generality, use deepcopy!-)

Leave a Comment Cancel reply