Create a pandas DataFrame from multiple dicts [duplicate]

Question

To make a DataFrame from a dictionary, you can pass a list of dictionaries:

>>> person1 = {'type': 01, 'name': 'Jhon', 'surname': 'Smith', 'phone': '555-1234'}
>>> person2 = {'type': 01, 'name': 'Jannette', 'surname': 'Jhonson', 'credit': 1000000.00}
>>> animal1 = {'type': 03, 'cname': 'cow', 'sciname': 'Bos....', 'legs': 4, 'tails': 1 }
>>> pd.DataFrame([person1])
   name     phone surname  type
0  Jhon  555-1234   Smith     1
>>> pd.DataFrame([person1, person2])
    credit      name     phone  surname  type
0      NaN      Jhon  555-1234    Smith     1
1  1000000  Jannette       NaN  Jhonson     1
>>> pd.DataFrame.from_dict([person1, person2])
    credit      name     phone  surname  type
0      NaN      Jhon  555-1234    Smith     1
1  1000000  Jannette       NaN  Jhonson     1

For the more fundamental issue of two differently-formatted files intermixed, and assuming the files aren’t so big that we can’t read them and store them in memory, I’d use StringIO to make an object which is sort of like a file but which only has the lines we want, and then use read_fwf (fixed-width-file). For example:

from StringIO import StringIO

def get_filelike_object(filename, line_prefix):
    s = StringIO()
    with open(filename, "r") as fp:
        for line in fp:
            if line.startswith(line_prefix):
                s.write(line)
    s.seek(0)
    return s

and then

>>> type01 = get_filelike_object("animal.dat", "01")
>>> df = pd.read_fwf(type01, names="type name surname phone credit".split(), 
                     widths=[2, 10, 10, 8, 11], header=None)
>>> df
   type      name  surname     phone     credit
0     1      Jhon    Smith  555-1234        NaN
1     1  Jannette  Jhonson       NaN  100000000

should work. Of course you could also separate the files into different types before pandas ever sees them, which might be easiest of all.

Leave a Comment Cancel reply