Your benchmark is actually too small to show the real difference.
Appending, copies EACH time, so you are actually doing copying a size N memory space N*(N-1) times. This is horribly inefficient as the size of your dataframe grows. This certainly might not matter in a very small frame. But if you have any real size this matters a lot. This is specifically noted in the docs here, though kind of a small warning.
In [97]: df = DataFrame(np.random.randn(100000,20))
In [98]: df['B'] = 'foo'
In [99]: df['C'] = pd.Timestamp('20130101')
In [103]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100000 entries, 0 to 99999
Data columns (total 22 columns):
0 100000 non-null float64
1 100000 non-null float64
2 100000 non-null float64
3 100000 non-null float64
4 100000 non-null float64
5 100000 non-null float64
6 100000 non-null float64
7 100000 non-null float64
8 100000 non-null float64
9 100000 non-null float64
10 100000 non-null float64
11 100000 non-null float64
12 100000 non-null float64
13 100000 non-null float64
14 100000 non-null float64
15 100000 non-null float64
16 100000 non-null float64
17 100000 non-null float64
18 100000 non-null float64
19 100000 non-null float64
B 100000 non-null object
C 100000 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(20), object(1)
memory usage: 17.5+ MB
Appending
In [85]: def f1():
....: result = df
....: for i in range(9):
....: result = result.append(df)
....: return result
....:
Concat
In [86]: def f2():
....: result = []
....: for i in range(10):
....: result.append(df)
....: return pd.concat(result)
....:
In [100]: f1().equals(f2())
Out[100]: True
In [101]: %timeit f1()
1 loops, best of 3: 1.66 s per loop
In [102]: %timeit f2()
1 loops, best of 3: 220 ms per loop
Note that I wouldn’t even bother trying to pre-allocate. Its somewhat complicated, especially since you are dealing with multiple dtypes (e.g. you could make a giant frame and simply .loc and it would work). But pd.concat is just dead simple, works reliably, and fast.
And timing of your sizes from above
In [104]: df = DataFrame(np.random.randn(2500,40))
In [105]: %timeit f1()
10 loops, best of 3: 33.1 ms per loop
In [106]: %timeit f2()
100 loops, best of 3: 4.23 ms per loop