As of 2018, perf simply doesn’t have support for reading the Python stack frames (cf. a 2014 Python mailinglist discussion).
Python 3.6 has some support for Dtrace and Systemtap.
An alternative to this is Pyflame, a stochastic profiler for Python that samples python call stacks via ptrace(). In contrast to Dtrace/Systemtap you don’t need extra permissions and it also works with Python versions that are compiled without instrumentalization support.
When you use the --threads option with Pyflame you see Python lines that call into C/C++ extensions, although the stack-trace stops at the last Python frame. But perhaps this is sufficient for your use case.
Edit: Pyflame was abandoned in the end of 2019 or so. A hacker news thread mentions the following alternatives:
- https://github.com/P403n1x87/austin
- https://github.com/benfred/py-spy