How to check if PDF is scanned image or contains text

The below code will work, to extract data text data from both searchable and non-searchable PDF’s.

import fitz

text = ""
path = "Your_scanned_or_partial_scanned.pdf"

doc = fitz.open(path)
for page in doc:
    text += page.get_text()()

You can refer this link for more information.

If you don’t have fitz module you need to do this:

pip install --upgrade pymupdf

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)