Why is the same SQLite query being 30 times slower when fetching only twice as many results?

Question

The execution time geometrically proportional to the number of rows in each table rather than arithmetically e.g.

3 tables with 10 rows each => 1,000 comparision

3 tables with 10, 10 and 40 rows => 4,000 comparisons

3 tables with 20 rows each => 8,000 comparisons

You could probably re-factor the query to avoid some of the joins/cursors – when do you need an answer?

Could you do something like this:

SELECT precursor_id, feature_table_id 
FROM MSMS_precursor
INNER JOIN 

    (
        SELECT mzMin, mzMax, rtMin, rtMax, spectrum_id, feature_table_id, msrun_msrun_id

        FROM spectrum
        INNER JOIN 

           (select feature_table_id, mzMin, mzMax, rtMin, rtMax, msrun_msrun_id
            from feature
            where feature.msrun_msrun_id = 'value'
           ) subquery 

        ON subquery.msrun_msrun_id = spectrum.msrun_msrun_id
        WHERE 
            spectrum.scan_start_time BETWEEN subquery.rtMin AND subquery.rtMax 
    ) subquery

    ON subquery.spectrum_id = MSMS_precursor.spectrum_spectrum_id 

WHERE 
    MSMS_precursor.ion_mz BETWEEN subquery.mzMin AND subquery.mzMax

Using a subquery enables you to reduce the number of comparisons between the tables – you can quickly filter out the unwanted features, then the un-related spectra before searching for suitable precursors.

I don’t use SQLLite – but the principle should still apply.

UPDATED : fixed bug in SQL

Notes:

You don’t have to worry about the ANDs, you’ll only get:

features where feature.msrun_msrun_id = ‘value’
spectra for those features and where spectrum.scan_start_time BETWEEN
subquery.rtMin AND subquery.rtMax
precursors for those spectrs and where MSMS_precursor.ion_mz BETWEEN subquery.mzMin AND
subquery.mzMax

UPDATE 18/May:

It’s the indexing!!! you have indexes on the search fields, but not on the fields participating in the joins – foreign key indices really boost performance:

CREATE INDEX `fk_msrun_msrun_id_feature` ON `feature` (`msrun_msrun_id` ASC); 
CREATE INDEX `fk_spectrum_spectrum_id_feature` ON `feature` (`msrun_msrun_id` ASC); 
CREATE INDEX `fk_spectrum_spectrum_id_MSMS_precursor` ON `MSMS_precursor` (`spectrum_spectrum_id` ASC);

Leave a Comment Cancel reply