Overnormalization

In the general sense, I think that overnormalized is when you are doing so many JOINs to retrieve data that it is causing notable performance penalties and deadlocks on your database, even after you’ve tuned the heck out of your indexes. Obviously, for huge applications and sites like MySpace or eBay, de-normalization is a scaling … Read more

Data Standardization vs Normalization vs Robust Scaler

Am I right to say that also Standardization gets affected negatively by the extreme values as well? Indeed you are; the scikit-learn docs themselves clearly warn for such a case: However, when data contains outliers, StandardScaler can often be mislead. In such cases, it is better to use a scaler that is robust against outliers. … Read more

When to use Unicode Normalization Forms NFC and NFD?

The FAQ is somewhat misleading, starting from its use of “should” followed by the inconsistent use of “requirement” about the same thing. The Unicode Standard itself (cited in the FAQ) is more accurate. Basically, you should not expect programs to treat canonically equivalent strings as different, but neither should you expect all programs to treat … Read more

SQL Joins vs Single Table : Performance Difference?

Keep the Database normalised UNTIL you have discovered a bottleneck. Then only after careful profiling should you denormalise. In most instances, having a good covering set of indexes and up to date statistics will solve most performance and blocking issues without any denormalisation. Using a single table could lead to worse performance if there are … Read more

File.listFiles() mangles unicode names with JDK 6 (Unicode Normalization issues)

Using Unicode, there is more than one valid way to represent the same letter. The characters you’re using in your Tricky Name are a “latin small letter i with circumflex” and a “latin small letter a with ring above”. You say “Note the %CC versus %C3 character representations”, but looking closer what you see are … Read more

Database design for apps using “hashtags”

I would advise going with a typical many-to-many-relationship between messages and tags. That would mean you need 3 tables. Messages (columns Id, UserId and Content) Tags (columns Id and TagName) TagMessageRelations (columns: MessageId and TagId – to make the connections between messages and tags – via foreign keys pointing to Messages.Id / Tags.Id) That way … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)