Data to Analytics: October 2017

Sunday, 8 October 2017

Database Design and Performance

While designing the database below ideas helps to gain the better performance for the SELECT query

Compromise with Denormalization :If a significant number of your queries require joins of more than five or six tables, you should consider Denormalization.
Use Computed column :Instead of computing columns while reading better to add a computed column to calculate the value while DML. For example ORDER table have column Qty, Price, Discount. So suppose we want to total doller amount for each order then first we have to determine the dollar amount for each product.

SELECT "Order ID", SUM("Unit Price" * Quantity * (1.0 - Discount))

For a large set of orders, the query can take a long time to run. The alternative is to calculate the dollar amount of the order at the time it is placed, and then store that amount in a column within the Orders table.

Decide Between Variable and Fixed-length Columns: Fixed length columns always take maximum space defined by the schema, even when the actual value is empty. The downside for variable length columns is that some operations are not as efficient as those on fixed length columns. For example, if a variable length column starts small and an UPDATE causes it to grow significantly, the record might have to be relocated. Additionally, frequent updates cause data pages to become more fragmented over time. Therefore, we should use fixed length columns when data lengths do not vary too much and when frequent updates are performed.

Use Smaller Key Lengths: An index is an ordered subset of the table on which it is created. It permits fast range lookup and sort order. Smaller index keys take less space and are more effective that larger keys. It is a particularly good practice to make the primary key compact because it is frequently referenced as a foreign key in other tables. If there is no natural compact primary key, you can use an identity column implemented as an integer instead

Reference :https://technet.microsoft.com/en-us/library/ms172432(v=sql.110).aspx

Saturday, 7 October 2017

Data Modeling Best Practices Article by Dale Anderson

Data Modeling Best Practices

I found a very good article written by May 5, 2017 -- Dale Anderson, on Best Practices of Data Modeling. We can see complete article using below links :

https://www.talend.com/blog/2017/05/05/data-model-design-best-practices-part-1/#comment-1439

I have copied few of highlights below.

Adaptability – creating schemas that withstand enhancement or correction
Expandability – creating schemas that grow beyond expectations
Fundamentality – creating schemas that deliver on features and functionality
Portability – creating schemas that can be hosted on disparate systems
Exploitation – creating schemas that maximize a host technology
Efficient Storage – creating optimized schema disk footprint
High Performance – creating optimized schemas that excel

Things to be avoid while designing Data model:

χ Composite Primary Keys avoid them, rarely effective or appropriate; there are some exceptions depending upon the data model
χ Bad Primary Keys usually datetime and/or strings (except a GUID or Hash) are inappropriate
χ Bad Indexing either too few or too many
χ Column Datatypes when you only need an Integer don’t use a Long (or Big Integer), especially on a primary key
χ Storage Allocation inconsiderate of data size and growth potential
χ Circular References where a table A has a relationship with table B, table B has a relationship with table C, and table C has a relationship with table A – this is simply bad design (IMHO)

menu

Sunday, 8 October 2017

Database Design and Performance

While designing the database below ideas helps to gain the better performance for the SELECT query

Saturday, 7 October 2017

Data Modeling Best Practices Article by Dale Anderson