Monthly Archives: April 2016

Machine Learning – A Trend or a Fad

Big Data and Machine Learning are the two buzz words in the industry today. The way companies in various sectors are adopting data science approach to improve their top line and bottom line numbers has been phenomenal.

With the huge popularity that the data science has achieved in the recent past, it’s high time to take a step back and understand how different organizations fared when they employed ML or Advanced techniques to improve their business operations. Towards the later half of 2015, there was an article published by Gartner where they showed the hype cycle of the emerging technologies.

http://www.gartner.com/newsroom/id/3114217

What is interesting over here is to find that “Machine Learning” falls in the “Peak of Inflated Expectations” in the Hype Cycle which shows that there have been successes (often owing to the initial publicity) accompanied by scores of failures. This may look scary at first since Big Data and Advanced Analytics are supposed to lift up the business. We definitely wanted this to be in the “Plateau of Productivity” on the Hype Cycle. Technologies falling in this plateau region are the ones that are adopted by all the mainstream businesses and have positive ROI. It’s promising to see that both “Machine Learning” and “Advanced Analytics” are expected to reach the plateau in 2-5 years’ time.

Gartner’s Hype Cycle shows that benefits of Machine Learning and Advanced Analytics have not yet reached its true potential. This also shows that it is the best time to be in the space. As shown through the article, the companies who can adopt these technologies in this time and space will get the benefits of it within the next five years.

Neverware – Revive your old Windows PC as a Chromebook

Do you have an old Windows laptop lying around that you refuse to touch because of how slow it runs Windows (Vista/7) today? If so, a new product called Neverware CloudReady may be able to help. CloudReady is FREE for Home use.

Neverware’s CloudReady will help you repurpose that laptop into a Chromebook that your young child can use. They have a list of models below that are certified to run CloudReady without any issues here: http://go.neverware.com/certifiedmodels. 

The Verge has a great article below on how Neverware is helping resuscitate these old PCs for schools and other educational institutions.
http://www.theverge.com/2016/2/17/11030406/neverware-google-chromebook-chromium-os-education-microsoft

Have you tried CloudReady already? If so, chime in with your comments and experience. I am planning on trying it out soon myself!

SSIS DATA PROFILING

SQL Server Integration Services offers a useful tool to analyze data before you bring it into your Data Warehouse.  The Profile Task will store the analysis in an XML file, which you can view using the Data Profile Viewer.  Before we review how to use the Profile Task, let’s take a look at the eight types of profiles that can be generated by this control.

  • Candidate Key Profile Request
    • Use this profile to identify the columns that make up a key in your data
  • Column Length Distribution Profile Request
    • This profile reports the distinct lengths of string values in selected columns and the percentage of rows in the table that each length represents. Use this profile to identify invalid data, for example a United States state code column with more than two characters.
  • Column Null Ratio Profile Request
    • As the name implies, this profile will report the percentage of null values in selected columns. Use this profile to identify unexpectedly high ratios of null values.
  • Column Pattern Profile Request
    • Reports a set of regular expressions that cover the specified percentage of values in a string column. Use this profile to identify invalid strings in your data, such as Zip Code/Postal Code that do not fit a specific format.
  • Column Statistics Profile Request
    • Reports statistics such as minimum, maximum, average and standard deviation for numeric columns, and minimum and maximum for datetime columns. Use this profile to look for out of range values, like a column of historical dates with a maximum date in the future.
  • Column Value Distribution Profile Request
    • This profile reports all the distinct values in selected columns and the percentage of rows in the table that each value represents. It can also report values that represent more than a specified percentage in the table.  This profile can help you identify problems in your data such as an incorrect number of distinct values in a column.  For example, it can tell you if you have more than 50 distinct values in a column that contains United States state codes.
  • Functional Dependency Profile Request
    • The Functional Dependency Profile reports the extent to which the values in one column (the dependent column) depend on the values in another column or set of columns (the determinant column). This profile can also help you identify problems in your data, such as values that are not valid. For example, you profile the dependency between a column of United States Zip Codes and a column of states in the United States. The same Zip Code should always have the same state, but the profile discovers violations of this dependency.
  • Value Inclusion Profile Request
    • The Value Inclusion Profile computes the overlap in the values between two columns or sets of columns. This profile can also determine whether a column or set of columns is appropriate to serve as a foreign key between the selected tables. This profile can also help you identify problems in your data such as values that are not valid. For example, you profile the ProductID column of a Sales table and discover that the column contains values that are not found in the ProductID column of the Products table.

SQL SERVER DATABASE ENCRYPTION STATUS

Securing your data is very important and database encryption is just part of that landscape. Taking advantage SQL Server database encryption can be a daunting task but once it is configured it is also important to monitor it.

This query is intended to allow the database administrator to gain information about the status of encryption on their systems. This script can be very handy for audits or to ensure that your databases are in an a state you expect them to be in.

For those databases in your environment that require Transparent Data Encryption (TDE) this script will be invaluable for monitoring the encryption states of your databases.

 

SELECT
[db].name,
[db].is_encrypted,
[dek].encryption_state,
CASE [dek].encryption_state
WHEN 0 THEN 'Not Encrypted'
WHEN 1 THEN 'Unencrypted'
WHEN 2 THEN 'Encryption in progress'
WHEN 3 THEN 'Encrypted'
WHEN 4 THEN 'Key change in progress'
WHEN 5 THEN 'Decryption in progress'
WHEN 6 THEN 'Protection change in progress '
ELSE 'Not Encrypted'
END AS 'Desc'
FROM
sys.dm_database_encryption_keys [dek]
RIGHT JOIN
sys.databases [db] ON [dek].database_id = [db].database_id