In a nutshell, when data virtualization is enabled, an abstraction layer is provided which hides the applications most of the technical aspects of how and where data is stored. Applications do not need to know where all data is physically stored, how the data should be integrated, where the data store server runs, what the required APIs are, which language to use to access the data.
Advantages and Disadvantages –
- Users can work with more timely data
- Less need for creating derived data store
- Time to market for new reports
- Transformations are executed repeatedly
- Complex transformations takes too long
- Production system overwrites old data when new data is entered
Before Implementing, identify good test project with several to millions of rows in one data source,
Several to 100 columns’ and Low volume of concurrent users.
In traditional process, we use ETL to move data to application specific database and use that data for application or build reports, but in some cases by the time you moved the data, the report requirements can change and here the DV layer allows application to access shared enterprise data services without physically replicating data to your own application schema. With DV, the data would have stayed in the source system, any application could use it without copying it over. If we go beyond structured and internal data, you can use DV to connect to unstructured data (Facebook, Twitter) and external data (3rd party owned data) without owning them in your own infrastructures. Having said that, it is a complimentary tool that you would want to have, but not to replace what you already have in your technology stack.