Big data, where the quantum of data fluctuates widely, and where security is not as important as in the case of financial transactions, suits the public cloud perfectly.
Most enterprises that have embraced big data have on-premise Hadoop, NoSQL database, or an enterprise data warehousing environment. Rather than run these in-house, it makes sense to migrate such services to public clouds.
However, it is not worthwhile to migrate everything to the cloud. It makes sense to migrate only those applications that can leverage the benefits of elasticity, performance enhancement, scalability, cost-effectiveness and reliability, which the cloud offers. If an application cannot leverage such advantages, it is better to run it in-house, and retain greater control of the same.
Apart from the enterprise data and applications already hosted in a public cloud, the components of big data applications best suited for public cloud are:
- Short turn-around projects, where the requirement for storage and bandwidth are temporary. The flexible plans offered by public cloud allow the enterprise to increase or decrease storage or bandwidth, as required.
- External data sources. When there is a large quantity of unstructured data that require pre-processing (and coming in from external sources), it may make sense to direct it to the cloud without straining the in-house storage and bandwidth resources.
- New applications that cannot be accommodated in the in-house big data processing capabilities. For instance, it may make sense to hive off, say, social media analytics to the cloud, rather than invest to upgrade the existing in-house platform.
Optimizing big data analysis requires treading a fine line between deploying the various big data components in-house and at public cloud deployments. The decision on whether to migrate to the cloud needs to be made on the suitability or the specific circumstances. The generally touted benefits of the cloud may not apply at all times and under all circumstances.