The speed at which IT organizations must respond to market trends and customer demands is increasing, leading to quicker go-to-market needs on products and services. This requires the enterprise IT landscape to be underpinned by a strong technology platform that enables true business agility. At appveen we have been advocating the Datahub as a data management approach to build a strong tech platform and leverage it to deliver digital initiatives across the organization. This blog primarily talks about the capabilities that go into building Datahub, the kind of enterprise-wide digital initiatives that can be effectively handled with this approach, and how it is different from data lakes and data warehouses.
Data Hub vs Data Lake vs Data Warehouse?
Unlike Datahub, data lakes and data warehouses are larger sets of collections of data. Data lakes tend to store an entire set of enterprise data with fundamentally different structural and semantic differences without having a view on reporting requirements in the future. This inherently makes it difficult to govern and manage the data and thus rendering to data-science use-cases.
Unlike Datahub, data warehouses can’t handle unstructured data and the consumers of data are limited to processing analytical workloads. Moreover, data warehouses are typically characterized by high-latency making them ineffective for real-time digital use cases.
Deliver enterprise-wide digital initiatives using the Datahub approach
- A Lightweight, nimble outcome-based enterprise data management system that can be delivered in few months using Datahub as opposed to a conventional master data management implementation that takes typically a year or more to complete. Moreover, these conventional MDM’s tend to be very expensive and have led to many failed and struggling implementations.
- Centralized reference data management and application data management to handle the exponential increase in the number of data assets — reference and app data. This helps streamline the flow of critical data across the enterprise.
- A 360-degree view of a customer that helps in creating a personalized customer experience. This needs seamless integration with several customer data sources and timely access to governed customer data.
- Real-time integration with partners and suppliers with a reduced turn around time to onboard new partners.
- Corporate to bank integration and payments hub that streamlines and simplifies the exchange of information (payments, etc.) between the bank and a corporate.
- A light-weight workflow automation platform enabling businesses to rapidly automate critical internal processes. For example, automating the collection of survey data or IoT and distributing them to the downstream applications, processes, etc.
- APIfication of core legacy data — ERP, CBS, and other core systems’ data — to make it consumable via ReST APIs.
Building Datahub capabilities:
- Works with a variety of data — It brings together and transforms all forms of data — structured or unstructured
- Consumer-centric design — The underlying architecture enables the seamless distribution of data in real-time with internal/external applications, external partners, 3rd party apps, reporting tools, etc.
- Handling Data at web-scale — Handle data at web-scale without a non-linear increase in costs with built-in self-healing capabilities. Multi-instance deployment using containers to achieve high throughput.
- API-first, microservices-based architecture — Data is accessible via APIs in a governed manner. One-click API generation with rich capabilities to sort, filter, etc.
Intuitive user experience — Intuitive UX enabling business users to model data and APIs.
- Easy data management — Enterprise-level role-based access to manage data model definitions/standardized documentation/Metadata repository etc.
- Effective data governance — The focal point needs to be on data governance including abilities to collaborate using workflows to verify and publish data, enforce data standards onto incoming data, rules defined as part of the data model, etc.
- Seamless data integration — Ability to integrate applications, processes, and partners and ingest data in any form.