Data Fabric, Data Mesh, And the Cloud: Data Management Architectures for the Future
While the emerging constellation of next-generation data architectures—fabric, mesh, and cloud—is extremely appealing, it’s still full of unknowns. These approaches present opportunities for greater data democratization, but also increased complexity.
Understanding the distinctions between data fabric and mesh are also important before moving to this architecture.
Data mesh is a highly decentralized, self-service architecture in which datasets are managed or controlled by business units across enterprises. Data fabric is a more centralized architecture that supports metadata designed to integrate disparate, multiple data platforms and pipelines that simplify access to these assets. “Data fabric emphasizes virtualization and centralization of data to create a unified data infrastructure by integrating different data sources located in different systems and different cloud environments,” said Anil Dangol, data manager at Launch Consulting. “Data mesh, on the other hand, emphasizes decentralization of data which advocates data as a product where each team owns the data product.”
Both fabric and mesh “are reasonably early in their technology evolution,” Jim Webber, chief scientist at Neo4j, pointed out. A survey of more than 200 IT leaders by Unisphere Research, a division of Information Today, Inc., finds cautious uptake of these modern data architectures. While fewer than 1 in 5 enterprises are using some variation of data mesh and fabric architectures, many are cautiously eyeing the technologies (“The Move to Modern Data Architecture: 2022 Data Delivery and Consumption Patterns Survey,” May 2022).
Cloud-based architectures, conversely, are relatively mature, and are delivering greater capabilities. “Almost all organizations that are building a new data landscape are leveraging cloud-based technology,” said Steve Jones, vice president of insights and data at Capgemini. “Even traditional extract, transform, and load-based architectures benefit from the power of cloud to enable dynamic capacity to speed up processing without requiring large-scale continual infrastructure.”
Data “is the core of digital enablement and most organizations today are moving away from a one-size-fits-all data strategy to a more modern platform that focuses on enabling data products, focused data stores, machine learning, or AI solutions, and other data services,” said Pranabesh Sarkar, data architect at Altimetrik. “The focus is on building an integrated data ecosystem leveraging a foundation built on data lake, mesh, or data fabric.”
Moving to data fabric and mesh requires extensive rethinking of data management and analytics processes, Jones explained. Companies are “still learning how to transition from passive, post-transactional data architectures to ones that support a data-driven organization. This isn’t a question of architecture as much as it is of governance and culture. Some organizations have made the technical shift of moving toward a data-mesh infrastructure, but without the associated culture change, they’ve ended up in a similar place—just using different technologies.”
Building out a data fabric may be an easier transition than mesh. Currently, “enterprises are leaning more into fabric so that they can master their data,” said Webber. “This is a common pattern we see at Neo4j with metadata knowledge graphs. Mesh seems to be for those of a more adventurous mindset with teams offering their data into the mesh for others to consume.” The most compelling use case for mesh at this time, Webber added, is helping with “discoverability and reuse of siloed data.”
In recent years, there has been a lifting and shifting of data into the cloud, which has created its own problems.
“However, with this rapid migration there were a lot of data design patterns involved which required copying the data into multiple different places,” said Dangol. “It did not fully address data as a product creating a lot of challenges, such as data governance and security, storage, quality, and data life cycle management. Data mesh and data fabric try to address those problems in their own ways.”
The appeal of data mesh stems from “treating data as a product, which pushes data ownership responsibility to the team with the domain understanding to create, catalog, and store the data,” said Mathias Golombek, CTO of Exasol. “Doing this at the data creation phase brings more visibility to the data and makes it easier to consume—and stops any human knowledge siloes forming. This opens up data democratization. Employees can focus on experimentation, innovation, and producing more value from the data. That’s the theory, anyway.”
Data fabric and mesh, built on cloud, may open up data processes in ways not possible before. “At a technical level, next-generation data architectures support better data discovery and access, leading to data democratization, simplified and agile data flows, faster time to insight and time to value, and the ability to industrialize the application of AI,” said Naveen Kamat, executive director and CTO of data and AI services at Kyndryl. “At a business level, this can mean you are now enabling a whole new set of business outcomes from customer experience to productivity and revenue maximization.”
The rapid availability of data through next-generation data architectures means more rapid business responses. “Operational speed data will improve business outcomes where insight at the point of action drives better business performance,” said Jones. “This is the tip of the spear for next-generation data architecture, looking at where data-driven applications exist, as opposed to applications simply dumping data into data stores. Then look at who benefits, and therefore who is accountable for delivering those benefits. Above all, focus on the cultural change you want to achieve.”
As they advance at their individual paces, these next-generation architectures have become crucial components of budding data modernization efforts—of which only about 20% of companies have completed, said Bret Greenstein, partner for data, analytics, and AI with PwC. “Initially, 3–5 years ago, companies were adopting cloud by lifting and shifting the legacy data systems they had into cloud as-is. This was a fast approach, but it didn’t do anything to enable new business outcomes or to simplify and speed the flow of data through the enterprise. However, in the last several years, the more strategic approach of data modernization has become the dominant pattern for reaching next-generation data architectures. This approach leverages data mesh principles, on cloud, designed in ways that maximize business value and usually create dramatic simplification.”
Overall, decentralized data creation “brings more visibility and makes data easier to digest and consume,” said Golombek. “It also helps to truly democratize the data because data consumers don’t have to worry about the data discovery and can focus on experimentation, innovation, and generation of new value from data. Because of the decentralized data operations and the provisioned data infrastructure as a service, data mesh results in greater agility and scalability, with teams focusing on relevant data products. It also supports the creation of a federated, global governance that enables interoperability and simplifies access to data.”
Moving to next-generation data architectures is a journey, not an overnight sprint. “They can be complex and challenging to design and implement, requiring specialized knowledge and expertise,” said Jerod Johnson, senior technology evangelist at CData. “This can make it difficult for organizations to fully understand and take advantage of the capabilities that these architectures offer.” In addition, “adopting a next-generation data architecture may require significant changes to existing systems and processes, which can be difficult and time-consuming. Integrating new technologies with legacy systems can be a challenging task and could require additional resources, expertise, and testing.”
There are funding issues and business commitments as well. “Modern architectures can be expensive to implement and maintain, requiring significant investments in hardware, software, and personnel,” Johnson said. “This can be a barrier for some organizations, especially smaller ones.”
Even at large enterprises, there aren’t “enough people with knowledge and experience to run these things appropriately,” said Grant Fritchey, product advocate for Redgate Software. “Not only could you end up with unsecure or non-functional data stores, you could lose data or, worse yet, run up unneeded costs.”
In addition, implementing data mesh or data fabric may see more success within larger enterprises, but be too elaborate for smaller companies or startups. “It might be suitable for big organizations where each team owns their specific domain, but not be applicable to smaller organizations which have limited IT staff managing and owning all the data for a company,” said Dangol.
Additional challenges with mesh include data consistency, data governance, data quality, complexities, and interoperability, Dangol added. With data fabric, challenges include “complex data integration for various source systems, ensuring proper data governance in centralized environments, maintaining scalable infrastructure, data quality, and cost to maintain data in a central environment.”
Data security is also an issue. “New systems can also introduce new security risks, especially when dealing with large amounts of sensitive data, such as cloud-based data architecture,” said Johnson. “Organizations will need to invest in security measures to protect against these risks, which can be costly and complex. Some organizations may have regulatory requirements that must be met, and next-generation data architectures may not be able to comply with these regulations. This can make it difficult for organizations in certain industries, such as finance or healthcare, to adopt these architectures.”
Data governance also needs to be stepped up as these next-generation architectures come into play. “Data is an incredible asset, but if it’s not managed well, it can also be an incredible liability, especially around compliance,” Webber said. “While mesh infrastructure helps with this, at the end of the day you have a people problem. People publish data from their team in good faith only to find it really shouldn’t be shared. Governance can hinder fabric as well. Mastering an enterprise’s data, even with such tools, remains a difficult task with often multiple owners who think their data is authoritative. Cutting that Gordian knot is hard because it’s a people problem that involves many people.”
There is demand on the part of businesses to “be accountable for data, with actual business KPIs linked to data effectiveness and where failure to deliver is seen as a business risk and cost,” said Jones. In addition, he added, new approaches need to overcome “a data team aligned to IT that is still wedded to traditional post-transactional, reporting-centric approaches for data. This means that the need for operational control and accuracy is replaced with a focus on data-quality pipelines and manual cleanup. The challenge in shifting from a reporting-centric, post-transactional data warehouse to a business-owned, operational speed and insight-driven data mesh is a cultural one—the technology just industrializes that transformation.”
STEPS FOR SUCCESS
As with any groundbreaking technology, the needs of the business come first. “Getting started on next generation data architectures almost always requires starting with a clear understanding of your business—strategic goals, key stakeholders, pain points in the current environment, financial constraints, and costs of the current environment,” said Greenstein. “These are the essential ingredients in defining the right target to start an architecture and roadmap for your business.”
Open communication and collaboration are paramount. “Ensure that all the relevant stakeholders are aware of the changes and that the teams that will be working with the new architecture are properly trained,” said Johnson. “This will help to minimize disruptions to your organization’s business and will help to ensure a smooth transition to the new architecture.”
The move also requires addressing a range of questions and concerns. “You need to address your culture and identify your tech challenges as decentralization versus centralization, scalability, product-oriented data mindset, and cultural shift,” said Dangol. “Data mesh or data fabric isn’t a tool or software, but a new way of thinking about managing data. In data mesh, we need to map out the departmental ownership of data to find the right balance. Once the groundwork has been laid, you need to find the right tools and ensure you have the right architecture and quality control over your data.”
Building a next-generation data architecture “requires diligence and good planning,” said Sarkar. “Many organizations are still struggling to move away from the traditional approach of central data team building and managing the entire data platform. In the new scheme of things, the core team is tasked to build and manage the core platform with reusable components and common frameworks to ingest, transform, and work with the data, which is then leveraged by other teams to build and manage their data products. Every organization’s journey will look different. However, the key tenets and principles will mostly remain the same.”
The original article can be found at : https://www.dbta.com/Editorial/Think-About-It/Data-Fabric-Data-Mesh-And-the-Cloud-Data-Management-Architectures-for-the-Future-157252.aspx