A lot of talk has been heard lately about the concept of data lake. Variously known as, data refinery, data factory etc. I find it interesting that we now hear logical architectural terms that speak to the concepts and to the purpose of the big data technologies such as Hadoop / HDFS and Apache distributed database technologies such as HBase/ Cassandra.
This may be indicative of a shift. What I am not sure of is does this mean that there is a level of maturity that has been achieved by this suite of open source technologies? Or could this point to the fact that these technologies have practical applications that solve enterprise scale problems? Or does it show that enterprises have realized that they are no longer able to just deal with "structured data" and that a vast majority of information lies in the space of "unstructured content" leaving them no choice but to venture into the realm of big data technologies? Not really sure!
The fact remains, when the big name software vendors start getting into the business of marketing big data technologies and call start publishing white papers with cool sounding names then there is something going on!! I look at the concepts of data lake, data refinary, data factory etc as synonymous terms for what in the information science realm we call data aggregation! I could be totally off base here and would love to have more of a conceptual / architectural debate on this topic.
I would love to hear from others actively leveraging these technologies as to how they are applying these concepts/ technologies.
Popular posts from this blog
Fellow Bloggers – My role is to create & deliver digital products and solutions that help deliver value to the customer and increase customer loyalty. As an architect of these solutions I am constantly striving to effectively leverage Big Data, NLP and data science techniques. However, when it comes to data science I always struggle with the concepts of machine learning (ML) and artificial intelligence (AI) . In this blog I embark on a quest to find a way to set apart the concepts of ML & AI and to simplify the decision of when to apply which of these two concepts. In just the past couple of years, ML/ AI have magically penetrated into all aspects of our service industry - from automating a manual process to driving cars to offering self-help assistance to recommending next best offers to automation of complex decision making. So the question becomes are these algorithms "simulating" the human or just "mimicking" the human. Do they b
It is one thing to read about Internet of Things (IoT) and get dazzled by the commercial opportunities it offers based on the stats like the number of connected devises there are in the world today. Or how more and more consumer products are getting connected to the "grid" to enable remote monitoring/ operations. Despite all that, it is unclear as to how an enterprise would be able to make a strategic decision about benefits of an investment in IoT. How would it know if their business model or product portfolio or customer base would gain from this investment? I am looking for any case studies, market research material that might help in this analysis. Are there players (established industrial giants and/ or manufacturing heavy-hitters) who have adopted this technology and gained market share or helped make significant product improvements and /or branch out into services not possible before the advent of IoT. Some notable companies come to mind in this space -