Monday, July 18, 2016

AT&T Cloud Business Intelligence Project


The article below touches on AT&T's cloud virtualization initiative and the exciting machine learning aspects of AT&T's services and capabilities which will likely continue to grow.  
Our team has been supporting the business intelligence aspect of the 2020 initiative through development of the capability to provide insight and actionable intelligence around the virtualized infrastructure to the organization.  
Truly interesting and rewarding work within multiple teams and a great group of people!
John K.

Thursday, November 1, 2012

Master Data Management (MDM) Landscape

yeah, yeah, reusing the same picture...it is MDM time again!
(and I don't mean Mobile Device Management)

MDM definition 

"Master Data Management comprises a set of processes, governance, policies, standards and tools that consistently defines and manages master data which excludes transactional data but may include reference data." Wikipedia.


MDM trends for 2012-2013
  • Market Consolidation (see mega vendors below)
  • Tool Convergence (aka: multi-entity, multi-domain...yep called this 4 years ago)
  • BPM for MDM (more governance automation as expected)
  • Data Virtualization / MDM as a Service (have been writing about the MDM SOA intersection since 2007)
    • Data Quality as a Service
      • Address standardization / cleansing 
      • Cleanse/Dedup/Merge across multiple Salesforce installations in the cloud
    • Data Profiling as a Service
  • IMDBs & Big Data (see two prior posts on in memory databases)
  • Forrester trends blog here: http://blogs.forrester.com/rob_karel/12-01-05-mdm_in_2012_what_was_what_will_be_and_what_wont_be 


MDM mega vendors  (and their associated product offerings)
  • IBM 
    • InfoSphere MDM Server (formerly DWL and Trigo)
    • Initiate Master Data Service (formerly Initiate)
  • Informatica 
    • MDM (formerly Siperian)
  • Oracle
    • Customer, Product, Supplier, and Site Hubs (Oracle and Siebel)
    • Data Relationship Management (formerly Hyperion)
  • SAP
    • NetWeaver MDM

Gartner Magic Quadrant leadership is very closely correlated to the mega vendors above and not much changed from 2011 to 2012. Only Informatica and IBM continue to show as leads in all three areas below: 
  • 2012 MDM (customer data solutions): IBM, Oracle, Informatica
  • 2012 Data Integration: Informatica, IBM, SAP, Oracle, SAS
  • 2012 Data Quality: Informatica, SAS, Trillium, IBM, SAP
  • 2011 MDM (customer data solutions): IBM, Oracle, Informatica
  • 2011 Data Integration: Informatica, IBM, SAP, Oracle, SAS
  • 2011 Data Quality: SAS, Informatica, Trillium, IBM, SAP

According to Forrester, Informatica (Siperian) is the MDM leader followed by IBM (Initiate Systems), and per link below, the same two players are leading on data virtualization.




Without having a personal preference on a specific technology (have an IBM / Initiate Systems certification  and experience with Siperian and Oracle PIMit seems that Informatica is moving towards taking the gold here, and (somewhat) coincidentally it will be the product of choice for my next CDI POC to concentrate on hierarchy management, data stewardship and data governance.  Below are a few more Informatica reference links supporting leadership claims:

Nice to see how far Informatica has come from the old ETL days over a decade ago...
John K

Wednesday, October 10, 2012

Starting with SAP HANA



SAP has made it quite easy to get started with SAP HANA. You can get a 30 day trial access to a hosted HANA test and evaluation environment here: http://scn.sap.com/docs/DOC-28191 

At the same site, there is documentation, starting guides and a plethora of tutorial videos: http://scn.sap.com/community/developer-center/hana

Below are high level steps required to get started with a simple example in SAP HANA. First use a data source and Data Services to pull master and transaction data into SAP HANA.  Then create a model and finally expose the model through explorer.  Ok...seems too simple.


And here is the link for developers "The Road to HANA" with great information to get you going: http://scn.sap.com/docs/DOC-31723

Happy in-memory computing!
John K. 

Tuesday, October 9, 2012

IMDBs (In Memory Databases) gaining popularity



As hardware and memory prices continue to decline, in memory databases (IMDB) are gaining traction for real-time analytics, applications, and platforms. There are good reasons for the accelerated adoption of IMDBs for big data, with the most obvious being performance (as in very fast delivery of large data, and on the fly aggregations of multi-dimensional data delivered quickly to the end user).

From an architecture perspective there are a few key differences, looking at HANA as an example, that power such enhanced performance. 
  • Memory vs disk as primary storage (much faster performance)
  • Column and row store vs tables (no indices, materialized views, or cubes - aggregations on the fly)
  • Data compression
  • Parallelism (both on hardware and processing)
These changes are redefining the roles of data modelers, database developers and DBAs which are way more blurred in IMDB environments vs traditional databases.

Use Cases for IMDBs include: real-time analytics on operational databases, data warehousing, and predictive analysis on big data, as well as real-time applications and platforms (mobile, cloud)

Drawbacks for IMDBs include: Maturity of most product offerings, cost and availability of skills / expertise.

Enterprise IMDB Vendors and Products


  • Microsoft (updated per announcement on Oct 24th 2012)
  • Opensource
    • Project Serengeti is an open-source effort that is coming right behind Microsoft from VMWare
    • MySQL, other smaller players here
    • Feedback welcome on opensource IMDB enterprise solutions you have had success with

Expecting the adoption of IMDBs as well as product maturity to continue to accelerate and the vendor battle to heat up, as large enterprise clients climb on-board.  At this time everyone seems to be chasing SAP HANA.

Other Links


Tuesday, January 31, 2012

Hadoop and MapReduce

Hadoop and MapReduce continue to pick up steam and generate "noise" in periodicals and online publications. As an Information Management Architect I get asked about the value of Hadoop and the impact to existing information and architecture roadmaps.

As such, a short entry to clarify the purpose of Hadoop, which I view as a complementary rather than a competing solution (assuming there is a majority of structured data to be managed) to addressing the need for highly distributed parallel cloud computing for unstructured and structured data in high volumes requiring fast performance. Thus the Yahoo, Facebook, and Google's of the world success stories and implementations.

Some of the drawbacks as of the day of this post include open-source components maturity & support (HDFS, HBASE, MapReduce, PIG, HIVE, AVRO, SQOOP, Chukwa, and Zookeeper), security, backup/recovery, integration to other systems, weak consistency, specialized resource skills/cost, and existing resource skills retraining.

Based on the expanding market, many of the enterprise Business Intelligence vendors (Informatica, SAP, etc..) are building plug-ins to ensure support of Hadoop in order to protect and expand their client base.

Thursday, November 18, 2010

ITIL in 30 days!

The month of November turned out to be a small detour from Information Management activities and very much in the weeds of IT governance as I lead the definition of an ITIL (Information Technology Infrastructure Library) initiative for my current client. Insane to think that anyone can do justice to, and actually implement ITIL in 30 days, right? My last run-in with ITIL at Cisco was as large of an initiative as they come.

Well, truth be told the current organization, much smaller in size, was already on board with ITIL as the preferred way to manage IT processes and practices. Furthermore, the initiative excluded the CMDB / technology implementation component. So in effect, we tackled IT governance under the ITIL framework and dug into documenting policies, processes, people (roles & responsibilities), and physical assets under management....on a shoestring. With a top down approach, tackling breadth first and depth as applicable, bi-weekly brainstorming and artifact review sessions, and management support of the initiative, we were able to create the executive overview, framework, documents, process charts, policies, RACIs, and publish the ITIL governance solution for corporate consumption…yes, within 30 days!

More emphasis (depth) was placed on Service Support (Incident Management & Service Desk, Problem Management, Change Management, Release & Deployment Management, Asset & Configuration Management) as well as Security Management, while Service Delivery (Service Level Management, Availability Management, Capacity Management, IT Financial Management, and IT Service Continuity Management) was framed so that internal resources can continue decomposition of processes, policies and responsibilities, without the need for help from outside consultants. To be fair, there is always more work that can be done, but implementing a structure, in quick fashion, that allows a client to continue execution independently is invaluable and very cost effective.

Now, ITIL might initially seem quite far from Architecture and Information Management; the heavy governance MDM initiatives however, overlap many of the same areas of interest that must be tackled (policy, procedures, roles, ownership, management support, executive sponsorship, metrics and reporting, knowledge management, etc...). Time well spend which should translate to cost savings due to reduction of issues and rework, as well as improved efficiency of operations; and all the "-ility" benefits of Service Delivery that come with ITIL.

It seems like December might include some time off to hit the slopes or visit Chicago with the family, so next entry will most likely be next year and who knows on what….the exciting life of consulting!

Thursday, March 4, 2010

SQL Server 2008 Master Data Services

Last week I installed SQL Server 2008 Enterprise R2 MDS customer, product, and chart of accounts sample domain models on a virtual x64 Windows Server 2008 sandbox. The same virtual houses Report Server for 2008 with Report Builder 3.0 and SQL Server 2005 with Report Services as well. Install was completed in minutes with no issues. Coincidentally, Report Builder 3.0 has had a make-over and seems more modular and flexible at first pass. Will be looking into report model migration since we are currently using Report Builder 2.0 at this client.

MDS did not strike me as the most intuitive tool to work with, and I did not find as much breadth as I would like. Jury is still out though, since this is the first release since the purchase of this product by Microsoft and I only had a limited amount of time to devote to this initial evaluation task. If anyone out there has success stories to share on MDS shoot me a note/link.

Time for zzzz's.