Machine Learning for the Enterprise.

In my last blog “Business differentiation through Machine Learning” I introduced and described the concepts of machine learning. We traced its origins from a computer science project to Watson showcasing and winning on the Jeopardy TV quiz show and its real world use across numerous industries including health care.

We concluded that machine learning has the potential to help make the world a better and safer place.  However (yep…there’s always a ‘however’), in order for that to become a reality machine learning in all its forms has to be enterprise ready.

 

Enterprise Requirements.

When I use the word “Enterprise” I mean organizations that have business critical requirements.  It’s not just about organization size.  Volumes of transactions and data, velocity of interactions, variety of data – yes it’s those “Vs” of big data again – are all key factors that might impact an organization’s machine learning requirements.  There is also collaboration across data scientists, engineers, developers creating, testing, training, deploying machine learning and its models and the levels at which different audiences want to be exposed to machine learning. So let’s look at some of these and other factors that make machine learning from IBM truly enterprise ready.

 

Collaboration

Large enterprises tend to have a significant data scientist team often with multiple data scientists engaged on a single project. Collaboration across these data scientists and maybe other personas is required to maximize productivity, agility and effectiveness.  Today as part of the IBM data science experience we bring in the concept of the “project” wherein various personas and users can safely collaborate on a project to build, test, use and deploy many artefacts with a group of people. Our machine learning technologies adopt the same concept – able to share all the analytic artefacts (notebook, pipeline, model, etc,).  For example a group of people can collaborate on a single notebook wherein one person can do curation/transformation and then can hand over to another person for creating algorithms and testing and training the model. Then other team members can evaluate the model and deploy it.  Each individual user can be authenticated separately and authorized as part of the roles defined in the project limiting or granting access to part of the overall process / experience accordingly.

 

Consumption

Not everyone is a data scientist nor wants or needs to know about model design, statistical theory or training the model.  Developers for example may have varying levels of needs.  They may just want to be able to use a known model that works well and deploy it in their app.  Figure #1 below shows how IBM has designed a work space that allows application developers to not only choose and deploy a model but actually create a pipeline via a step-by-step process.   Take that one step further higher level developers may want to choose from a collection of pre-packaged machine learning services such as fraud detection, weather prediction, manufacturing models, sentiment analysis, emotional analysis.  IBM provides these today through its Bluemix services which are integrated as part of data science experience.

Screen Shot 2016-08-23 at 8.20.37 AM

Figure #1  Integrated workspace – creating a pipeline.

 

Commoditizing, Automating Machine Learning

Machine learning in enterprise environments can be challenging.  It starts with the assumption that the model becomes stale the minute you stop training it. Over time the accuracy of the models can worsen and can take significant time to understand what is happening, why and to then retrain existing models and deploy new versions. It comes down to revenue – and some enterprises may have a hard time adopting the necessary discipline –  because they cannot gauge the impact to the bottom line. A lot of machine learning use cases might not be very intuitive in a sense that you cannot set clear control points and flows and people cannot logically relate to them. So just the very term “Machine Learning” may send some of the less scientific people in our communities running the opposite direction.  Often data scientists perform a number of tedious and time-consuming steps to derive insight from a raw data set. The process can involve data ingestion, cleaning, and transformation  (e.g. outlier removal, missing  value  imputation), then proceed to model building, and finally a presentation of predictions that align with the end-users objectives and preferences.  It can be a long, complex, and  sometimes artful process requiring substantial time and effort, especially because of the combinatorial explosion in choices of algorithms (and platforms), their parameters, and their compositions. Tools that can help automate steps in this process have the potential to accelerate the time-to-delivery of useful results, expand the reach of data science to non-experts,  and offer a more systematic exploration of the available options.  Cognitive Automation of Data Science (CADS) helps integrate learning, planning and composition, and  orchestration techniques to automatically and efficiently build models. This is done by deploying analytic flows to interactively support would-be data scientists in their tasks. CADS also provides the capability to run multiple predefined algorithms in parallel and identify the best suitable algorithm for a particular use case. In short, CADS selects the best algorithm for the given use case. Click here to read an IBM paper on CADS.

 

Training, Tuning, Model Optimization

There can be times when the model or algorithm used in machine learning becomes too good to be true on the training data: the model predicts the training data very well, but performs poorly on new data.  This is known as overfitting – an extreme case can occur in rote learning where the model achieves 100% performance on data you have already seen, and probably won’t do any better than random guess on new data. Imagine what this would do to a business – it could ruin it, upset customers, set the wrong price point and miss many business opportunities.  IBM machine learning helps counter this through a clean separation of training data from holdout data used to evaluate model performance, as well as careful use of cross validation techniques.

 

Data Sovereignty and Isolation

Some enterprise organizations have a fear, psychological or otherwise, when it comes to putting their data and applications on hardware, storage and network infrastructures that is shared with other organizations. IBM’s Cloud First strategy provides the necessary sovereignty, multi-tenancy and isolation to help ensure that their data and applications are managed privately across IBM world wide data centers.

 

Variety  – All types of Data

There used to be a time when data was simpler – structured relational or hierarchical data stored in databases.  Big data simply means “all data” which includes volumes of raw content some structured in some form,  other parts unstructured. Add the Internet of Things sending massive amounts of sensor data and the world is not so simple.  The IBM data strategy is to Make Data Simple.  Our machine learning capabilities leverages this strategy being able to process structured, semi-structured and unstructured data sets using many connectors and abstracting complexity by exploiting the Spark, R, Python runtimes for the machine learning. IBM provides 20+ different data sources(connectors) from which an organization can ingest data.

 

Large compute power

Enterprises need high compute power since they process ever increasing work loads of data, transactions and processes. Since our Spark service is a single multi-tenant cluster, resource utilization is not wasted as we can repurpose the computer power. Scaling out and scaling down are capabilities that are built as part of the service. Our machine learning is able to transparently enable this scale out / scale down capabilities.

 

Information governance

IBM has a strong heritage on information governance from managing data over it’s lifecycle, cleansing and quality of data, data wrangling and shaping, through to security and privacy of data.  The information governance catalogue uses policies that help ensure that only the right people can see, access, execute data and services. IBM can also help provide real time monitoring, threat detection, prevention and intervention as well as forensics, compliance, detailed audit, obfuscation/masking of data, encryption and more.  All these can be applied to machine learning and training data.

 

Machine Learning not a One Trick Pony – Machine Learning as a Service (MLaas)

There are many forms of machine learning. From the System ML that IBM donated to Apache Foundation to natural language processing, vision, personality and emotional insights, customer sentiment, retrieve and rank and more.  IBM can makeit simple for an enterprise of any size to pick and choose from a number of predefined machine learning services from the Bluemix tiles below in figure #2.   I ran my earlier blog through the “personality analyser” – I knew I was a really nice guy but it was reassuring to hear it from a machine learning service.  Don’t believe me?  Try it here.

Screen Shot 2016-08-23 at 8.21.11 AM

Figure #2 Machine Learning as a Bluemix service.

 

Using Machine Learning to Reduce Costs and Risks
There are many customers across different industries that have used our machine learning capabilities to help reduce costs, improve customer service and reduce risks.

The Vermont Electrical Power Company (VELCO) worked with IBM Research to develop an integrated weather forecasting system to help deliver reliable, clean, affordable power to their consumers while integrating renewable energy into the grid.  The solution combines high resolution weather with multiple forecasting tools based on machine learning.  The machine learning models are trained on hindcasts of weather correlated to historical energy production and historical net demand.

The results are some of the most precise and accurate wind and solar generation forecasts in the world. This powerful tool turns multiple streams of data—transmission telemetry, distribution meter data, generation production, highly precise forecast models—into actionable information using leading edge analytics. A collaborative achievement involving dozens of in-state and regional partners and the formidable intellectual resources of IBM Research, VWAC’s results are significant and its value already demonstrated, even as further benefits continue to emerge. To find out more watch the video on the VELCO website Courtesy Vermont Electrical Power Company web site and video

 

Conclusion – Breadth, Depth and Enterprise Ready

While many vendors may provide an aspect of machine learning often restricted to a particular runtime or platform IBM provides many forms of machine learning covering generic machine learning, natural language processing, vision, personality and emotional insights, sentiment, retrieve and rank and many others.  IBM has been exploiting many of these machine learning capabilities for many years as part of its Watson Analytics portfolio helping to take organizations on their cognitive journey. Combine this breadth and depth of capability with the enterprise readiness of machine learning capabilities discussed above with our cognitive strategy and execution and it becomes clear why some many of the biggest and business critical organizations in the world choose IBM.

For more information on IBM’s cognitive strategy and machine learning capabilities click this link ibm.com/outthink

 

Dinesh Nirmal, 

Vice President, Development, Next Generation Platforms, Big Data & Analytics 

Follow me on Twitter @DineshNirmalIBM

 

TRADEMARK DISCLAIMER: Apache, Apache Hadoop, Hadoop, Apache Spark, Spark are trademarks of The Apache Software Foundation.

Advertisements

5 thoughts on “Machine Learning for the Enterprise.

  1. Very interesting document, clear and enough concise for beginners.
    The problem is “where to start from?” for somebody that has not an in-depth training in statistics and data modelling? I’m great on using SPSS to create streams for data mining but I’m struggling on using the tool for Cognitive and Predictive purposes on the data I’m taking from the different data-warehouses.

    Like

  2. CARD looks interesting and differentiation from cross validator of spark. It might save time of data scientist to choose algorithm and opt parameters.

    Like

  3. CADS looks significant differentiation from existing cross validator to save much time to determine algorithm and best parameters of model for specific data.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s