Upload
rameshlatentview
View
4.248
Download
2
Embed Size (px)
DESCRIPTION
A Short PMML Tutorial by LatentView
Citation preview
Ramesh Hariharan
PMML Tutorial
www.LatentView.com
This presentation is solely for the use of LatentView. No part of this presentation may be circulated, quoted, or reproduced for distribution without prior written approval from LatentView.
12-Feb-2009
www.LatentView.com
www.latentview.com/blog
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 2
• Next Steps…
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 3
• Next Steps…
PMML Overview
PMML – Predictive Modeling Mark-up Language� Used for Model Scoring� XML Document� Owned by DMG. A consortium led by SPSS, SAS, IBM, Microsoft, Oracle and others� Currently in version 3.2
Advantages of PMML
� Portability of models� Metadata standardization� Model once, score anywhere (MOSA ☺)
Drawbacks of PMML
� Least Common Denominator� Potential loss of precision� Lack of support for complex transformations
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
Some of the Model Types Supported� Association Rules, Clustering, General Regression, Naïve Bayes, Neural Networks, Support Vector
Machines
Capabilities of PMML� Model Composition – model sequencing & model selection� Built-in and User-defined functions� Usual data types – date, numbers, category� Model Verification – sample results for testing� Output field – create output tables based on the models� Extension Mechanisms
4
� Model once, score anywhere (MOSA ☺) � Lack of support for complex transformations� Lack of support from Tools
PMML in the Decision Management Architecture
Create Rules
Client Managers
Business Rules formulation
Scores and Decisions
Requests
Business Rules
Decision Models
Model Repository
Ope
ratio
nal S
yste
ms
Sales & Marketing
Customer Management
Risk Management
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
AnalyticModeling
LatentView Analysts Enterprise Decision Engine
Model Development
Enterprise Data
ProductData
ChannelData
CustomerData
Payment History Data
Interaction Data
Ope
ratio
nal S
yste
ms
Other Applications
Analytics Data Backbone
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 6
• Next Steps…
Constructing a PMML<?xml version="1.0"?> <PMML version="3.2" xmlns="http://www.dmg.org/PMML-3_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > <Header copyright="Example.com"/> <DataDictionary> ... </DataDictionary> ... a model ...
</PMML>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 7
www.dmg.orghttp://dmg.org/v3-2/GeneralStructure.htmlhttp://dmg.org/v3-2/pmml-3-2.xsd
Constructing a PMML<?xml version="1.0"?> <PMML version="3.2" xmlns="http://www.dmg.org/PMML-3_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > <Header copyright="Example.com"/> <DataDictionary> ... </DataDictionary> ... a model ...
</PMML>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 8
www.dmg.orghttp://dmg.org/v3-2/GeneralStructure.htmlhttp://dmg.org/v3-2/pmml-3-2.xsd
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 9
• Next Steps…
XSD Overview
XSD – XML Schema Definition
The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.
An XML Schema:• defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
• defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes
A First Example
Look at this simple XML document called "note.xml":
<?xml version="1.0"?> <note> <to>Tove</to>
<from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>
</note>
Look at the XML Schema for the same
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">
<xs:element name="note"> <xs:complexType>
<xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/>
</xs:sequence> </xs:complexType>
</xs:element></xs:schema>
Simple Elements
<xs:element name="xxx" type="yyy"/>
XML Schema has a lot of built-in data types. The most common types are:• xs:string• xs:decimal• xs:integer• xs:boolean• xs:date• xs:time
Example
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
Example
<lastname>Refsnes</lastname> <age>36</age><dateborn>1970-03-27</dateborn>
<xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>
XSD Attributes
Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex type. But the attribute itself is always declared as a simple type.
<xs:attribute name="xxx" type="yyy"/>
where xxx is the name of the attribute and yyy specifies the data type of the attribute. XML Schema has a lot of built-in data types. The most common types are:• xs:string• xs:decimal• xs:integer• xs:boolean
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
• xs:boolean• xs:date• xs:time
Example
<lastname lang="EN">Smith</lastname>
<xs:attribute name="lang" type="xs:string"/>
Simple Elements: Restrictions
Restrictions are used to define acceptable values f or XML elements or attributes. Restrictions on XML elements are called facets.
Restrictions on Values<xs:element name="age">
<xs:simpleType> <xs:restriction base="xs:integer">
<xs:minInclusive value="0"/> <xs:maxInclusive value="120"/>
</xs:restriction> </xs:simpleType>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
</xs:simpleType></xs:element>
Restrictions on a set of Values<xs:element name="car" type="carType"/>
<xs:simpleType name="carType"> <xs:restriction base="xs:string">
<xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/>
</xs:restriction> </xs:simpleType>
Complex Elements
<employee> <firstname>John</firstname> <lastname>Smith</lastname>
</employee>
<xs:element name="employee" type="personinfo"/><xs:complexType name="personinfo">
<xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/>
</xs:sequence>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
</xs:sequence> </xs:complexType>
<xs:element name="employee“><xs:complexType>
<xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/>
</xs:sequence> </xs:complexType>
<xs:element>
More Complex Elements
You can also base a complex element on an existing complex element and add some elements, like this:
<xs:element name="employee" type="fullpersoninfo"/>
<xs:complexType name="personinfo"> <xs:sequence>
<xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/>
</xs:sequence> </xs:complexType>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
</xs:complexType>
<xs:complexType name="fullpersoninfo"> <xs:complexContent>
<xs:extension base="personinfo"> <xs:sequence>
<xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/>
</xs:sequence> </xs:extension>
</xs:complexContent> </xs:complexType>
XSD Indicators
You can also base a complex element on an existing complex element and add some elements, like this:
IndicatorsThere are seven indicators:
Order indicators:• All • Choice • Sequence
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
Occurrence indicators:• maxOccurs• minOccurs
Group indicators:• Group name • attributeGroup name
Complex Type: Example
Let's have a look at this XML document called "ship order.xml":
<?xml version="1.0" encoding="ISO-8859-1"?><shiporder orderid="889923" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="shiporder.xsd">
<orderperson>John Smith</orderperson> <shipto>
<name>Ola Nordmann</name> <address>Langgt 23</address> <city>4000 Stavanger</city> <country>Norway</country>
</shipto> <item>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
<item> <title>Empire Burlesque</title> <note>Special Edition</note> <quantity>1</quantity> <price>10.90</price>
</item> <item>
<title>Hide your heart</title> <quantity>1</quantity> <price>9.90</price>
</item> </shiporder>
Complex Type: Example Solution
The XSD for the file:
<?xml version="1.0" encoding="ISO-8859-1" ?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="stringtype"><xs:restriction base="xs:string"/>
</xs:simpleType><xs:simpleType name="inttype">
<xs:restriction base="xs:positiveInteger"/></xs:simpleType><xs:simpleType name="dectype">
<xs:restriction base="xs:decimal"/></xs:simpleType><xs:simpleType name="orderidtype">
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
<xs:simpleType name="orderidtype"><xs:restriction base="xs:string"><xs:pattern value="[0-9]{6}"/></xs:restriction>
</xs:simpleType><xs:complexType name="shiptotype">
<xs:sequence><xs:element name="name" type="stringtype"/><xs:element name="address" type="stringtype"/><xs:element name="city" type="stringtype"/><xs:element name="country" type="stringtype"/>
</xs:sequence></xs:complexType>
continued next slide
Complex Type: Example Solution
The XSD for the file:
…continuous from the previous slide
<xs:complexType name="itemtype"><xs:sequence>
<xs:element name="title" type="stringtype"/><xs:element name="note" type="stringtype" minOccurs="0"/><xs:element name="quantity" type="inttype"/><xs:element name="price" type="dectype"/>
</xs:sequence></xs:complexType>
<xs:complexType name="shipordertype">
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
<xs:complexType name="shipordertype"><xs:sequence>
<xs:element name="orderperson" type="stringtype"/><xs:element name="shipto" type="shiptotype"/><xs:element name="item" maxOccurs="unbounded" type="itemtype"/>
</xs:sequence>
<xs:attribute name="orderid" type="orderidtype" use="required"/></xs:complexType>
<xs:element name="shiporder" type="shipordertype"/></xs:schema>
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 21
• Next Steps…
PMML: Headers
<Header copyright="Copyright (c) 2009 LatentView" description="LatentView Logit Model v1.0">
<Extension name="timestamp" value="2009-01-19 19:38:13" extender="Rattle" /><Extension name="description" value="Administrator" extender="Rattle" /><Application name="Rattle/PMML" version="1.2.0" />
</Header>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
PMML: Data Dictionary
<DataDictionary numberOfFields="23"><DataField name="ind_Sale" optype="continuous"
dataType="double" />…
<DataField name="STATE" optype="categorical" dataType="string" />
</DataDictionary>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
PMML Transformations
PMML defines various kinds of simple data transformations:� Normalization : map values to numbers, the input can be continuous or discrete. � Discretization : map continuous values to discrete values. � Value mapping : map discrete values to discrete values. � Functions : derive a value by applying a function to one or more parameters � Aggregation : summarize or collect groups of values, e.g., compute average.
Value Mapping<DerivedField name="ETHNICGROUPCODE_02" optype="ordinal" dataType="integer">
<MapValues outputColumn="derived" defaultValue="0" mapMissingTo="0"><FieldColumnPair field="ETHNICGROUPCODE" column="original" />
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
<FieldColumnPair field="ETHNICGROUPCODE" column="original" /><InlineTable><row><original>02</original><derived>1</derived>
</row></InlineTable>
</MapValues></DerivedField>
Built-in Function<DerivedField name="I1EXACTAGE_dr" optype="continuous" dataType="double">
<Apply function="sum"><FieldRef field="I1EXACTAGE"/><FieldRef field="I1ESTIMATEDAGE"/>
</Apply></DerivedField>
PMML: Mining Schema
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
PMML: Mining Schema
< <MiningSchema><MiningField name="ind_Sale" usageType="predicted" missingValueReplacement="-1"
missingValueTreatment="asValue" /><MiningField name="I1ESTIMATEDAGE" usageType="active" missingValueReplacement="-1"
missingValueTreatment="asValue"/><MiningField name="I2ESTIMATEDAGE" usageType="active" missingValueReplacement="-1"
missingValueTreatment="asValue"/>…
<MiningField name="I1EXACTAGE" usageType="active" missingValueReplacement="-1" missingValueTreatment="asValue"/>
</MiningSchema>
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential) 27
• Next Steps…
Next Steps
� Create a PMML file from your models – one for Logistic, Clustering and Decision Tree models
� Build PMML manually, and validate it using an XML editor such as XMLFox (a syntactically valid PMML may not be logically valid)
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
Thank You !
www.LatentView.com
LatentView Analytics Pvt. Ltd (Confidential)
JVL Plaza, Ground Floor,626 Anna Salai, Teynampet,Chennai – 600 018
Phone: +91-44-4509 4039/40
80, Broad Street, 5th FloorNew York, NY 10004
Phone: +1-212-837-7874