A comparative study on Machine Learning for Computational Learning Theory

8/20/2019 A comparative study on Machine Learning for Computational Learning Theory

http://slidepdf.com/reader/full/a-comparative-study-on-machine-learning-for-computational-learning-theory 1/5

www.ijsret.org

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 – 0882

Volume 4, Issue 9, September 2015

A comparative study on Machine Learning for Computational

Learning Theory

Madhur Aggarwal#1, Anuj Bhatia#2 #1B.Tech, IT from BharatiVidyapeeth's College of Engg., Software Developer at Plumslice Labs Pvt Ltd.

#2B.Tech. ECE from Graphic Era University, ATG Developer at Accordion Systems Pvt. Ltd.

AbstractFrom the past two decades one of the mainstays ofinformation technology is Machine Learning and with

that, a rather vital, albeit generally hidden, aspects of ourlife. Generalizing from examplesMachine learning

algorithms can figure out how to execute significanttasks. This is cost-effective and often feasible wheremanual programming is not. In thispaper weprovide a

comprehensive analysis of various approaches ofmachine learning based on different domains with their

pros and cons. A brief comparison has been made between the different techniques based on certain parameters.

Keywords — machine learning, theory, computationallearning

I. IntroductionMachine learning systems automatically cram programs

from data. This is often a awfully engaging different tomanually making them, and within the last decade theuse of machine learning has unfold quickly all overcomputing and on the far side. Machine learning is usedin internet search, spam filters, recommender systems,

ad placement, credit marking, fraud detection, stockcommerce, drug style, and plenty of alternativeapplications. A recent report from the McKinsey worldInstitute asserts that machine learning (a.k.a. data

processing or predictive analytics) are the driving force

of consecutive massive wave of innovation [1]. Somefine textbooks are out there to concerned practitionersand researchers e.g. [2] [3]. However, a lot of of the“folk knowledge” that's required to effectively create

machine learning applications isn't freely available in

them. As a outcome, several machinelearning assignments take for much longer than

necessary or finish up giving fewer- than-idealoutcomes. however a lot of of this folks knowledge is

fairly simple to connect.Machine learning has turn into a scientific discipline thatoperational communication of its ideas remnants an art.

The concept of a calculated study of machine learning is by no way novel to computer science. As an example,

study in the fields known as inductive interference andapplied pattern recognition typically addresses drawback

of inferring a good rule from given information. Surveysand highpoints of these rich and diversefields[4] [5] [6

[7]. Whereas variety of ideas from these older areashave tested relevant to the current study.

The demand of computational efficiency isnow adefinite and vital concern

Inductive interference model usually look for learningalgorithms that do precise identification within the limit

the categories of

functions taken aretypically so giant thatenhanced complexity outcomes don’t seem to

be attainable. Whereas sometimes finds complexityoutcomes leads to the pattern recognition

works, computational efficiency is in generally asecondary concernStudy in computational learning theory noticeably has

some connection with empirical machine learninganalysis conducted within the field of computer science

As may well be expected, this connection differs instrength and connection from problem to problemIdyllically the 2 fields would balance one another in a

very significant method, withexperimental analysis advising new theorems to be

evidenced and vice-versa. several of the problems half-tracked by artificial intelligence but they seem veryadvanced and are badly understood in their

biological incarnation, to the purpose that they're presently on the far

side mathematical formalization.

II. Phases of Machine Learning Representation: -A classifier should be delineated insome formal language that the pc can handle. [8]

Conversely,selecting a illustration for a learneris equivalent to picking the set of classifiers thait can probably learn. This set is named as the

hypothesis space of the learner. If a classifier isn't withinthe hypothesis space, it can-not be learnedA connected question that we’ll address in late section

is a way to represent the input, i.e., what options to use. Evaluation:An analysis function also

knownas objective function or rating function) isrequired to tell apart smart classifiers from unhealthyones. The anslysis function used inside by the algorithm



www.ijsret.org



could dissent from the outside one that we wish theclassifier to optimize, for easy optimization (see below)and owing to the problems mentioned within the nextsection. Optimization: Finally, we'd need a way to go

looking amongst the classifiers within the language forthe maximum- scoring one. The

selection of optimisation method [8] is essential tothe competance of the learner, and co-

jointly helps verify the classifier made if the evolution

function has over one optimum its common for novellearners to begin out victimization ready-to-

shelf optimizers, that are later changed by custom-designed ones

III. Issues and ChallengesStatistical Predicate Invention: -Establish invention inILP and concealed variable discovery in applied

mathematics learning are extremely 2 faces of identical problem. Researchers in each group typically agree

that this is often key (if not the key) problematic formachine learning. Without base inventionlearning can forever be narrow in essence each word

within the wordbook is associate unreal predicate, withseveral layers of years of invention among it and also

the sensory percepts on that it's ultimately based.sadly, advancement to date has been narrow.The consent looks to be that the matter is simply too

exhausting, and it’s unclear what to try anddo regarding it.

Generalizing across Domains Machine: -Learninghas historically been outlined as simplifying across tasks

from a similar domain, and within the previousfew decades we’ve learned to try to to this with success.Though, the obtrusive distinction amid machinelearners and folks is that individuals will simplify acrossdomains with good ease. For instance Wall Street

hires countless physicists regarding finance but theyknow nothing about finance .However they apprehend

plenty regarding physics and also the maths it needs,and someway this transfers quite well

to valuation choices and forecasting the stock market

palce place. Machine learners will do nothing of the sort.If the predicates telling 2 domains are completely

different, there’s simply nothing the learner will knockoff the new domain given what it learned within

the recent one

Learning Many Levels of Structure: -So far-off,

in statistical relational learning (SRL) we'vegot developed technologically advanced algorithms for

studying from structured inputs and structuredoutputs, however not for studying structured internal

demonstrations. In each ILP and statistical learningmodels generally have solely 2 levels of structure. As anexample, in support vector machines the 2 levels are thekernel and therefore the linear combination, and inILP the 2 levels are the clauses and their

conjunction. Whereas 2 levels are in essence sufficient torepresent any operate of interest, they're an especially

inefficient method to signify most functions. Byhaving several levels and recycling structure we are ableto typically acquire representations that are

exponentially a lot more compact..

Deep Combination of Learning and Inference: -Inference is vital in structured

learning, however analysis on the 2 has been for themost part separate so far. This has led toa inexplicable state of affairs wherever we tend to spend

plenty of data and central processing unit time learninginfluential models, then again we've to try to

to approximate illation over them, dropping some(possibly much) of that power. Researchers wouldlike biases

and illation must be economical, therefore efficient illation ought to be the bias. We must always design ourlearners from scratch to find out the foremost powerfumodels they'll, subject to the constraint that illation overthem should be efficient always (ideally real time).

Learning to Map between Representations: -An

application space wherever structurelearning will have lot of influence is illustration

mapping.3 major issues during this space are entityresolution (matching objects) schema matching(matching predicates) and metaphysics alignment

(matching concepts). We've got algorithms fordetermining every ofthose issues individually, assumptive the others have

previously been resolved. However in maximum reaapplications they're all arise at the same time, and no

any of the “one piece” algorithms work. this can be ahaul of great sensible significance as a result ofintegration wherever organizations pay maximum oftheir Information Technology budget, and while

not resolving it the “automated Web” (Webservices, linguistics internet, etc.) will not ever reallytake off.

Learning in the Large: -Structured learningis presumably to give off in massive domains, as a result

of in little ones it's usually not too tough to handengineer a “upright enough” set of propositional options

So far, for the foremost half, we’ve functioned on micro problems (e.g. distinguishing promoter regions in DNA)our emphasis ought to shift more and more to macro-



www.ijsret.org



problems (e.g. modeling the whole metabolic network inan exceedingly cell. We'd like to find out “in themassive,” and this doesn’t simple mean massivedatasets. it's several sizes: learning in made domainswith several reticulate theories: learning with lots of

data, lots of knowledge, or both; taking massive systemsand replacement the normal pipeline design with

conjoint inference and learning; learning models withlots of factors rather than millions; nonstop, open-endedlearning; etc.

Structured Prediction with Intractable Inference: -Max-

margin coaching of organized models like HMMs andPCFGs has turn into fashionable in recent years. One

among its engaging options is that, once reasoning istractable, learning is additionally tractable. Thiscontrasts with most probability and Bayesian ways that

will continue intractable. Yet, most attention grabbingAI issue comprises intractable reasoning. However we

tend to optimize margins once reasoning is approximatecomprise intractable reasoning. However we tend tooptimize margins once reasoning is approximate?

However will approxapproximate reasoning act with theoptimizer? Will we acclimate current improvementalgorithmscurrent improvement algorithms toform them sturdy with relevance reasoning errors, or will we got to develop

fresh ones? We'd like to reply these queries if max-margin ways are to interrupt out of the slender vary of

structures they'll presently handle efficiently.

Reinforcement Learning with Structured Time: -The Markov theory isnice for dominant the complexness of sequent decision issues, however it's co-jointly a

straitjacket .Within the universe systems have memory,some interfaces are quick and a few are slow andlong uneventful periods substitute with bursts of

activity. We need to find out at multiple time scales atthe same time, and with a fashionable structure of

actions and intervals. This is often a lotof advanced, however itshould conjointly facilitate create reinforcementlearning a lot more efficient At coarse scales,

rewards are virtually fast, and RL is simple. At advancedscales, rewards are distant, however by spreadingrewards across scales we might be competent to

also ready to greatly speed up learning.

Expanding SRL to Statistical Relational AI: -We must

reach out to alternative subfields of AI, as a resultof they have a similar issues as we do: they need logical

and applied math approaches, every solves solely asection of the matter, and what's very required may bea combination of the 2. we wish to use learning to

larger and bigger items of a whole AI system. Forinstance, natural language process involves an oversizedvariety of subtasks (analyzing, referenceresolution, word meaning clarification, participant rolelabelling, etc.). By now, learning has been

applied principally to every one in isolation, ignoringtheir interactions. We want to drive towards a solution

to the whole problem.

Learning to Debug Programs: -Machine learning

is creating inroads into different fieldsof computer science: systems, networking, computer

code engineering, databases, design, graphics, HCIetc. This can be a good chance to possess impact, and a

good source of wealthy issues to drive the sphere. Onefield that look set for advancementis automated debugging. Debugging is very long and

time consuming, and was one in all theinitial applications of ILP. Though, within the early

days there was no information for learning to correctand learners couldn't get terribly sofar. Nowadays we've the net and large repositories

of source code. Even higher, we are able to leveragemass collaboration. Anytime a coder repairs a bug, wehave a tendency to probably have a chunk otraining information. If coders allow us tomechanically record their corrects, debugging traces

compiler messages, etc., and lead them to a centrarepository, we'll presently have an outsized corpus o

bugs and bug fixes.

IV.

Methods

A. Basic Al gori thm

For machine learning basic algorithm are used tosolve a binary classification problem namely as:

Naive Bayes,

Nearest Neighbors,

The Perceptron,

K-means which can be used

B. Algori thm in computational Learning

Pitt et al.[9] then notice that illustration categoriesdiagrammatical by k-terms-DNF and k-clause-CNF are

correctly confined inside the categories k-CNF, andtherefore the category k-clause-CNF is polynomiallylearnable by k-CNF, and therefore the category k

clause-CNF is polynomially learnable by k-DNF. andtherefore the author [9] evidenced that for

any mounted k>2, learning k-term-DNF by k-TERM-DNF and learning k-clause-CNF by k-clause-CNF are

NP-hard issues



www.ijsret.org



The outcomes of [9] are necessary in this they show thetremendous machine benefit that will be multiplied

by even handed modification ofhypothesis illustration. This will be seen asa restricted however obvious validation of the rule of

thumb in AI that illustration is vital. By stirring to a a lotof powerful hypothesis category H rather

than demand on the a lot of natural alternative H=C,move from an NP onerous downside go a polynomialtime resolution

Further positive outcomes for polynomial timelearning embrace the algorithmic rule of Haussler et al

[10] for learning the category of internal dividingBoolean formulae. His algorithmic rule is noteworthy

for the actual fact that the time quality be contingentlinearly on the dimensions of objectformula, however solely logarithmically on the overall

variety of variable n; therefore if there are several inaptattributes the time needed going to be quiet diffident.

The show that there needn't be specific centeringmechanism within the descriptions of distribution openmodel for distinguishing those variables that are

appropriate for a learning algorithmicrule, however somewhat this task is combined within thealgorithms themselves. Alike outcomes are given forlinearly divisible categories by Littlestone [11], andlately a model of learning within the presence

of time several moot features was suggested byBlum[12].

Rivest [13] Take k-decision lists, and provides a polynomial time algorithmic rule learning kDL by kDL

for any constant k. Author additionally proved that kDLappropriately embraces each kCNF and KDNF.Ehrenfeucht et al. [14] studied decision trees.Author

outlined a measure of however stable a choice tree istermed a rank. For decision tree of a hard and fast rankr, they provide a polynomial

time algorithmic learning algorithmic learning algorithmrule that continuously output a rank r decision tree

Abe [15]gave a polynomial time rule for learning a category offormal languages called semi linear sets. Helmbold et al.[16] offer methods for learning nested variations

of categories already acknowledged to be polynomiallylearnable. These embrace categories like class ofall set of Zk closed underneath addition and

subtraction and also the class of nested variations ofrectangles within the plane.There are several efficient rules that learn illustration

categories outlined over Euclidean domains. Most ofthose are supported on the innovative work of Blumer et

al. [17] on learning and also the Vapinkchervonenkisdimension, which will be mentioned in bigger aspectlater. These algorithms display the polynomial

learnability of, amid others the category of all rectanglesin n- dimensional area, and also the intersection ofn half planes in two dimensional area.Gold et al. [18] gave the primary illustration primarily

based hardness outcomes that smear to the dispersal for

method of learning. Author proven that the matter odiscovering the smallest deterministic finite

automation per a given sample is NP complete; theoutcomes of Haussler et al. [19] areoften simply smeared to Gold’s result to show that

learning deterministic finite automata of size n by deterministic finite automata of size n can't be

achieved in polynomial time if RP= NP there are sometechnical difficulties concerned in properly processing

the matter of learning finite automata withinthe distribution free model. Gold’s outcomes wereenhanced by Li et al [20], who demonstrates tha

discovering an automation 9/8 larger than thelittlest consistent automation continues to be NP

complete.Pitt et al. [21] radically enhanced the outcomes of Gold

by verifying that deterministic finite automata of size

n can not be erudite in polynomial time by deterministicfinite automata of size nk for any fastened worth k>0except RP = NP. Their outcomes have open thelikelihood of an effective learning algorithmic rulethrough deterministic finite automata or an algorithmic

rule through some completely different illustration othe sets recognized by automata.

V. Conclusion

We have discussed algorithm and approaches formachine learning based on different domains. They have

some strength and weaknesses, but the motive of thesework are to make machine learning less complex in theaspect of theory computational learning and provideaccurate learning at different point of time at differenconditions.

References[1] J. Manyika, M. Chui, B. Brown, J. Bughin, RDobbs, C. Roxburgh, and A. Byers, “Big data: The nextfrontier for innovation, competition, and productivity,”

Technical report, McKinsey Global Institute, 2011.[2] T. M. Mitchell, “Machine Learning,” McGraw-Hill

New York, NY, 1997.[3] I. Witten, E. Frank, and M. Hall, “Data Mining

Practical Machine Learning Tools and Techniques,”Morgan Kaufmann, San Mateo, CA, 3rd edition, 2011.[4] D. Angluin, C. smith, “Inductive inferences theoryand method,” ACM computing surveys, 15, 1983

pp.237-269.



www.ijsret.org



[5] R. dudas and P. hart, “Pattern classification andscene analysis,” John valley and sons, 1973. [6] L.devroye, “Automatic Pattern Recognition: A studyof probability of error,” IEEE Transaction on patternanalysis and machine intelligence 1998. Pp.530-543.

[7] V.N. Vapnik, “Estimation of dependences based onempirical data,” springer verlag, 1982.

[8] P. Domingos, “A Few Useful Things to Know aboutMachine Learning,” University of Washington Seattle,WA 98195-2350.

[9] L. Pitt, L.G. Valiant, “computational limitation onlearning for examples,” journal of the ACM, 35(4),

1988, pp.965-984.[10] D. Haussler, “Generalizing the PAC model: sample

size bounds from metric dimension based uniformconvergence results.[11] N. Littlestone, “Learning quickly when irrelevant

attributes abound: a new linear theory, IEEE, 1988, pp.120-129.

[12] M. Li, P. Vitanyi, “A theory of learning simpleconcepts under simple distribution and average casecomplexity for the universal distribution,” IEEE, 1989,

pp.34-39[13] R. Rivest, “Learning decision lists,” MachineLearning, 2(3, 1987), pp. 229-246.[14] A. Ehrenfeucht, D.Haussler, “Learning decisiontrees from random examples,” workshop on

computational learning theory, Morgan Publisher, 1990, pp.182-194.

[15] N. Abe, “Polynomial learnability of semi linearsets,” Proceeding of the 1991 workshop on

computational learning theory, 1991, pp.25-40.[16] D. Helmbold, R. Solan, M. Warmuth, “Learningnested differences of intersection closed concept

classes,” workshop on computational learning theory,1988.[17] A. Blumer, A. Ehrenfeucht, D. Haussler, M.

Warmuth, “ Occam’s razor,” information processingletter, 24, 1987, pp. 377-380.

[18] E.M. Gold, “Complexity of automationidentification from given data,” information and control,37, 1978, pp. 302-320.[19] S.Judd, “Learning in neural networks, “proceedings

of the 1988 workshop on computational learning theory,1988, pp.2-8.[20] M. Li, U. Vazirani, “On the learnability of finite

automata,” proceedings of the 1988 workshop oncomputational learning theory, 1988, pp.359-370.[21] A. Blum, “An Approximation algorithm for 3-

coloring,” proceedings of the 21st ACM symposium onthe theory of computing, 1990, pp. 535-542.

Documents

A comparative study on Machine Learning for Computational Learning Theory