Panel discussion by: Wayne Grixti Sergio Pisani Mark Vella · 2020-02-11 · Artificial...

Preview:

Citation preview

Mark VellaWayne Grixti Sergio Pisani

Panel discussion by:

#AImt

Contents• A data-centric paradigm

• Concerns and challenges

• Governance

A data-centric paradigm

Artificial Intelligence (AI)• Loosely defined:

– Algorithmic solutions to complex problems typically solved by humans

– Characteristics• Learning, autonomy, adapt to their environment

• Machine Learning– Main contributor to AI’s recent achievements– Been around for quite a while in the form of

• Statistical analysis• Pattern recognition

– Advances in technology served as a catalyst

• The role of the Data Scientist tool center stage

Data Science for Cyber Security

ML

Dataset

Data stream …Classification/RegressionProgram

SpamOr Ham?

Ransomware, Info-stealer Or AdwareOr Benign?

PredictedCPU util. %- anomalous?

Understand/Visualize/Pre-process

Select/configure/evaluate

Compute/Searchan optimized…

Un/Semi/SupervisedMath/Instance-based/Deep models

Spam Dataset

Network Traffic Analysis Dataset

Malware Dataset

Feature Engineering i• Dataset -> Feature Vectors• Spam: Bag of words<'academic', 'academy', 'acatihdlihdpbgwgcmvmdw5kihlvdxigywnjb3vudc4gd2l0a..<0, 16, 32, 8, 0, 0, 0, …..>

• Network traffic analysis: Connection statistics@attribute 'duration' real’@attribute 'protocol_type' {'tcp','udp', 'icmp'} @attribute 'service' {'aol', 'auth', 'bgp', 'courier', .. @attribute 'flag' { 'OTH', 'REJ', 'RSTO', 'RSTOS0', 'RSTR', 'S0', 'S1', 'S2', 'S3', 'SF', 'SH' }@attribute 'src_bytes' real@attribute 'dst_bytes' real<0,tcp,ftp_data,SF,491,0,0,.... >

Feature Engineering ii• Malware: Instruction counts<and,lea,xor,sub,jmp,mov,pop,test,add,call,ret,jne,push,je,inc,shl,or,cmp …. ><195,348,354,177,280,2451,600,309,392,529,22,248,1620,388,214,80,98,514 … >

• Taking into account– Biased datasets– Missing data– Inaccurate labels– Insufficient, non-representative data– Concept drift– Attributes with a different scale– Categorical or sparse values– Etc ...

Models as decision boundaries

Learning to predict ...

… using an optimization procedure

OK not the Robots Revolt• yet!

• But numerous challenges abound

Concerns and Challenges

The application of partly autonomous

algorithms in cybersecurity is not entirely

new, although traditionally those systems were

usually not referred to as ‘Artificial Intelligence’.

Cybersecurity controls capable of functioning autonomously

and taking intelligent decisions to protect information

systems and services have existed for quite some time for

instance for deciding whether or not to allow a certain network

communication, to autonomously filter spam messages, or

adapt to new circumstances such as the identification of

previously unseen forms of cyber-attacks.

Since then the field of Cybersecurity hasundergone rapid transformation due to thedevelopments in Machine Language (ML), deep(DL), and reinforcement learning (RL), which haveresulted in notable successes in addressingcomputer vision, Natural Language Processing

(NLP) and autonomous decision-making.

ChallengesClear abuse of AI systems to enhance cyber-

attacks and malicious use of the technology

resulting in an increased and more extended

cyber attack vector.

• Increased difficulties to attribute attacks to

specific actor.

• The targeting of human vulnerabilities through

autonomous social engineering, social media

and propaganda manipulation.

• Attacks on cyber-physical systems such as

autonomous vehicles, or the development of

autonomous weapon systems.

• Deep fakes – Forgery of voice and images

• DDos packets created mimicking user activity

• machine learning-based phishing attack

generator that is recorded to have increased the

penetration success from 0.3% to 15%

• Security mechanisms such as such as CAPTCHA

are being bypassed

• Building smarter password guessers through

deep learning on leaked data sets

• AI especially DL models already outperform

humans in multiple task.

• Introducing empathy as an algorithm

With an increasing number of AI systems

employed in cybersecurity, not only are we

making progress but we are also introducing new

vulnerabilities that then open the window for new

types of attacks

New Vulnerabilities

Threat to machine learning isdata poisoning.

If attackers can figure outhow an algorithm is set up,or where it draws its trainingdata from, they can figureout ways to introducemisleading data that builds acounter-narrative aboutwhat content or traffic islegitimate versus malicious.

Harnessed power - The Tool Box

• Digital Forensics

• Facial Recognition

• Predictive policing

Digital Forensics• Use of biometric systems for investigative

purposes. Ø Analysis of unordered data volumes

Ø Text analysis

Ø Image

Ø Audio

Facial recognitionA much awaited development London MET which is finding

staunch opposition from privacy advocates

notwithstanding that 8 major trial were carried out

since 2016. Results indicate that only 1 in 1000 is

innocently pinged. If we consider the amount of

criminals that will face justice, is this a matter of ‘a

compromise for a greater good’ ?

Predictive applicationsToday’s machines are interpreting data ,

recognising patterns and thus recommending

more efficient ways to achieve desired

outcomes.

At the basis of all we need a strategy that makes

use of AI technology for prevention and

counter-cybercrime in order to stay at par with

criminal trends and predictions

Governance

Malta AI Strategy

• Actions to mitigate cybersecurity risks

• Intersection of AI and cybersecurity

• Framework to create trustworthy AI

MDIA AI Certification Programme

• Platform to practitioners & companies

• Based on Malta Ethical AI Framework

• Valuable recognition in the market

Malta’s AI Ethical Framework

• Governance and control practices

• Assess potential forms of attack

• Risks due to unintentional behaviour

Process Map

Conclusion

• AI will bring a lot of benefits

• Still risks need to be addressed

• Instil trust through tech assurance

Open Discussion

Credits• Chio, C., and Freeman, D. (2018). Machine

Learning and Security: Protecting Systems with Data and Algorithms. O'Reilly Media, Inc.".

• Géron, A. (2017). Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. " O'Reilly Media, Inc.".

• PoweredTemplate.com

#AImt

Recommended