Finding & Investigating Digital Footprints with Open ...€¦ · Paste Sites –What Could You...

Preview:

Citation preview

1

Dr Stephen Hill

Finding & Investigating

Digital Footprints with

Open Source Intelligence

Workshop

The Web Explained

Search Engines

▪ To be truly effective at online research andinvestigation, it is important to understand theunique and combined qualities of each searchengine and to use them effectively in conjunctionwith each other…

2

Search Engines (Index)

▪ Search engines are "engines" or "robots" that crawl the weblooking for new web pages

▪ These robots read the web pages and put the text (or partsof the text) into a large database or index that you can thenaccess…

▪ Google - https://www.google.co.uk

▪ Bing - http://www.bing.com

▪ Yahoo - https://uk.yahoo.com

▪ Yandex - https://www.yandex.com

Index Search Explained

▪ Page A and Page B have equivalent location and frequency of

keywords; however

▪ Page A has 20 external webpages linking to it and Page B

has 40

▪ Based on the implication that Page B is more popular, it

would achieve a higher page ranking within Google and

Bing’s search results than Page A

▪ This information is significant to investigators as many of the

webpages sought may be “hidden” or purposely forced to be

“unpopular” by the owner due to the nature or intention of the

site…

Point to Remember!

This presents a challenge when using Google and Bing

as both of these search engines focus on presenting the

most popular pages at the top of their search results

When using these search engines, it may be necessary

to locate the least popular sites within millions of search

results, proving time consuming and relatively

ineffective…

3

https://www.google.com.au

Google – Index Search

https://www.google.co.nz

Google – Index Search

Google – Index Search (Regional)

https://www.google.co.uk

4

‘Bubbling & Tracking’

Search History

Location

Browser

Browsers version

Computer being used

Language being used

Time to type in a query

Time we spent on the search result page

Time between selecting different results for the same query

Operating system

Frequency clicking on adsense advertising on other websites

Operating systems version

Resolution of computer screen

Average amount of search requests per day

Average amount of search requests per topic (to finish search)

Distribution of search services used (web / images / videos)

Average position of search results clicked on

Time of the day

Current date

Topics of ads clicked on

Frequency of clicking advertising

Frequency of searches of domains on Google

http://www.rene-pickhardt.de/google-uses-57-signals-to-filter

Google – Time Filter

5

Google – Time Filter

Google – Cache

Google – Cache

http://webcache.googleusercontent.com/search?q=cache:efj0Wj8fzxUJ:dfk.com/+

&cd=1&hl=en&ct=clnk&gl=au

6

Google – Similar

Google – Similar

Google Image Search

7

Google Image Search

Google Image Search

Google Image Search – Face Filter

8

Google Image Search

Google Image Search

Google Reverse Image Search

9

Google Reverse Image Search

Google Reverse Image Search

BEYOND GOOGLE

10

Bing

https://www.bing.com

Google & Bing

http://advangle.com

11

Google & Bing

Google & Bing

http://advangle.com

Google & Bing

http://advangle.com

12

Search Directories

▪ Search directories are hierarchical databases withreferences to web sites

▪ The web sites that are included are hand picked by individuals and classified according to the rules of that particular search service

▪ Yahoo Directory - https://business.yahoo.com

▪ BOTW - http://botw.org

▪ DMOZ - http://www.dmoz.org

DMOZ

http://www.dmoz.org

https://startpage.com

StartPage

13

14

Carrot2

http://search.carrot2.org

Yippy - Cluster Search

Formerly known as ‘Clusty’

http://www.yippy.com

15

DuckDuckGo

http://duckduckgo.com

16

DuckDuckGo Bangs

https://duckduckgo.com/bang

Semantic Search

www.cluuz.com

Qwant

https://www.qwant.com

17

Qwant

https://www.qwant.com

Exalead - Advanced

http://www.exalead.com/search

Where to Find Search Engines?

www.searchenginecolossus.com

18

Advanced Search Techniques

▪ Phrase searching: “fraud in New Zealand”

▪ Boolean search: AND* fraud, NOT* scam

▪ Google Alternative: “fraud”, -scam

▪ Boolean search: fraud OR scam OR swindle

▪ Parentheses: ( ) also known as nesting…

* Will not work with Google

Check the Spelling

▪ Remember words are can be spelt differently orthere may be a misspelt word or typo on thewebsite you are looking for hence why somesearch engines fail to find the word/phrase

▪ Consider spelling and typo’s

▪ Tyres & Tires, colour & color

▪ Stephen Hill, Steven Hill, Steve Hill

▪ Serach Engine, Fraud Invesdigation...

Wildcards *

In most search engines and directories, a search for

investigat*

will give you pages with the words including:

investigate, investigated, investigation, investigator

Note: Google uses a process called stemming

19

Truncation & Wildcards *

Other ways to search using the *

" * * director of HTC Parking and Security Limited“ = ?

"Ms Anna Koltsova phone *" =?

"the * population of Auckland is" = ?

Parentheses

▪ Require the terms and operations that occur insidethe brackets to be searched first

▪ This is called "nesting"

“identity theft” ((organized OR organised) -crime)

▪ Parentheses MUST BE USED to group terms joinedby OR when there is any other Boolean operator inthe search…

20

Keyword Searching

Finding Archived Web Pages

https://archive.org/web

Internet Archive

http://archive.org/web

21

News Links

http://www.onlinenewspapers.com/

http://www.world-newspapers.com/

http://www.listofnewspapers.com/

http://www.refdesk.com/paper.html

http://www.allyoucanread.com/

http://www.actualidad.com/

http://www.thepaperboy.com/newspapers-by-country.cfm

http://news.silobreaker.com/

http://www.newsola.com

Real Time News

22

News Links

23

Classifieds - A Criminal Hotspot?

People Search

https://pipl.com

Company Search

https://opencorporates.com

24

Company Search

https://www.gov.uk/government/publications/overseas-registries/overseas-registries

Paste Sites – What Could You Find?

▪ Paste sites are websites allowing users to upload textfor public viewing.

▪ Originally designed for software developers whoneeded a place to store large amounts of text

▪ Links would be created to the text and the user couldshare the link with other programmers to review thecode.

▪ Many hacking groups use this area of the Internet tostore compromised data.

▪ Most popular site – ‘Pastebin’

Tools for Social Media Intelligence

25

Facebook

Facebook Search

LinkedIn

26

LinkedIn Search

LinkedIn Search

https://www.linkedin.com/help/linkedin/answer/76015

Twitter

27

Twitter Search

28

29

Social Searcher

http://www.social-searcher.com

Social Searcher

http://www.social-searcher.com

Social Searcher

http://www.social-searcher.com

30

Reverse Image & EXIF Extraction

Reverse Image Search

http://www.tineye.com

Reverse Image Search

31

Reverse Image Search

Reverse Image Search

http://www.tineye.com/

Metadata (EXIF)

▪ Exchangeable Image File Format

▪ Standard that specifies the formats for images,sound, and ancillary tags used by digital cameras(including smartphones), scanners etc

▪ Applied to JPEG & TIFF images and can include;

▪ Original Image date & time, modified dated & time

▪ Camera details including ‘geolocation’ settings…

32

EXIF Sites to Consider

Jeffrey’s EXIF Viewer▪ http://regex.info/exif.cgi

Others▪ http://www.takenet.or.jp/~ryuuji/minisoft/exifread/english/

▪ http://www.impulseadventure.com/photo/jpeg-snoop.html

▪ http://www.sno.phy.queensu.ca/~phil/exiftool

Camera Trace▪ http://cameratrace.com/trace

▪ http://www.stolencamerafinder.com

Video Metadata▪ https://mediaarea.net/en/MediaInfo

Where Was This taken?

Tracing Location of a Photo

https://petapixel.com/assets/uploads/2012/12/fugitivemcafee.jpg

33

http://petapixel.com/assets/uploads/2012/12/fugitivemcafee.jpg

34

WHOIS

WHOIS

WHOIS

http://whois.domaintools.com/planethollywoodlondon.com

35

Hiding Your Identity Online

Disguising your ID

▪ Every time you surf the Internet, your IP addressis publicly visible to everyone on target networkresources

▪ It is important therefore not to leave a digitalfootprint...

Sock (Finger) Puppets

4 steps to create a sock puppet:

▪ Create fake ID – use name generator

▪ Create fake profiles/user accounts on Facebook etc.

▪ Fake/disguised email, phone and IP details

▪ Consider payment method – pre-paid credit card…

36

http://www.fakenamegenerator.com

Disguising Your Online ID

Proxy and VPN services re-route your internet traffic and change your IP

A Proxy is like a web filter

▪ Proxy will only secure traffic via the internet browser usingthe proxy server settings

A VPN encrypts all of your traffic

▪ VPN’s replace your ISP and route all traffic through the VPNserver, including all programs and applications...

TOR

https://www.torproject.org

37

TOR

“Tor protects you by bouncing your communications arounda distributed network of relays run by volunteers all aroundthe world:

It prevents somebody watching your Internet connectionfrom learning what sites you visit, and it prevents the sitesyou visit from learning your physical location.

Tor works with many of your existing applications, includingweb browsers, instant messaging clients, remote login, andother applications based on the TCP protocol”.

Who is using Tor?

▪ Normal people (e.g. protect their browsing records)

▪ Militaries (e.g. military field agents)

▪ Journalists and their audiences

(e.g. citizen journalists encouraging social change)

▪ Law enforcement officers (e.g. for online “undercover” operations)

▪ Activists and Whistleblowers (e.g. avoid persecution while still raising a voice)

▪ Bloggers

▪ IT professionals (e.g. during development and operational testing, access

internet resources while leaving security policies in place)

38

Tor Project

Some of the software and services under the Tor project umbrella:

▪ Torbutton

▪ Tor Browser Bundle

▪ Vidalia

▪ Orbot

▪ Tails

▪ Onionoo

▪ Metrics Portal

▪ Tor Cloud

▪ Shadow

▪ Tor2web…

Tails

https://tails.boum.org

TOR to Web

https://tor2web.org

39

VPN Options

https://www.privateinternetaccess.com

How Safe is your Browser?

https://panopticlick.eff.org

40

Public Vote on Secure Browser

Source: Sensors Tech Forum (http://sensorstechforum.com)

The users voted that the most secure browsers are:

▪ Google Chrome - 49% or 296 votes

▪ Mozilla Firefox - 31% of votes, or 187 voters

▪ Internet Explorer - 7% or 43 voters

▪ Safari and Opera both got 4% or 25 votes

▪ Microsoft Edge - 3%, or 19 votes

▪ Maxthon - 1% or 9 votes…

http://sensorstechforum.com/which-is-the-most-secure-browser-for-2016-firefox-chrome-internet-explorer-safari-2

Final Considerations

Other questions should also be taken into consideration inaddition to securing your web browser:

▪ Do you update your browser whenever a new version isavailable?

▪ Have you configured your browser updates as automatic?

▪ Do you use third-party browser add-ons and plugins, and ifyes, are you familiar with their developers?

▪ Do you install third party software from unknown downloadpages, without paying attention to the DownloadAgreement?

41

Dr Stephen Hill

drshill@gmx.co.uk

Finding & Investigating

Digital Footprints with

Open Source Intelligence

Workshop

Recommended