37
Selenium Sandwich Part 3: What you aren't Steven Lembark Workhorse Computing [email protected]

Selenium sandwich-3: Being where you aren't

Embed Size (px)

Citation preview

Page 1: Selenium sandwich-3: Being where you aren't

Selenium Sandwich Part 3: What you aren't

Steven LembarkWorkhorse [email protected]

Page 2: Selenium sandwich-3: Being where you aren't

What is a Selenium Sandwich?

Tasty!!!

No really...

Page 3: Selenium sandwich-3: Being where you aren't

What is a Selenium Sandwich?

Last time we saw how to combine Selenium and Plack.

Selenium calls a page.

Plack returns a specific response.

Catch: You can' get there from here.

Page 4: Selenium sandwich-3: Being where you aren't

What is a Selenium Sandwich?

Last time we saw how to combine Selenium and Plack.

Selenium calls a page.

Plack returns a specific response.

Catch: You can' get there from here.

Or you can, which is the problem.

Page 5: Selenium sandwich-3: Being where you aren't

Getting to the server

Q: How do we get a specific page loaded?

Say a Google map, Yelp search, or *aaS dashboard?

A: Load the page from a server?

Page 6: Selenium sandwich-3: Being where you aren't

Getting to the server

Q: How do we get a specific page loaded?

Say a Google map, Yelp search, or *aaS dashboard?

A: Load the page from a server?

What about our static content?

Page 7: Selenium sandwich-3: Being where you aren't

Locally sourced

You want to test a Google page.

How?

Save it locally?

Only if you want to save all of it.

Page 8: Selenium sandwich-3: Being where you aren't

Trucked in

Q: How many URL's does it take to screw in a...

Page 9: Selenium sandwich-3: Being where you aren't

Trucked in

Q: How many URL's does it take to make a Google page?

A: Lots.

Banners, logos, JS lib's, Java lib's, ads...

Page 10: Selenium sandwich-3: Being where you aren't

Trucked in

Q: How many URL's does it take to make a Google page?

A: Lots.

Banners, logos, JS lib's, Java lib's, ads...

Many are dynamic: they cannot be saved.

Page 11: Selenium sandwich-3: Being where you aren't

Werefore art thou?

Many URL's are relative.

They re-cycle the schema+host+port.

Page 12: Selenium sandwich-3: Being where you aren't

Relative paths

Many URL's are relative.

They re-cycle the schema+host+port:

http://localhost:24680/foobar.

http://localhost:24680/<everything else>

Page 13: Selenium sandwich-3: Being where you aren't

Relative paths

Need to ask locally for a remote page.

With the browser having no idea where it came from.

In other words: We need a proxy.

Page 14: Selenium sandwich-3: Being where you aren't

HTTP Proxying

Normally for security or content filtering.

Or avoiding security and content filtering.

How?

Page 15: Selenium sandwich-3: Being where you aren't

Explicit proxy

Configure browser.

It asks the proxy for everything.

Proxy pulls content, returns it.

Proxy decides which content goes to test server.

Page 16: Selenium sandwich-3: Being where you aren't

HTTP::Proxy

Run as a daemon.

User filters.

LWP as back-end for fetching.

Slow but reliable...

Page 17: Selenium sandwich-3: Being where you aren't

Basic proxy setup

Grab a port...

and go!

use HTTP::Proxy;

my $proxy = HTTP::Proxy->new( port => 24680 );

# or...

my $proxy = HTTP::Proxy->new;

$proxy->port( 24680 );

# loop forever

$proxy->start;

Page 18: Selenium sandwich-3: Being where you aren't

Initializing HTTP::Proxy

Base class supplies “new”.

Derived class provides its own “init”.

package Mine;use parent qw( HTTP::Proxy );my $src_dir = '';sub init{ # @args == whatever was passed to new # in this case a path.

my ( undef, %argz ) = @_;

$src_dir = $argz{ src_dir } || '.'or die 'Missing “work_dir” in MyPath';

...}

Page 19: Selenium sandwich-3: Being where you aren't

Adding filters

HTTP::Proxy supports request and response filters.

Requests modify outgoing content.

Response filters hack what comes back.

Our trick is to only filter some of it.

Page 20: Selenium sandwich-3: Being where you aren't

Four ways to filter content

request-headers request-body

response-headers response-body

Filters go onto a stack:

$proxy->push_filter( response => $filter # or request => ...);

Page 21: Selenium sandwich-3: Being where you aren't

Massage your bodypackage MyFilter;use base qw( HTTP::Proxy::BodyFilter );sub filter{ # modify content in the reply

my ( $self, $dataref, $message, $protocol, $buffer ) = @_; $$dataref =~ s/PERL/Perl/g;}1__END__

Page 22: Selenium sandwich-3: Being where you aren't

Fix your head

package MyFilter;use base qw( HTTP::Proxy::HeaderFilter );

# change User-Agent header in all requests

sub filter{ my ( $self, $headers, $message ) = @_; $message->headers->header ( User_Agent => 'MyFilter/1.0' ); ...}

Page 23: Selenium sandwich-3: Being where you aren't

Have to hack the request

Change:

https://whatever

to:

http://localhost:test_port/...

Or pass through to remote server.

Page 24: Selenium sandwich-3: Being where you aren't

Timing is everything

Modifying the response is too late.

That leaves the request or agent.

Page 25: Selenium sandwich-3: Being where you aren't

Timing is everything

Modifying the response is too late.

That leaves the request or agent.

Request can easily modify headers or body.

Not the request.

Page 26: Selenium sandwich-3: Being where you aren't

Timing is everything

Modifying the response is too late.

That leaves the request or agent.

Request can easily modify headers or body.

Not the request.

That leaves the agent.

Page 27: Selenium sandwich-3: Being where you aren't

Secret Agents

Choice is a new HTTP::Proxy class (is-a).

Or replacing the agent (has-a).

For now let's try the agent.

Page 28: Selenium sandwich-3: Being where you aren't

Wrapping LWP::UserAgent

Anything LWP does, we check first.

Any path we know goes to test.

Any we don't goes to LWP.

Page 29: Selenium sandwich-3: Being where you aren't

Wrapping LWP::UserAgent

Anything LWP does, we check first.

Any path we know goes to test.

Any we don't goes to LWP.

Intercept all methods with AUTOLOAD.

Requires we have none of our own.

Page 30: Selenium sandwich-3: Being where you aren't

Generic wrapperpackage Wrap::LWP;use parent qw( LWP::UserAgent );

use Exporter::Proxy qw( wrap_lwp install_known );

our $wrap_lwp= sub{ my $lwp = shift or die ... ;

my $wrapper = bless \$lwp, __PACKAGE __;

$wrapper};

Page 31: Selenium sandwich-3: Being where you aren't

Generic wrapperuse Exporter::Proxy qw( wrap_lwp handle_locally );use List::MoreUtils qw( uniq );

our @localz = ();

our $handle_locally= sub{

# list of URL's is on the stack.# could be literals, regexen, objects.# lacking smart match, use if-blocks.

@localz = uniq @localz, @_;

return};

Page 32: Selenium sandwich-3: Being where you aren't

Generic wrapperour $AUTOLOAD = '';AUTOLOAD{

my ( $wrapper, $request ) = @_;

my $url = $request->url;my $path = $url->path;

if( exists $known{ $path } ){

# redirect this to the test server$url->scheme( 'http' );$url->host ( 'localhost' );$url->port ( 24680 );

}...

Page 33: Selenium sandwich-3: Being where you aren't

Generic wrapper

# now re-dispatch this to the LWP object.# this is the same for any wrapper.# goto preserves the call order (e.g., croak works).

my $i = rindex $AUTOLOAD, ':';my $name = substr $AUTOLOAD, 1+$i;my $agent = $$wrapper;

my $handler = $agent->can( $name )or die ... ;

splice @_, 0, 1, $agent;

goto $handler}

Page 34: Selenium sandwich-3: Being where you aren't

Using the wrapperuse Wrap::LWP;use HTTP::Proxy;

$handle_locally->( 'https://foo/bar', 'http://bletch/blort?bim="bam"');

my $proxy = HTTP::Proxy->new( ... );

my $wrapper = $wrap_lwp->( $proxy->agent );$proxy->agent( $wrapper );

$proxy->start;

Page 35: Selenium sandwich-3: Being where you aren't

TMTOWDTI

AUTOLOAD can handle known sites.

Instead of modifying the URL: just deal with it.

Upside: Skip LWP for local content.

Downside: Proxy gets more complicated.

Page 36: Selenium sandwich-3: Being where you aren't

Result

Known pages are handled locally.

Others are passed to the cloud.

Server & client have repeatable sequence.

The test loop is closed.

Page 37: Selenium sandwich-3: Being where you aren't

So...

When you need to be who you're not: Use a proxy.

HTTP::Proxy gives control of request, reply, & agent.

Handling LWP is easy enough.

Which gives us a nice, wrapped sandwich.