22
Performant Django Ara Anjargolian Co-Founder & CTO, YCharts

Performant Django - Ara Anjargolian

Embed Size (px)

DESCRIPTION

http://www.hakkalabs.co/articles/performant-django-best-practices

Citation preview

Page 1: Performant Django - Ara Anjargolian

Performant Django

Ara Anjargolian Co-Founder & CTO, YCharts

Page 2: Performant Django - Ara Anjargolian

There are two distinct kinds of performance issues

Predictably, they are: front-end and back-end.

Handling them effectively, requires very different approaches.

Page 3: Performant Django - Ara Anjargolian

First, a quick note about frontend performance

80-90% of the end-user response time is spent on the frontend.  Start there.

-Steve Souders

Page 4: Performant Django - Ara Anjargolian

Front-End Performance Work

•  Can be universally applied

•  Requires systems/tooling changes

•  Often has clear, system-independent best practices

Page 5: Performant Django - Ara Anjargolian

Best Practice: Cache static assets forever (as long as they don’t change)

Why: Download assets as infrequently as possible

Solution: Already done! (As long as you use CachedStaticFilesStorage or CachedFilesMixin with your own storage)

Page 6: Performant Django - Ara Anjargolian

Best practice: Bundle/minify/ compress static assets

Why: Reduce # of requests, download time

Solution: Use a static-asset-manager. 2 good ones: django-pipeline, webassets.

Bonus points: Lower number of requests by using data URIs for images (which pipeline supports)

Page 7: Performant Django - Ara Anjargolian

Best Practice: Serve static files via a CDN.

Why: Less latency Solution: Good: Store in filesystem, point STATIC_URL to CDN with an origin to your URL. Better: Use django-storages/STATICFILES_STORAGE storage setting to store in cloud file storage (i.e. S3) and point CDN to it.

Page 8: Performant Django - Ara Anjargolian

Best Practice: Serve more stuff as static assets.

Why: Static assets can be served faster, more efficiently than dynamic assets.

Solution: Front-end templates, static-y data structures that can be served as JSON.

All that’s required are some custom management commands.

Page 9: Performant Django - Ara Anjargolian

Back-End Performance Work

•  Can really only be done on a case by case basis.

•  Often only requires code changes.

•  Is very site-specific.

Page 10: Performant Django - Ara Anjargolian

OK, I lied, there are some global back-end performance to-dos.

•  Use cached sessions (contrib.sessions.backends.cache or contrib.sessions.backends.cached_db)

•  Use cached template loader

•  If you’re starting a new project, or do a ton heavy weight templates, consider using jinja2 as your template engine.

But on to the real stuff!

Page 11: Performant Django - Ara Anjargolian

OK, I lied, first a disclaimer

DO NOT try to “optimize” every view.

•  This is an utter waste of time, as there will be diminishing returns.

•  Optimizing on the backend often means adding complexity. And in a multi-programmer environment, complexity is expensive!

Page 12: Performant Django - Ara Anjargolian

Backend performance work starts with a profile of the “problem” view

Use a profiler middleware!

(A good one: https://gist.github.com/Miserlou/3649773)

Page 13: Performant Django - Ara Anjargolian

What does a profile look like?

Page 14: Performant Django - Ara Anjargolian

Understanding a profile

Things to look for:

•  Tons of time spent in SQL?

•  Particular functions being called to where the function call is taking longer than you would expect, or, the function is being called way too much?

Page 15: Performant Django - Ara Anjargolian

What if the problem is SQL?

First use django-debug-toolbar, or, django-devserver to identify the problem queries.

Is the issue one slow query? Too many queries?

Page 16: Performant Django - Ara Anjargolian

SQL Tricks, Part 1

•  select_related(): Helps avoid extra queries to grab objects referenced by foreign keys/one to one relationships

•  values/values_list(): Avoid Python object creation overhead when dicts/lists are good enough

•  db_index=True: if you are referencing objects by field that’s not it’s primary/foreign key and does not have a uniqueness constraint on it, you might need this

Page 17: Performant Django - Ara Anjargolian

SQL Tricks, Part 2

•  prefetch_related(): Like select related except the “join” is done in Python and thus works for M2M

•  only(): Only grab fields in the model you need (USE WITH CAUTION!)

•  defer(): Get all fields except those stated in defer()

•  bulk_create(): When writing lots of rows to same table

Page 18: Performant Django - Ara Anjargolian

What if the problem is SQL and none of the above helps?

•  raw(): -Roll your own SQL that can perhaps use stuff specific to the DB, or fancier queries.

•  Denormalization: Less joins, precomputed data

•  No SQL: Maybe the data you are storing in a relational database doesn’t map well to a relational database.

Page 19: Performant Django - Ara Anjargolian

What if the problem is in the Python?

Common issues:

•  Algorithmic issues like n^2 paths that don’t need to be n^2

•  Doing extra work like constantly re-evaluating a loop invariant inside a loop

•  Using if/else for error controls where exceptions will do (again, most problematic inside a loop)

In general: People doing bad stuff inside loops.

Page 20: Performant Django - Ara Anjargolian

What if you optimized your Python/SQL and you’re still slow?

Cache.

Then cache some more.

Page 21: Performant Django - Ara Anjargolian

•  View cache

•  Template fragment cache

•  Function level cache (via package like django-cache-utils, django-cache-helper)

•  Query cache (django-cache-machine, django-cacheops)

Many types of caching

Page 22: Performant Django - Ara Anjargolian

The End

Questions?

[email protected]

http://github.com/ara818

Like solving complex performance problems?

YCharts is hiring!