Upload
hakka-labs
View
2.428
Download
1
Tags:
Embed Size (px)
DESCRIPTION
http://www.hakkalabs.co/articles/performant-django-best-practices
Citation preview
Performant Django
Ara Anjargolian Co-Founder & CTO, YCharts
There are two distinct kinds of performance issues
Predictably, they are: front-end and back-end.
Handling them effectively, requires very different approaches.
First, a quick note about frontend performance
80-90% of the end-user response time is spent on the frontend. Start there.
-Steve Souders
Front-End Performance Work
• Can be universally applied
• Requires systems/tooling changes
• Often has clear, system-independent best practices
Best Practice: Cache static assets forever (as long as they don’t change)
Why: Download assets as infrequently as possible
Solution: Already done! (As long as you use CachedStaticFilesStorage or CachedFilesMixin with your own storage)
Best practice: Bundle/minify/ compress static assets
Why: Reduce # of requests, download time
Solution: Use a static-asset-manager. 2 good ones: django-pipeline, webassets.
Bonus points: Lower number of requests by using data URIs for images (which pipeline supports)
Best Practice: Serve static files via a CDN.
Why: Less latency Solution: Good: Store in filesystem, point STATIC_URL to CDN with an origin to your URL. Better: Use django-storages/STATICFILES_STORAGE storage setting to store in cloud file storage (i.e. S3) and point CDN to it.
Best Practice: Serve more stuff as static assets.
Why: Static assets can be served faster, more efficiently than dynamic assets.
Solution: Front-end templates, static-y data structures that can be served as JSON.
All that’s required are some custom management commands.
Back-End Performance Work
• Can really only be done on a case by case basis.
• Often only requires code changes.
• Is very site-specific.
OK, I lied, there are some global back-end performance to-dos.
• Use cached sessions (contrib.sessions.backends.cache or contrib.sessions.backends.cached_db)
• Use cached template loader
• If you’re starting a new project, or do a ton heavy weight templates, consider using jinja2 as your template engine.
But on to the real stuff!
OK, I lied, first a disclaimer
DO NOT try to “optimize” every view.
• This is an utter waste of time, as there will be diminishing returns.
• Optimizing on the backend often means adding complexity. And in a multi-programmer environment, complexity is expensive!
Backend performance work starts with a profile of the “problem” view
Use a profiler middleware!
(A good one: https://gist.github.com/Miserlou/3649773)
What does a profile look like?
Understanding a profile
Things to look for:
• Tons of time spent in SQL?
• Particular functions being called to where the function call is taking longer than you would expect, or, the function is being called way too much?
What if the problem is SQL?
First use django-debug-toolbar, or, django-devserver to identify the problem queries.
Is the issue one slow query? Too many queries?
SQL Tricks, Part 1
• select_related(): Helps avoid extra queries to grab objects referenced by foreign keys/one to one relationships
• values/values_list(): Avoid Python object creation overhead when dicts/lists are good enough
• db_index=True: if you are referencing objects by field that’s not it’s primary/foreign key and does not have a uniqueness constraint on it, you might need this
SQL Tricks, Part 2
• prefetch_related(): Like select related except the “join” is done in Python and thus works for M2M
• only(): Only grab fields in the model you need (USE WITH CAUTION!)
• defer(): Get all fields except those stated in defer()
• bulk_create(): When writing lots of rows to same table
What if the problem is SQL and none of the above helps?
• raw(): -Roll your own SQL that can perhaps use stuff specific to the DB, or fancier queries.
• Denormalization: Less joins, precomputed data
• No SQL: Maybe the data you are storing in a relational database doesn’t map well to a relational database.
What if the problem is in the Python?
Common issues:
• Algorithmic issues like n^2 paths that don’t need to be n^2
• Doing extra work like constantly re-evaluating a loop invariant inside a loop
• Using if/else for error controls where exceptions will do (again, most problematic inside a loop)
In general: People doing bad stuff inside loops.
What if you optimized your Python/SQL and you’re still slow?
Cache.
Then cache some more.
• View cache
• Template fragment cache
• Function level cache (via package like django-cache-utils, django-cache-helper)
• Query cache (django-cache-machine, django-cacheops)
Many types of caching
The End
Questions?
http://github.com/ara818
Like solving complex performance problems?
YCharts is hiring!