Dave Dash

DjangoCon Testing Tutorial

2011-08-10T00:00:00+00:00

If you want to learn all you can about testing anything in your Django App, see my tutorial at DjangoCon. It’s on September 5th, it’ll be 3 hours long and so far with seven sign ups it will be very hands-on.

Here’s what I think I will cover, but I may change this depending on what the audience wants:

Testing issues
- ask people to fill out etherpad with issues they’ve run into
- ask someone to rank them in order of complexity
List an outline of topics
- post them on etherpad
- have people + them if they are interested
Testing overview
- We started in late 2009 early 2010
- Our largest project has 2500 tests
- Our next largest has 1100
- We have pretty good coverage
How testing works in Django
- I’m not 100% sure on this
- Test runner setups up a new database
- Test runner finds and runs tests
- Tests run class setup
- Test runs each test in a test case
  - Load fixtures
  - Tests run setup
  - Tests runs the test
  - Tests runs teardown
- Tests run class Teardown
- You get an F if you’re bad and a . if your not.
- Now that you know it, you can hack it.
How we’ve hacked testing
- 2500 tests is a lot
- We no longer recreate the database when you run the test suite
- In each test case we just load the fixtures once.
- We rearrange the tests so things with the same fixture set run together
Testing tools that we use at Mozilla
- nose/django_nose
- nose plugins
  - nicedots
  - progressive
- coverage
  - git + whatchangedpy
Testing everything, no excuses
- 100% Coverage isn’t important
- 80% is nice
- Good coverage on tricky things is important
- Some coverage on everything is important
- External
- If you start depending on APIs, Search or different tools you need to be able to test for them.
- Writing these test cases will take less time than this tutorial
- It will save you so much headache in the future.
- The same headaches you save yourself by writing “normal” tests
- Mock easy things
  - use a decorator on any test/view that might use redis
  - if redis isn’t setup, use the mock client
  - mock client doesn’t support everything,
    - just what I need to get my tests running -
    - feel free to extend it if you use it
  - Testing Redis
- Setup/Teardown for complicated tools
  - Good for search and APIs
  - Raise SkipTest (nose) if the developer doesn’t want to run these tests
  - Non realtime tools
    - Testing Sphinx search
    - SetupClass
      - load fixtures
      - run indexer
      - run server
    - Sphinx server now available for all tests in your test case
    - Teardown
      - stop server
  - Real time tools
    - Nicer, data can be added in post_save signals or elsewhere in your app
    - Testing LDAP
      - Setup
        
        Remove LDAP files
        
        Load an ldif
        
        Start slapd
      - Your code can now touch LDAP
    - Testing ElasticSearch
      - We leave ES running all the time.
      - Setup
        
        Checks for ES support or SkipTest
        
        Deletes index
        
        Creates index
      - You can now read/write to ES
      - Teardown
        
        Delete’s index
- Fixtures
  - Fixture Magic
  - Model Maker
- pitfalls
  - dates
  - using PDB

Test Driven Confidence

2010-04-20T00:00:00+00:00

If you’re already testing your web applications, you can skip this post.

One of the bugs I am working for AMO on involves porting a small, but moderately complicated checkbox from our PHP site and rewriting it for Django.

I decided to look at the existing implementation and found it to not work correctly at all. This was frustrating, especially since I verified that my own code worked, and that QA verified that it worked as well.

This is frustrating on many levels. Chances are some minor assumption I made changed, and thus broke this functionality. Discovering regressions is never fun, and fixing them is can be long and tedious if you can’t automatically verify that everything is working correctly.

Lucky for me, coming up with tests is easy, you just do what you would do to verify the code satisfies the requirements and then code it. Sometimes the tests can take longer than writing the actual code, but ultimately you can ship with confidence. You can be confident that your feature won’t break in the future without immediate notice, and you can be confident that your new code won’t break anything else.

Making our tests run thrice as fast

2010-03-16T00:00:00+00:00

I’ve written a faster version of TransactionTestCase and packaged it with test_utils. It’s mysql specific since it relies on SET FOREIGN_KEY_CHECKS=0 to flush the database.

The long story…

Why speed matters

We’re closing in on 300 tests for Zamboni. As of yesterday, to run our entire test suite it would have taken approximately 5 minutes. If you run tests before code-reviews, during a code-review, and before you push to master - you’ve spent about 15 minutes doing tests for a single feature or bug-fix. We have about 5 developers, so this cycle happens many times in a work day. In that time many sandwiches can be made and consumed.

Even shortcuts, like running a subset of tests will only go so far, and ultimately we do want to validate that all our tests pass for any code-change.

Testing Sphinx search with `TransactionTestCase`

Django recently sped up testing by running tests in a transaction. However, this means that data never gets committed to the database and therefore external tools, like the Sphinx indexer, will never see any of that data. So we resort to TransactionTestCase which will commit the data.

Unfortunately TransactionTestCase is painfully slow. The accepted practice is to only use TestCase if you want your tests to be fast. So, I decided to complain to one of our new hires and he and I decided to tinker in mysql to figure out what was slow. We discovered the following:

delete from [table] is slow
truncate [table] is slow
… unless you SET FOREIGN_KEY_CHECKS=0

So we decided we should do our own tear down. After some tinkering with cProfiler I discovered that TransactionTestCase does a (slow) database flush on setup for a test case. This wouldn’t do.

Making our own `TransactionTestCase`

I decided to make our own TransactionTestCase and it would just run SET FOREIGN_KEY_CHECKS=0 and TRUNCATE on each table at tear down time. It would also not do a flush on set up.

We write our tests with the idea that they clean up after themselves. Rather than having them cleanup after the last test. This is a requirement for us since django-nose doesn’t reorder tests (nor should it) and a standard django.test.TestCase assumes a clean database.

Looking at a single test test_sphinx_indexer, using django.test.TransactionTestCase took ~30 seconds. Using our new TransactionTestCase it takes ~4 seconds!

Fast tests are good

We can now run our 275 tests in ~100 seconds versus the ~300 seconds it used to take. Furthermore, skipping our sphinx tests (which are the only tests that use TransactionTestCase) only saves us ~10seconds. That’s not a lot of overhead for better coverage.

This took me the better part of a day, but solving this now, means we’re going to more often than not run our sphinx tests all the time rather than skip them. Our QA team will assure you that search is probably the most regression prone part of our site, so running these tests are vital to quality.

If you need to use TransactionTestCase in mysql, give ours a try.

django-fixture-magic: Testing issues with real data.

2010-03-05T00:00:00+00:00

I just released Fixture Magic.

When dealing with legacy data, you’ll run into all kinds of edge cases. Perhaps, an object might not display correctly unless it has the right parameters, or if it has null parameters it might not display at all. So when testing Django, it’s nice to actually use non-dummy data.

Luckily Django has a way of pulling real data out of your database using django.core.serializers:

from addons.models import Addon
a = Addon.objects.get(id=3615)
from django.core.serializers import serialize
jsonize = lambda a: serialize("json", a, indent=4)
jsonize([a])

This solution runs well in a Django shell and can be lots of fun for the whole family… until things get complicated.

Serializing alone isn’t enough.

Serializing a fixture with foreign keys means you’ll have an un-loadable fixture unless you serialize the dependent fixtures. Even for one or two foreign keys, this can be a pain. For addons.mozilla.org, we have a spidery-web of dependencies: Files need a Version which needs an Addon which need Translations.

Thus begat the dump_object management command. Give it an app, model name and a pk and it will give you not only a serialized JSON of that object, but all the objects that it requires.

Example:

./manage.py dump_object files.file 64874 64876 > my_new_fixture.json

This looks for the File model in the files app and pulls out of the database Files instances with pks of 64874 and 64876. It then recursively searches for any required objects.

Too much serial

If you create a lot of fixtures, you’ll eventually have overlapping serialized objects. In addons.mozilla.org we have Addons, Versions (which depend on Addons) and AddonCategorys (which depend on Addons and Categorys). If we wanted to get serialize a specific Addon, it’s dependent Versions and AddonCategorys it makes sense to start with dump_objecting the related Version and then dump_objecting the AddonCategory. Both dump_object commands will fetch the Addon in question, resulting in duplicated data.

To combat this we can use merge_fixtures to dedupe our fixtures:

./manage.py dump_object versions.version 64874 > 1.json
./manage.py dump_object categories.addoncategory > 2.json
./manage.py merge_json 1.json 2.json > happy_fixture.json

This should make creating test data slightly less painful. So give it a try.

Resolving Django dumpdata errors

2009-09-13T00:00:00+00:00

Recently I recieved this wonderful piece of news when I ran ./manage.py dumpdata for the first time:

Error: Unable to serialize database: User matching query does not exist.

I knew this might not work out since I was dealing with a legacy database, but the resolution is quite simple. First I had to narrow it down to which app was causing this. Naturally I assumed it was one of the two apps I had, either common or restaurant. So I ran: ./manage.py dumpdata common and ./manage.py dumpdata restaurant. The latter had no problem whatsoever.

This made sense, since my common application was the only one that made any reference to a User. By looking in my models.py for that application, I narrowed it down to my Profile object. Sure enough, commenting it out meant I could get my data.

It ended up being a foreign key mismatch between the profile and user tables. Since this is legacy data, this mismatch made sense. A simple SELECT id,userid FROM profile WHERE userid NOT IN (SELECT id FROM auth_user) gave me a list of bad profiles. Removing them allowed me to create my Django fixtures.