Δ All posts
Importing Data into Django Projects
13 December 2008
Importing non-django data into Django Projects can be a painful process, so I started Django Data Import @ Google Code.
Background
The general approach people suggest on IRC is to write a python script which loops through the records in the database and creates new Django models based on that input. My first import (for about 30 tables and 500,000 total records) was over a thousand lines long, contained lots of repetitive logic, and took many hours to run (and re-run, everytime something choked). I vowed never to do it again.
There's also the approach to simply dump the old SQL into the new. This isn't realistic in most cases because models have changed and because you lose the validation of Django models.
How this tool works
This tool serves as a linkage between your projects models and Django-created models of your old data (using manage.py's --inspectdb). You establish the relationship like so:
class UserImport(importer.Import):
#new_field_name = importer.Field('old_field_name')
username = importer.Field('username')
email = importer.Field('email')
first_name = importer.Field('firstname')
last_name = importer.Field('lastname')
is_active = importer.Field('emptype')
password = importer.Field(None,value='tmp')
is_staff = importer.Field(None,value=False)
is_superuser = importer.Field(None,value=False)
last_login = importer.Field(None,value=blank_date)
date_joined = importer.Field(None,value=blank_date)
def clean_email(self,slave_record,cleaned_data):
return '%s%s@old_site.com' % (slave_record.firstname.lower(),slave_record.lastname.lower())
def clean_username(self,slave_record,cleaned_data):
return slave_record.username.lower()
def clean_is_active(self,slave_record,cleaned_data):
if slave_record.emptype != "UNEMP":
return True
return False
class Meta:
master = User
slave = OldUsers.objects.all()
An easier way
The classes of this tool make heavy use of get_or_create so you can run and re-run your import class without worrying about creating duplicate models. They also provide a framework to take care of almost any data situation you run into, whether you need to prepare data before it goes in, populate fields which don't exist in the old model, or establish M2M relationships after your record has been imported.
