How To Make A Django Base Model
In a Django project I’m working on, I have a couple of similar models. I will show how to refactor the models to have less code, clearer project, and remove lots of repeated code.
Let’s assume that I’m working on indexing youtube videos.
The Basic Model
For the basic model I need to create a class for a video, a playlist, and a channel. Channel can have many playlists and each playlist can have many videos.
The first idea was simple:
from django.db import models
class Channel(models.Model):
channel_id = models.CharField(
max_length=200, unique=True)
title = models.CharField(
max_length=200, blank=True, null=True)
description = models.TextField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234, blank=True, null=True)
class Playlist(models.Model):
channel = models.ForeignKey(
Channel, blank=True, null=True)
playlist_id = models.CharField(
max_length=200, unique=True)
title = models.CharField(
max_length=300, blank=True, null=True)
description = models.TextField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234, blank=True, null=True)
class Video(models.Model):
playlist = models.ForeignKey(
Playlist, null=True,
blank=True, related_name='videos')
channel = models.ForeignKey(
Channel, null=True,
blank=True, related_name='videos')
video_id = models.CharField(
max_length=200, unique=False)
title = models.CharField(
max_length=200, blank=True, null=True)
description = models.TextField(
blank=True, null=True)
length = models.TimeField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234, blank=True, null=True)
This is a very simple model. As you can see there are three classes: Channel
, Playlist
, and Video
. Each class has a special identifier like channel_id
, playlist_id
, video_id
— they are the original Youtube ids.
Most of the fields can be blank, as they are intended to be filled in by an indexer. So a user can create a new object with only e.g. channel_id
and then indexer will insert all the other fields.
Notice the youtube_url
field, which should be also filled by user, however it would be nice to do it automatically.
Adding Indexer Fields
As the indexer will run, it should mark the indexing state. It should fill some fields with the indexing status, errors, and timestamps. For that let’s add some more fields:
from django.db import models
# Create your models here.
class Channel(models.Model):
# ONLY THE NEW FIELDS ARE SHOWN BELOW
etag = models.CharField(
max_length=300,blank=True, null=True)
reindex = models.BooleanField(default=False)
is_deleted = models.BooleanField(default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
class Playlist(models.Model):
# ONLY THE NEW FIELDS ARE SHOWN BELOW
etag = models.CharField(
max_length=300, blank=True, null=True)
reindex = models.BooleanField(default=False)
is_deleted = models.BooleanField(default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
class Video(models.Model):
# ONLY THE NEW FIELDS ARE SHOWN BELOW
etag = models.CharField(
max_length=300, blank=True, null=True)
reindex = models.BooleanField(default=False)
is_deleted = models.BooleanField(default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
As you can see, lots of repeated code. What’s more, when an indexing error occurs, there should be filled some fields, which should also be filled when the indexing was successful. Oh, it would be nice to have some helpers for that.
Add An Informative Object Name
Currently for e.g. the Video
object, when you would like to show it in the admin app, then it will be converted to a string like "Video object"
. All the Video
objects will be shown in the same way. Let’s make it more informative by adding the __str__
function. (I’m using Python 3, so I can use only the __str__
function, there is no need to implement the __unicode__
one).
class Channel(models.Model):
def __str__(self):
return self.title if self.title else self.channel_id
class Playlist(models.Model):
def __str__(self):
return self.title if self.title else self.playlist_id
class Video(models.Model):
def __str__(self):
return self.title if self.title else self.video_id
So there are functions for each class. The logic is simple: use the title
if there is already a title
set by the indexer, otherwise use the object_id
.
Automatically Fill Youtube URL Field
Let’s make the youtube_url
filled automagically. For each class the url will be different:
class Channel(models.Model):
def save(self, *args, **kwargs):
self.youtube_url = "https://www.youtube.com/channel/{}" \
.format(self.channel_id)
super(YoutubeChannel, self).save(*args, **kwargs)
class Playlist(models.Model):
def save(self, *args, **kwargs):
self.youtube_url = "https://www.youtube.com/channel/{}" \
.format(self.channel_id)
super(YoutubeChannel, self).save(*args, **kwargs)
class Video(models.Model):
def save(self, *args, **kwargs):
self.youtube_url = "https://www.youtube.com/channel/{}" \
.format(self.channel_id)
super(YoutubeChannel, self).save(*args, **kwargs)
I have overwritten the save
function. It sets the proper URL for an object and then calls the original save
function to do the rest of the standard magic.
Choose Objects For Indexer
Let’s assume that the indexer is indexing one thing at a time. This is also one process, one thread (this is a very simplified version). So we need to have some logic for getting the data for indexing. The obvious thing to do would be to index the not indexed objects or those marked as reindex
. What’s more, I don’t want to have the indexer making queries. Instead I would add some helper function and keep them with the class definitions.
The simplest implementation would be to use the managers like this:
class ChannelManager(models.Manager):
def channel_for_indexing(self):
return self.filter(
Q(last_indexing__isnull=True)
| Q(reindex__exact=True)).first()
class PlaylistManager(models.Manager):
def playlist_for_indexing(self):
return self.filter(
Q(last_indexing__isnull=True)
| Q(reindex__exact=True)).first()
class VideoManager(models.Manager):
def video_for_indexing(self):
return self.filter(
Q(last_indexing__isnull=True)
| Q(reindex__exact=True)).first()
I stored them in the same models.py
file. I just need to use them in the models classes:
class Channel(models.Model):
objects = ChannelManager()
class Playlist(models.Model):
objects = PlaylistManager()
class Video(models.Model):
objects = VideoManager()
With this mechanism getting the next Channel
for indexing looks like this:
Channel.objects.channel_for_indexing()
Add Helpers For The Indexer
I will also create two helper functions, they will be used when the indexer will (or not) fail.
class Channel(models.Model):
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
class Playlist(models.Model):
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
class Video(models.Model):
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
The Huge Mess
Now the whole models.py
looks like this:
class ChannelManager(models.Manager):
def channel_for_indexing(self):
return self.filter(
Q(last_index__isnull=True)
| Q(reindex__exact=True)).first()
class PlaylistManager(models.Manager):
def playlist_for_indexing(self):
return self.filter(
Q(last_index__isnull=True)
| Q(reindex__exact=True)).first()
class VideoManager(models.Manager):
def video_for_indexing(self):
return self.filter(
Q(last_index__isnull=True)
| Q(reindex__exact=True)).first()
class Channel(models.Model):
channel_id = models.CharField(
max_length=200, unique=True)
title = models.CharField(
max_length=200, blank=True, null=True)
description = models.TextField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234, blank=True, null=True)
etag = models.CharField(
max_length=300, blank=True, null=True)
reindex = models.BooleanField(
default=False)
is_deleted = models.BooleanField(
default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
objects = ChannelManager()
def __str__(self):
return self.title if self.title else self.channel_id
def save(self, *args, **kwargs):
self.youtube_url = "https://www.youtube.com/channel/{}" \
.format(self.channel_id)
super(YoutubeChannel, self).save(*args, **kwargs)
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
class Playlist(models.Model):
channel = models.ForeignKey(
Channel, blank=True, null=True)
playlist_id = models.CharField(
max_length=200, unique=True)
title= models.CharField(
max_length=300, blank=True, null=True)
description = models.TextField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234, blank=True, null=True)
etag = models.CharField(
max_length=300, blank=True, null=True)
reindex = models.BooleanField(
default=False)
is_deleted = models.BooleanField(
default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
objects = PlaylistManager()
def __str__(self):
return self.title if self.title else self.playlist_id
def save(self, *args, **kwargs):
self.youtube_url = "https://www.youtube.com/channel/{}" \
.format(self.channel_id)
super(YoutubeChannel, self).save(*args, **kwargs)
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
class Video(models.Model):
playlist = models.ForeignKey(
Playlist, null=True, blank=True,
related_name='videos')
channel = models.ForeignKey(
Channel, null=True, blank=True,
related_name='videos')
video_id = models.CharField(
max_length=200, unique=False)
title = models.CharField(
max_length=200, blank=True, null=True)
description = models.TextField(
blank=True, null=True)
length = models.TimeField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234, blank=True, null=True)
etag = models.CharField(
max_length=300, blank=True, null=True)
reindex= models.BooleanField(
default=False)
is_deleted = models.BooleanField(
default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
objects = VideoManager()
def __str__(self):
return self.title if self.title else self.video_id
def save(self, *args, **kwargs):
self.youtube_url = "https://www.youtube.com/channel/{}" \
.format(self.channel_id)
super(YoutubeChannel, self).save(*args, **kwargs)
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
As you can notice: lots of repeated code.
Let’s Clean It
The first thing that should be done would be moving all the repeated code to some base class. Let’s name it BaseModel
. The first simple version would look like this:
class BaseModel(models.Model):
etag = models.CharField(
max_length=300, blank=True, null=True)
reindex = models.BooleanField(
default=False)
is_deleted = models.BooleanField(
default=False)
last_error = models.CharField(
max_length=300, blank=True, null=True)
last_indexing = models.DateTimeField(
blank=True, null=True)
last_error_indexing = models.DateTimeField(
blank=True, null=True)
last_successful_indexing = models.DateTimeField(
blank=True, null=True)
youtube_url = models.CharField(
max_length=1234,
blank=True, null=True)
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
There are just two things left: use it as a base class of the models,
class Channel(BaseModel):
# and remove the fields moved to BaseModel
class Playlist(BaseModel):
# and remove the fields moved to BaseModel
class Video(BaseModel):
# and remove the fields moved to BaseModel
and make it abstract, so migrations mechanism will not try to create table:
class BaseModel(models.Model):
class Meta:
abstract = True
The Final Code
Here is the code after all the changes. The redundant code has been moved to the BaseModel
class and the other models are much cleaner.
from django.db import models
class ChannelManager(models.Manager):
def channel_for_indexing(self):
return self.filter(
Q(last_index__isnull=True)
| Q(reindex__exact=True)).first()
class PlaylistManager(models.Manager):
def playlist_for_indexing(self):
return self.filter(
Q(last_index__isnull=True)
| Q(reindex__exact=True)).first()
class VideoManager(models.Manager):
def video_for_indexing(self):
return self.filter(
Q(last_index__isnull=True)
| Q(reindex__exact=True)).first()
class BaseModel(models.Model):
etag = models.CharField(max_length=300, blank=True, null=True)
reindex = models.BooleanField(default=False)
is_deleted = models.BooleanField(default=False)
last_error = models.CharField(max_length=300,
blank=True, null=True)
last_indexing = models.DateTimeField(blank=True, null=True)
last_error_indexing = models.DateTimeField(blank=True, null=True)
last_successful_indexing = models.DateTimeField(blank=True, null=True)
youtube_url = models.CharField(max_length=1234,
blank=True, null=True)
def indexing_error(self, e):
self.last_error_indexing = datetime.now()
self.last_indexing = datetime.now()
self.last_error = e
def indexing_ok(self):
self.last_indexing = datetime.now()
self.last_successful_indexing = datetime.now()
self.last_error = None
class Meta:
abstract = True
class Channel(BaseModel):
channel_id = models.CharField(max_length=200, unique=True)
title = models.CharField(max_length=200, blank=True, null=True)
description = models.TextField(blank=True, null=True)
objects = ChannelManager()
def __str__(self):
return self.title if self.title else self.channel_id
class Playlist(BaseModel):
channel = models.ForeignKey(Channel, blank=True, null=True)
playlist_id = models.CharField(max_length=200, unique=True)
title = models.CharField(max_length=300, blank=True, null=True)
description = models.TextField(blank=True, null=True)
objects = PlaylistManager()
def __str__(self):
return self.title if self.title else self.playlist_id
class Video(BaseModel):
playlist = models.ForeignKey(Playlist, null=True, blank=True,
related_name='videos')
channel = models.ForeignKey(Channel, null=True, blank=True,
related_name='videos')
video_id = models.CharField(max_length=200, unique=False)
title = models.CharField(max_length=200, blank=True, null=True)
description = models.TextField(blank=True, null=True)
length = models.TimeField(blank=True, null=True)
objects = VideoManager()
def __str__(self):
return self.title if self.title else self.video_id
The Future Work
This part was easy. The harder part was refactoring the admin configuration to have the common admin fields declarations in one place. This will be described in one of the future posts.