When I grow up I want to be a Database Administrator (said no-one ever)

Instead of just sharing the slides from the keynote talk that I gave at DjangoCon Europe 2024 in Vigo this month, I'm trying a new approach, inspired by Simon Willison's annotated presentations. This post contains the slides together with the transcript, links etc. so it should be much more useful.

I'll add a link to the video as soon as it's available.

This was my third time speaking at DjangoCon Europe, and my first ever keynote. I hope to be able to attend many more because, as I said at the conference, spending time with the Django community is way up there on my list of favourite things. I was so excited to be asked to speak to a room full of Django developers about my all-time favourite topic - databases in general, and PostgreSQL in particular.

Title slide

The talk is aimed at developers who are somewhat forced into looking after databases, whether they like it or not. It looks at some of the things that a developer actually needs to know about databases in their role as a developer and tries to encourage them to love databases just a little bit.

Head shot of Karen in bike helmet and sunglasses

I live in a small village in the French Alps, and when I'm not playing with or talking about databases I love getting out on one of my bikes - road, gravel or mountain bike depending on how I’m feeling.

As well as being a Senior Solutions Architect, which I came to from a DBA (database administrator) background, I’m on the PostgreSQL Europe Board of directors, I’m leading the (very) recently established PostgreSQL Europe diversity task force, and I write and present talks (about databases) at PostgreSQL and Developer conferences.

timeline of Karen's career to date: 1998 Junior DBA, 2010 Senior DBA, 2010 Database Expert, 2020 Senior Database Consultant, 2022 Senior Solutions Architect

When asked “What do you want to be when you grow up?” I doubt any kid ever said “A database administrator”. In fact, I asked the audience "Who, in this room, said they wanted to be a DBA when they grew up?" and unsurprisingly not a single hand went up.

I’ve spent my whole career working with databases, as illustrated by this diagram of my roles over the last 25 years. I’ve had such "wide-ranging" job titles as Junior Database Administrator”, “Senior Database Administrator”, “Database Expert” and “Senior Database Consultant”.

My current job title is the only one I’ve ever had that doesn’t contain the word “database”, although I still work exclusively with databases, and even I didn’t want to be a DBA when I grew up!

picture of a toy dinosaur

Although I think at one point I wanted to be a dinosaur, so maybe I didn’t know too much about career options.

3 colourful bar charts showing lists of jobs children want to do when they grow up

Anyway, just in case I was wrong, and kids do in fact say they want to be database administrators when they grow up, I thought should do a bit of actual research. Fortunately, lots of people have already done the hard work for me; they’ve been out and asked a bunch of children what they want to be when they grow up, and they’ve asked adults what they wanted to be back when they were growing up [1] [2] [3].

And you’ll actually be surprised to hear that database administrator

...

didn’t feature in any of the lists.

photo of a soft toy doy with stethoscope and other doctor's equipment

Vet or Doctor was a popular choice,

photo of a child playing football

as was professional footballer,

coloured doodles of astonauts and other space-related objects

astronaut, obviously,

photo of a young woman engineer working at a computer and surrounded by electronic equipment

and even engineer featured in most of the lists,

poorly drawn picture of a woman sitting on top of a database, working on a laptop and with pages of data floating around her head

but not database administrator.

I even had to inflict my drawing "skills" on you to be able to include a picture of a DBA because it’s not so easy to find useful images.

So I have to concede that maybe (just maybe) not everyone is as passionate about databases as I am.

meme of a cat saying "please don't interrupt me while I'm ignoring you"

Most of the developers I speak to just want the database to quietly do its thing in the background so they can concentrate on developing their application, which doesn’t seem like an unreasonable request.

But the world of databases is changing. The traditional DBA role is becoming less common, and developers are often expected to manage their own databases. Even if you don’t have to actually look after your own database, it often forms the backbone of your application, so it’s helpful to know how to interact with it.

So I wanted to talk about how you, as a developer, can navigate this uncomfortable reality, and what you actually need to know about databases.

Line drawing representing a PostgreSQL database in a heart

Obviously, because I love databases so much, I think everyone should want to know everything about databases.

If you don’t want to talk constantly about Boyce-Codd normal form, then why not?

If your party trick isn’t to recite the ACID properties of transactions, and the different transaction isolation modes available in each of the database management systems, and discuss how your chosen RDBMS either conforms to or differs from the SQL standard, then are you even really living?

Don't hesitate to get in touch if you need other ideas for sparkling conversation starters for your next dinner party.

Photo of Daniele Procida at PyCon Italia 2024 giving his keynote talk "The Attentive Programmer"

I was in the audience for Daniele Procida’s “The Attentive Programmer” at PyCon Italia in May, and he also gave the talk at DjangoCon Europe. If you didn't manage to see it, I recommend catching the recording.

Daniele talked about the idea of writing a love program. I love the ode to his favourite camera that he wrote in Python, and it got me think about how I would express love in database terms.

Line diagram showing a simple entity relationship diagram containing Department and Employee entities with crows-foot notation depicting one to many relationship between them. Also a key showing how different relationships can be modelled in crows-foot notation

My chosen love language would be the entity relationship diagram. I think it’s a beautiful way to represent almost any real-world objects in terms of their attributes and the relationships between them.

I love the fact that there are rules, and syntax, and conventions. I like the simplicity of the crows-foot notation, and the fact that so much information can be conveyed by just a few little lines.

Just from the tiny snippet above, we know that one department can have one or more employees, and that an employee belongs to one and only one department.

I like the way the colouring and spacing and organisation of the elements in the diagram can help even a non-technical reader to see how the different concepts and pieces of data fit together.

I’ve been known to lose myself and spend far too long rearranging an ER diagram to look balanced, pretty, untangled, and ordered, and to make sure it fits on to one sheet of paper (albeit sometimes a very large sheet of paper).

Simple entity relationship diagram, partially depicting the relationships between the different elements of a PostgreSQL cluster

What better thing to use my love language for than to describe one of my favourite things – a Postgres cluster!

The process always seems quite simple to start with:

  • I modelled my Postgres cluster, which comprises one or more databases.
  • I can define one or more roles or users for my cluster.
  • I can create one or more tablespaces.
  • Each of my databases can contain one or more schemas,
  • and each of my tables will belong to one of those schemas.
  • The tables can also belong to one of the tablespaces.
  • An index can be created on one or more of my table columns,
  • but equally, a column could belong to one or more indices.


I could model the permissions granted to the roles, and I could model the fact that certain databases have access to certain tablespaces. I need to think about how changes over time will be modelled; do I want to record the history?

I could add in other object types like views, functions, stored procedures, extensions...

I’ve not even started to add in the column names or the primary and foreign key columns.

And I had to force myself to stop there, because the goal of the excersie wasn’t to come up with a realistic model of my database cluster, but to show how I could lovingly "draw" my databases in this way!

Different definitions of a database administrator with the following key words and phrases highlighted: "create, organise and look after computer systems that store data", "maintaining, securing, and operating databases", "use specialist software to organise and maintain a secure database", "managing computer systems that store and organise data", "securing data and identifying areas for improvement with the infrastructure".

To go back to the actual subject of the talk, What even is a DBA?

Well, according to various different definitions from careers services, database management system vendors and job sites, there seems to be a general consensus that a DBA is someone who uses specialist software to manage and secure computer systems that store data.

question mark

Which tells us pretty much nothing at all about what a DBA actually does all day

Text: DBA Responsibilities 1/2: Backup and Recovery, Security, Monitoring, DB Design/Data Modelling, Support/Troubleshooting, DB Software install/upgrade

Text: DBA Responsibilities 2/2: Database Expertise, Performance Tuning, Capacity Planning, Database Creation, Database maintenance, Data Protection / GDPR

So, what does a DBA actually do?

I compiled a list of DBA responsibilities based on various definitions and sample job adverts. It turns out to be a pretty long list.

Apparently, a DBA is expected to do some or all of the following things:

  • Design, implement and maintain backup and recovery procedures
  • Design and implement security procedures, manage database access
  • Monitor the database’s availability, performance, security, space etc
  • Logical and physical data modelling
  • 24/7 support and troubleshooting
  • Plan, perform and test database software install and upgrades
  • Provide database expertise, advice and support to other teams
  • Fix performance problems, generally improve database performance
  • Plan for future database size/resource needs
  • Create databases
  • Design and implement database maintenance procedures
  • Make sure data protection/GDPR rules are obeyed


Meme: cartoon dog drinking tea with flames around him saying "this is fine"

No problem!

bar chart showing most frequently requested skills for DBA roles in UK for 6 months to January 2023. Social Skills in 16th position is circled in red

And then there’s the list of skills that a DBA is expected to have.

I created this chart last year, based on data for the UK from IT Jobs Watch. It ranks the required skills and capabilities for DBA job roles based on how often they appeared in DBA job adverts.

It boils down to:

  • SQL skills, procedural languages, data integration & ORMs
  • Knowledge of one or more database management systems
    (SQL Server, Oracle, Postgres)
  • Knowledge of various operating systems
    including Linux, Windows and various cloud environments
  • Performance Tuning
  • Database Migration
  • Disaster Recovery
  • Social Skills
    I’m not sure what they’re trying to say about DBAs there, and when I did a quick check this week to make sure the list was still up to date, social skills ranked even higher in the list: in 5th place instead of down in 16th!
  • High Availability
  • Replication
  • Clustering
  • DevOps and automation methodologies and tools


sketch of a woman whose brain appears to be exploding

That’s a lot to know and do!

Venn diagram depicting "DBA" and containing overlapping "Production DBA" and "Development DBA" ovals

Oh, and to make things even less clear, there are various different “flavours” of DBA.

A traditional split is between the Production DBA and Development DBA, where the development DBA works closely with developers, and focuses on building and maintaining a database environment that supports the application development. Then usually throws things over the fence to the production DBA, who makes sure the databases in your production environment are kept up and running, focusing on things like availability, performance and security.

Maybe you are the development DBA in your organisation.

2 Venn diagrams, each depicting DBA, the first containing overlapping "Production DBA" and "Development DBA" ovals, the second containing overlapping "Application DBA" and "System DBA" ovals

In some organisations, the split is between application DBA (concerned with the logical, application-related aspects of the database) and system DBA (responsible for the underlying software and physical infrastructure).

In this situation, you’re probably the application DBA.

Venn Diagram depicting "DBA Tasks" and containing multiple overlapping ovals with labelled with different types of DBA

As well as production DBAs and development DBAs, application DBAs and System DBAs, you’ll hear of

  • Datawarehouse DBAs
  • Cloud DBAs
  • Database architects or even
  • Replication DBAs or
  • Backup and recovery DBAs
    who just look after one aspect of database admin.


These different types of DBA role will differ from one organisation to another, or even from one team to the next. The division of responsibilities isn’t necessarily clear-cut, and the roles often overlap.

Picture of yellow diamond sign saying "don't panic"

Fortunately, you really don’t need to be an expert in all of that.

Text "Who does the work" and photo and slide from Katie McLauchlin's "What should you have to worry about" talk from DjangoCon Europe 2022, Porto

I really enjoyed Katie McLaughlin’s keynote “What should you have to worry about” from DjangoCon Europe 2022 in Porto. Katie pointed out that although you absolutely can run Django by creating a server, installing Django and Postgres and nginx and making it available to the world, there are then all sorts of things to worry about, including:

  • Server and network availability.
  • OS, Django, nginx and Postgres updates.
  • Database migrations.
  • Database administration.
  • Backups.


"Are you a Django Developer or are you a combination Django developer and database administrator and systems administrator and network administrator and full stack engineer and grossly, grossly underpaid?” Katie McLaughlin: What should you have to worry about,  DjangoCon Europe 2022 Porto

and if you’re the one who's worrying about these things, Katie wondered

"Are you a Django Developer, or are you a combination Django developer and database administrator and systems administrator and network administrator and full stack engineer and grossly, grossly underpaid?"

You don’t need to be a database expert

You absolutely don’t need to know everything about databases. I certainly don’t know everything about app development, and even that statement makes it sounds as though I know much more than I do.

I know just enough to be able to help developers out with their database-related questions. Likewise, you need to know just enough about databases to help you out with your app development.

Bullet list: Set Memory & Performance Parameters Schedule (and test) Backups Implement High Availability Configure Connections Put in place Logging, Monitoring & Alerting

This is the agenda from a talk I gave called “How to keep your database Happy” at PyCon UK last September. My top 5 tips for things you can put in place, without too much effort, to make sure you have a robust, performant database environment

  1. Check that a few key configuration parameters are set correctly for your environment and workload
  2. Make sure you take regular backups of your database (and test your recovery process)
  3. Put a high availability architecture in place
  4. Make sure the right users/applications can connect to and interact with your database
  5. Make sure you know what's going on and can react quickly if something goes wrong


I gave about a 3-minute overview of how to do each of those things. Funnily enough, that wasn't enough time to go into the details of everything you need to know to look after your database, i.e. everything I've spent the last 25 years learning.

And that was just the things that relate to the production / system side of things. We didn't even start to think about the application side of things: database design, queries design and performance tuning, connection management, stored procedures, transaction management and more.

But you don’t need to know these things in detail. The main thing is to know that these ideas and concepts exist, and to know where to look and who to ask for more information if and when you need it.

I always include links to the relevant documentation or other useful resources when I do that type of presentation, because I know I can’t cover everything in a 30-minute talk, and that even if I did, people wouldn’t remember most of what I said.

If, when you have to implement something, or when you have a problem, you think

"Oh, I vaguely remember Karen saying something about that.”

and you can go and read up on it, then I feel as though I’ve done something useful.

Meme: Yoda saying "Delegate you must or crazy you will go"

If you don’t have the time, skills, inclination or infrastructure available to do all of the things you need to do to look after a database, there are various options, including managed services, that will do it all for you.

The big cloud providers provide managed database services, the company I work for (Crunchy Data) has a managed PostgreSQL service called Crunchy Bridge, and there are plenty of others out there.

And even if do you have the skills and the inclination, it might just be a better use of your time and expertise as an application developer to outsource some of these tasks.

Developer Responsibilities 1/2 Backup and recovery, DB Design/Data modelling and Troubleshooting circled in red

So, going back to the list of DBA responsibilities, which of those tasks do you actually have to know about or be able to do yourself?

  • It's definitely useful to have some knowledge of how database backup and recovery works. You might need to do it yourself for your development environments, you may want to make sure that sensible backup processes are in place for your chosen managed database service, and you definitely want to test out recovering a database in case something goes very wrong.
  • You may not be the one who designs and enforces database security policies, but it's helpful to have in the back of your mind some of the best practices - for example principle of least privilege & separation of duties - when deciding who should own and manipulate database objects, which users should connect to the database and how etc.
  • You probably won't be responsible for putting database monitoring in place, but at least having an idea of what some of the metrics shown on your database dashboard mean can be a valuable debugging tool.
  • Data modelling is a must from my point of view. Having that vision of what the data in your system looks like and how it fits together can be really helpful when looking at the best ways to write code that accesses and updates that data.
  • Hopefully, you won't have to do 24/7 support, but being able to do some basic database troubleshooting could save you a lot of time - debugging connection errors, figuring out why the database went down, understanding why (and resolving) performance of a certain query suddenly dropped off a cliff.
  • Hopefully, the software install and upgrades can be left to someone else, especially if you're using a managed service.


Developer Responsibilities 2/2 with Performance tuning and Data Protection/GDPR circled in red
  • Although being a "database expert" is unlikely to be part of your remit, it will probably give you a boost if you manage to understand some databasey-stuff and explain it to the rest of your team
  • It's definitely important to be able to work out why a particular query or part of your application is performing badly and be able to make improvements to performance. Even better is learning how to write code that performs well against the database in the first place.
  • Capacity planning may not be in your remit, but it's good to be aware of the resources being used for the database, and how they're expected to grow so you can design your application to be able to cope.
  • >A lot of organisations are putting automated processes in place to create database clusters, and individual databases. If you're using a managed service, there's likely an API for that. Even if that's the case, it's helpful to have a basic understanding of some of the most common parameters to make sure they're set to appropriate values for your use case when the database is created.
  • Hopefully, "someone else" will deal with general database maintenance tasks, especially if you're using a managed service. Again, it's helpful to be familiar with the basics of things like autovacuum, rebuilding indexes, refreshing materialized views, keeping statistics up to date etc. so you're aware of the impact that the processes could have on your application (whether in terms of negative impact of not keeping on top of it, or the potential impact of running certain processes).
  • It can be a real pain, but it's everyone's responsibility to make sure data protection/GDPR rules are obeyed. This one's not a database administration specific task, but it's tightly related to the database.


Text: What skills does a DBA need? Followed by a list: SQL, procedural languages, Database management systems, Operating systems,Performance Tuning, Database Migration, Disaster Recovery, Social Skills, High Availability, Replication, Clustering, DevOps, Automation

Going back to the list of DBA skills and capabilities; which of them to you actually need?

Text: What DB skills does a developer need? followed by a list of skills, replicated in the body of the post
  • I know, I know, everyone uses ORMs. But if you want to understand what the ORM’s doing, and potentially why a query seems to be misbehaving, there’s no better way than to understand and to be able to write your own SQL queries.
  • You don’t need to know about lots of different database technologies - just the one that you’re actually using. Obviously, I’m going to recommend PostgreSQL for pretty much any situation, but you might have your reasons for using something else.
  • I already touched on performance tuning and disaster recovery.
  • Having spent time with the Django community, I’d say you’re generally fine on the social skills front!
  • And you probably already know at least as much about DevOps and Automation than the average DBA.


Text: What do I want you to know? followed by list of talk titles, replicated in the body of the post

You can get an idea of some of the things that I think it’s useful for you to know by looking at the titles of some of the talks that I've written for developer events:

  • How to keep your database happy (as mentioned already)
  • How to tune PostgreSQL to work even better talks about some of the most useful PostgreSQL configuration parameters and how to set them to the right values for your use-case.
  • Tuning your database for Analytics looks at the implications of running a mixed workload on your database, and how you can minimise the performance impact of that.
  • Everything you wanted to know about databases but were too afraid to ask your DBA answers all sorts of questions from "what's a database instance?" and "what's the difference between a view and a materialized view?" to "How do I read a query execution plan?".
  • Database troubleshooting for developers explains how to recognise and fix the most common database issues you’ll come across.
  • How Postgres is misused and abused in the wild was written for the people who develop PostgreSQL itself. The aim was to encourage reflection about what we as a project can do to make the process of learning about and using Postgres as straightforward as possible.


Where can you find more information? List to resources, duplicated in the body of the post


Outline wording saying "DBA: or how I learned to stop worrying and love the database"

Hopefully, once you've learnt about some of those things, you'll be able to proudly talk about "how I learned to stop worrying and love the database".

thank you slide with QR code to slides and contact details

I've included a link to the non-annotated, pdf version of the slides, and some contact details so you can get in touch with me, or follow me for more database chat.

Mastodon: @karenhjex@mastodon.online

Twitter/X: @karenhjex

References:

[1] fatherly.com: What kids want to be when they grow up

[2] statista.com: Childhood aspirations in China/US/UK

[3] Moneypeny.com: What did you want to be?

[4] reed.co.uk: What your kids really want to be when they grow up

[5] itjobswatch.co.uk: DBA roles