Managing OS users with Ansible across multiple environments

Ivan Tuzhilkin
Level Up Coding
Published in
5 min readMay 10, 2021

--

I’ve been using Ansible for managing OS users last five years or so. Ansible built-in User module is quite handy and easy to use. There are many ready to use roles on Ansible Galaxy and Github with mostly the same functionality. But the Ansible itself implies an imperative way of doing things. Creating a user is quite an easy task, but managing users across multiple environments with hundreds and thousands of servers could be tricky. And maybe using tools like Puppet or Salt sound like a better idea for managing users in huge enterprise with complex hierarchy. But maybe this not necessary, and there is a way how to adapt our roles for large-scale environments.

How it starts

Let’s brush up on how to manage users in a small start-up company. I’ll try to explain a hypothetical case that is not directly related to any company I’ve been working in. Typically there is a list of users with a few groups, in a very common case users with full administrative access — admins; and users with basic system access — developers. Additionally, elevated privileges managed through sudoers but I don’t want to touch on this topic yet.

Example users list should look like this:

users:
- name: 'John D'
username: 'john'
groups:
- 'admin'
- name: 'Jane D'
username: 'jane'
groups:
- 'developers'

Okay, we have a list of users and we have a list of hosts (inventory) with a dozen of servers, all we need to do is run the playbook against our inventory and that's it. We don’t care too much about security, observability, etc. Easy.

ansible-playbook -i inventory playbooks/users.yml

On a daily basis out operations flow looks like this:
Adding new server — update inventory, run playbook.
Adding new users — update group_vars/vars, run playbook.
Deleting user account — update group_vars/vars, run playbook.
Easy, no complexity, no issues.

As the company grows, we have a bunch of environments/projects and an additional department, like analytics.

users:
- name: 'John D'
username: 'john'
groups:
- 'admin'
- name: 'Dave K'
username: 'dave'
groups:
- 'admin'
- name: 'Bob O'
username: 'bob'
groups:
- 'admin'
- name: 'Jane D'
username: 'jane'
groups:
- 'developers'
- name: 'Mark S'
username: 'mark'
groups:
- 'developers'
- name: 'Sven J'
username: 'sven'
groups:
- 'developers'
- name: 'Alice C'
username: 'alice'
groups:
- 'analytics'

Now we do almost the same:
Adding new server — update inventory, run playbook.
Adding new users — update group_vars/vars, say three magic words, run playbook, one swear word, run playbook again. Like here:

ansible-playbook -i hosts playbooks/users.yml -l dev -e '@dev.yaml'
ansible-playbook -i hosts playbooks/users.yml -l analytics -e '@analytics.yaml'
ansible-playbook -i hosts playbooks/users.yml -l prod

Deleting user account — update group_vars/vars, …, run playbook again.

Technically there is a ton of ways how to accomplish this task. But anyway it will include a few steps, like manage a list of users and probably several additional lists.

Not that complex, some more steps, no issues if the process is well documented.

The next level

At some point in time the company reaches a new level, staff grows more than a hundred engineers in a dozen departments. Now it is time for the team to find out how to improve the existing tooling to meet the growth expectations or rather decide to adopt some well known enterprise-level software. However, there is still no need to provide access to servers for all the users, so the list is still not that long it auditable and it makes sense to stick with the IaC paradigm.

What makes sense to achieve at this stage:

  • Keep it simple, maintainable — one shared users list for all the environments/sites/etc;
  • Visible process — one job that runs against all the servers and show when and how it was done.

If the team want to apply changes more declarative way without making extra steps it’s essential to distinguish where to create users and how. Let’s say that team wants to create admin users on all the machines then create developers accounts on dev-servers only and analytics on data-servers, etc. Here could be added a new parameter to the user — target_hosts which is a list of host groups where user belong.

It may look like that:

users:
- name: 'John D'
username: 'john'
groups:
- 'admin'
# We do not declare target_hosts for admins because we want to have them on all hosts
...
- name: 'Jane D'
username: 'jane'
groups:
- 'developers'
# We declaring target_hosts groups list which is an Ansible inventory hosts group where we adding the user
target_hosts:
- dev
...
- name: 'Alice C'
username: 'alice'
groups:
- 'analytics'
target_hosts:
- data

That’s it, look simple. But how to achieve that?

Maybe this not obvious but all we need to do here is to add an extra step before the set of tasks in a play and a special conditional.

- name: "Determine target hosts"
set_fact:
do_run: True
with_subelements:
- "{{ users }}"
- target_hosts
- skip_missing: True
when:
- item.1 in group_names

Here we iterating over the target_hosts list and checking if at least one of the group is in the group_names which is an Ansible special variable that contains the list of the groups current host belongs to. If the condition is true we setting a fact (var) do_run to True.

Now we do extra check inside the task:

- name: "Manage user accounts"
user:
name: "{{ item.username }}"
...
state: "{{ item.state }}"
loop: "{{ users }}"
when:
- users is iterable
- do_run is defined or item.target_hosts is not defined

Here we add an extra check whether do_run is True on target_hosts is not defined. We do not define target_hosts for admins to make that we have their accounts on all the machines and we declaring target_hosts for developers and other account types when we need to have such accounts only on a certain set of machines. Also, note that the state argument is required as we don’t want to maintain additional lists like users_retired, we just explicitly declare the state present or absent.

So, now we can control where to create users in one list and by running only one automatic job. Target_host group could be a Tag in a dynamic cloud inventory, like AWS/GCP/WhateverCloud, availability zone, region, project.

Please check out my users role on Github for the details. Hope it could be useful. Thanks for reading!

--

--

Experienced problem solver with more than 10 years of hand's on experience in automating and optimizing mission-critical deployments.