A dependency chain is where you have a number of functions that all need to do some part of a piece of work in order to fully deliver it. These functions complete their part and then pass it along to another.

In my last article I showed why your IT requests were likely taking so much longer to service than you expected. But what if you have multiple dependencies chained together? How can you get a rough idea of how long something is going to take when you have a disconnected dependency chain rather than an end to end view of the system. In this article, I’ll explain the process of getting a rough statistical idea of how long something is going to take when you have multiple dependencies all linked together. I’ll explain this through an R script, but the concept is easily transferable.

First things first, here are the libraries I’m using. I only include them here for completeness.

```
library(truncnorm)
library(ggplot2)
library(dplyr)
```

We need to establish some test data before I can get to the, quite simple, method. Feel free to skip the next chapter if you’d prefer to just get the bottom line.

## Setting up our dependency chain data

We’ll be modelling IT delays as a function of utilisation and discrete processing time for work. So first things first, let’s generate some utilisation. I’m going to assume a fairly standard range from a healthy-ish 80% up to 100%. It doesn’t feel right to just pluck numbers out of the air for everything in this script so I wrote a function.

### Setting Utilisation

```
get_rand_utilisation <- function(lower = 80, upper = 100) {
sample(lower:upper, 1) / 100
}
```

I’m going to assume a rather simplistic view where we have 3 functions completing work. They don’t talk to each other very well and we don’t have a joined up view. But we will have an idea of how long things take in each function.

Let’s assign some utilisation.

```
analysis_util <- get_rand_utilisation()
dev_util <- get_rand_utilisation()
test_util <- get_rand_utilisation()
```

I’ve ended up with:

Analysis: 90% Development: 80% Test: 83%

### Creating a cycle time distribution

I’ve chosen to assign some arbitrary values for mean work processing time. As the delays take up such a vast amount of the total time, I didn’t feel bad about this unscientific method.

analysis_effort_days <- 1 dev_effort_days <- 3 test_effort_days <- 2

I then defined a function that would generate a cycle time distribution. The expected wait time was calculated by using the utilisation formula from the previous post.

This was used to create a distribution of 50 items, with a minimum value of 1 and a mean of the total expected elapsed time. The **elapsed time** is the **sum **of the **processing time** and the **total expected waiting time** based on the utilisation value. I could have made the data more realistic here by introducing an appropriate standard deviation but there seemed to be little value for this exercise.

```
get_cycle_sample <- function(utilisation_perc, processing_time, n = 50) {
wait_time <- (utilisation_perc / (1 - utilisation_perc)) * processing_time
rtruncnorm(n = n, a = 1, mean = wait_time)
}
```

So now I could create my distributions for each function. I’ve cast the output to a frame and piped it into a histogram so you can see what we’re working with. Yes, I should have created a function.

#### Analysis Cycle Time Distribution

```
analysis <- get_cycle_sample(utilisation_perc = analysis_util, processing_time = analysis_effort_days)
data.frame(analysis) %>% ggplot(aes(x = analysis)) + geom_histogram()
```

#### Development Cycle Time Distribution

```
dev <- get_cycle_sample(utilisation_perc = dev_util, processing_time = dev_effort_days)
data.frame(dev) %>% ggplot(aes(x = dev)) + geom_histogram()
```

#### Test Cycle Time Distribution

```
test <- get_cycle_sample(utilisation_perc = test_util, processing_time = test_effort_days)
data.frame(test) %>% ggplot(aes(x = test)) + geom_histogram()
```

## Simulating a dependency chain

So from here it’s depressingly straight forward I’m afraid, I simply used the **replicate **function in R to run a function a set number of times to generate a sample of possible states from our underlying distributions.

The function I used samples each of the dependencies in the chain and sums them to provide a total processing time for the entire dependency chain. This is much like plucking a random work item from the past for each function but doing it 100 times.

```
build_sample <- function() {
sample(analysis, 1) + sample(dev, 1) + sample(test, 1)
}
samples_df <- as.data.frame(unlist(samples))
colnames(samples_df)[1] <- "lead_times"
```

We now have a list of possible futures based on our historic data, so we can see what the distribution looks like and begin to query it for some answers. But let’s note a few things first.

- We expect the work to take a total of 6 days (1 + 2 + 3)
- We are using very realistic utilisation figures, there are many places where 80% utilisation would be a luxury
- This is a relatively simple context where there are only 3 disconnected functions

Here’s the distribution of expected lead times for our dependency chain.

`samples_df %>% ggplot(aes(x = lead_times)) + geom_histogram()`

## Analysing the dependency chain data

The spread is a little narrow as we didn’t apply sensible standard deviations to the individual distributions but I hope that the point is clear.

We were supposed to have 6 days of effort, but now we’re looking at a mean of over 30. It’s actually around 33 days if we want a confidence interval of 85%, which we do. A mean is the most likely, but we’re also going to late half the time.

Even in this rather simple chain, we’re looking at almost 6 times the processing time being wasted in delays. I wonder who many dollars, pounds or euros that adds up to across your entire IT organisation?

## Conclusion

This simple example has highlighted the dangers of dependency chains. A single dependency is bad enough, as we’ve proven. But having them all link together is even worse. We only discussed 3 here, but I have seen dependency maps where there are 15, 20, or even 30 different functions required to perform work before something can get to a customer. No wonder we have software releases that take months or years to get out of the door.

Eliminate your dependencies, optimise for value, and give this article a share if you found it helpful.