Article Artificial Intelligence Development
14 September 2023

Rethinking Project Estimation with ChatGPT

Can AI improve predictability in software delivery? Waracle's Pete Gordon takes the reins for this week's blog, looking at how useful ChatGPT can be as a toolkit for project estimation, join him on his voyage of discovery!

teamwork iconteamwork icon
teamwork iconteamwork icon

Diverse People, Diverse Perspectives

Waracle is an inclusive, inspiring & developmental home to the most talented and diverse people in our industry. The perspectives offered in our insights represent the views and opinions of the individual authors and not necessarily an official position of Waracle.

In my latest adventure with OpenAI’s ChatGPT, I explored using the Monte Carlo algorithm to help improve my project estimates. It wasn’t the most simple or straightforward journey, as I hit more than a few roadblocks, but I also learnt more about the opportunities this cutting-edge technology can bring to my project estimation work.

Join me on this journey of discovery.

The spark of an idea: Seeking precision in project estimation

As a digital solutions consultant, my typical day is punctuated with the same questions from clients: “How long will it take?” and “How much will it cost?” To deliver answers confidently, I’ve learned to leverage data, experience, and a bit of magic.

Many years ago, I was introduced to the seminal book, “The Clean Coder,” by Robert C. Martin, a leading figure in software development. Martin uses Monte Carlo simulations to add another level of data-driven decision-making to project planning. Using this approach, we’d calculate the probability of a project completing by a particular deadline.

I was particularly intrigued by the concept of “accuracy through randomness.” By introducing random variables and observing their effects across multiple simulations, the Monte Carlo method promised a more holistic view of potential project outcomes.

Excited and apprehensive, I embarked on this new journey, eager to see if ChatGPT could produce code and, secondly, if the Monte Carlo simulation approach could be the game-changer I was looking for.

Analysis of the Idea: Diving into Data

I started by researching what the Monte Carlo simulation was. I began by asking ChatGPT questions. It was able to explain in detail what it was and the approach. It was clear to me I needed task data from past projects as input, but I didn’t have a clue what the columns would be.

ChatGPT to the rescue. I just asked Chat GPT-3.5 for the expected input format, and it generated a sample input file. This was useful because, as it happens, I had a CSV detailing each task with its minimum and maximum expected durations. We regularly collect this information from pointing sessions with developers when estimating new projects or features with our teams.

I asked it to produce sample outputs as well. Again, I was pleasantly surprised that ChatGPT could do this, and it gave me confidence that it knew what I was trying to achieve. Here is an example of the output to give you the idea:

50th Percentile: 546 days
75th Percentile: 557 days
90th Percentile: 568 days
95th Percentile: 575 days

This says there is a 50% chance that the project will be completed in 546 days but a 95% chance that it will take 575 days. This puts some real data around the project length we will communicate as a team to the business and how we came up with those estimates.

My First Dance Partner: GPT-3.5

With data in hand, I started working on the code. I had high hopes. However, as I soon discovered, while GPT-3.5 is a marvel in many areas, it fell short in this instance.

It was able to produce code, but not the right code. Not code that worked and produced the output I wanted it to. One or the other but not both at the same time.

I found I got stuck in a seemingly endless loop of GPT-producing code, which I’d run and would then copy the resulting error back into GPT. Once I got through the errors, the output produced wasn’t what we’d previously discussed and didn’t answer my question about how long a project should take. When I pointed this out by pasting the output back to Chat GPT, pointing out the problems, it would produce more buggy code, which I’d have to get fixed before I could tell the output still wasn’t what I’d hoped for.

The experience was like talking to a forgetful but helpful service robot. GPT was always super happy to help you do what you asked right now but would “forget” the big picture of what you were trying to do overall. It was like playing wack-a-mole with problems. As soon as I’d fixed one, another popped up, and when I’d sorted it, the original problem was back in the code.I tried re-stating my goal, but that didn’t seem to help. I’d go around in cycles fixing the same issues repeatedly, only to have all that progress ignored when fixing the next bug.

Hitting the proverbial wall

After about 3 days of trying different prompts, googling and dealing with the code, error, and fix cycle, I’d had enough. I was at a loss. Doubt crept in. Was it me? Was I now getting the prompts right? What was I doing wrong? Had I made a mistake in choosing the Monte Carlo approach in the first place? Maybe it’s all hype, and these Open AI models aren’t all they are cracked up to be.

The path forward seemed murky. I was about to give up.

Feeling somewhat defeated, I turned to my colleague, Gary Crawford, our Chief Innovation Officer and resident AI Guru at Waracle. Over a therapeutic cup of coffee, Gary listened to my woes and suggested: “Why not try GPT-4 with Advanced Data Analysis?”

A quick Google revealed that according to the OpenAI site, Advanced Data Analysis is:

“an experimental ChatGPT model that can use Python, handle uploads and downloads. We provide our models with a working Python interpreter in a sandboxed, firewalled execution environment, along with some ephemeral disk space. Code run by our interpreter plugin is evaluated in a persistent session that is alive for the duration of a chat conversation (with an upper-bound timeout), and subsequent calls can build on top of each other. We support uploading files to the current conversation workspace and downloading the results of your work.”

It was worth a shot. I already had the input file and output format developed, so why not.

A New Hope with GPT-4 and Advanced Data Analysis

Advanced Data Analysis was only available using the paid-for GPT-4, and you had to turn it on in your account settings manually.

The difference was night and day! I was immediately impressed with the upgrade. Somehow ChatGPT managed to understand what I was looking for. Instead of wrestling towards an answer it felt much more like I was working with a colleague.

This excerpt from our conversation shows just how far ahead this upgrade really is. In it you can see how GPT-4 asks me clarification questions to help it understand my goal.

After I answered all GTP’s questions, I asked it to write the Python code.

Not only could Advanced Data Analysis write the Python code I was looking for it could run it! Previously I’d have to download the code, run it and tell ChatGPT what I’d found. Advanced Data Analysis could handle doing this loop itself, giving it feedback on the code it was producing immediately. This was really useful it also automatically updated the error code and reran it till it was correct! It was doing it all itself.

It turned out that ChatGPT hadn’t just understood my goal. It had understood it better than I had.

I wanted to understand the output and present it to clients. ChatGPT had added code to create a histogram and exciting details about what you can tell from it. This was informative, and I learned something about how to read and understand the data I’d asked for. Something I’d never have thought to do myself.

The code was produced with comments and pip command to install the Python modules it needed to run. It ran locally and also produced the histogram. Everything I could have hoped for in about an hour.

Frankly, I was gobsmacked!

My Conclusion

I don’t know if applying the Monte Carlo simulation can revolutionise project estimation; I suspect not. Project and Programme management are much more about working with people.. Project end dates get committed to for all sorts of interesting reasons. However, that’s not the point of this story.

I was able to take a complicated idea I didn’t know enough about to do on my own and achieve my own goals. By using ChatGPT I also learned more about what I was asking using ChatGPT-4 and Advanced Data Analysis.

The AI was more knowledgeable about statistics and the specifics of the Monte Carlo algorithm than I would ever care to be. Being able to run and fix code was game-changing and significantly sped up the process of generating and testing code.

I’m struck by the immense potential of combining forces with advanced AI technologies to help explore solutions like this. Advanced Data Analysis wasn’t perfect but felt more like a paring partner than a tool—more than any other tool I’ve ever used.

I’m excited to see what we’ll do in the future together.

Want to speak to the team

about Generative AI?

Find out more
Share this article


Pete Gordon
Pete Gordon
Principal Technical Consultant