CCTV Truck1

How to capture AI-friendly Pipe Inspection Footage

How to capture AI-friendly Pipe Inspection Footage

As VAPAR’s CTO, it’s safe to say I’ve got a good familiarity with which inspection footage works well (and which doesn’t) for automated pipe inspections using artificial intelligence (AI).

Over the last few years, the capability of image recognition AI models have improved significantly, meaning automation is a universally serious time-saver for many organisations looking to optimise or streamline their image based assessments. 

Although accuracy of artificial intelligence has improved over this time, the results which AI models are able to produce can sometimes be limited by the characteristics of the inspection footage which they are fed. If Contractors are looking to maximise the results they can achieve for themselves and their clients using AI, there’s definitely some recommendations I’ve observed which should be followed.

As different AI vendors may have different ways of handling challenges and developing solutions. I’ve tried to cover each point with a generalist approach. Many of these challenges would also be true of a person trying to provide a condition assessment based on the footage alone.

Challenges and Limitations

Firstly, to get some better context around the recommendations, I’ll outline the main challenges and limitations of AI for automated CCTV coding I’ve observed during my time with VAPAR.


Generally, pipe inspection standards will define a number of codes to be used which require granular detail which is not reliably achievable for operators or software without quantitative computer vision and tracking of camera telemetry.

Sizing of Features

Determining the size of features within millimetre accuracy is a challenging task for software and human operators alike. As an alternative criterion for software, categorisation of defect severity could be undertaken using relative categories, such as ‘small’, ‘medium’ and ‘large’, that are aligned with quantitative ranges.

‘Clock’ Positioning

Using 12 segments (named to align with clock references) can be challenging depending on the amount of panning, tilting and zooming that the operator undertakes during the inspection. Quadrants or eighths would likely yield more consistent results from both manual and automated assessment.

Soil Through Defect

Currently, distinguishing the difference between soil visible through a defect, debris sitting inside a pipe, or roots can be a difficult task for AI.

Start & Finish Nodes

Start nodes may not always be present in footage captured by CCTV contractors. Furthermore, the type of maintenance hole used to access pipes can be difficult for AI to ascertain. Inspection footage is typically started from the centreline of the maintenance hole pointed directly down the barrel of the pipe to be inspected. These nodes are typically evident to the CCTV operator as they require entry to perform the inspection. 

Continuous Defects

It can be difficult to determine whether defects are discrete or continuous when a CCTV camera is moving through a pipe. This is due to the capture of the defects jumping in and out of frame during camera operation.

Multiple Assets in a Single Video

Where a CCTV camera travels through more than one asset, AI will need a way of identifying this distinction and handling the condition assessment of the assets separately. Otherwise the defects detected would all be assumed to be part of a single pipe asset which is incorrect.

Multiple inspection time frames captured in a Single Video 

Where a camera operator approaches an issue that needs to be immediately resolved (such as a blockage), they can stop the recording of the footage, clear the issue, and resume recording again. Where the halted inspection footage and completed inspection footage for the same asset are in a single video, AI needs a way of identifying this distinction between previous or ‘abandoned’ footage vs.‘completed’, and then overriding the abandoned condition assessment with that of the completed footage.

Shape or Dimensions Change

Where pipe shape or dimensions change, quantifying the extent of this change can be difficult to determine when using visual inspection footage alone.


Now that I’ve outlined the core problems we’ve encountered with AI for automated CCTV coding, let’s cover some tips to ensure you’re capturing AI-friendly pipe inspection footage:


There are a number of standard procedures that operators can apply to ensure inspection footage is optimised for use with AI pipe assessments. Areas where standardised procedure can be introduced to great effect are:

  • Standardising the asset information block at the start of footage capture
  • Standardising the chainage on-screen display positioning.
  • Standardising a requirement for the CCTV camera head to be centred within the pipe and field of view also centred (to see equally the top and bottom of the pipe).


There are also a number of procedural restrictions which CCTV operators can observe in order to create footage optimised for AI-based pipe assessments. These include:

  • Restriction of cleaning inspection footage (i.e. CCTV capture during jetting, where the jetting head is visible throughout the  footage and obscure the field of view) used for condition assessment.
  • Restriction on reversing significant distances through the pipe – this can cause offsets in the chainage measurement and also cause problems for the AI, which will duplicate the detection of defects and features.
  • Restriction on zooming whilst moving (either driving forward or panning), as this can make this camera movement difficult to track.
  • Restriction on stopping and starting the capture of footage within a single video, i.e. where cleaning is performed or the camera is moved without recording, the inspection should be taken in a single pass.

These recommendations are some of the main components we’ve identified that have the ability to impact the post processing of video files – either by AI or by an inspector.


Deep learning solutions in CCTV pipe inspections (and how we used them)

Deep learning solutions in CCTV pipe inspections (and how we used them)

Recently, I wrote a piece describing some of the machine learning challenges which I’d encountered during my time working with stormwater and sewer pipe inspection footage at VAPAR. Like any other industry, pipe infrastructure brings its fair share of issues which need to be resolved if valid and accurate results are to be obtained from AI models and provided to clients.

In this piece, I’d promised to also outline the deep-learning (which I’ll do here) and computer vision challenges (which I’ll be doing soon) that I’ve encountered during my VAPAR tenure so far.

So, with no further delay, let’s get into the deep learning struggles!

Correct Defect Identification (and Mixtures of Defects)

In total, we identify and categorise 80+ kinds of defects in pipe infrastructure – these are guided by the regional standard defect codes, such as those from Australia, NZ and UK. Since many of these defects are extremely similar, or define levels of severity within a defect type, distinguishing the difference between them is not an easy task. Even an experienced, specialised asset engineer won’t enjoy a high rate of success (and anecdotally the average accuracy industry wide is 50%). With this in mind, it’s clear that performing this function accurately is also going to be challenging for an AI-based system.

Detecting the Proper Scale of a Defect

When a contractor records a defect, they will sometimes zoom in to inspect the defect more closely. Since our AI models identify and classify defects based directly from this footage, instances of zooming can lead these models to incorrectly classify defects as being larger than they actually are as a result of zooming. In turn, this will affect condition scores and repair recommendations which we would provide to a client.

Localization of defects

Another deep learning challenge which we encountered related to the localisation of identified defects. Localisation refers to the precise physical location of an identified defect within a pipe. Since the footage provided to us by clients does not contain any data relating to localisation or telemetry, we are immediately presented with a challenge – providing information relating to defect location through an AI model.

So, how did we overcome these problems?

Firstly, let’s discuss overcoming the challenges associated with correct defect identification, as well as instances of multiple defects within a single frame.

Solutions – Correct Defect Identification (and Mixtures of Defects)

For pipe inspections, VAPAR uses various defect classifications, initially identified by a pre-trained deep learning mode. However, based on my previous experience, I knew that only utilizing transfer learning and fine-tuning techniques, we could not achieve optimal results. 

Features in the final layer of our pre-trained model (before the classification layer) usually have small dimensions, which is not suitable for our application. Since we deal with a large range of defect types, the core problem to solve was establishing how to alter the layers in our pre-trained convolutional neural network to achieve optimum results. We eventually managed to achieve this by combining our domain and application-specific knowledge with our nuanced understanding about the deep learning and convolutional neural network.

An example of the imbalances in data which we successfully accounted for

Imbalances in data was another issue when aiming to optimize our defect identification. The graph above illustrates the distribution of training data which our deep learning model utilised. In this graph, it’s clearly evident that certain classifications are far more prevalent within the data set, which influences the results of the model toward those categories most commonly represented in the data set – potentially skewing classification when performing inspections for clients.

To resolve this imbalance, researchers and experts typically use two techniques – balance sampling and weighted loss function. In this instance, we utilized the combination of these two techniques to help us take the most out of our model and improve performance by around 25 percent.

To combat the issue of having multiple defects in one frame (see the image below for an example), we combined the results of our deep learning model with our machine learning model, and developed an AI-based algorithm to effectively account for instances of multiple defects.

An example of multiple defects found within a single frame

Solutions – Detecting the Proper Scale of a Defect

To correctly determine the proper scale of identified defects, we defined and developed a new AI model, taking the benefits of the three most important AI techniques (computer vision, machine learning and deep learning). First we developed a deep learning-based solution for measuring the scale of the defect in each relevant frame. Then we utilized a machine learning solution to find similarities between different frames. Within the machine learning model, we utilized computer vision techniques to provide the data required for the model. Successfully executing this solution allowed us to deliver strong performance when dealing in accounting for camera zoom to correctly capture the scale of defects.

Localization of defects

Personally, I found solving the issues around localisation to be the most satisfying resolution of all those which I’ve outlined so far.

With available deep learning and image segmentation techniques along with the right dataset, localization is an achievable task to undertake. However, most industrial projects (like ours) carry huge time and cost requirements if this data is to be provided, 

This left us at a crossroads – do we abandon this functionality for our clients? Perhaps, an ordinary team might, but I’m proud to say our innovative and driven team managed to come up with a fantastic solution, using the latest state-of-the-art innovation in the deep learning discipline.

The images below illustrates some of the results obtained with our solution. 

Original ImageLocalisation data

For me, the coolest part of our innovative solution is that we can classify and localize the defect at same time without any memory or time cost.


Interested in learning about the computer vision challenges in CCTV pipe inspections (and how we’ve overcome them)? Stay tuned for future blogs!

Alternatively, check out the piece we already completed relating to machine learning challenges.


Saeed Amirgholipour PhD. is an AI Architect, Full Stack Data Scientist, and Data Science/ AI Lead Trainer with over 10 years of industry experience, including CSIRO’s Data61, Australia’s leading data innovation group. His experience spans across end-to-end large-scale innovative AI, Data Science, and analytics solutions. Saeed has a passion for solving complex business problems utilizing Machine Learning (ML) and Deep Learning (DL) models.

Pipes bring unique machine learning challenges. How can we overcome these?

Pipes bring unique machine learning challenges. How can we overcome these?

I joined VAPAR as their Lead Data Scientist in May, meaning that I’ve just passed my first six months as a ‘Vaparino’. So far, it’s been a really interesting journey; the water industry brings some unique challenges that have forced me to learn and solve uncommon problems in my role.

Firstly, some context – VAPAR is an Australia-based start-up which provides end-to-end AI-based solutions for assessments of assets; specifically sewer and stormwater pipes. VAPAR’s platform performs automated inspections of these pipes to find defects, report the exact locations of the defect and provide repair recommendations to our clients. 

For asset owners like Councils and Water Utilities, time and money are hugely important, connected considerations. Because of the massive lengths of networks these asset owners are responsible for managing, pipes will sometimes go decades without receiving an inspection. This means that asset owners are at risk of not addressing critical issues on time, increasing the risk of service disruption dramatically. That’s where VAPAR can improve the process; we provide time-efficient, automated assessments that allow asset owners to save on pipe repairs, and also protect against unplanned repairs.

As a Data Scientist, the most crucial lesson to remember for a new use-case is to get acquainted with your data as deeply as possible. Once you’ve got a solid understanding of the data, you’re able to imagine solutions to your problems with far greater ease. Being new to the water industry, there were two key challenges related to machine learning that needed to be overcome. 

Duplicate Defect Reporting

When a contractor is recording pipe inspection footage, they will move the camera through the pipe at inconsistent speeds, including instances where the camera is stationary for a couple of seconds. They will also tilt the camera head around to get a better look at the inside of the pipe.

Due to the significant computational cost involved, our platform samples frames rather than analyse every frame from a piece of footage. Because of this, and combined with the inconsistent speed of camera movement and operation, it’s possible for a defect to be observed initially from a distance, before being reported a second time when still in the field of view. 

Such duplication of defects can have a huge effect on the data we use to provide advice to our clients. If left unaddressed, this could lead to inaccurate, misleading repair recommendations which could result in clients assigning budget for repairs to pipes which didn’t require them, creating huge inefficiencies.

Start and end node detection

Start and end nodes in the context of pipe infrastructure refers to the beginning and end of each piece of pipe infrastructure (where a maintenance cover will be located). For stormwater pipes, these might be the grates where stormwater runs off from the street and into the pipes.

Because start and end nodes are the points at which cameras are both inserted and retracted from, there will often be footage captured of them in the pipe inspection footage which can cause defects to be identified which aren’t relevant to the pipe condition assessment. A common example is when the inspection camera points directly upwards at the vertical well leading to the surface, or pans around to capture the other pipe connections to the node.

If we were to leave reported defects from start and end nodes in our data without adjustment, we would be reporting large volumes of irrelevant defects in our pipe condition assessments for our clients. 

How we approached these problems

In order to develop models which could account for these problems, we developed a process which would allow our machine learning models to identify instances where they occurred, and take appropriate action to prevent these problems from impacting the results and recommendations which we provided to our clients.

When a client uploads footage to our platform, it provides initial defect detection as an output. If this data were to be provided to the client immediately, we simply wouldn’t be providing accurate advice or recommendations.

Instead, the VAPAR team took the data that was initially output by our platform, and performed some further automated analysis and data preparation steps on it, taking in additional factors about the footage.

Some of our exploratory analysis relating to data distribution among features in cases of duplicte defects

Once preparation was finalised, we fed this pre-processed data into a machine learning model, where, combined with the distance information from the dataset, we were able to define a model which could automatically identify duplicate frames.

The results for duplicate defect detection after the development of the new AI model

Using this same methodology, we were also able to define a model which would exclude defects found in start and end nodes from the analysis and recommendations provided to our clients.

The results for start and end node detection after the development of the new AI model

Without a deep understanding of the data and the challenges it poses, these solutions would have been very challenging to identify, test and implement. Domain knowledge of your use case is key to developing robust, dynamic solutions to provide the best possible outcomes.


Interested in learning about the computer vision and deep learning problems in CCTV pipe inspections (and how we’ve overcome them)? Stay tuned for future blogs!


Saeed Amirgholipour PhD. is an AI Architect, Full Stack Data Scientist, and Data Science/ AI Lead Trainer with over 10 years of industry experience, including CSIRO’s Data61, Australia’s leading data innovation group. His experience spans across end-to-end large-scale innovative AI, Data Science, and analytics solutions. Saeed has a passion for solving complex business problems utilizing Machine Learning (ML) and Deep Learning (DL) models.

microsoft round table.webp

Microsoft roundtable discussion with Satya Nadella, Global CEO of Microsoft

This week, VAPAR was honoured to be one of eight companies involved in the Microsoft round table discussion with none other than Satya Nadella himself! Amongst his passion for Microsoft Solutions, was his messaging for the Partner network to ensure our customers succeed – even if a competitive solution to Microsoft is a better fit for the project. This was a great reminder to keep the customer at the heart of what we all do to continue to add value.

We’d like to thank Microsoft Australia for the opportunity as it allowed us to obtain such an inspirational takeaway.


Automation and why it’s not taking your job

Automation is a hot topic and is reaching new applications every day, but this should be embraced instead of feared.

Automation used to be seen as some sort of wizardry which was confined to the world of IT and software engineers. The intent of automation is to take out the data intensive, risky or precision requiring tasks to remove error, repetitive strain and risk of injury for the people who had to do these jobs.

Machines and programs are literally manufactured to bear this load. Humans only have one body and, in some industries, it only takes one incident to change your life. Also, does anyone actually enjoy doing the same mind-numbing task over and over again for a year let alone as a career?

The thing that separates people from machines is our creativity, analytical skills and problem solving skills. Studying current applications, if automation is done well it frees up people from doing the jobs they hate and gives them time to focus on the more cognitively difficult processes that require human decision making. This is significantly more fulfilling than “going through the motions” on a process that is repetitive and easily automatable.

On top of this, people are always going to be needed where automation is introduced. Every automation process has “exceptions”, which are cases that don’t fit the bill. These exceptions are traditionally complex or confusing and require detailed analysis. Automation takes away the pressure to get through more cases while allowing you to focus on the cases that really need your attention.

As automation gets introduced to each industry, the industry evolves and expands, and the same people can do so much more fulfilling, value adding tasks.

If you’re interested in hearing more, there is a great TED talk you can watch here:

output-onlinepngtools (33)

When is automation the answer?

Managing large infrastructure asset databases can be time and resource intensive.

But is automation the answer?

Automation is best done for tasks that are:

  1. Manual
  2. Repetitive
  3. High frequency

These types of tasks have ongoing implications for operational expenditure if not automated.

What are some real-world applications for automation in your water business?

If your business manages underground stormwater or sewer pipes, there is now a product to automate the condition assessment and mapping from inspection footage – learn more.

CCTV review – what is it currently costing me?

Engineers spend hundreds of hours a year reviewing, interpreting and creating work orders from the inspection footage results.

The workflow is manual, repetitive and (depending on level of compliance to best practice asset management standards) is done at a high frequency.

The cost per metre is related to the internal operational costs for reviewing the footage, extracting the key data and adding the information to various enterprise databases (like GIS, ERP systems, etc).

If the inspection footage is reviewed, interpreted and recorded in a timely manner then there are further savings and benefits in terms of:

  • Identification and verification of defects that operators miss – picking up issues before they become more serious (and expensive!) problems.
  • Better forecasting in long term financial budgets for maintenance based on actual, unbiased existing asset condition.
  • Better planning of workforce requirements for short term and long term maintenance.

Is there an automated solution for this work?

Yes! Now there is finally a way to review, interpret and record the condition of underground pipes, based on the CCTV inspection footage. It’s all done through a web platform.

Find out more about the web platform and pricing here.