# Whitepaper Research: Leveraging Node Graphs for CI/CD Pipelines Automation

# Introduction

This was a piece of research I had to do for my sixth semester at Fontys University of Applied Sciences. Thus, I also decided to share it here!

# Purpose

The point of this research, is to explore an alternative to the way CI/CD pipelines are normally defined (in a text format). The main idea being, to make it easier for developers, who are not infrastructure professionals, to not just use said pipelines, but to also be able to create them.

# Objectives

# Research Questions

# Main question

How does the utilization of node graphs impact the automation of CI/CD pipelines compared to traditional methods?

# Sub-questions

  1. What specific advantages do node graphs offer for automating CI/CD pipelines? CI/CD?
  2. What factors influence the usability and acceptance of node graph interfaces for CI/CD automation?
  3. How do node graph-based CI/CD pipelines compare in terms of development time and maintenance efforts to conventional automation approaches?

# Introduction

# Background

CI/CD is an automated process (usually integrated with your project's VCS of choice) which is used during the software development process to:

  1. Increase the quality of the software (usually done via automatic testing)
  2. Automatically deploy new versions of the software, as they come out

Said pipelines, are generally created and maintained by DevOps engineers and used by the software developers who write the actual software.

What is CI/CD

# Significance of the Study

The main way this research contributes to the filed, is that the usual way these pipelines are normally created, requires expertise that software developers (especially juniors) may not have. This research, proposes a method to make the creation and maintenance of CI/CD pipelines, more accessible to the average developer and reduce friction for DevOps engineers.

# Methodology

The methodology used for research, is based on DOT framework methods, namely:

  • Available product analysis
  • Community research
  • A/B testing

# Results

In order to understand their advantages, we need to first understand what their application on software development (as CI/CD pipelines, are a type of software). When using node graphs for development the source code for the software application is organized into atomic functional units called nodes.

As the main research question, is answered by answering the sub-questions, we will start with those.

# What specific advantages do node graphs offer for automating CI/CD pipelines?

# Easier understand of complex tasks

Each node can have inputs and outputs. Outputs and inputs can refer to each other via connections to each other. When a node executes its functionality, it retrieves its inputs by following the connections defined for it to retrieve data output by other nodes. The node then executes its operation on these inputs to produce its own outputs. The ability to link nodes together in this way allows complex tasks or problems to be broken down and make them easier to understand.

# Visualization of the software

Normally, software is visualized as text. Having it visualized as a series of nodes, provides better and more visual way for a person to understand a task. This is good, because people a very good at visual pattern recognition.

The user interface of the software application used to define the node graph will often visually display the node graph to the user. Nodes are often drawn as rectangles, and connections between nodes are drawn with lines or splines.

# More intuitive for novice engineers

Due to its visual nature, learning to use said node-graphs is easer and more intuitive for people who have limited expertise in the area of CI/CD.

Although not directly related, a very similar problem, was facing the games industry for quite a while. They needed a way for artists, with limited technical expertise, to be able to make changes to higher level systems in a game. A common solution to this problem, was to utilize node graphs.

An example of this, can be found in Unreal engine. Instead of having artists learn C++, they just had to interact with the Blueprint Visual scripting system.

Another piece of software, that utilized node graphs, to allow artists to represent complex behavior, is SideFX's Houdini software. Generally, used for visual effects.

When defining CI/CD pipelines, we are face with a similar problem. We would like software engineers, who are generally not experienced in infrastructure, to be able to work with said pipelines, without needing to learn another discipline. We would like to provide them with a simpler visual interface for them to use.

Node graph architecture

Unreal Engine Blueprints

SideFX Houdini

# What factors influence the usability and acceptance of node graph interfaces for CI/CD automation?

The main factor is scalability to more complex systems. The Achilles heel of node-graphs, is that truly complex systems end up turning the node-graph's advantages, into disadvantages. Take a look at the following examples.

Simple node graph
Simple node graph

This example is from Unreal Engine, as their node graph system, is a good example of what I would like to end up with. As you can see, when the complexity of the graph is low, readability and maintainability is quite high, as it is easy to grasp, what is going on.

Complex node graph
Complex node graph

However, if your system ends up being too complex for this approach, you could end up with something similar to the image above.

The good news, is that a singular pipeline is rarely as complex as the second example. Take a look at this example, which is a CI/CD pipeline for a .NET application.

trigger:
  branches:
    include:
      - main

variables:
  solution: '**/*.sln'
  buildPlatform: 'Any CPU'
  buildConfiguration: 'Release'

stages:
  - build
  - test
  - publish
  - deploy

build:
  stage: build
  script:
    - dotnet restore
    - dotnet build --configuration $buildConfiguration

test:
  stage: test
  script:
    - dotnet test --configuration $buildConfiguration --logger trx --results-directory TestResults

publish:
  stage: publish
  script:
    - dotnet publish --configuration $buildConfiguration --output ./publish

deploy:
  stage: deploy
  script:
    - scp -r ./publish user@server:/var/www/html/

This resolves to the following graph (generated by: https://www.plantuml.com)

CI/CD graph
CI/CD graph

As you can see, what I would call an "average complexity" pipeline, ends up not looking too messy when converted to a graph. Ultimately, you should use the right tool for the job. If you have a complex CI/CD system, non-Infra people, shouldn't work on it. On the other hand, if you have something simple, using node-graphs can be a great way to allow developers to work with CI/CD themselves, while infra personnel can work on things that developers can't.

# Conclusion

# Summary of Findings:

# Advantages of Node Graphs

  1. Easier Understanding of Complex Tasks: Node graphs break down complex tasks into smaller, manageable units, making them easier to understand.
  2. Visualization: Node graphs provide a visual representation of software, enhancing comprehension through visual pattern recognition.
  3. Intuitive for Novice Engineers: The visual nature of node graphs makes them more accessible to engineers with limited CI/CD expertise.

# Usability and Acceptance

  1. Scalability: Node graphs are effective for simpler systems but can become cumbersome for highly complex systems.
  2. Developer Accessibility: Node graphs enable developers to work with CI/CD pipelines without needing deep infrastructure knowledge.
  3. Practical Implications:

# Tool Selection

The complexity of the CI/CD system should guide the choice of using node graphs. Simple systems benefit more from node graphs, while complex systems may require traditional methods.

# Resource Allocation

Node graphs allow developers to handle CI/CD tasks, freeing up infrastructure personnel for more specialized work.