Graphical Editing

This week we’ll be looking data types, and using them to build some tools that manipulate images. The basis for this application has already been written. We’ll be writing command-line applications this week.

Background

Consider a pixel, it consists of three components corripsonding to three colours of light (red, green and blue). We can combine these to generate any colour your screen can represent by verying the brightnesses of these three components. This is the additive colour model.

Figure 1. RGB additive colour

By DemonDays64 - Own work made in Blender with Cycles; replacement for File:RGB illumination.jpg which was low-resolution. Later edited slightly with w:Paint.NET's default brightness/contrast tool., CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=118929579

Commonly, we use 256 levels of brightness for each of these components.

Colour Example

Q: What size of integer value is most suitable for storing a value between 0 and 255?

Click to reveal the answer

Byte. This is the smallest size of int supported as it uses 8 bits, it can store 2⁸ (256) values

Hexadecimal representation

Although it’s quite easy to write this as a tuple (r,g,b) where r is the value for red, g is the value for green and b is the value for blue. It’s more common in the web world to convert each of these values into their hexadecimal representation and concatenate them.

Convert the value (64, 0, 0) into hexadecimal and concatenate them. Using this representation as a string isn’t great to deal with from a software development perspective. Usually, its just converted to this for presentation purposes. We’ll look at how we can do this using the formatting tools that we explored last week later in this session.

Libraries

Installing Libraries

We can install libraries using the package manager built into Visual Studio, nuget. You can also use the nuget command line utility if you have that installed. For now, I’m going assume we will use the graphical way, but for those of you that are intrested, this is how you install it using the command-line utility:

Installing SkiaSharp on the command line

nuget install SkiaSharp

The Graphical Way

See the guide on using NuGet.

(advanced) API exploration

Warning

this section is incorrect, we need to use SKBitmap so we can update the Pixels later, not SKPixmap which is read only.

When actually implement such an algorithm, we do need to make such a choice. In our case, we are using a library to access the pixel data, so it’s likely that will influence our decision. Lets look at the API documentation for the library we are using.

There are few different classes used to represent images in the documentation. We’re trying to load an image after all. In object-oriented programming, we have (user definable) types called Classes. Each class represents a theoretical concept. These might map to some physical thing (like a robotic arm, a gun in a game, or a webpage), or something more abstract, something which helps us think about the problem.

An individual copy of a concept is referred to as an instance. In other words, the class Image represents the concept of an image - something that is made of pixels, and which has a width and a height, and so on. An instance of the Image class is a specific image, such as the one of your breakfast, or that one of the cat you took yesterday, or so on.

Images in this library are represented by the SKImage class. However, the designer of the library has separated out different aspects of the idea of images. To them, an SKImage is an, "An abstraction for drawing a rectangle of pixels". When we look at the things we are allowed to do to images (the methods) we can see a whole bunch of stuff about reading and writing whole images or groups of pixels. This is useful, but it’s not what we want to do. We want to modify individual pixels.

From this image class, we have access to another class named, SKBitmap. This class represents the concept of, "Pairs SKImageInfo with actual pixels and rowbytes." - which, (apart from pointing at a potential issue in the documention and the way I’m phrasing this), points the general idea of giving us access to the actual pixel data and rows of pixels. Indeed, this class gives us access to a method (thing we are allowed to do to to the class) called, GetPixelColor giving us access to an individual pixel, represented as an SKColor class, which in turn represents the concept of a colour.

I know this description has seemed a bit wordy, and probably doesn’t make a whole lot of sense at the moment, but this mindset of dividing a problem into concepts that represent our world is one of the core ideas behind object-oriented programming. As humans we describe the world in terms of things (A classroom, a book, a sandwich) and intangible concepts (the colour blue, important dates, game rules) which help us abstract away detail to make solving the problem easier (much in the way we did with pseudocode earlier in this session). We can thing as a method as a thing we can do to this concept, and we can think of a property (variable) as things that are part of this concept (like its colour or weight).

Image Manipulation

There is a great deal of complexity to how we encode images. Conceptually, you can think of an image as a 2D array of pixels, where the two dimensions corresponds to the x and y in the image. We can iterate and manipulate these pixels to suit our needs.

Note	you can actually do this using a 1 dimensional array and column or row first ordering. I’ll show you that as an extension task.

Loading Images

As I previously mentioned, there is a lot of complexity to loading image files (see PNG and JPEG as prime examples). As a result, we’ll be using a library to convert the 'on disk format' to something closer to the conceptual model I just presented.

Note	In theory, you could save this array directly to disk as-is, however, that would require quite a lot of memory. We use image formats to save space on disk, as the expense of the time taken to load the format. This is one of the many trade-offs you will encounter in our field - space vs time.

We are going to be using the Free and Open Source SkiaSharp library for loading images. This is because the System.Drawing.Common extention is not cross platform. ImageSharp is another option, but there are some strange licence terms (where the work is implied to be under an open source licence, but has a fields of endeavor restriction, which is actually not open source). It also gives us a chance to talk about the concept of libraries and how to use them.

You can use this image as a test: http://fal.fosslab.uk/comp101/guild/graphics/wk03/img/images/test.png

Streams

Many programming languages abstract away the concept of 'data that we access from start to finish' as a stream, which conceptually we can think of as a sequence of data that we can read or write in order. We can get the next data from a stream by reading from it, and we can add new data to the current position by writing to it. Streams occur in many places in programming. The Console we used last week is effectively a stream of text. We can also treat network connections and files as streams.

(advanced) On static methods and API design

In C# we can get a FileStream from the File class directly. This might seem a bit strange considering what I was saying before about classes and instances. Wouldn’t it make more sense to get it from an instance of a File?

Yes - and many languages (such as Python and C++) work like that. However, the language designers of C# decided that rather than express Files in this way, they instead would make everything work on the concept of File instead. Rather than create a File instance and perform operations on that, you 'ask' the concept of a file (via a static method) directly - providing the file you are talking about as an argument.

Arguably, this is poor OO design - I say arguably because there are people that would disagree. It 'muddies the waters', but it avoids the need to create many short-lived file instances. Many older languages (like C) used this approach as well (C is not an 'OO' language). Many languages, (including Python and C++) also have these non-oo ways of opening and dealing with files in a c-like way. For historic and performance reasons.

There are trade-offs for designing code in such a way. In this case, trading execution speed and avoiding memory usage for confusing my first year students!

Note

You may also see in the literature as static meaning, 'there is only one copy of this thing', and this is also a correct way of thinking about this idea. There is only one copy of the concept of a file, so naturally there is only one copy of a static method (or static variable). Reasoning about it from the OO perspective of concepts and instances makes more sense to me conceptually, but feel free to disagree.

The static method which allows us to create a FileStream is called OpenRead (we’ll see OpenWrite later when we want to save data). To create a FileStream the method needs to know one piece of information - what file we’d like to access. We pass this in as an argument to the function, as a string.

Loading images from Disk

FileStream src = File.OpenRead("D:/<username>/test.png");

// TODO read image data - see below

// we should close the file when we are done
src.Close();

(advanced) Resource blocks/using/with

It’s fairly common to use a resource (such as file) for a short time, then need to close it. If you forget to close it (or something bad happens and the program crashes) the resource can be left open or in an incomplete state. Because of this, many language designers have developed ways to ensure that resources are closed when they are no longer needed. In C#, this is the using block:

using( FileStream src = File.OpenRead("D:/<username>/test.png") ) {
	// process the file here
} // FileStream is closed automatically here

You can reason about this in the same way you reason about variable scopes from last week. The stream is closed at the end of the curley bracket enclosing its scope. The same concept in Python is present in the with statement.

Accessing Pixel Data

We can use a Bitmap to get access to the individual pixels in the image. To make sure this works, we’re going to ask for the pixel located at (0,0) and output its red, green, and blue components.

accessing and outputting component values

// TODO open the file steam here
SKBitmap pixelData = SKBitmap.Decode(src);
// TODO close the file stream here

// get the colour at (0,0) and output it's values to the screen
SkiaSharp.SKColor colour = pixelData.GetPixel(0,0);
Console.WriteLine("Red: {0}, Green: {1}, Blue: {2}", colour.Red, colour.Green, colour.Blue);

We can use a for loop to access each pixel in the image. The Pixmap class expects us to have both the x and the y to be provided. We can access the width and the height of the image using the Width and Height properties. You can combine the for loop example below with the example above to output every colour in the image, rather than just the first one.

//TODO create a variable of type, `SKBitmap` and populate it from the image.

// Go through every pixel, column by column until we have iterated the whole image
for (int x=0; x<pixelData.Width; x++) {
	for (int y=0; y<pixelData.Height; y++) {
		// TODO get the pixel located at (x,y) and output it using the Console
	}
}

Extension: Formatting Arguments

I mentioned earlier that it’s fairly common to see colours written in hexadecimal format. We can use the formatting argument to show these values as hexadecimal, if we use X as the type, for the red argument, we should write: {0:X}.

Example 1. Showing hexadecimal

Replace all three arguments with their hexadecimal versions.

For the value (255, 0, 0), the output should be:

Red: FF, Green: 0, Blue: 0

However, this means we don’t have the decimal version anymore. We are allowed to use the same argument twice, if we write {0} in one place, and {0:X} in another in the same string, the first time it will be decimal and the second time it will be hexadecimal.

Example 2. Showing both

Rewrite the output to be: "Red: [value in decimal], Green: [value in decimal], Blue: [value in decimal], #[Red in hex][green in hex][blue in hex]"

For the value (255,0,0) the output should be:

Red: 255, Green: 0, Blue: 0, #FF0000

Calculating Greyscale

The first algorithm for modifying images that I am going to introduce is converting an image into the gray scale representation of itself. If you play with the sliders above, you may notice that when all three of the sliders are in the same place (or very similar places), we end up with a shade of grey. We will use this to convert our images to be greyscale.

There are actually a few different ways of doing this, the way I’m going to show you is based on the average of the pixel values. We will build up to that though, section by section.

Average colour of a pixel

\[P_{0..2} = \frac{\sum_{i=0}^{3}{P_i}}{3}\]

This might look scary just means, "sum the three components together and divide by count of the components (3), and set each pixel component to that value."

Pseudocode

If we were to express this in pseudocode, it might look something like this:

Converting images to gray scale

FOREACH pixel IN image
	value = ( getComponent(pixel, 'R')
		+ getComponent(pixel, 'G')
		+ getComponent(pixel, 'B')
		) / 3
	setComponent(pixel, 'R', value)
	setComponent(pixel, 'G', value)
	setComponent(pixel, 'B', value)
NEXT pixel

Note	This code is not valid program code. I have abstracted away much of the implementation detail, as you will see. This just gives you the general 'gist' of the approach, and not the (implementation) details. This is what separates pseudocode from 'real' code.

I’ve 'abstracted away' two concepts in this code behind functions. The first, getComponent, is meant to fetch the value of the pixel component which is passed as the second argument. The other function, setComponent is meant to set the component to the argument passed in as the second argument (ie, R, G, or B) to the value passed in in the third argument. Likewise, although I’ve used a foreach loop in this code, there may be reasons why in an implementation you’d actually implement this as a for loop or a pair of for loops, or similar. Conceptually though, we are iterating through every pixel in the image (and we don’t care about it’s physical location).

The implementation detail in this case would be how we actually store and update those values. It’s not important to the algorithm if we are using an array, a 2D array, a list, or any other structure (such as an int array with byte-packed values, or a float array). Any implementation would work, and as we don’t care, it doesn’t make sense to complicate our description of the algorithm with this detail.

We’ve already talked about how we would implement getComponent in the previous section. We get the colour from the pixmap using GetPixelColor and access it’s Red, Green and Blue properties. We’ll talk about how we’d implement setComponent in a bit, so lets implement the greyscale part now. We can output to console as we have been doing until this point to check it worked.

Example 3. Outputting greyscale values

Calculate and display (using Console) the expected greyscale value for each pixel.

You will need to create a variable of a suitable type (byte) to contain the value
addition can be performed using + (a + b means add a to b)
division can be performed using / (a / b mean a divided by b)
remember order of operations, you may need brackets.

Updating Pixel Data

We can use SetPixel in a similar way to GetPixel to update the colour of a pixel. You will need a colour to update the new pixel to. I’m going to create a new variable of type, SKColor to store this. In the following example, I’m going to set this to (255, 0, 255) - in other words a combination of red and blue.

Updating a single pixel

SkiaSharp.SKColor newColour = new SkiaSharp.SKColor(255, 0, 255);
pixelData.SetPixel(0, 0, newColour);

Example 4. Updating your image

Using what we learnt about loops, use the above code example to change your code to set the pixel colour, rather than, or as well as outputting the value to screen.

Remember that we need to set the red, green and blue values to all be the same value (the average of the original components). You won’t be able to see the result just yet. I’m going to show you how to save your image in the next step.

Saving our image

Just like when loading, when saving we need to save our idealised representation of the image (an array of colours) to a suitable image format (like PNG, or JPG). This process is called encoding, and we have the Encode method to help us with this.

Arguments:

Stream
SKEncodedImageFormat
Quality

https://learn.microsoft.com/en-us/dotnet/api/skiasharp.skbitmap.encode?view=skiasharp-2.88#skiasharp-skbitmap-encode(system-io-stream-skiasharp-skencodedimageformat-system-int32)

FileStream dst = File.OpenWrite("D:/<username>/test_bw.png");
pixelData.Encode( dst, SKEncodedImageFormat.Png, 10);
dst.Close();

Tip	You will need to create the folder first.

Channel Extraction

We’ve just seen how to take all three channels (red, green and blue) and set them to the same value. What if instead you extracted just the red, green or blue channels?

Test image for channel extraction

Example 5. Channel Extraction

Create a method which can just extract the red, green and blue channels. Save them as test_red.png, test_blue.png, test_green.png.

You can do this by zeroing the channel(s) that you are not intrested in and keeping the channel you are intrested in.

Channel Swapping

Swap two of the channels (eg, red and green) and save that as <image>_swapped.png.

(advanced) Functions

Implement functions for each of these, so you can reuse and combine them. What would the input for these be?

Note	If you pass the pixel data directly to the function, it will modify it directly rather than modifying a copy. This means your function does not need to return a value.