This week we’ll be looking data types, and using them to build some tools that manipulate images. The basis for this application has already been written. We’ll be writing command-line applications this week.
Background
Consider a pixel, it consists of three components corripsonding to three colours of light (red, green and blue). We can combine these to generate any colour your screen can represent by verying the brightnesses of these three components. This is the additive colour model.
By DemonDays64 - Own work made in Blender with Cycles; replacement for File:RGB illumination.jpg which was low-resolution. Later edited slightly with w:Paint.NET's default brightness/contrast tool., CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=118929579
Commonly, we use 256 levels of brightness for each of these components.
Colour Example
Q: What size of integer value is most suitable for storing a value between 0 and 255?
Click to reveal the answer
Byte. This is the smallest size of int supported as it uses 8 bits, it can store 28 (256) values
Hexadecimal representation
Although it’s quite easy to write this as a tuple (r,g,b) where r is the value for red, g is the value for green and b is the value for blue. It’s more common in the web world to convert each of these values into their hexadecimal representation and concatenate them.
Convert the value (64, 0, 0) into hexadecimal and concatenate them. Using this representation as a string isn’t great to deal with from a software development perspective. Usually, its just converted to this for presentation purposes. We’ll look at how we can do this using the formatting tools that we explored last week later in this session.
Libraries
Installing Libraries
We can install libraries using the package manager built into Visual Studio, nuget. You can also use the nuget command line utility if you have that installed. For now, I’m going assume we will use the graphical way, but for those of you that are intrested, this is how you install it using the command-line utility:
nuget install SkiaSharp
The Graphical Way
See the guide on using NuGet.
(advanced) API exploration
Warning
|
this section is incorrect, we need to use SKBitmap so we can update the Pixels later, not SKPixmap which is read only. |
When actually implement such an algorithm, we do need to make such a choice. In our case, we are using a library to access the pixel data, so it’s likely that will influence our decision. Lets look at the API documentation for the library we are using.
There are few different classes used to represent images in the documentation. We’re trying to load an image after all. In object-oriented programming, we have (user definable) types called Classes. Each class represents a theoretical concept. These might map to some physical thing (like a robotic arm, a gun in a game, or a webpage), or something more abstract, something which helps us think about the problem.
An individual copy of a concept is referred to as an instance. In other words, the class Image
represents the concept of an image - something that is made of pixels, and which has a width and a height, and so on. An instance of the Image
class is a specific image, such as the one of your breakfast, or that one of the cat you took yesterday, or so on.
Images in this library are represented by the SKImage
class. However, the designer of the library has separated out different aspects of the idea of images. To them, an SKImage is an, "An abstraction for drawing a rectangle of pixels". When we look at the things we are allowed to do to images (the methods) we can see a whole bunch of stuff about reading and writing whole images or groups of pixels. This is useful, but it’s not what we want to do. We want to modify individual pixels.
From this image class, we have access to another class named, SKBitmap. This class represents the concept of, "Pairs SKImageInfo with actual pixels and rowbytes." - which, (apart from pointing at a potential issue in the documention and the way I’m phrasing this), points the general idea of giving us access to the actual pixel data and rows of pixels. Indeed, this class gives us access to a method (thing we are allowed to do to to the class) called, GetPixelColor giving us access to an individual pixel, represented as an SKColor class, which in turn represents the concept of a colour.
I know this description has seemed a bit wordy, and probably doesn’t make a whole lot of sense at the moment, but this mindset of dividing a problem into concepts that represent our world is one of the core ideas behind object-oriented programming. As humans we describe the world in terms of things (A classroom, a book, a sandwich) and intangible concepts (the colour blue, important dates, game rules) which help us abstract away detail to make solving the problem easier (much in the way we did with pseudocode earlier in this session). We can thing as a method as a thing we can do to this concept, and we can think of a property (variable) as things that are part of this concept (like its colour or weight).
Image Manipulation
There is a great deal of complexity to how we encode images. Conceptually, you can think of an image as a 2D array of pixels, where the two dimensions corresponds to the x and y in the image. We can iterate and manipulate these pixels to suit our needs.
Note
|
you can actually do this using a 1 dimensional array and column or row first ordering. I’ll show you that as an extension task. |
Loading Images
As I previously mentioned, there is a lot of complexity to loading image files (see PNG and JPEG as prime examples). As a result, we’ll be using a library to convert the 'on disk format' to something closer to the conceptual model I just presented.
Note
|
In theory, you could save this array directly to disk as-is, however, that would require quite a lot of memory. We use image formats to save space on disk, as the expense of the time taken to load the format. This is one of the many trade-offs you will encounter in our field - space vs time. |
We are going to be using the Free and Open Source SkiaSharp library for loading images. This is because the System.Drawing.Common
extention is not cross platform. ImageSharp is another option, but there are some strange licence terms (where the work is implied to be under an open source licence, but has a fields of endeavor restriction, which is actually not open source). It also gives us a chance to talk about the concept of libraries and how to use them.
You can use this image as a test: http://fal.fosslab.uk/comp101/guild/graphics/wk03/img/images/test.png
Streams
Many programming languages abstract away the concept of 'data that we access from start to finish' as a stream, which conceptually we can think of as a sequence of data that we can read or write in order. We can get the next data from a stream by reading from it, and we can add new data to the current position by writing to it. Streams occur in many places in programming. The Console
we used last week is effectively a stream of text. We can also treat network connections and files as streams.
(advanced) On static methods and API design
In C#
we can get a FileStream
from the File
class directly. This might seem a bit strange considering what I was saying before about classes and instances. Wouldn’t it make more sense to get it from an instance of a File?
Yes - and many languages (such as Python and C++) work like that. However, the language designers of C# decided that rather than express Files in this way, they instead would make everything work on the concept of File instead. Rather than create a File instance and perform operations on that, you 'ask' the concept of a file (via a static method) directly - providing the file you are talking about as an argument.
Arguably, this is poor OO design - I say arguably because there are people that would disagree. It 'muddies the waters', but it avoids the need to create many short-lived file instances. Many older languages (like C) used this approach as well (C is not an 'OO' language). Many languages, (including Python and C++) also have these non-oo ways of opening and dealing with files in a c-like way. For historic and performance reasons.
There are trade-offs for designing code in such a way. In this case, trading execution speed and avoiding memory usage for confusing my first year students!
Note
|
You may also see in the literature as static meaning, 'there is only one copy of this thing', and this is also a correct way of thinking about this idea. There is only one copy of the concept of a file, so naturally there is only one copy of a static method (or static variable). Reasoning about it from the OO perspective of concepts and instances makes more sense to me conceptually, but feel free to disagree.
|
The static method which allows us to create a FileStream
is called OpenRead
(we’ll see OpenWrite
later when we want to save data). To create a FileStream
the method needs to know one piece of information - what file we’d like to access. We pass this in as an argument to the function, as a string.
FileStream src = File.OpenRead("D:/<username>/test.png");
// TODO read image data - see below
// we should close the file when we are done
src.Close();
(advanced) Resource blocks/using/with
It’s fairly common to use a resource (such as file) for a short time, then need to close it. If you forget to close it (or something bad happens and the program crashes) the resource can be left open or in an incomplete state. Because of this, many language designers have developed ways to ensure that resources are closed when they are no longer needed. In C#, this is the using
block:
using( FileStream src = File.OpenRead("D:/<username>/test.png") ) {
// process the file here
} // FileStream is closed automatically here
You can reason about this in the same way you reason about variable scopes from last week. The stream is closed
at the end of the curley bracket enclosing its scope. The same concept in Python is present in the with
statement.
Accessing Pixel Data
We can use a Bitmap to get access to the individual pixels in the image. To make sure this works, we’re going to ask for the pixel located at (0,0) and output its red, green, and blue components.
// TODO open the file steam here
SKBitmap pixelData = SKBitmap.Decode(src);
// TODO close the file stream here
// get the colour at (0,0) and output it's values to the screen
SkiaSharp.SKColor colour = pixelData.GetPixel(0,0);
Console.WriteLine("Red: {0}, Green: {1}, Blue: {2}", colour.Red, colour.Green, colour.Blue);
We can use a for loop to access each pixel in the image. The Pixmap class expects us to have both the x and the y to be provided. We can access the width and the height of the image using the Width
and Height
properties. You can combine the for loop example below with the example above to output every colour in the image, rather than just the first one.
//TODO create a variable of type, `SKBitmap` and populate it from the image.
// Go through every pixel, column by column until we have iterated the whole image
for (int x=0; x<pixelData.Width; x++) {
for (int y=0; y<pixelData.Height; y++) {
// TODO get the pixel located at (x,y) and output it using the Console
}
}
Extension: Formatting Arguments
I mentioned earlier that it’s fairly common to see colours written in hexadecimal format. We can use the formatting argument to show these values as hexadecimal, if we use X
as the type, for the red argument, we should write: {0:X}
.
Replace all three arguments with their hexadecimal versions.
For the value (255, 0, 0), the output should be:
Red: FF, Green: 0, Blue: 0
However, this means we don’t have the decimal version anymore. We are allowed to use the same argument twice, if we write {0}
in one place, and {0:X}
in another in the same string, the first time it will be decimal and the second time it will be hexadecimal.
Rewrite the output to be: "Red: [value in decimal], Green: [value in decimal], Blue: [value in decimal], #[Red in hex][green in hex][blue in hex]"
For the value (255,0,0) the output should be:
Red: 255, Green: 0, Blue: 0, #FF0000
Calculating Greyscale
The first algorithm for modifying images that I am going to introduce is converting an image into the gray scale representation of itself. If you play with the sliders above, you may notice that when all three of the sliders are in the same place (or very similar places), we end up with a shade of grey. We will use this to convert our images to be greyscale.
There are actually a few different ways of doing this, the way I’m going to show you is based on the average of the pixel values. We will build up to that though, section by section.
This might look scary just means, "sum the three components together and divide by count of the components (3), and set each pixel component to that value."
Pseudocode
If we were to express this in pseudocode, it might look something like this:
FOREACH pixel IN image
value = ( getComponent(pixel, 'R')
+ getComponent(pixel, 'G')
+ getComponent(pixel, 'B')
) / 3
setComponent(pixel, 'R', value)
setComponent(pixel, 'G', value)
setComponent(pixel, 'B', value)
NEXT pixel
Note
|
This code is not valid program code. I have abstracted away much of the implementation detail, as you will see. This just gives you the general 'gist' of the approach, and not the (implementation) details. This is what separates pseudocode from 'real' code. |
I’ve 'abstracted away' two concepts in this code behind functions. The first, getComponent
, is meant to fetch the value of the pixel component which is passed as the second argument. The other function, setComponent
is meant to set the component to the argument passed in as the second argument (ie, R
, G
, or B
) to the value passed in in the third argument. Likewise, although I’ve used a foreach
loop in this code, there may be reasons why in an implementation you’d actually implement this as a for loop or a pair of for loops, or similar. Conceptually though, we are iterating through every pixel in the image (and we don’t care about it’s physical location).
The implementation detail in this case would be how we actually store and update those values. It’s not important to the algorithm if we are using an array, a 2D array, a list, or any other structure (such as an int array with byte-packed values, or a float array). Any implementation would work, and as we don’t care, it doesn’t make sense to complicate our description of the algorithm with this detail.
We’ve already talked about how we would implement getComponent
in the previous section. We get the colour from the pixmap using GetPixelColor
and access it’s Red
, Green
and Blue
properties. We’ll talk about how we’d implement setComponent in a bit, so lets implement the greyscale part now. We can output to console as we have been doing until this point to check it worked.
Calculate and display (using Console) the expected greyscale value for each pixel.
-
You will need to create a variable of a suitable type (
byte
) to contain the value -
addition can be performed using
+
(a + b
means add a to b) -
division can be performed using
/
(a / b
mean a divided by b) -
remember order of operations, you may need brackets.
Updating Pixel Data
We can use SetPixel
in a similar way to GetPixel
to update the colour of a pixel. You will need a colour to update the new pixel to. I’m going to create a new variable of type, SKColor
to store this. In the following example, I’m going to set this to (255, 0, 255) - in other words a combination of red and blue.
SkiaSharp.SKColor newColour = new SkiaSharp.SKColor(255, 0, 255);
pixelData.SetPixel(0, 0, newColour);
Using what we learnt about loops, use the above code example to change your code to set the pixel colour, rather than, or as well as outputting the value to screen.
Remember that we need to set the red, green and blue values to all be the same value (the average of the original components). You won’t be able to see the result just yet. I’m going to show you how to save your image in the next step.
Saving our image
Just like when loading, when saving we need to save our idealised representation of the image (an array of colours) to a suitable image format (like PNG, or JPG). This process is called encoding, and we have the Encode
method to help us with this.
Arguments:
-
Stream
-
SKEncodedImageFormat
-
Quality
FileStream dst = File.OpenWrite("D:/<username>/test_bw.png");
pixelData.Encode( dst, SKEncodedImageFormat.Png, 10);
dst.Close();
Tip
|
You will need to create the folder first. |
Channel Extraction
We’ve just seen how to take all three channels (red, green and blue) and set them to the same value. What if instead you extracted just the red, green or blue channels?
Create a method which can just extract the red, green and blue channels. Save them as test_red.png, test_blue.png, test_green.png.
You can do this by zeroing the channel(s) that you are not intrested in and keeping the channel you are intrested in.
Channel Swapping
Swap two of the channels (eg, red and green) and save that as <image>_swapped.png
.
(advanced) Functions
Implement functions for each of these, so you can reuse and combine them. What would the input for these be?
Note
|
If you pass the pixel data directly to the function, it will modify it directly rather than modifying a copy. This means your function does not need to return a value. |