Wyyme Software: C# Data Structures, Collections and Generics

C# Data Structures

Array and Linked List are most basic Data Structure which can be used to create other Data Structures in Computer Science.

Array

An array is a collection of elements stored so that you can access each element through an index. Each element is accessed through a simple mathematical formula (index * element length).

Uses of Arrays

You can use arrays to create other data types. You could use an array to create a stack, a queue, and a circular buffer, but you can also use an array to create lists and other collection types, as well as strings and non collection data types.

Arrays can also represent mathematical concepts such as vectors and matrices. You can create mathematical formulas to perform multiplications, additions, dot product, and so on using arrays that represent matrices and vectors.

You can easily calculate the size and element indices of an array at run time. This makes arrays useful for streaming values to disk or the network and even for passing between dynamic-link libraries (DLLs) created by different programming languages. Libraries are available for quickly copying the data in arrays because the values are stored continuously in memory.

Consider using an array when the array contents, but not the properties of the array (such as the size of the array), change frequently. Arrays are very efficient when performing indexing operations, and can quickly access the elements that need to be changed. That speed is mitigated by the overhead required for item removals, insertions, and additions, but some of the overhead can be reduced by implementing different types of arrays.

Advantages of Arrays

The biggest advantage of arrays is that you can access any item in them using the same mathematical formula instead of having to traverse the collection to find the index you are looking for. Theoretically, it takes the same amount of time to access item 1 as it does to access item 2, 10, 10,000, or 2,300,000. Arrays also require the least amount of memory of all the collection types.

Disadvantages of Arrays

All items in an array must be laid out in memory so that they can be accessed using a mathematical algorithm. When a user decides to add or remove an item from the array, the new item (or remaining items) must still be accessible using the same mathematical algorithm. This creates a performance penalty for the following reasons:

When you add an item, you must create a new array and copy the old values to the new array.

When you remove or insert an item, you must shift up or down all items after the location operated on.

Linked List

A linked list is a collection of nodes, each of which references the next and/or previous node in a sequence. Items in the list do not have to be unique, and they do not have to be placed into the list in a sequence order; they can be added to the list in any order. Linked lists are normally accessed through a node mechanism, whereas arrays are normally accessed through indexes.

Uses of Linked Lists

Like arrays, linked lists are typically used to implement other data structures. You could use a linked list to implement a queue and a stack, but linked lists can be used to create other collection types as well.

Consider using a linked list if you plan to do a lot of insertions, removals, or additions. Linked lists are not very efficient at indexing operations, but they perform pretty well when traversing the list.

Advantages of Linked Lists

Because each node in the list references the next node in the sequence, you do not get the shifting and reallocating performance hit that occurs with arrays. Deletions and insertions can be even faster if you have the node that you are removing or inserting after. Nodes can be easily rearranged by only changing their references. Doubly linked lists have an additional reference that makes adds and some removals more efficient than in a singly linked list.

Disadvantages of Linked Lists

Lists require more memory than arrays because each node references the next node. Also, items are not arranged so that a mathematical formula can be used to access each element, so elements have to be traversed in order to find the item located at the index you want. You may have to traverse the list during a removal and insertion if you have the data and not the node of what you want to operate on. Doubly linked lists also require more memory than singly linked lists.

Associative Array Overview (Internal implementation is Linked List)

An associative array, also called a dictionary, is an abstract collection type that associates a single key with a value. Each key in the collection is unique. Some languages refer to an associative array as a hash table, which generally leads to some confusion regarding the difference between an associative array and a hash table. An associative array defines how a collection should behave, such as associating a key with a value. You can implement an associative array in many ways. One possible way is by using a hash table, which is why some languages refer to an associative array as a hash table.

Uses of Associative Arrays

Associative arrays are useful when you have a value you need to look up with a unique key. You should use other collection types if you never need to use the key to find the value. Say you have a Person class that contains personnel information that you need to store and access quickly whenever someone gives you a phone number. If the collection contains hundreds of instances of the Person class, you would have to constantly traverse the collection to locate the specific record. This would be a performance hit if you had to search thousands or even hundreds of records every second. An associative array that uses a good lookup implementation would be able to access the Person instance quickly through his or her phone number.

Advantages and Disadvantages of Associative Arrays

The advantages and disadvantages of an associative array are normally determined by the implementation you use for the key/value association. Some implementations, such as association lists, are very easy to implement but very inefficient for large lists. Others, such as red-black trees and hash tables, are harder to implement but a lot more efficient for larger lists, and not so efficient for smaller lists. If you are willing to do the research and invest the coding time, associative arrays can be a very efficient tool.

Queue

A queue is a First In, First Out (FIFO) collection. A FIFO is a collection that resembles people waiting in a straight line. The first person to enter the line is the first person to leave the line. When the first people leave the line, they have been dequeued, or popped off the front of the queue, and as additional people join the line, they are enqueued, or pushed onto the back of the queue.

Uses of Queues

Queues are normally used when items need to be processed in the order they are received. For example, say you have a thread that reads an ordered list of instructions from a file and another thread that processes those instructions. The reader thread needs to hold the information that it reads until the processing thread processes the items in the order they were read. A queue is the perfect collection for holding the data. The reader thread pushes the information onto the queue and the processing thread pops it off the queue.

Stack

A stack is a Last In, First Out (LIFO) collection. A LIFO is a collection that resembles a stack of plates. The last plate added is the first plate that is removed.

Uses of Stacks

A stack is normally used when you want to process first the last item received. Stacks can be used for variables in a function. When a function is called, each local variable is pushed onto the stack. After the function returns, each variable is popped off the stack, leaving the variables for the current function at the top of the stack.

Circular Buffer

A circular buffer is a fixed-size data structure that is arranged so that the end of the buffer connects to the beginning of the buffer.

Uses of Circular Buffers

Circular buffers are useful when you need a FIFO collection that can be a fixed size. In general, the size of a circular buffer does not increase as you add and remove items from it. The data you are adding may cause an increase in memory but not in the circular buffer data structure itself. So you can have an endless FIFO that does not increase in memory because items are truncated or removed from the buffer when it is full.

Older data will be overwritten if the user attempts to add data to a full buffer. That functionality is helpful if, say for example, you have a large number of messages coming in that you need to process but old messages are not very important. If your buffer size is 1000, you will start to drop messages that are more than a 1000 messages old when the buffer fills up.

Circular Buffer Implementation

Because a circular buffer is a fixed size collection, it is generally better to use an array as the internal structure instead of a linked list. If you plan on doing a lot of resizing, it may be better to use a linked list as the internal structure, but you should also consider collection types other than circular buffers as well.

C# Collections

Collection classes are specialized classes for data storage and retrieval. These classes provide support for stacks, queues, lists, and hash tables. Most collection classes implement the same interfaces.

Collection classes serve various purposes, such as allocating memory dynamically to elements and accessing a list of items on the basis of an index etc. These classes create collections of objects of the Object class, which is the base class for all data types in C#.

Various Collection Classes and Their Usage

The following are the various commonly used classes of the System.Collection namespace. Click the following links to check their detail.

ArrayList

It represents ordered collection of an object that can be indexed individually.

It is basically an alternative to an array. However unlike array you can add and remove items from a list at a specified position using an index and the array resizes itself automatically. It also allows dynamic memory allocation, add, search and sort items in the list.

Hash table

It uses a key to access the elements in the collection.

A hash table is used when you need to access elements by using key, and you can identify a useful key value. Each item in the hash table has a key/value pair. The key is used to access the items in the collection.

Sorted List

It uses a key as well as an index to access the items in a list.

A sorted list is a combination of an array and a hash table. It contains a list of items that can be accessed using a key or an index. If you access items using an index, it is an ArrayList, and if you access items using a key , it is a Hashtable. The collection of items is always sorted by the key value. The SortedList class represents a collection of key-and-value pairs that are sorted by the keys and are accessible by key and by index.

Stack

It represents a last-in, first out collection of object.

It is used when you need a last-in, first-out access of items. When you add an item in the list, it is called pushing the item and when you remove it, it is called popping the item.

Queue

It represents a first-in, first out collection of object.

It is used when you need a first-in, first-out access of items. When you add an item in the list, it is called enqueue and when you remove an item, it is called dequeue.

BitArray

It represents an array of the binary representation using the values 1 and 0.

It is used when you need to store the bits but do not know the number of bits in advance. You can access items from the BitArray collection by using an integer index, which starts from zero.

C# Generics - Why Generics?

Generics allow you to delay the specification of the data type of programming elements in a class or a method, until it is actually used in the program. In other words, generics allow you to write a class or method that can work with any data type.

The Buffer Problem

Circular Buffer is commonly used for streaming data, like streaming audio data. If the reader doesn't keep up with the values in the buffer, and the writer keeps filling it up, then you'll lose some data and the audio is garbled, but at least you didn't fill up memory with all this audio data and crash a program entirely. I do have some unit tests to prove that this works and this last test is the most interesting one. It shows that if I create a Circular Buffer with a capacity of 3, and then I write 5 values into it, it will only remember the last 3 values because that's the capacity.

If I write in 1.0, 2.0, 3.0, 4.0, 5.0 it will remember 3.0, 4.0, and 5.0 when I read it out. It's the last 3 values and then it will be empty. Now you don't have to know about the code for the Circular Buffer, but I will just quickly demonstrate that it works in a command line application. I can enter values into the buffer like 2.1, 3.14, 1.7, and then 6, 7, 8, 9, second 9, but when I end the program, it only remembers the last 3 values 8, 9, and 9. And it does that because again I created a buffer with a capacity of 3. I could have created one with a capacity of 1,000, but I'm trying to keep it small for this demo. Now you don't have to know the code for the Circular Buffer, it's only going to be here to prove a point. I did implement it with an old trick that I learned in college with the C programming language.

I essentially have a start and an end pointer and I rearranged those pointers to index into an array as items are read and written to the buffer, and it works for my needs, it's not fancy, it doesn't have a lot of validation code, and it does work for simple cases. But, here's my problem, it only stores doubles. I cannot use it for bytes or strings or integers or any other type of object, only doubles, because that's the type of array I have. And the Write method only accepts doubles and the Read method only returns doubles. Now I want to use this buffer for different types of objects. I want to be able to store strings. How am I going to implement that? If you already know generics, pretend that you don't, there's a couple possible solutions.

Object Solution

One solution to my problem is to allow circular buffer to work with objects instead of doubles. That is change the API, change the internal implementation, change everything to take object instead of double, because we are in C#, and everything we work with derives from system.object. So I should be able to go into the code, hit Shift+Ctrl+F, and say that I want to replace everywhere you find the word double, let's just replace that with object, but let's only do that in this Current Document. And I'll say Replace All and we replaced 4 occurrences. Essentially now instead of having an array of double I have an array of object. Instead of having a Write method that takes a double, I take an object reference and Read returns an object reference. And what's amazing about this if I come into my unit tests, and Run All of my unit tests, they all still work and they still pass and this is great, that means I now have a CircularBuffer that works with doubles or strings, if I wanted to use strings.

In fact let me make that check. Here inside the unit test project I'll set value1 to be a string instead of a double, value2 also, run all my tests again, and they all still pass. And this seems like I'm off to a pretty good start, but there are two problems. One will become very obvious when I try to do something interesting with a buffer, not just write out the contents or verify things in unit test, and the other problem is a hidden problem, that might cause my application to perform badly. Let's look at the first problem. Here inside of the program, a Console mode application, essentially all I'm doing is taking input from a user, parsing into a double, sticking it into the buffer, and when they're finished entering items, I just display all the content of the buffer of what's inside. So I don't really care what's inside, I just want it to be a string so I can display it on the screen.

If I really was working with a buffer of doubles, I'd probably want to do some math on the numbers. I'd want to average an audio signal or sum a list of measurements. So let's say instead of dumping the buffer out like this, I actually wanted to compute a sum of numbers that are inside the buffer. So I'll say sum = 0, while buffer is not empty sum += buffer.Read. But this starts to demonstrate one of the first problems. Read no longer returns a double, it returns an object reference. To make that into a double, one thing I could do is a coercion, a typecast. I could say yes I know it's returning an object reference, but let's just force it into being a double. And now add these things up for me and at the end here let me write out the sum that we compute. But looking at this now it's not as nice as it could be because if Read actually returned a double instead of an object reference, I wouldn't need this cast and the code would be a little bit easier to read. I'm also not guaranteed to get a double since the Write method accepts just about anything, it takes an object reference. Someone could say buffer.Write, and just put in a string like "hello". And that means when the application runs, and I try to put in 1.2 and a 2.1, well when I actually run it it's going to explode with an invalid cast exception, because I just tried to take a string and cast it into a double. So yes, my buffer does now work with doubles, and integers, and any other type of object, but it's no longer typesafe.

It's not as easy to work with. What I'll have to do now is more than just a typecast, I'm going to have to do some sort of check to make sure that that actually is a double. So I could say buffer.Read and then if value is a double, only then will we try to cast something. Let me run this version of the program, and type in a 1.2 and a 2.1, and yes the sum is a 3.3, which is good. But I have to write a little more code, so it's not as nice to work with. There's also the hidden problem that I alluded to. Let me go to the command prompt for Visual Studio and open up a tool that's always installed with Visual Studio, it's called IL DASM. That's short for Intermediate Language Disassembler and with this tool I can browse to any EXE or DLL, any type of assembly compiled for .NET. So my program will currently be in projects Data Structures, bin, Debug, I'll open up Data Structures.exe. And one of the many things that IL DASM will show me, if I dig around inside of here, will be the intermediate language instructions in the assembly. These are the OpCodes, the instructions for the virtual machine that the common language runtime provides, and here inside of the main method of the application, I can see there is a line of code that is boxing a double, before calling the right method of buffer.

If you're not familiar with the boxing operation, go back and watch my C# Fundamentals course, the module on Types and Assemblies. That will give you a lot more details on Boxing Value types. But the short version of the story is that when we take a value type and doubles, longs, ints, dates, times; those are all value types, when we take a value type and try to interact with it through an object reference, the C# compiler is forced to take the value, allocate space on the heap, and copy that value to the heap. So again for a more detailed explanation, watch C# Fundamentals part 1. But boxing operations are not inherently bad. The problem is most people using a Circular Buffer like this will have a large buffer, buffers with the capacity for 10,000 or 100,000 items.

And when you start boxing large amounts of data like that, a large number of items, then you can run into performance issues, because it takes additional time to allocate the memory, copy the value to the heap, and then eventually you'll want to work with the data. And that's going to require an unboxing operation, taking that value off the heap and making it a double that I can sum again. So all in all, we would like to avoid boxing if we could, if there's an easier way to do it. So let's forget about this approach of using objects inside of the Circular Buffer.

Copy and Paste For Victory

A sophisticated architectural pattern known as copy and paste. So I'll take the CircularBuffer class that I have now, copy it, and then I'm just going to paste it here inside of the same file. I should make a new file and only have one class per .cs file, but like I say, I'm in a hurry. And now I'll change the name of this class, let me call it CircularBufferString. And now I just need to make it work with string, so I want this to be a string array. I need to make sure that the constructor names are the same as my class name, but this is going to be again a string array. I want the Write method to take only strings, I want the Read to always return a string, and voila! I now have a CircularBuffer that works with strings. But, I just remembered I also need one that works with integers too. Well fortunately with today's powerful computers, I can copy/paste as many classes as I need. Except copy/paste is like bad karma, it always comes back to haunt you.

What I really need to do is keep all my logic for managing a CircularBuffer. All this logic that manipulates the start and end pointers, I want to reuse that to manage buffers for doubles, for strings, for custom objects, just buffer everything. And if you remember what I did here was paste in some new code. And essentially I needed to go through the new class and just change the main type of that class, search and replace the word double, and replace it with string, and that is exactly what generics can do for me. They can allow me to parameterize my data type so I don't have to commit to a specific type here at compile time like string or double, instead I'll have a placeholder. I'm going to call it T for type and I'll figure out the actual type some time later after compilation.

Generic Type Parameters

Working with Generic Collections

The array is a data structure. You can use to hold multiple object references, and it can be strongly typed. So, here I have an array of employees, two employees to be precise. I'm creating it using the collection initializer syntax in C#. When I do that, I don't even need to explicitly specify the size of the array. The C# complier can figure out that this array needs to be sized to hold two employee references. And once I have an array, I can do all sorts of wonderful things. I can foreach through an array. So, I could say foreach employee that is in this employees collection, let's Console.WriteLine the employee's Name. That's one way to iterate through and array, and I can also write a for loop. So, I can set up a variable I to go from 0 to less than the length of this array and also write out the employees name because I can index into that array.

So, employees You have almost direct access to every element inside of there. And there are other various things that I can do with an array both with instance members that are available on the employees object itself and also through the link extension method to let me do things like order the items, group the items, project the items, page through the items, all sorts of functionality available there, but the one place where the array falls down is when I need to add new things. So, there's no add method, there's no insert method, there's no append method. An array has a fixed size. In this case, I have an array that stores two employees. If I somehow pick up a third employee, there's no easy way for me to get that into the array. Now, there is a Resize method, a static method, where I pass in the Array by reference. I specify what the new size of the Array is going to be. In this case, I want to store 10 employees now, but that code is a bit of a pain to write if you need to consistently resize the array.

It can also lead to a bit of a performance problem if I'm constantly resizing this array because behind the scenes the array is one contiguous block of memory, and every time I resize it, that block of memory needs to be copied around into the new array. It would be much nicer if there was a data structure that could grow for me, that could automatically resize, but still have some of the nice features of an array like the ability to index in to directly get to an item in a fixed amount of time, and that's where the List comes in.

So, the List is a generic type as you can see in the IntelliSense window. I need to specify a type argument, and I could always say that I want a List that stores any type of object, but in this case, I do want to store employees, so I want a List of Employee. And now I just need to change around the constructor, also make that a new List of Employee, but I can still use the initializer syntax, I can still write a foreach loop, I can still index into that List, but I do need to change the Length property. A List doesn't have a Length property. It has a Count instead.

And now that I have a List, I still have pretty much all the capabilities that an array had, but I also have the ability to add new items. And behind the scenes, a List will automatically expand. How does it expand? When does it expand? Well, those are good questions to ask, and what we could do just for fun is a little bit of an experiment. So, let me comment out the code that I have or maybe just delete it entirely, and let's create a list of numbers, so a new List that will hold integers. And, again, that generic type parameter int, that is used to parameterize different methods that are available on this list like the Add method. So, the Add method will only take an int. The list that we had before, the List of employee, I could only take an employee. So, you can imagine List of T, kind of like our circular buffer of T where that T is used throughout the class definition for parameter types, for return types, and so forth. Well, let's use this List inside an infinite loop and try to figure out when it's expanding itself, when is it growing? And we can kind of figure that out because a List has a property called Capacity

The System.Collections.Generic namespace contains interfaces and classes that define generic collections, which allow users to create strongly typed collections that provide better type safety and performance than non-generic strongly typed collections. For quite some time now, I am involved in the development of application software using the .NET Framework version 1.1. But there was one thing that .NET 1.1 really lacked. The support for something like 'Templates' found in the good old(?) C++. The support for the concept of type parameters, which makes it possible to design classes which take in a generic type and determine the actual type later on.

This means that by using a generic type parameter T, you can write a single class MyList<T> and the client code can use it as MyList<int>, MyList<string> or MyList<MyClass> without any risk of runtime casts or boxing operations. My dear friends let me introduce you to the concept of 'Generics', which is included in .NET Framework version 2.0, and which can be considered very close to Templates in C++.

Version 2.0 of the .NET Framework introduces a new namespace viz. System.Collections.Generic, which contains the classes that support this concept, like the List, Queue, Stack, LinkedList and many more which you can use effectively in your programs.

Advantages of Generics

In the earlier versions, before .NET 2.0, generalization was accomplished by casting types to and from the universal base type System.Object. Generics provide a solution to this limitation in the common language runtime and the C# language. This limitation can be demonstrated with the help of the ArrayList collection class from the .NET Framework base class library. ArrayList is a highly convenient collection class that can be used without any modifications to store any reference or value type.

// The .NET Framework 1.1 way of creating a list

    ArrayList list1 = new ArrayList();

    list1.Add(3);

    list1.Add(105);

    //...

    ArrayList list2 = new ArrayList();

    list2.Add("First item.");

    list2.Add("Second item");

    //...

But this convenience comes at a cost. Any reference or value type that is added to an ArrayList is implicitly typecast to System.Object. If the items are value types, they must be boxed when added to the list, and unboxed when they are retrieved. The casting, boxing and unboxing operations degrade performance; the effect of boxing and unboxing can be quite significant in scenarios where you must iterate through large collections.

So, what we need is, flexibility of the ArrayList but it should be more efficient and should provide some ways of type checking by the compiler, without sacrificing on the reusability of that class with different data types. An ArrayList with a type parameter? That is precisely what generics provide. Generics would eliminate the need for all items to be type cast to Object and would also enable the compiler to do some type checking.

In the generic List<T> collection, in System.Collections.Generic namespace, the same operation of adding items to the collection looks like this:

// The .NET Framework 2.0 way of creating a list

    List<int> list1 = new List<int>();

    list1.Add(3); // No boxing, no casting

    list1.Add("First item"); // Compile-time error

For the client code, the only added syntax with List<T> compared to ArrayList is the type argument in the declaration and in instantiation. In return for this slightly greater coding complexity, you can create a list that is not only safer than ArrayList, but also significantly faster, especially when the list items are value types.

Little Experiment about Array (Concern about What data type you added )

Below code could lead the performance problem

To avoid this performance draw back of array, List comes into the picture

Little experiment of Generic List (Concern about What data type you added )

Little experiment of Generic Queue (Concern about When you added )

Peek method in queue will not remove the item from queue instead just display the item

Convert Queue to Array which will have additional capability on list as Array

Little experiment of Generic Stack (Concern about When you added )

Little Experiment of SET Things which means HASH Set (Concern about What you added)

Hash set will maintain the Unique set of items in the List

Other operation we could perform on HashSet

Here, HashSet doesn’t know about object ref. which can be used to find uniqueness of items in HashSet

Linked List (Doubly Linked List)

Sometimes you need to frequently rearrange items. For example, if you think about the implementation of a word processor, there's quite a bit of inserting, deleting, copying, pasting, and moving text around. It's total chaos at least when I'm writing. In these scenarios where there are many little modifications, a list can be inefficient because it has to copy items around when you insert things. So instead, you might want to look at the LinkedList class provided by the .NET Framework. The LinkedList in the .NET Framework is what we call a doubly LinkedList because each item that I'm going to add to the list will contain a Next and Previous pointer.

Map Things (Dictionary Data Structure)

The next data structure that I want to talk about is a popular data structure that sees quite a bit of use in many applications. This is the Dictionary data structure. Just like a real dictionary, the .NET Dictionary allows you to find information quickly because it orders information inside in a way that allows for efficient searches. In the case of a real dictionary, the words are in alphabetical order so you know if you were looking for the word bloviate, it's going to be towards the front of the dictionary. You just need to turn pages until you get to the right location. Not that many people use paper dictionaries anymore, but hopefully you can remember reading about paper books on the internet. In a .NET Dictionary, you can store more than just words.

You can store any type of object just like you can with the List of T, but there are two parts to every item that you store in a dictionary. The first part is known as the key. This is like the word of a dictionary. It's the key that you'll use to search and find information. When you find the key, you can look at the value. That's that second part of every item you store in a dictionary. The value is like the definition of a word in the dictionary. So, there's the key, and there's the value, and we call these the key value pair. The whole idea of a dictionary is to store different values using keys that make it easy to find those values. So, let's use a dictionary because it's a little bit different than the other generic types we've been working with. When I go to create a Dictionary, I need to specify not one, but two generic type parameters.

Sorting of Dictionary items

Items before Sorting

Items after Sorting with SortingDictionary

Generic Classes and Interfaces

Generic Classes

Generic classes encapsulate operations that are not specific to any particular data type. The most common use for generic classes is with the collections like linked lists, hash tables, stacks, queues, trees and so on where operations such as adding and removing items from the collection are performed in more or less the same way regardless of the type of the data being stored.

    public class Node <T>

        T head;

        T next;

Here, T is the type parameter. We can pass any data type as parameter.

This class can be instantiated like this:

    Node<string> node = new Node<string>();

This will tell the compiler that the properties, head and next are of type string. Instead of string, you can substitute this with any data type.

Write your own Generic class

The following example demonstrates how you can write your own generic classes.

The example shown below is a simple generic linked list class for demonstration purpose:

using System;

using System.Collections.Generic;

public class MyList<T> //type parameter T in angle brackets

    private Node head;

    // The nested class is also generic on T.

    private class Node

        private Node next;

        //T as private member data type:

        private T data;

        //T used in non-generic constructor:

        public Node(T t)

            next = null;

            data = t;

        public Node Next

            get { return next; }

            set { next = value; }

        //T as return type of property:

        public T Data

            get { return data; }

            set { data = value; }

    public MyList()

        head = null;

    //T as method parameter type:

    public void AddHead(T t)

        Node n = new Node(t);

        n.Next = head;

        head = n;

    public IEnumerator<T> GetEnumerator()

        Node current = head;

        while (current != null)

            yield return current.Data;

            current = current.Next;

Notice the declaration of the above class :

    public class MyList<T>

T is the parameter type. Throughout the above code, the data type for the Node is T rather than any specific types like int or string or any other class. This gives flexibility to the programmer to use this class with any data type he wishes to use.

The following code example shows how the client code uses the generic MyList<T> class to create a list of integers. Simply by changing the type argument, the code below can be easily modified to create lists of strings or any other custom type:

    class Program

        static void Main(string[] args)

            //int is the type argument.

            MyList<int> list = new MyList<int>();

            for (int x = 0; x < 10; x++)

                list.AddHead(x);

            foreach (int i in list)

                Console.WriteLine(i);

            Console.WriteLine("Done");

Okay. I think you have got a hang of the generics by now, right? Anybody who has worked with templates in C++ would find this almost similar.

Generic Interfaces

CircularBuffer is working well, but we've got some new requirements from the business. They want this program to behave a little bit differently. There are some scenarios where they do not want any information to be lost. So, if we receive gigabytes of data from a user, we need to buffer gigabytes of data. And since the CircularBuffer is designed to throw data away once it reaches some capacity, this means we need a new type of buffer. We still need the CircularBuffer for some configurations. Perhaps the user gets to decide if they will sacrifice data loss for performance, so we'll keep the CircularBuffer around, but we need yet another buffer.

And it's in these types of situations where the business and the application require some flexibility where we usually try to avoid coding against a concrete class like CircularBuffer<T>. We want to raise the level of abstraction just a bit, introduce a layer of indirection, and code against an interface instead because we can hide the implementation of different types of buffers behind an interface. And if I pass an interface to the other pieces of the application that need a buffer, like in this simple application that's just the method to ProcessBuffer and ProcessInput, then I can swap out different buffer implementations behind the scenes without the other pieces of the application even knowing that.

Demo of Generic Interfaces

Another Interface example

Circling Back with Generic Interface with Inheritance

The Great IEnumerable Implementation

This interface will give iterating facility to our collection classes

Comparing Employee objects

IEqualityComparer with Sorted Dictionary

IComparer with SortedSet

Cleaning Up Generics

Generic Methods and Delegates

Using Extension method in Generic

Using Delegates in Generic

Delegates in Normal Implementation

Delegates in Generic Way

Every Day Delegate Implementation

Action Delegate

Delegates(Action, Func and Predicate) with Lambda Expression

Implementation of Convertor Operator

Implementation of EVENT Handler in Generic

Constraints, Covariance, and Contravariance

Generics So Far

Here, we have problem when we check the incoming objects identity

Demo of Model Objects

Example of Class Constraints

The Interface Constraints

Covariance and contra variance

Many programming language type systems support subtyping. For instance, if Cat is subtype of Animal, then an expression of type Cat can be used whenever an expression of type Animal could. Variance refers to how subtyping between more complex types (list of Cats versus list of Animals, function returning Cat versus function returning Animal, ...) relates to subtyping between their components. Depending on the variance of the type constructor, the subtyping relation may be either preserved, reversed, or ignored. For example, in C# :

IEnumerable<Cat> is a subtype of IEnumerable<Animal>, because the IEnumerable<T> type constructor is covariant. Note how original subtyping of the interface's type argument is preserved.

Action<Cat> is a supertype of Action<Animal>, because the Action<T> type constructor is contravariant (An Action<T> represents a first-class function expecting an argument of type T or sub-T). Note how the subtyping of T is reversed while encapsulated by an Action, but preserved once supplied as an argument to the underlying function.

Neither IList<Cat> or IList<Animal> is a subtype of the other, because the IList<T> type constructor is invariant.

A programming language designer will consider variance when devising typing rules for e.g. arrays, inheritance, and generic data types. By making type constructors covariant or contravariant instead of invariant, more programs will be accepted as well-typed. On the other hand, programmers often find contravariance unintuitive, and accurately tracking variance to avoid runtime type errors can lead to complex typing rules. In order to keep the type system simple and allow useful programs, a language may treat a type constructor as invariant even if it would be safe to consider it variant, or treat it as covariant even when that can violate type safety.

Within the type system of a programming language, a typing rule or a type constructor is:

covariant if it preserves the ordering, ≤, of types, which orders types from more specific to more generic;
contravariant if it reverses this ordering;
invariant if neither of these applies.

Arrays example of Covariance and Contravariance

First consider the array type constructor: from the type Animal we can make the type Animal[] ("array of animals"). Should we treat this as

Covariant: a Cat[] is a Animal[]
Contravariant: a Animal[] is a Cat[]
or neither (invariant)?

If we wish to avoid type errors, and the array supports both reading and writing elements, then only the third choice is safe. Clearly, not every Animal[] can be treated as if it were a Cat[], since a client reading from the array will expect a Cat, but an Animal[] may contain e.g. a Dog. So the contravariant rule is not safe.

Conversely, a Cat[] can not be treated as a Animal[]. It should always be possible to put a Dog into a Animal[]. With covariant arrays this can not be guaranteed to be safe, since the backing store might actually be an array of cats. So the covariant rule is also not safe—the array constructor should be invariant. Note that this is only an issue for mutable arrays; the covariant rule is safe for immutable (read-only) arrays.

This illustrates a general phenomenon. Read-only data types (sources) can be covariant; write-only data types (sinks) can be contravariant. Mutable data types which act as both sources and sinks should be invariant.

Covariance in Generic

Covariance only works in Interfaces and Delegates

IRepository have all the readonly and writable methods, so exposing these writable methods very dangerous. Below Covariance solution will help them out.

Contravariance in Generic

Contravariance is just opposite of Covariance in Generic parameter. Contravariance will be used with “IN” modifier.

Generic Odds & Ends

The Enum Problem

Special Class Constrains cannot be used in Generic. We cannot use generic and Enumeration together

The Math Problem

Generic <T> will not work with Operator of C#

The problem with Base type

Solution for above problem

The problem with Static Members of Class

Solution for this above problem

Blogger Labels: Data,Structures,Collections,Generics,Array,List,Structure,Computer,Science,collection,index,formula,length,Uses,Arrays,concepts,vectors,matrices,formulas,multiplications,additions,product,size,indices,disk,libraries,DLLs,languages,memory,Consider,properties,item,removals,insertions,Advantages,advantage,Disadvantages,items,algorithm,user,performance,location,nodes,references,node,sequence,mechanism,indexes,Lists,Deletions,reference,Also,removal,insertion,Associative,Overview,Internal,implementation,dictionary,Some,difference,Person,personnel,information,instances,instance,association,implementations,trees,tool,Queue,FIFO,Queues,example,instructions,reader,Stack,Last,LIFO,Stacks,Circular,Buffer,Buffers,Older,messages,storage,retrieval,Most,interfaces,purposes,basis,Object,Various,Classes,Usage,System,Click,ArrayList,allocation,Hash,combination,Hashtable,SortedList,BitArray,representation,bits,integer,zero,specification,method,Problem,writer,unit,demo,college,pointer,pointers,validation,integers,Write,Read,solutions,Solution,Shift,Ctrl,Current,Document,Replace,occurrences,Instead,CircularBuffer,fact,Here,Console,mode,math,measurements,coercion,exception,version,Visual,Studio,DASM,Intermediate,Language,Disassembler,Debug,OpCodes,machine,Fundamentals,module,Types,Assemblies,Value,explanation,amounts,Copy,Paste,Victory,computers,Except,karma,logic,custom,compilation,Generic,Type,Parameters,employees,syntax,employee,WriteLine,Name,extension,Resize,IntelliSense,argument,Count,capabilities,parameter,definition,users,development,Framework,Templates,concept,MyList,client,MyClass,LinkedList,versions,generalization,limitation,library,modifications,Second,convenience,cost,scenarios,Compile,error,declaration,Little,Experiment,Concern,Peek,Convert,Unique,HashSet,Sometimes,processor,text,chaos,Previous,Just,paper,dictionaries,SortingDictionary,purpose,member,AddHead,IEnumerator,GetEnumerator,Notice,Throughout,programmer,Program,Main,Okay,Anybody,requirements,configurations,situations,abstraction,layer,interface,ProcessBuffer,ProcessInput,Another,Back,Inheritance,Great,IEnumerable,IEqualityComparer,IComparer,SortedSet,Methods,Delegates,Normal,Delegate,Action,Func,Predicate,Lambda,Expression,Convertor,Operator,EVENT,Handler,Constraints,Covariance,Contravariance,Model,Objects,Class,variance,Many,systems,Animal,Cats,Animals,components,relation,Note,Neither,IList,programmers,errors,Within,Should,Covariant,Contravariant,phenomenon,Mutable,IRepository,modifier,Odds,Ends,Enum,Special,Constrains,Enumeration,Base,Static,Members,hundreds,variables,namespace,doesn,runtime,compiler,constructor,parameterize,initializer,foreach,third,gigabytes,subtype,invariant,writable

Tuesday, 12 November 2013

C# Data Structures, Collections and Generics

C# Data Structures

Array

Linked List

Associative Array Overview (Internal implementation is Linked List)

Queue

Stack

Circular Buffer

C# Collections

Various Collection Classes and Their Usage

ArrayList

Hash table

Sorted List

Stack

Queue

BitArray

C# Generics - Why Generics?

The Buffer Problem

Object Solution

Copy and Paste For Victory

Generic Type Parameters

Working with Generic Collections

Advantages of Generics

Little Experiment about Array (Concern about What data type you added )

Below code could lead the performance problem

To avoid this performance draw back of array, List comes into the picture

Little experiment of Generic List (Concern about What data type you added )

Little experiment of Generic Queue (Concern about When you added )

Peek method in queue will not remove the item from queue instead just display the item

Convert Queue to Array which will have additional capability on list as Array

Little experiment of Generic Stack (Concern about When you added )

Little Experiment of SET Things which means HASH Set (Concern about What you added)

Hash set will maintain the Unique set of items in the List

Other operation we could perform on HashSet

Linked List (Doubly Linked List)

Map Things (Dictionary Data Structure)

Sorting of Dictionary items

Items before Sorting

Items after Sorting with SortingDictionary

Generic Classes and Interfaces

Generic Classes

Write your own Generic class

Generic Interfaces

Demo of Generic Interfaces

Another Interface example

Circling Back with Generic Interface with Inheritance

The Great IEnumerable Implementation

Comparing Employee objects

IEqualityComparer with Sorted Dictionary

IComparer with SortedSet

Cleaning Up Generics

Generic Methods and Delegates

Using Extension method in Generic

Using Delegates in Generic

Delegates in Normal Implementation

Delegates in Generic Way

Every Day Delegate Implementation

Action Delegate

Delegates(Action, Func and Predicate) with Lambda Expression

Implementation of Convertor Operator

Implementation of EVENT Handler in Generic

Constraints, Covariance, and Contravariance

Generics So Far

Here, we have problem when we check the incoming objects identity

Demo of Model Objects

Example of Class Constraints

The Interface Constraints

Covariance and contra variance

Arrays example of Covariance and Contravariance

Covariance in Generic

Contravariance in Generic

Generic Odds & Ends

The Enum Problem

The Math Problem

The problem with Base type

Solution for above problem

The problem with Static Members of Class

Solution for this above problem

No comments: