Algorithms and Data Structures
Of the many subfields within computer science,
algorithms and data structures may be the most
fundamental, meaning it
characterizes computer science like perhaps no other. What is involved
in the study of algorithms and data structures?
Importance
Subfields of
Computer Science
include:
Algorithms and Data Structures,
Computation Theory,
Computer Architecture and Communications,
Artificial Intelligence and Robotics,
Database and Information Retrieval,
Graphics and Human-Computer Communication,
Numerical and Symbolic Computation,
Operating Systems,
Programming Languages, and
Software Engineering.
We study data structures and algorithms because:
- Computing systems are concerned with the storage and
retrieval of information.
- For systems to be economical the data must be organized
(into data structures) in such a way as to support
efficient manipulation (by algorithms).
- Choosing the wrong algorithms and
data structures makes a program slow at best
and unmaintainable and
insecure at worst.
Topics
Algorithms and data structures are usually studied together
with introductory software engineering concepts; the major topics
are generally
- Data — data types, abstract data types, data structures
(arrays and links), classes, APIs, collections, OOP.
- Algorithms — algorithmic patterns and paradigms (recursion,
backtracking, search, etc.), complexity analysis.
- Software development — abstraction, efficiency, robustness,
classification, inheritance and composition, polymorphism,
software life cycle, methodology, modeling languages (e.g. UML),
design patterns, and testing.
Data Types
During the design of a system, before the algorithms are
written and before the data structures are sketched out,
we must first identify the kinds of objects
we need and how they should behave.
A type is a collection of values, like
Z = {..., -2, -1, 0, 1, 2, ...}
N = {0, 1, 2, 3, 4, ...}
B = {false, true}
C = the set of characters
R = the set of all real numbers
PrimaryColor = {RED, GREEN, BLUE}
Suit = {SPADES, HEARTS, DIAMONDS, CLUBS}
ComplexNumber = R × R
Card = {1..13} × Suit
Deck = the set of all permutations of Card
SmallReal = {x | x ∈ R and 0.0 ≤ x and x ≤ 1.0}
Color = PrimaryColor → SmallReal
IntegerList = Z*
List of α = α*
A data type is a type together with operations.
An abstract data type (ADT) is a data type in which the
internal structure and internal workings of the objects of the type
are unknown to users of the type. Users can only see the effects
of the operations.
Types and operations can be given in pictures...

... or with signatures
CLUBS : Suit
HEARTS : Suit
DIAMONDS : Suit
SPADES : Suit
ACE : Rank
TWO : Rank
THREE : Rank
FOUR : Rank
FIVE : Rank
SIX : Rank
SEVEN : Rank
EIGHT : Rank
NINE : Rank
TEN : Rank
JACK : Rank
QUEEN : Rank
KING : Rank
makeCard : Suit × Rank → Card
getSuit : Card → Suit
getRank : Card → Rank
newDeck : Deck
positionOf : Deck × Card → int
cardAt : Deck × int → Card
topCard : Deck → Card
shuffle : Deck → Deck
split : Deck → Deck
Exercise: Draw a picture, and enumerate operations, for the type of integer lists. An integer list is a sequence of zero or more integers with operations such as obtaining the length, adding an item to a given position, removing the nth item, reversing, finding the position of a given item, etc.
Collections
A collection is a group of elements treated a single object.
Most ADTs seem to be collection types. Collection types can be
- Homogeneous or heterogeneous
- Unordered or ordered
- Linear, Hierarchical, Networked, or Non-Positional
The latter classification looks like this:
| Classification |
Topology |
Examples |
SET (NON-POSITIONAL) |
 |
Unrelated items, such as people in
a class or a hand of cards. |
SEQUENCE (LINEAR) |
 |
One-to-one relationship, as in a printer
queue, ring LAN, or a line at the deli. |
TREE (HIERARCHICAL) |
 |
One-to-many relationship, as in a family tree,
organization chart, expression tree, or folders and files
in a file system. |
GRAPH (NETWORK) |
 |
Many-to-many relationship, as in a road map,
schedule, belief network, or communication net. |
Operations common to most collections include
- Creation and destruction
- Adding, removing, replacing, and searching for elements
- Getting the size, or figuring out if it is empty or full
- Enumeration (traversal)
- Equality testing
- Cloning
- Serializing
Data Structures
A data structure is an arrangement of data for
the purpose of being able to store and retrieve information.
Usually a data structure is some physical representation of an
ADT.
Example: A list (data type) can be represented by an array (data structure) or by attaching a link from each element to its successor (another kind of data structure).
Building Blocks
Data structures are built with
- Records (a.k.a. structs, classes, hashes), which group
entities into a single unit with named fields
- Arrays, which store fixed size entities into indexed,
contiguous slots in memory
- Links (a.k.a. references or pointers)
Choosing the Right Data Structure
The data structure chosen for a data type must meet the resource
constraints of the problem. Figure out the constraints before
you start coding!
Example: For an abstract data type for an ATM,
- Creation and deletion of a new account can be slow
- Lookup and update of the balance of a given account must be extremely fast.
These are the extremely important questions you must ask
to determine acceptable tradeoffs:
- Can the data structure be completely filled at the beginning, or
will there be insertions intermingled with deletions, lookups,
updates, and other operations?
- Can items be deleted from the structure?
- Will the items always be processed in a well-defined order, or
will random-access have to be supported?
Algorithms
We design algorithms to carry out the desired behavior
of a software system. Good programmers tend to use stepwise refinement
when developing algorithms, and understand the tradeoffs between
different algorithms for the same problem.
Classifying Algorithms
There are several classification schemes for algorithms:
- By problem domain: numeric, text processing
(matching, compression, sequencing), cryptology, fair division,
sorting, searching, computational geometry, networks (routing,
connectivity, flow, span), computer vision, machine learning.
- By design strategy: divide and conquer, greedy,
algebraic transformation, dynamic programming, linear programming,
brute force (exhaustive search), backtracking, branch
and bound, heuristic search, genetic algorithms, simulated annealing,
particle swarm, Las Vegas, Monte Carlo, quantum algorithms.
- By complexity: constant, logarithmic, linear, linearithmic,
quadratic, cubic, polynomial, exponential.
- Around the central data structures and their behavior:
sets, linear structures, hierarchical structures, network
structures.
- By one of many implementation dimensions: sequential vs. parallel,
recursive vs. iterative, deterministic vs. non-deterministic,
exact vs. approximate, ...
Got some free time? Check out
Wikpedia's List of Algorithms.
Kinds of Problems
This classification has proved useful:
| Problem Type | Description |
| Decision | Given f and x, is it true that f(x)? |
| Functional | Given f and x, find f(x) |
| Constraint | Given f and y, find an x such that f(x)=y |
| Optimization | Given f, find the x such that f(x) is
less than or equal to f(y) for all y. |
Choosing the Right Algorithm
Choosing the right algorithm is both
- an art, because cleverness, insight, ingenuity, and dumb
luck are often required to find efficient algorithms for certain
problems
- a science, because principles of algorithm analysis
and widely applicable algorithmic patterns have been developed
over time for you to use.