Data Structures and Algorithms for Engineers

|CARNEGIE MELLON UNIVERSITY IN RWANDA|

04-630
Data Structures and Algorithms for Engineers

Course discipline: TBD

Core

Units: 12

Lecture/Lab/Rep hours/week: 4 hours lectures/week, 1.5 hours labs/week (two sessions), 1 hour recitation/week (two sessions)

Semester: Spring

Pre-requisites: programming skills

Students are expected to be familiar with programming in at least one programming language. Formal programming language training is not required. Students may not have any formal background in algorithms, data structures, analysis, or detailed design techniques and methods.

Course description:

Many organizations today are incorporating computer hardware and software into the products that they design and build. Most of these organizations' primary competencies are not computer science or software engineering, but rather they find that automation makes their products smarter, more capable, and more appealing in the market place. Because deep domain knowledge is needed to build these products, these organizations often hire engineers from traditional engineering disciplines to design and build the product platform, in many cases requiring them to write software to make the product actually work. These are capable engineers from many disciplines other than software engineering and unfortunately they usually learn software engineering on the job. This process typically involves considerable trial and error and often results in poorly designed and documented systems, defect laden software, bloated product development costs, unmaintainable software, and missed opportunities to leverage software development investments.

In addition to developing mere functionality, some application domains are often highly constrained and unforgiving in their quality attribute needs such as performance, safety, and availability. These systems intimately depend upon software to provide these capabilities in addition to basic functionality. Designing software intensive systems with these properties in a cost-effective way requires first-class computer science and software engineering expertise. While many practicing engineers often have many years of industrial experience writing software applications, many lack a formal background in computer science principles. These engineers may have had a few courses or technical training in specific computer languages or technologies, but in general they often lack formal training in algorithms, computing theory, data structures, and design among other key topics. The result is that many of these engineers are not fully realizing their potential as software engineers. This course is designed to bridge these gaps in formal computer science training.

Learning objectives:

The primary objective of the course is to provide engineers without formal training in computer science, a solid background in the key principles of computer science, in general, and of the algorithms and data-structures, in particular. The key purpose of this course is to complement the experience that engineers may already have in writing software with formal computer science underpinnings, making those engineers more capable in developing software intensive systems.

The course begins by considering the main phases of the software development lifecycle, from requirements elicitation, to computational modelling, system specification, software design, implementation, and software quality assurance, including various forms of testing, verification, and validation. Then, building on the concept of abstract data types, the course provides an in in-depth treatment of the key elements of algorithms and data-structures, beginning with the fundamentals of searching, sorting, lists, stacks, and queues, but quickly progressing to more advanced topics, including trees, graphs, and algorithmic strategies. It also covers the analysis of the performance and tractability of algorithms, finishing with automata theory and computability theory. A key focus of the course is on effective implementation and good design principles.

Outcomes:

After completing this course, students should be able to:

Recognize and analyze critical computational problems, generate alternative solutions to problems, and assess their relative merits.
Understand, analyze, and characterize those factors that influence algorithmic computational performance and memory consumption.
Design, implement, and document appropriate, effective, and efficient data structures & algorithms for a variety of real-world problems.
Understand detailed software structures and their underlying strengths and weaknesses.
Perform detailed, code-level design and document the design in an understandable way.

Content details:

Refer to the Lecture Plan for information on course delivery, including lectures, labs, assignments, and exercises.

The course will cover the following topics:

Introduction: The Software Development Lifecycle
Representation of Algorithms
Algorithmic Complexity
Searching and Sorting Algorithms
Abstract Data Types (ADT)
Containers, Dictionaries, and Lists
Stacks
Queues
Trees
Heaps
Graphs
Hashing
Algorithmic Strategies
Analysis of Correctness
Automata Theory & Computability Theory
Databases
Programming Paradigms
Component-Based Software Engineering

The detailed content for each of these topics follows.

Introduction

Goals of the course
History of computer science
Topic areas
Preview of course material
Course mechanics
Overview of labs, assignments, and exercises
Assignment software development platform and tools

Fundamental Algorithmic Strategies

Definition of an algorithm
Algorithmic analysis and complexity
Classes of algorithms
Brute force, divide and conquer, branch and bound, dynamic programming, greedy algorithms, recursion, approximation, heuristics and heuristic algorithms, probabilistic algorithms

Algorithmic Representation and Analysis

Modelling software
Representation, communication, and analysis of algorithms
Relational modelling
State modelling
Pseudo code
Flow charts
Finite state machines
UML
Predicate logic
Analysis

Correctness Analysis

Types of software defects
Code module design
Syntactic, semantic, logical defects
(Semi-)formal verification: partial vs. total correctness
Invariant assertion method
Simple proof strategies: by contradiction, counterexample, induction
Dynamic testing: unit tests, test harness, stubs, drivers, integration testing, regression testing.
Static tests: reviews, walkthroughs, inspections, reviewing algorithms and software
Pair programming
Verification and validation strategies
Software quality assurance metrics

Measurement

Complexity analysis
Big O notation
Recursion: runtime memory implications.
Recursive vs. iterative solutions

ADT Introduction and Design

Vector example exercise
History of abstraction
Abstract Data Types (ADT)
Information hiding
Types and typing
Encapsulation
Efficiency
Correctness
Checks for pre-conditions and post-conditions
Design practices

Lists

Basic operations
Implementation with arrays and linked lists in pseudo-code
Singly linked lists
Doubly linked lists
Performance considerations

Stacks

Stack (LIFO): push, pop, peek, size, numItems operations
Array implementation in pseudo-code (directly and array of pointers to data)
Stack applications, including evaluation of infix, prefix, and postfix expressions

Queues

Queue (FIFO): enqueue, dequeue, peek, size, numItems operations
Array implementation in pseudo-code (directly and array of pointers to data)
Linked list implementation in pseudo-code
Circular queues
Performance considerations
Deque

List Sorting

In-place sorts: bubblsort (efficient and inefficient), selection sort, insertion sort.
Not-in-place sorts: quicksort, merge sort.
Complexity analysis
Characteristics of a good sort
Speed, consistency, keys, memory usage, length & code complexity, stability
Other sorts ordered by complexity

Trees

Concepts and terminology.
Types of tree: binary, binary search, B-tree, 2-3 tree, AVL, Red-Black
Introduction to binary trees: insertion and deletion
Tree traversals: inorder, preorder, postorder
AVL trees
Non-search trees: parse trees, array implementation, linked list implementation
Forests

Heaps

Heap basics
Types of heap: min heaps and max heap
Heap characteristics
Heap operations: delete max/min, down heap, up heap, merge, construct, heapify; complexity of operations
Priority queues
Operating systems heaps
Implementation of heap
Heap sort (pseudo-code)
d-ary heaps
Leftist heaps

Graphs

Type and definitions
Euler's theorem
Directed and undirected graphs
Array representation
Graph traversal: breadth-first and depth-first, uses of.
Graph representation
Vertex operations and classic problems
Adjacency list representation and operations: insert edge & insert vertex
Depth-first search and maze traversal (pseudo-code)
Spanning trees and minimum spanning trees, Kruskal's algorithm (pseudo-code), Prim's algorithm (pseudo-code)
Dijkstra's shortest path algorithm (pseudo-code)
Graphs problems: routes, Hamilton paths, network flows, covering problems, museum guard problem
Fleury's Euler circuit algorithm

Hashing

Using keys to address data
Mappings: injection, surjection, bijection
Map ADT
Hash functions
Hash tables: current value tables, direct access tables
Managing collisions: chaining, overflow areas, re-hashing, linear probing, quadratic probing
Evaluating hash functions: prime division, mid-square, folding, load factor
Example application: dictionaries
Generating hash functions and using hash structures

Software Design

Static, dynamic, physical structures
Abstract vs. physical structures
Architectural design: components of design, internal vs external aspects, history of design
Top down design and structured design
Yourdon Structured Analysis: data flow diagrams (DFD), data dictionaries, process specification, entity relationship (ER) diagrams, state transition diagrams
Structured vs. object-oriented design
OO programming; classes; type, operational and functional polymorphism; inheritance, attributes, methods, instantiations, abstract classes, object-oriented languages
OO design methodology: UML class diagram, composite-structure diagram, architecture diagram, package diagram, object diagram, component diagram, deployment diagram, activity diagram, sequence diagram, communication diagram, interaction diagram, timing diagram, use case diagram, state machine diagram
OO design principles: open/close principle, design by contract principle, dependency inversion principle, other design principles, documentation

Operating Systems

Types of operating system (OS), history of OS
Computer hardware
Operating system concepts
Process mode
Thread model
Scheduling: batch and interactive
Deadlock: modelling, recovery
Memory management: swapping, virtual memory, paging
Input/output: memory mapped, DMA, interrupts, device drivers
File management: disks, file structure, directory structure
Multiprocessors: synchronization, RPCs, distributed systems
Security

Secondary Storage / Files Management

Secondary storage and disk storage
Buffering techniques
Files: meta-data, flat files, indexed files, hash indexed files
Databases: relational databases, hierarchical databases, NoSQL databases
Compression strategies, dictionary algorithm, LZ algorithm
File structure strategies.

Faculty:

David Vernon

Delivery:

Face-to-face.

Students assessment:

This course includes several hands-on programming and analysis assignments. Students will program mainly in C/C++. The programming assignments include individual assignments and a team capstone project in teams of 2-3 people. In addition to programming assignments, students will be assigned readings to support the lecture material.

Marks will be awarded as follows.

Individual Assignments 50% Final Capstone Project 40% (The capstone project will be completed in 2-3 person teams). Instructor Judgement 10% (We will use time tracking and observation to determine this part of the grade).

Software requirements:

We will use Microsoft Visual C++ Express compiler, version 10.0 (also known as Visual C++ 2010) and CMake running on Windows 7 (64 bit).

A complete software installation guide will be provided in due course.

Course texts:

Algorithmics: The Spirit of Computing, Third Edition, David Harel and Yishai Feldman.

Data Structures and Algorithms, Alfred V. Aho, Jeffrey D. Ullman, and John E. Hopcroft.

A selection of papers and readings will be provided to complement these required textbooks.

Acknowledgments:

This syllabus is based mainly on Course 04-630 Computer Science Principles for Practicing Engineers given by Mel Rosso-Llopart and Anthony J. Lattanze at Carnegie Mellon University.

Additional topics and teaching material have been taken from Course CS-CO-412 Algorithms and Data Structures given by David Vernon at the Innopolis University.

Data Structures and Algorithms for Engineers

Course description:

Learning objectives:

Outcomes:

Content details:

Faculty:

Delivery:

Students assessment:

Software requirements:

Course texts:

Acknowledgments:

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools