\documentclass[a4paper]{article}
\usepackage[scale=0.9]{geometry}
\usepackage{multicol}

\begin{document}
\begin{multicols}{2}
The reader should be able to list the stages of encapsulated data flow
process. Also the reader should be able to compare the logical layers of
the OSI and TCP/IP networking models and identify the logical layers used
by devices on a network.

The reader should understand:
\begin{itemize}
\item The stages of the general troubleshooting process
\item The bottom-up troubleshooting approach
\item The top-down troubleshooting approach
\item The divide and conquer troubleshooting approach
\item How to select an effective troubleshooting approach based on a
  specific situation
\item The process of gathering symptoms from a network
\item Guidelines for gathering symptoms from a user
\item The process of gathering symptoms from an end-system
\end{itemize}

\section{Overview}

Troubleshooting networks is more important than ever. As time goes on,
services continue to be added to networks. With each added service comes
more variables. This adds to the complexity of the network troubleshooting
as well as the network itself. Organizations increasingly depend on network
administrators and network engineers having strong troubleshooting skills.
Troubleshooting begins by looking at a methodology that breaks down the
process of troubleshooting into manageable pieces. This permits a
systematic approach, minimizes confusion, and cuts down on time otherwise
wasted with trial and error troubleshooting.

Network engineers, administrators, and support personnel realize that
troubleshooting is a process that takes the greatest percentage their time.
One of the primary goals in this module is to present efficient
troubleshooting techniques, in order to shorten overall troubleshooting
time when working in a production environment.

Two extreme approaches to troubleshooting almost always result in
disappointment, delay, or failure. On one extreme is the theorist, or
rocket scientist, approach. On the other is the practical, or caveman,
approach. Since both of these approaches are extremes, the better approach
is somewhere in the middle using elements of both.

The rocket scientist analyzes and re-analyzes the situation until the exact
cause at the root of the problem has been identified and corrected with
surgical precision. This sometimes requires taking a high-end protocol
analyzer and collecting a huge sample, possibly megabytes, of the network
traffic, while the problem is present. The sample is then inspected in
minute detail. While this process is fairly reliable, few companies can
afford to have their networks down for the hours, or days, it can take for
this exhaustive analysis.

The caveman's first instinct is to start swapping cards, cables, hardware
and software until miraculously the network begins operating again. This
does not mean that the network is working properly, just that it is
operating. Unfortunately, the troubleshooting section in some manuals
actually recommends caveman style procedures as a way to avoid providing
more technical information. While this approach may achieve a change in
symptoms faster, this approach is not very reliable and the root cause of
the problem may still be present. In fact, the parts used for swapping may
include marginal or failed parts swapped out during prior troubleshooting
episodes.

Analyze the network as a whole rather than in a piecemeal fashion. One
technician following a logical sequence will almost always be more
successful than a gang of technicians, each with their own theories and
methods for troubleshooting.

\section{2.1  Using a Layered Architectural Model to Describe Data Flow}

\subsection{2.1.1  Encapsulating data}

Logical networking models separate network functionality into modular
layers. These modular layers are applied to the physical network to isolate
network problems and even create divisions of labor. For example, if the
symptoms of a communications problem suggest a physical connection problem,
the telephone company service person can focus on troubleshooting the T1
circuit that operates at the physical layer. The repair person does not
have to know anything about TCP/IP, which operates at the network layer, or
attempt to make changes to devices operating outside of the realm of the
suspected logical layer. The repair person can concentrate on the physical
circuit. If it functions properly, then either the repair person or a
different specialist looks at areas in another layer that could be causing
the problem.

The Open Systems Interconnection (OSI) model provides a common language for
network engineers. Having looked at using a systematic approach,
documentation, and network architectures, it can be seen that the OSI model
is pervasive in troubleshooting networks. The model allows troubleshooting
to be described in a structured fashion. Problems are typically described
in terms of a given OSI model layer. At this stage it is assumed that there
should be an intimate familiarity with the model. Taking a quick look at
the OSI model helps clarify its role in troubleshooting methodology.

The OSI reference model describes how information from a software
application in one computer moves through a network medium to a software
application in another computer. The OSI reference model is a conceptual
model composed of seven layers, each specifying particular network
functions. With this technique, one transition is guaranteed for each bit
cycle, or bit time. The model was developed by the International
Organization for Standardization (ISO) in 1984, and it is now considered
the primary architectural model for intercomputer communications. The OSI
model divides the tasks involved with moving information between networked
computers into seven smaller, more manageable task groups. A task, or group
of tasks, is then assigned to each of the seven OSI layers. Each layer is
reasonably self-contained, so that the tasks assigned to each layer can be
implemented independently. This enables the solutions offered by one layer
to be updated without adversely affecting the other layers. The figure
details the seven layers of the Open System Interconnection reference
model.

The OSI model provides a logical framework and a common language used by
network engineers to articulate network scenarios. The Layer 1 through
Layer 7 terminology is so common that most engineers do not think twice
about it any more.

The upper layers (5-7) of the OSI model deal with application issues and
generally are implemented only in software. The application layer is
closest to the end user. Both users and application layer processes
interact with software applications that contain a communications
component.

The lower layers (1-4) of the OSI model handle data-transport issues. The
physical layer and data-link layer are implemented in hardware and
software. The other lower layers generally are implemented only in
software. The physical layer is closest to the physical network medium,
such as the network cabling, and is responsible for actually placing
information on the medium.

When sending data from an application in one host to an application in a
second, the network software on the source host takes data from an
application and converts it as needed for transmission over a physical
network.  The process involves:
\begin{description}
\item[Converting data into segments]
Encapsulating segments with header information that includes logical
network addressing information, also the process of converting segments
into packets

Encapsulating packets with a header, including physical addressing
information, and converting packets to frames
\item[Encoding frames into bits]

The data is now ready for travel over the physical medium as bits. The
encapsulation process as a whole represents the initial stage in
transferring data between two end systems.
\end{description}

\section{2.1  Using a Layered Architectural Model to Describe Data Flow}

\subsection{2.1.2  Bits on the physical medium}

The Ethernet receiver derives the clock rate from the incoming data
stream. Using a direct signal encoding of 0 volts for a logic 0 value
and 5 volts for a logic 1 value could lead to timing problems.
Specifically, a long string of 1s or 0s could cause the receiver to
lose synchronization with the data. Further, the recipient would be
unable to determine the difference between an idle sender (0 voltage)
and a string of 0s (again 0 voltage).

The solution for this dilemma is found in the Ethernet encoding scheme.
Rather than transmitting the logic level directly, Manchester encoding is
used. With this technique, one transition is guaranteed for each bit cycle:


With a Manchester encoded signal, a binary 1 is represented by a change of
amplitude from a low to a high during the middle of a bit-time. Conversely,
a binary 0 is represented by a change of amplitude from a high to a low
during the middle of a bit-time.


However, the trade-off for this synchronization technique is that twice the
signaling bandwidth is required, since there must be two pulses for every
bit transmitted. As a result, 10-Mbps Ethernet actually works with a 20 MHz
serial data signal.

Data moving through the physical layer medium from the source to the
destination is the end product of the encapsulation process

\subsection{2.1.3  Network devices utilize control information}

 Layer 2 network devices utilize the control information within a frame to
assess where a frame is physically destined to on a local network segment.
The physical address, or MAC address, of the destination network adapter,
or interface, is read so that the proper decision on switching to an
appropriate port can be made. In addition to addressing information, the
Layer 2 device can check on the validity of the frame by recalculating the
frame check sequence (FCS) and matching it with the FCS included as part of
the encapsulation process at the data-link layer.

Layer 3 network devices are responsible for determining logical paths
between networks through an internetwork. Layer 3 devices read the
networking address of a destination contained within the control
information of packets, and then forward them to an appropriate interface.
Layer 3 addressing is hierarchical so that intermediate devices need only
know which network the destination device is a member of in order to
deliver the packet to the correct location.

Data flow alternates between the physical medium which is stage two of data
flow, and Layer 2 and 3 devices representing the third stage in the flow of
data from a source to a target end-system

\subsection{2.1.4  Decapsulation}

 When the interface of an end-system receives data from the physical
medium, frames must be extracted from the bit-stream so that the end-system
can verify that the destination physical address of the frame equals its
own. When the physical address is verified, the packet is decapsulated from
the frame control information and the packets logical control information
is examined. Data is further decapsulated from packets as needed for use
with the target application.

This represents the fourth stage in the layered model of data flow.  Data
returned to the original sender goes through the same process:
\begin{description}
\item[Stage 1] Encapsulation
\item[Stage 2] Transmission over the physical medium
\item[Stage 3] Network devices utilizing control information to
  deliver data to the appropriate end-system
\item[Stage 4] Decapsulation of data as needed for use with the target
  application
\end{description}

\subsection{2.1.5  OSI model versus TCP/IP model}

 Similar to the OSI networking model, the TCP/IP networking model divides
networking architecture into modular layers. Figure  shows how the TCP/IP
networking model maps to the layers of the OSI networking model. It is this
close mapping that allows the TCP/IP suite of protocols to successfully
communicate with so many networking technologies.

The TCP/IP network access layer corresponds to the OSI physical and data-
link layers. The network access layer communicates directly with the
network media and provides an interface between the architecture of the
network and the Internet layer.

TCP/IP Internet layer relates to the OSI Network layer. The Internet layer
of the TCP/IP protocol model is responsible for placing messages in a fixed
format that allows devices to handle them.

The transport layers of TCP/IP and OSI directly correspond in function. The
transport layer is responsible for exchanging packets between devices on a
TCP/IP network.

The application layer in the TCP/IP suite actually combines the functions
of the three OSI model layers which are session, presentation, and
application. The application layer provides communication between
applications such as FTP, HTTP, and SMTP on separate hosts.

\subsection{2.1.6  Position of network devices in layered model}

The ability to identify which layers pertain to a networking device gives
a troubleshooter the ability to minimize the complexity of a problem by
dividing the problem into manageable parts.  For instance, knowing that
Layer 3 issues are of no importance to a switch, aside from multilayer
switches, defines the boundaries of a task to Layer 1 and Layer 2. Given
the fact that there is still plenty to consider at only these two layers,
this simple knowledge can prevent the wasting of time troubleshooting
irrelevant possibilities and will significantly reduce the amount of time
spent attempting to correct a problem. However, it is still important to
note that there are network applications that are part of these devices
that move into Layers 4-7.

\section{2.2  Troubleshooting Approaches}

\subsection{2.2.1  General troubleshooting process}

The stages of the general troubleshooting process are:

\begin{description}
\item[Step 1] Gather symptoms
\item[Step 2] Isolate the problem
\item[Step 3] Correct the problem
\end{description}

The stages are not mutually exclusive. At any point in the process, it may
be necessary to retrace to previous steps. For instance, it may be required
to gather more symptoms while isolating a problem. Additionally, when
attempting to correct a problem, another unidentified problem could be
created. As a result, it would be necessary to gather the symptoms,
isolate, and correct the new problem.

A troubleshooting policy should be established for each stage. A policy
will give a consistent manner in which to perform each stage. Part of the
policy should include documenting every important piece of information.

Gathering Symptoms - To perform the "Gathering Symptoms" stage of the
general troubleshooting process, the troubleshooter gathers and documents
symptoms from the network, end systems, or users. In addition, the
troubleshooter determines what network components have been affected and
how the functionality of the network has changed compared to the baseline.
Symptoms may appear in many different forms. These forms include alerts
from the network management system, console messages, and user complaints.

While gathering symptoms, questions should be used as a method of
localizing the problem to a smaller range of possibilities. However, the
problem is not truly isolated until a single problem, or a set of related
problems, is identified.

Isolation of Problem - To perform the "Isolate the Problem" stage of the
general troubleshooting process, the troubleshooter identifies the
characteristics of problems at the logical layers of the network so that
the most likely cause can be selected. At this stage, the troubleshooter
may gather and document more symptoms depending on the problem
characteristics that are identified.

Correct the Problem - To perform the "Correct the Problem" stage, the
troubleshooter corrects an identified problem by implementing, testing, and
documenting a solution. If the troubleshooter determines that the
corrective action has created another problem, the attempted solution is
documented, the changes are removed, and the troubleshooter returns to
gathering symptoms and isolating the problem.

\subsection{2.2.2  Bottom-up}

 When applying a bottom-up approach towards troubleshooting a networking
problem, the examination starts with the physical components of the network
and then is worked up through the layers of the OSI model until the cause
of the problem is identified.  It is a good approach for a troubleshooter
to use when the problem is suspected to be physical. Most networking
problems reside at the lower levels, so implementing the bottom-up approach
will often result in effective results.

The downside to selecting this approach is that it requires checking of
every device and interface on the network until the possible cause of the
problem is found. It is a requirement to document each conclusion and
possibility. The challenge is to determine which devices to start with.

In many cases, problems within the first four layers can be determined by
entering a ping or traceroute command. If the connection is successful,
then the cause is likely at the application level. Otherwise, a closer look
at the lower levels will be needed to locate the problem.

Verify that Internet control message protocol (ICMP) echo request and echo
reply are enabled on the network in order for commands such as ping and
traceroute to work. This action should include authorization from the
network administrator and documentation of that authorization. If ping has
been disabled on the network, it is a result of the implementation of
policy. Document in a station log or your personal work log that ping, or
any command that was initially disabled, was enabled for network testing
and subsequently disabled. This is important should there be an
unauthorized intrustion into the network while you are troubleshooting the
network. If disabled, the failure of a ping or traceroute command can
easily be mistaken for a loss of connectivity.

\subsection{2.2.3  Top-down}

 When applying a top-down approach towards troubleshooting a networking
problem, the end user application is examined first. Then work down from
the upper-layers of the OSI model until the cause of the problem has been
identified.  When a troubleshooter selects this approach, the applications
of an end system are tested before tackling the more specific networking
pieces. A troubleshooter would most likely select this approach for simpler
problems or when the troubleshooter thinks that the problem is with a piece
of software.

The disadvantage to selecting this approach is that it requires checking of
every network application until the possible cause of the problem is found.
It is a requirement to document each conclusion and possibility. Like the
bottom-up approach, the challenge is to determine which application to
start with.

\subsection{2.2.4  Divide and conquer}

 When the divide and conquer approach is applied towards troubleshooting a
networking problem, a layer is selected and tested in both directions from
the starting layer.  The divide and conquer approach is initiated at a
particular layer. The layer is based on troubleshooter experience level and
the symptoms gathered about the problem. Once the direction of the problem
is identified, troubleshooting follows that direction until the cause of
the problem is identified.
If it can be verified that a layer is functioning, it is typically a safe
assumption that the layers below it are functioning as well. If a layer is
not functioning properly, gather symptoms of the problem at that layer and
work downward to lower layers.


\subsection{2.2.5  Guidelines for selecting and approach}

 When selecting an effective troubleshooting approach to solve a network
problem, the problem is usually resolved in a quicker, more cost-effective
manner.

Consider the following when selecting an effective troubleshooting
approach.

\subsubsection{Determine the scope of the problem}

A troubleshooting approach is often selected based on its complexity. A
bottom-up approach typically works better for complex problems. Using a
bottom-up approach for a simple problem may be overkill and inefficient.
Typically, if symptoms come from users then a top-down approach is used. If
symptoms come from the network, a bottom-up approach will likely be more
effective.

\subsubsection{Apply previous experiences}

If a particular problem has been experienced previously, then the
troubleshooter may know of a way to shorten the troubleshooting process. A
less experienced troubleshooter will likely implement a bottom-up approach,
while a skilled troubleshooter may be able to jump into a problem at a
different layer using the divide and conquer approach.

\subsubsection{Analyze the symptoms}

The more known about a problem, the better the chance that it can be
solved. It may be possible to immediately correct a problem simply by
analyzing the symptoms.

\paragraph{Example}

Two IP routers have been identified in a network that have connectivity,
but are not exchanging routing information. Before attempting to solve the
problem, a troubleshooting approach needs to be selected. Similar symptoms
have been seen previously, which point to a likely protocol issue. Since
there is connectivity between the routers, it is not likely to be a problem
at the physical or data link layer. Based on this past experience
knowledge, it is decided to use the divide and conquer approach, and the
troubleshooter begins testing the TCP/IP-related functions at the network
layer.

\section{2.3  Gathering Symptoms}

\subsection{2.3.1  Gathering symptoms for a network problem}

Following are the stages for gathering symptoms for a network problem:


\paragraph{Stage 1}

The troubleshooter analyzes symptoms gathered from the trouble ticket,
users, or end systems affected by the problem to form a definition of
the problem.

\paragraph{Stage 2}

If the problem is in the troubleshooter's system, it will be necessary
to move on to stage 3. If the problem is outside the boundary of the
troubleshooter's control, it will be necessary to contact an
administrator for the external system before gathering additional
network symptoms.

\paragraph{Stage 3}

The troubleshooter determines if the problem is at the core,
distribution or access layer of the network. At the identified layer
use an analysis of existing symptoms and knowledge of the network
topology to determine which piece or pieces of equipment are the most
likely cause.

\paragraph{Stage 4}

Using a layered troubleshooting approach, the troubleshooter gathers
hardware and software symptoms from the suspect devices. The
technician starts with the most likely possibility and uses knowledge
and experience to determine if the problem is more likely a hardware
or software configuration problem.

\paragraph{Stage 5}

Document any hardware or software symptoms. If the problem can be
solved using the documented symptoms, a troubleshooter will solve the
problem and document the solution. If the problem cannot be solved,
the technician begins the isolating phase of the general
troubleshooting process.

Be prudent with use of the debug command on a network. It generates enough
console message traffic that the performance of a network device can be
noticeably affected. Be sure to disable debugging when its capabilities are
no longer needed.

\subsection{2.3.2  Gathering symptoms from an end-user: hardware}

 When gathering symptoms for perceived hardware problems, a troubleshooter
should physically inspect or ask for physical inspection of the devices
using the senses of hearing, sight, smell, and touch.  Physical symptoms
may be related, but not limited, to the following:

\begin{itemize}
\item Electromagnetic Interference (EMI) from radio and television
  transmitters, or the introduction of portable devices that create
  EMI to the area Indicator lights of a NIC or networking device
\item Cable connections, the crimping of connectors and the physical
  state of connection sockets
\item Incorrect seating of modules and cards
\item Burning smells from insulative material which has melted, or of
  burnt out components
\item Overheating due to cooling fan malfunction
\end{itemize}

\subsection{2.3.3  Gathering symptoms from an end-user: software}

When gathering symptoms for probable software configuration problems,
a troubleshooter should start at the last known point where the
network functioned correctly. If an end-user station can successfully
ping the gateway but not the DNS server on another network segment,
then an entire set of potential problems associated with the physical
layer at the user-site can be eliminated. Effective questioning
techniques can discover this type of information without requiring a
trip to the end-user location. The commands shown in the figure can be
used to check the status of various devices and be used to determine
which configuration aspects to inspect.  The troubleshooter should use
effective questioning techniques to document the symptoms of a
problem:

\begin{itemize}
\item Ask questions that are pertinent to the problem.
\item Use each question as a means to either eliminate or discover
  possible problems.
\item Speak at a technical level that the user can understand.
\item Ask the user when the problem was first noticed.
\item Ask the user to re-create the problem, if possible.
\item Determine the sequence of events that took place before the
  problem happened.
\item Match the symptoms that the user describes with common problem
  causes
\end{itemize}

\subsection{2.3.4  Questions to ask an end-user}

 When asking an end user questions, it is important to follow a specific
sequence to allow the troubleshooter to gain the knowledge necessary to
attain a solution. A typical format for interviewing an end user concerning
their problem is:

\begin{itemize}
\item What does not work?

\item What does work?

\item Are the things that do and do not work related?

\item Has the thing that does not work ever worked?

\item When the problem was first noticed?

\item What has changed since the last time it did work?

\item Did anything unusual happen since the last time it worked?

\item When exactly does the problem occur?

\item Can the problem be reproduced and if so, how can it be reproduced?
\end{itemize}

\paragraph{question criteria:}
\begin{itemize}
\item ask questions that are pertinent to the problem
\item use questions to either eliminate or discover possible problem.
\item speak at a technical level the user can understand
\item match user symptoms with common problem causes
\end{itemize}

\paragraph{questions to end-user}
\begin{itemize}
\item when did the user first notice the problem
\item can the user re-create the problem
\item what sequence of event took place before the problem
  happened
\end{itemize}

\subsection{2.3.5  Flow charts for gathering network and end-user symptoms}
\begin{description}
\item[Stage 1 Interview user] If possible, a troubleshooter gathers
  initial symptoms from the user and uses these symptoms as a basis
  for additional troubleshooting.
  
\item[Stage 2 Analyze symptoms] A troubleshooter will get a
  description of the problem by analyzing any gathered symptoms from
  the user
  
\item[Stage 3 Determine symptoms] - Using a layered troubleshooting
  approach, a troubleshooter gathers hardware and software symptoms
  from the end system starting with the most likely cause. The
  troubleshooter should rely on previous experience, if possible, to
  decide if the problem is more likely a hardware or software problem.
  
\item[Stage 4 Document symptoms] - Document any hardware and software
  symptoms. If the problem can be solved using the documented
  symptoms, a troubleshooter solves the problem and documents the
  solution. If the problem cannot be solved at this point, then the
  isolating phase of the general troubleshooting process is initiated
\end{description}
\end{multicols}
\end{document}