eOliva's blog: September 2008

The basic operation of ARP is a request/response pair of transmissions on the local network.
The source transmits a broadcast containing information about the destination. The destination then responds unicast back to the source, telling the source the hardware address of the destination.

ARP Message types and address designations

There are two different messages sent in ARP, one from the source to the destination and one from the destination to the source.

The sender is the one that is transmitting the message and the target is the one receiving it. The identity of sender and target change for each message they trade.

Request: For the initial request, the sender is the source, the device with the IP datagram to send, and the target is the destination.
Reply: The sender is the destinations; it replies to the source, which becomes the target.

Each message has two addresses (layer two and layer three), so four different addresses are sent in each message.

Sender Hardware Address: The layer two address (MAC)
Sender Protocol Address: The layer three address (IP)
Target Hardware Address: The layer two address (MAC)
Target Protocol Address: The layer three address (IP)

ARP General Operation

The overall operation that ARP uses to resolve an address is described in the next nine steps:

Source Device Checks Cache: The source device will check its cache to see if already has a resolution of the destination device, if it has, it can skip to the last step.
Source Device Generates ARP Request Message: It puts its own data link layer as the Sender Hardware Address and its own IP address as the Sender Protocol Address.
Fills the IP address of the destination as the Target Protocol Address and the Target Hardware Address, that we must discover, goes blank.
Source Device Broadcasts ARP Request Message
Local Devices Process ARP Request Message: Message is received by each device on the local network and them look for a match on Target Protocol Address.
Destination Device Generates ARP Reply Message: The device whose IP address matches will generate an ARP Reply message.
It takes the Sender Hardware Address and Sender Protocol Address fields from the ARP Request message and uses these as the values for the Target Hardware Address and Target Protocool Address fo the reply.
Finally it fills the Sender Protocol Address with its IP and the Sender Hardware Address with its MAC.
Destination Device Updates ARP Cache: The destination will now add an entry to its own ARP cache containing the hardware and IP addresses of the source that sent the ARP Request.
Destination Device Sends ARP Reply Message: This is sent as unicast to the source device.
Source Device Processes ARP Reply Message
Source Device Updates ARP Cache

ARP Message Format

The message format includes a field describing the type of message (its operational code or opcode) and information on both layer two and layer three address.
In order to support addresses that may vary on length, the format specifies the type of protocol used at both layer two and three and the length of addresses used at each of these layers.

Proxy ARP

ARP was designed to be used by devices directly connected on a local network.
Each device should be capable of sending both unicast and broadcast transmissions directly to each other.
So if device A and device B are separated by a router, they aren't local to each other. Device A would not send directly to B or vice-versa; they would send to the router at layer two, and will be considered "two hops apart" at layer three.

Proxy ARP Operation

The router that sits between the two networks is configured to respond to device A's broadcast on behalf of device B. It does not send back to A the hardware address of device B; since they are not on the same network, A cannot send directly to B anyway.

Instead, the router send A its own hardware address. A then sends to the router, which forwards the message to B on the other network. And the router does the same thing on A's behalf for B.

Proxy ARP provides flexibility for networks where hosts are not all actually on the same physical network.

Remeber, ARP cannot function between devices on different physical networks.

The network layer (OSI Model) or Internet layer (TCP/IP Model) is where the internetworking protocols are defined.

These two layers are intimately related, they perform different tasks but as neighbors in the protocol stack, must cooperate with each other.

So we need some kind of "glue" to join these two layers and let the "conversation" between them happen. The main job performed by this "glue" is address resolution, or providing mappings between layer two and layer three addresses. This resolution can be done in either direction, and is represented by the two TCP/IP protocols ARP and RARP.

Address Resolution Protocol (ARP)

Communication on an internetwork is accomplished by sending data at layer three using a network layer address, but the actual transmission of that data occurs at layer two using a data link layer address.

Every device that has a specified networking stack will have a both a layer two and a layer three address. This is necessary to define a way of being able to link these addresses together.
This is done by taking a network layer address (IP for example) and determining what data link layer address (MAC Address) goes with it.
This is called address resolution.

There is a lot of discussion about where Address Resolution fits in OSI Model, and it's something hard to explain. So as OSI is a Model, and rules must be applied except the exceptions (:o nice uh?) the address resolution case is one of these.

In the OSI Model there are two layers that deal with addressing: data link layer and network layer.
Physical Layer is concerned only with send at bit level.
The layers above network work with network layer addresses (IP).

So, why addressing at two different layers?
The different types of addresses are for different purposes.

Layer two addresses are used for local transmissions between hardware devices that can communicate directly, so it deals with directly-connected devices (on the same network).
Layer three addresses are used in internetworking, to create some kind of "virtual" network at the network layer, so it deals with indirectly-connected devices (as well as directly-connected).

The big problem is: IP addresses are too high level for the physical hardware on network to deal with.

So, we have two methods to do this address resolution, one more efficient and less flexible and other less efficient and more flexible.

Direct Mapping

Network layer addresses must be resolved into data link layer addresses numerous times during the travel of each datagram across an internetwork and the easiest method of accomplishing this is to do direct mapping between the two types of addresses (The IP and the MAC addresses).

The idea here is to choose a scheme for layer two and layer three that you can determine one from the other using a simple algorithm so you can take the layer three address and with a short procedure convert it to a layer two address. Thus whenever you have the layer tree address, you already have the layer two address.

The simplest example of direct mapping would be if we used the same structure and semantics for both data link and network layer addresses. Then, determining the layer two address is a simple matter of selecting a certain portion of the layer three address.

Example:

Consider a simple LAN like ARCNet,it uses a short 8-bit data link layer address
We could set up an IP network on such a LAN by taking a class C (or /24) network and using the ARCNet data link layer as the last octet. So with a network at 222.101.33.0/24 we could assign the physical address of #1 to IP 222.101.33.1 and the physical #29 to 222.101.33.29.

So to get the hardware address of a device, you just use the final 8 bits of the IP address, this is highly efficient because requires no exchange of data on network at all.

Direct Mapping is not possible with large hardware addresses

Direct Mapping only works when its possible to express the data link layer address with the network layer address. So, the same IP of 222.101.33.29 on an Ethernet network, will have a different type of hardware address, its a "hard-wired" address that comes with the card itself.

These addresses are 48-bits long and not 8.
This means the layer two address is bigger than layer three address, so there's no way to do a direct mapping here.

Dynamic Address Resolution

This is the alternative to direct mapping resolution.

Have you seen limousine drives waiting to pick up a person at the airport they do not know personally?

This is similar to our problem: they know the name of the person they must transport, but not the person's face. To find the person, they hold up a card bearing that person's name. Everyone other than that person ignores the card, but hopefully the individual being sought will recognize it and approach the driver.

So, let look at the computer's world: Device A wants to send to device B but know only device B's network layer address (it's name) and not its data link layer address (it's "face"). It broadcasts a layer two frame containing the layer three address of device B - this is like holding up the card wit someone's name on it. The devices other than B don't recognize this layer three address and ignore it.
Device B, that knows its own network layer address, recognizes this broadcast and sends a direct response back to device A, telling device A what device B's layer two address is.

The good thing is that is no need for any specific relationship between the network layer address and the data link layer address.

Dynamic Address Resolution Caching

Most devices on local network send to only a small handful of other physical devices, this is called locality of reference.

When you send a request to an Internet Web site from our office PC, it will need to go first to your company network's local router, so you'll need to resolve the route's layer two address.
If later you click a link on that site, that request will also need to go to the router.

Having to do a fresh resolution each time is stupid. It would be like having to look up the phone number of your best friend every time you wan to call to say hello.

After a device's network later address is resolved to a data link layer address, the link between the two is kept in the memory of the device for a period of time. When it needs the layer two address the next time, a quick lookup in its cache is made.

Other enhancements to dynamic resolution

When we send a request that needs to go to our local router, we resolve its address and send it the request.

A reply comes back to the router to be sent to us, so the router needs our address and have to do a dynamic resolution on us even though we just exchanged frames. Again: stupid. Instead, we can improve efficiency through cross-resolution; when device A resolves the address of device B, device B also adds the entry for device A to its cache.

As devices on a local networks are going to talk to each other fairly often, when A is resolving B's network layer address, it will broadcast a frame that devices C, D, E and so on will see.
Why not have them also update their cache tables with resolution information that they see, for future use?

So that's it, most of this was a synopsis from TCP/IP Guide, if you wanna take a look, it's worth it.

Thanks!!

eOliva's blog

Tuesday, September 23, 2008

TCP/IP Address Resolution Protocol - ARP

Friday, September 19, 2008

Understanding the Address Resolution

Links

Previous Posts

Archives