Implementing an email discovery,
compliance, archiving, and retention (DCAR) solution is similar to remodeling a house. You need a “blueprint” (plan) that provides detailed
instructions for the “remodeling”
(design, components, and implementation steps of the DCAR solution).
You also need to be able to translate
the plan into reality—as a contractor
would on a remodeling job—within
the budget and schedule you've allotted for the project. Like a remodeling,
implementing a DCAR solution might
disrupt your current messaging
scheme somewhat, but in the end it
will make your job as an Exchange
administrator easier and add functionality to your messaging system.
Fortunately, Exchange provides some
built-in features—message journaling, backup and restore APIs, and message and transport security—that
provide a framework for building a
DCAR solution. We'll examine those
features as well as some related technologies in Exchange, such as event
sinks, protocol logs, and message
tracking, and explore how each fits
into a DCAR strategy. And in the Web exclusive sidebar “Third-Party Products in an Exchange DCAR Solution,”
we'll look at
some important DCAR functions that
you'll need to obtain through third-party products and what to consider
when choosing such products.
Message Journaling
You'll find a growing degree of confusion about the difference between
archiving and journaling. From a
high-level point of view, you can
successfully argue that little distinguishes the two methods; both extract messaging data from the messaging
system. However, you do need to
know the difference, not only to
implement your solution correctly but
also to evaluate whether third-party
components will meet your needs.
Archiving is the process of removing content from the messaging system for long-term storage in some
other system, usually some type of
database. Email messages are taken
from user mailboxes according to criteria, such as age. Archiving technology usually provides some kind of
Web or mailbox-based extension
mechanism that lets users continue to
access archived content as necessary.
Features generally address the discovery, archiving, and retention components of DCAR. Archiving technology
is useful for mailbox management,
reducing storage requirements by
consolidating and compressing multiple copies of the same data and ensuring the preservation of corporate knowledge.
Journaling is the process of creating
or capturing copies of email messages
as they enter and traverse the messaging system, ensuring that those copies
are collected in central locations.
Together, the journal copies comprise
a searchable form of documentation
that administrators and auditors can
use to see the email messages that users
are sending and receiving. Journaling generally addresses the compliance
component of DCAR and is one of
the three common mechanisms for
moving messaging data into compliance and policy solutions. Journaling
usually doesn't provide any direct
benefits to end users, but it can be a
vital part of a complete DCAR solution.
Although Exchange has no built-in archiving functionality, it has
included basic journaling capabilities
since Exchange Server 5.5 Service Pack 1 (SP1). Over the years, Microsoft has increased journaling functionality in the various Exchange
releases, service packs, and occasional hotfixes. Today, Exchange
includes three types of journaling:
- simple journaling (also known as
message-only journaling)
- blind carbon copy (Bcc) journaling
- envelope journaling
All three types work on the same basic
principle: Almost every email message
that enters the Exchange organization
is examined to see whether it's bound
for a recipient configured for journaling. If it is, the first Exchange Server
categorizer through which it passes
creates another copy of the email message (in a process called bifurcation)
that's delivered to a specified journal
mailbox or public folder. The only
messages exempted from journaling
are system messages such as Active
Directory (AD) replication messages,
public folder replication messages,
and journal messages.
Note: Although you can specify a
public folder as the journal destination, Microsoft recommends that you
specify a mailbox. Journal messages
delivered to public folders can't be
stamped with the full range of data
with which email messages delivered
to a mailbox can be stamped.
Although you have some control over
which recipients care configured for
journaling, you should be aware that
your ability to perform this configuration isn't very granular in any current
version of Exchange. For Exchange
Server 2003 and Exchange 2000
Server, journaling is enabled on a
per–message-store basis. All mailboxes in enabled message-store databases are journaled, and all journal
messages that mailboxes generate in
that database are sent to the same
journal mailbox (although you can
configure separate journal mailboxes
for each message-store database).
In Exchange 5.5, you can enable
journaling for an entire organization or on a per-site or per-server basis. Be
aware that Exchange will capture and
copy only email messages that are
transmitted. If someone edits a message in-mailbox, the change won't be
captured. I know of at least one lawsuit that involved lawyers being blind-sided because they weren't aware that
email messages in their organization
had been changed to cover up evidence of wrongdoing. Opposing
counsel produced records of the original, unaltered messages. I'll review
the three types of journaling relative to
the goal of implementing a DCAR
solution.
Simple Journaling
Simple journaling has existed in
Exchange since Exchange 5.5 SP1.
When simple journaling is enabled,
the first Exchange categorizer to handle a given email message parses the
P2 header—the header information
contained within the actual message
that determines whether the relevant
mailboxes are in databases with journaling enabled. For email messages
sent within the organization that use
Messaging API (MAPI), remote procedure call over HTTP Secure (RPC over
HTTPS), Microsoft Outlook Web
Access (OWA), or another form of
HTTP access, this server is the
sender's mailbox server. Otherwise,
the bridgehead server receives the
message through SMTP or the
Exchange Message Transfer Agent
(MTA) service. Journal copies of the
email message are then sent to all
relevant journal mailboxes. You control simple journaling through the
Mailbox Store Properties dialog box,
as Figure 1 shows.
Let's look an example. Imagine an
Exchange organization with four
mailbox servers, EXCH01 through
EXCH04. Each mailbox server has
two mailbox stores, one for regular
users and one for journaling. Each
regular mailbox store is configured to
deliver to the journal mailbox in the
journal mailbox store on the same server. In addition, an SMTP bridgehead server handles all incoming and
outgoing SMTP traffic.
An external email message comes
into the organization addressed to
four recipients: Adam, Barbara, Charlie, and Denise. By chance, these four
recipients are homed on separate
mailbox servers. In addition to forwarding the email message to the
actual recipient mailboxes, the
bridgehead forwards it to the four
journal mailboxes, requiring extra
bandwidth, disk I/O, and CPU in the
process. That kind of traffic multiplier
can cause a significant performance
hit in organizations with geographically dispersed servers linked by low-bandwidth WAN connections or
organizations whose servers are
already running close to their peak
performance.
You might wonder why extra
bandwidth is required, given that the
SMTP stack in Exchange 2003 and
Exchange 2000 is supposed to send
only one copy of a message between
servers even when there are multiple
recipients on the destination server. Because the
journal copy of the message has extra properties stamped on it during the bifurcation process, the journal copy
technically counts as aseparate message. Be aware of this behavior when you design your solution.
Simple journaling has
some other limitations,
mainly because it uses
the P2 header information. Simple journaling
can't
- capture Bcc recipients. This limitation reduces
or eliminates journaling's usefulness; you
can't accurately track
email message
recipients.
- capture the results of any address
rewriting you might have
configured in your organization.
- uniformly expand distribution list
(DL) membership. This limitation
could leave you with a journaled
email message that contains the
DL name instead of the list of
members. How do you go about
proving the list's membership at
the time the email message passed
through the system? What if the list
constantly has members added
and removed? And consider how
this limitation can affect a large
organization with a complicated
AD replication topology.
The Exchange 5.5 version of simple
journaling has an additional flaw: The
journal copy captures display names
rather than actual email addresses. In
essence, you can't prove that an email
message actually went to a particular
recipient; you can guarantee only that
the message was sent to a recipient
who had that specific display name
configured at that specific time.