Imagine a different way to program in which you specify rules and facts instead
of the usual linear set of instructions. That's the idea behind rule-based programming.
A rule engine automatically decides how to apply the rules to your facts, and
hands you the result. Rule-based systems are growing in popularity. Rule engines
are ubiquitous in the enterprise, and rule-based systems control everything
from web sites to control factories.
The first step in developing any rule-based system is collecting the knowledge
the system will embody, and in this article - an excerpt from the new Manning
book Jess in Action
- you'll learn how this is done.
The Tax Forms Advisor
Imagine that you're developing a simple rule-based application that recommends
United States income tax forms. The application asks the user a series of questions
and, based on the answers, tells the user which paper Internal Revenue Service
forms she will likely need. You will populate the application with enough data
to make it realistic, although you won't try to make it exhaustive. Your application
might be used in an information kiosk at the post office. In this article, we'll
use this application for our examples.
Introduction to Knowledge Engineering
Every rule-based system is concerned with some subset of all the world's collected
knowledge. This subset is called the domain of the system. The process of collecting
information about a domain for use in a rule-based system is called knowledge
engineering, and people who do this for a living are called knowledge engineers.
On small projects, the programmers themselves might do all the knowledge engineering,
whereas very large projects might include a team of dedicated knowledge engineers.
Professional knowledge engineers may have degrees in a range of disciplines:
obvious ones like computer science or psychology, and domain-related ones like
physics, chemistry, or mathematics. Obviously, it helps if the knowledge engineer
knows a lot about rule-based systems, although she doesn't have to be a programmer.
A good knowledge engineer has to be a jack of all trades, because knowledge
engineering is really just learning-the knowledge engineer must learn a lot
about the domain in which the proposed system will operate. A knowledge engineer
doesn't need to become an expert, although that sometimes happens. But the knowledge
engineer does have to learn something about the topic. In general, this information
will include:
The requirements-Looking at the problem the system needs to solve
is the first step. However, you might not fully understand the problem
until later in the process.
The principles-You need to learn the organizing principles of the
field.
The resources-Once you understand the principles, you need to know
where to go to learn more.
The frontiers-Every domain has its dark corners and dead ends. You
need to find out where the tough bits, ambiguities, and limits of human
understanding lie.
The knowledge engineer can use many potential sources of information to research
these points. Broadly, though, there are two: interviews and desk research.
In the rest of this section, we'll look at techniques for mining each of these
information sources to gather the four categories of information we just listed.
Where Do You Start?
When you're starting on a new knowledge engineering endeavor, it can be difficult
to decide what to do first. Knowledge engineering is an iterative process. You
usually can't make a road map in advance; instead you feel your way along, adjusting
your course as you go. As the saying goes, though, a journey of a thousand miles
begins with a single step, and taking that first step can be hard.
With most projects, you should first talk to the customers-the people who are
paying you to write the system. Find out what their needs are and what resources
they can make available. This isn't knowledge engineering per se, but requirements
engineering-part of planning any software project. But the customer might point
you to particular sources of technical information and help you plan your approach
to knowledge engineering. After talking to the customers, you should have a
rough idea of what the system should do and how long development is expected
to take.
Next, it's best to seek out general resources you can use to learn about the
fundamentals of the domain and do a bit of self-study. Being at least vaguely
familiar with the jargon and fundamental concepts in the domain will let you
avoid wasting the time of people you interview later. You should learn enough
about the fundamentals to have a rough idea of what kinds of knowledge the system
will need to have.
Once you've developed an understanding of the basics, you're ready to begin
the iterative process. Based on your initial research, write down a list of
questions about the domain, which, if answered, would provide knowledge in the
areas you previously identified. Seek out a cooperative subject-matter expert,
briefly explain the project to him, and ask him the questions (often the customer
will provide the expert; otherwise they should pay the expert a consulting fee
to work with you). Usually the answers will lead to more questions.
After the initial interview, you can try to organize the information you've
gathered into some kind of structure-perhaps a written outline or a flow chart.
As you do this, you can begin to look for what might turn out to be individual
rules. For the income tax forms advisor, an individual rule you might encounter
early in the process would be:
rule use-ez-form:
IF
filing status is "single" and
user made less than $50000
THEN
recommend the user file Form 1040EZ
Buy a stack of white index cards and write each potential rule on one side
of an individual card. Use pencil so you can make changes easily. The cards
are useful because they let you group the rules according to function, required
inputs, or other criteria. When you have a stack of 100 cards or more, the utility
becomes obvious. You can use the reverse sides of the cards to record issues
regarding each rule. This stack of cards might be the final product of knowledge
engineering, or the cards' contents might be turned into a report. The cards
themselves are often the most useful format, though.
After organizing the new knowledge on index cards, you may see obvious gaps
that require additional information. Develop a new set of interview questions
and meet with the expert again. The appropriate number of iterations depends
on the complexity of the system.
Knowledge engineering doesn't necessarily end when development begins. After
an initial version of a system is available, the expert should try it out as
a user and offer advice to correct its performance. If possible, a prototype
of the system should be presented to the expert at every interview-except perhaps
the first one.
Likewise, development needn't be deferred until knowledge engineering is complete.
For many small projects, the knowledge engineer is one of the developers, and
in this case you may be able to dispense with the cards and simply encode the
knowledge you collect directly into a prototype system.
Interviews
People are the best source of information about the requirements for a system.
Many projects have requirements documents: written descriptions of how a proposed
system should behave. Despite the best intentions, such documents rarely capture
the expectations for a system in enough detail to allow the system to be implemented.
Often, you can get the missing details only by talking to stakeholders: the
customers and potential users of the system.
People can also direct you to books, web sites, and other people who will help
you learn about the problem domain. These days it's common to suffer from information
overload when you try to research a topic-there are so many conflicting resources
available that it's hard to know what information to believe. The stakeholders
in the system can tell you which resources they trust and which ones they don't.
If you find conflicting information among otherwise trustworthy references
during your research, or hear conflicting statements during interviews, don't
be afraid to ask for clarification. You'll need a strategy for resolving conflicts
that hinge on matters of
opinion. Sometimes you can do this by picking a specific person as the ultimate
arbiter. Other times, especially on larger projects, it's appropriate to hold
meetings to get the stakeholders to make decisions in a group setting.