CAN I ANALYZE THE DIALOGUES COLLECTED BY THE APPLET
If the web server produces an access_log file, such
as /var/log/httpd/access_log, then the server records
Applet dialogue in the access_log file. You may use
ftp to download the access_log file to your machine;
then run program B to analyze it.
Go to the Options menu and find the value for "AnalysisFile".
The Classify function operates on the data in the AnalysisFile.
By default the AnalysisFile is the same as the LogFile (the
current server log file). But you can change the analysis
file to another name, such as /var/log/httpd/access_log or
just access_log.
CAN I BUILD ON TOP OF THE ALICE CODE RATHER THAN CHANGING IT
Absolutely. You only have to change her name, location, birthday and/or
botmaster, and put a couple of references to yourself. Then add new
categories that cover your own area of expertise or interest.
CAN I CHANGE THE NAME OF THE ROBOT
The AIML tag <name/> inserts the name of the Bot wherever it appears.
The default robot name is "ALICE" but you can change it in the
"Options menu". Select "Show Options" and replace "ALICE" with the
name of your bot, and then do "Save Options". Depending on your
state, you may need to restart program B.
CAN I CREATE A LANGUAGE SPECIFIC INSTALLTION
Yes. The file "language.txt" controls the language of the
buttons and menus in the ALICE GUI. If the file is missing,
the program uses English names by default. To see an
example of a language-specific installation, copy the
file "Germanlanguage.txt" to "language.txt" and start
program B.
CAN I CREATE MORE AIML TAGS
AIML is extensible. You can create an infinite number of
new tags for foreign language pronouns, predicates, or
application-specific properties. The file "predicates.txt"
defines any new predicate tags. "Predicate tags" mean
tags that have a client-specific "set" and "get" method.
Pronouns like "it" and "he" have predicate tags like
<set_it></set_it> and <get_he/>. AIML has a number of
these built-in tags for common English pronouns.
There are two varieties of extensible predicate tags.
The first example illustrates the use of new tags
for foreign language pronouns. The Japanese language
pronoun "kare" means "he". In predicates.txt, we
can add a line of the form:
kare=dare
This single line automatically generates the tags
<set_kare> X </set_kare> to set the value of "kare"
to X, and the tag <get_kare/> to retrieve the value.
By default, <get_kare/> returns "dare" ("who?").
Now we can create two AIML categories for an elementary
Japanese conversation:
<category>
<pattern>KARE WA * DESU</pattern>
<template><star/> wa <set_kare><star/></set_kare> desu.</template>
</category>
<category>
<pattern>KARE WA * DESU KA</pattern>
<template><get_kare/> desu ka? </template>
</category>
<category>
These two categories produce a coherent conversation
about Mr. Arimoto:
Client: KARE WA DARE DESU KA
Robot: dare desu ka?
Client: KARE WA ARIMOTO SAN DESU
Robot: arimoto san wa kare desu.
Client: KARE WA DARE DESU KA
Robot: arimoto san desu ka?
Notice that the program replaces <set_kare/><star/></set_kare>
with "kare", the default value of this tag. Sometimes however
the default value of the tag is not the best choice for
the set tag to return. The second type of predicate tag
replaces the set tag markup with the original string
inside the markup.
To create a predicate tag "has" we add the following line
to predicates.txt:
has=*mother
The default value of <get_has/> is "mother" (because
everyone has a mother). The "*" here indicates that
the program should replace <set_has> X </set_has>
with X.
Here we write two AIML categories to demonstrate the
"has" predicate:
<category>
<pattern>I HAVE A *</pattern>
<template>Where did you get
<set_it><set_has><person/></set_has></set_it>?
</template>
</category>
<category>
<pattern>WHAT DO I HAVE</pattern>
<template>Tell me more about your <get_has/>. </template>
</category>
Notice that the first category embeds <set_has> in <set_it>,
so that the value of <person/> becomes both "it" and "has".
The resulting dialogue:
Client: WHAT DO I HAVE
Robot: Tell me more about your mother.
Client: I HAVE A BOX
Robot: Where did you get it?
Client: WHAT DO I HAVE
Robot: Tell me more about your box.
Look at the file "predicates.txt" for several more
examples of custom AIML tags, including the German
<set_sie>, <set_er> and <set_es>.
CAN I HAVE A PRIVATE CONVERSATION WITH ALICE
The ALICE server logs and records all conversations. Even the ALICE
Applet tries to transmit conversation logs back to the originating server.
You can have a private conversation with ALICE, however, if you download
Program B to your own computer and run it there. Running on your machine,
the server stores all the conversations locally.
CAN I INCLUDE JAVASCRIPT IN THE ROBOT REPLY
Yes. You can include any HTML including <script> tags. Suppose you
want to "chat AND browse," in other words, have the robot open
up a new browser window when she provides a URL link. Here's a category that
kicks out a piece of HTML/scripting that opens a new window with and loads a
given URL. This is handy for search engines or showing off one's web page.
<category>
<pattern> WHERE IS YOUR WEB SITE </pattern>
<template>
It's at "http://www.geocities.com/krisdrent/"
<script language="JavaScript">
// Go to <a href="http://www.geocities.com/krisdrent">The ALICE
Connection</a>
<!--
window.open("http://www.geocities.com/krisdrent/")
-->
</script>
</template>
</category>
A couple of things to note about this technique: #1, this will only work
when ALICE is being talked to from a browser that runs JavaScript, i.e. it
won't work in the applet. We have tested it in Netscape and MS Internet
Explorer, and it works well in both. #2. For the above reason, it is
important to have some sort of explanatory statement before the scripting in
case the scripting isn't supported. Besides, you want some response in your
ALICE window, even if another window DOES come up. #3. If this is viewed
in a browser that doesn't understand the <script> tag, notice that this line
will show up:
"// Go to <a href="http://www.geocities.com/krisdrent">The ALICE
Connection</a>"
Which is good, because it gives a back-up for the "non-scripted" (the Lynx
users, I guess.) And remember that you have to keep the "//" in front of
any non-java-script lines within the <script> tag.
CAN I INSERT DYNAMIC HTML INTO THE ROBOT REPLY
If you are fortunate enough to be running lynx under Linux, the
following markup is a simple way to "inline" the results of an HTTP
request into the chat robot reply. Try asking ALICE:
"What chatterbots do you know?" and she will reply with a page
of links generated by the Google search engine.
<category>
<pattern>WHAT *</pattern>
<template>
Here is the information I found:
<system>
lynx -dump -source -image_links http://www.google.com/search?q=<personf/>
</system>
</template>
</category>
CAN I RUN PROGRAM B IN THE BACKGROUND ON A NT SERVER
Yes. Set up your PC / Server to run Alice B as you normal. (Download the Java
Developers Kit, etc.)
Create a Batch file, in folder B containing only this text.
'jview bterm'
Create a task in the Task Schedule Wizard to run the batch file. (Ensure the
task starts in drive:\path\B'
Give the Task Schedule an appropriate Logon and password for the Server or
PC.
Right-click, select Run now, and log on and off as you like.
CAN I RUN SHELL COMMANDS FROM AIML SCRIPTS
Yes. Use the <system>X</system> tag to run the shell command X.
The command X is assumed to produce its output in line-oriented
format suitable for a BufferdReader to read line by line.
A simple example of this command in an AIML script is:
<category>
<pattern>WHAT TIME IS IT</pattern>
<template>The local time is: <system>date</system></template>
</category>
The "date" command is a system command that generates a text
string containing the date and time. (Note that this might
not work on Windows).
Take extreme care in using the <system> tag because it
potentially permits remote clients to run a command on
your system.
CAN I RUN THE WEB SERVER AS A DAEMON PROCESS
Yes. There is a class file called Bterm.java in the
program B distribution. Bterm runs the web server
as a console application, with no GUI. You can
redirect the output of program Bterm to a log file
and start the process in the background with
"java Bterm > B.log &" (assuming a Unix shell).
CAN I SPEAK TO THE ROBOT WITH VOICE INPUT
One simple experiment that works well as a demo
involves using IBM ViaVoice (tm) speech recognition
software on a Windows platform. At the same time,
run the ALICE program B web server and activate the
MS Agent interface. The ViaVoice software allows
you to dictate into an application called VoicePad,
but not directly into the browser. You have to
use "cut" and "paste" to move your speech inputs
into the browser form for ALICE. But the net effect
is a somewhat slow voice-in voice-out conversation
with ALICE.
The ViaVoice software seems to work well with ALICE
after some training. We trained it with the file
"patterns.txt" created with the "List Patterns" command.
CAN I TEST THE ROBOT OFFLINE ON MY DESKTOP
Yes. You can run the program B server and connect to it with
a browser, even if your desktop computer is offline.
When working offline, it often helps to change the Internet
settings (in IE or Netscape) to "local area network".
Then your machine becomes a one-computer network. You should
be able to use IE to connect to program B with http://localhost:2001.
CAN PROBABILITY STATISTICS WEIGHTS NEURAL NETWORKS OR FUZZY LOGIC IMPROVE BOTS
Statistics are in fact heavily used in the ALICE server, but not in the way
you might think. ALICE uses 'Zipf Analysis' to plot the rank-frequency of
the activated categories and to reveal inputs from the log file that don't
already have specific replies, so the botmaster can focus on answering
questions people actually ask (the "Quick Targets" function).
Other bot languages, notably the one used for Julia, make heavy use of
"fuzzy" or "weighted" rules. We see their problem as this: the botmaster
already has enough to worry about without having to make up "magic
numbers" for every rule. Once you get up 10,000 categories (like ALICE)
you don't want to think about more parameters than necessary. Bot
languages with fuzzy matching rules tend to have scaling problems.
Finally, the bot replies are not as deterministic as you might think, even
without weights. Some answers rely on <random> to select one of several
possible replies. Other replies generated by unforseen user input also
create "spontaneous" outputs that the botmaster doesn't anticipate.
CAN THE APPLET RECORD A DIALOG TXT FILE ON THE SERVER
No because the applet cannot write the file directly on the originating host.
If your server log file /var/log/httpd/access_log is too large; you
have a couple of choices:
1. If your ISP is a unix account, use telnet to log on to a shell account.
Use the command "grep Blog < access_log > dialog.txt" to create a smaller
file to download which contains just the lines recorded by the applet.
2. Create a CGI-BIN command called "/cgi-bin/Blog" that reads its
command-line argument and appends it to a file called "dialog.txt".
There ought to be a nice Perl script for this, or even a shell script.
CAN THE APPLETHOST USE A SYMBOLIC DNS NAME INSTEAD OF AN IP NUMBER
The answer is yes, but the numeric IP address works on more machines
than a symbolic name. Applets are protected by a "security sandbox"
from interfering with local resources on your machine. One restriction
is that Applets may only open socket connections to the originating
host. When using a symbolic DNS name, the "sandbox" may not know that
two variations such as "Www.AliceBot.Org" and "alicebot.org" are
in fact the same server. The client might not be able to resolve
the DNS name, and the Applet will throw a security exception.
CAN THE VIRTUAL IP BE THE REAL IP
Actually that would be the default case, when the client chats from
the same fixed IP address. The only time the virtual ip differs from
the real one is when the client is behind a dynamic firewall, like
WebTV or AOL customers.
CAN YOU GIVE ME A QUICK PRIMER ON AIML
Given only the <pattern> and <template> tags, there are three
general types of categories: (a) atomic, (b) default, and (c) recursive.
Strictly speaking, the three types overlap, because "atomic"
and "default" refer to the <pattern> and "recursive" refers to
a property of the <template>.
a). "Atomic" categories are those with atomic patterns, i.e. the pattern
contains no wild card "*" or "_" symbol. Atomic categories are the
easiest, simplest categories to add in AIML.
<category>
<pattern>WHAT IS A CIRCLE</pattern>
<template><set_it>A cicle</set_it> is a the set of points equidistant
from a common point called the center.
</template>
</category>
b). The name "default category" derives from the fact that its pattern
has a wildcard "*" or "_". The ultimate default category is the
one with <pattern>*</pattern>, which matches any input. In the
ALICE distribution the ultimate default category resides in a file
called "Pickup.aiml". These default responses are often called
"pickup lines" because they generally consist of leading questions
designed to focus the client on known topics.
The more common default categories have patterns combining a few
words and a wild card. For example the category:
<category>
<pattern>I NEED HELP *</pattern>
<template>Can you ask for help in the form of a question?</template>
</category>
responds to a variety of inputs from "I need help debugging my program"
to "I need help with my marriage." Putting aside the philosophical
question of whether the robot really "understands" these inputs,
this category elucidates a coherent response from the client,
who at least has the impression of the robot understanding the
client's intention.
Default categories show that writing AIML is both an art and a
science. Writing good AIML responses is more like writing good
literature, perhaps drama, than like writing computer programs.
c). "Recursive" categories are those that "map" inputs to other
inputs, either to simplify the language or to identify synonymous
patterns.
Many synonymous inputs have the same response. This is accomplished
with the recursive <srai> tag. Take for example the input "GOODBYE".
This input has dozens of synonyms: "BYE", "BYE BYE, "CYA", "GOOD BYE",
and so on. To map these inputs to the same output for GOODBYE we
use categories like:
<category>
<pattern>BYE BYE</pattern>
<template><srai>GOODBYE</srai></template>
</category>
Simplification or reduction of complex input patterns is another
common application for recursive categories. In English the
question "What is X" could be asked many different ways:
"Do you know what X is?", "Tell me about X", "Describe X",
"What can you tell me about X?", and "X is what?" are just a few
examples. Usually we try to store knowledge in the most concise,
or common form. The <srai> function maps all these forms to
the base form:
<category>
<pattern>DO YOU KNOW WHAT * IS</pattern>
<template><srai>WHAT IS <star/></srai></template>
</categroy>
The <star/> tag substitutes the value matched by "*", before
the recursive call to <srai>. This category transforms
"Do you know what a circle is?" to "WHAT IS A CIRCLE",
and then finds the best match for the transformed input.
Another fairly common application of recursive categories is
what might be called "parsing", except that AIML doesn't really
parse natural language. A better term might be "partitioning" because
these AIML categories break down an input into two (or more) parts,
and then combine their responses back together.
If a sentence begins with "Hello..." it doesn't matter what comes
after the first word, in the sense that the robot can respond to
"Hello" and whatever is after "..." independently. "Hello my name
is Carl" and "Hello how are you" are quite different, but they show
how the input can be broken into two parts.
The category:
<category>
<pattern>HELLO *</pattern>
<template><srai>HELLO</srai> <sr/>
</template>
</category>
accomplishes the input partitioning by responding to "HELLO"
with <srai>HELLO</srai> and to whatever matches "*" with <sr/>.
The response is the result of the two partial responses
appended together.
CAN YOU GIVE ME ANY HELP DEBUGGING THE APPLET
Debugging applets can be tricky. The same suggestion
to set IE for "local area network" might help here too.
Also the browser caches class files, so it's difficult to
know if you are testing a "fresh" copy of the applet. The
program "appletviewer" that comes with Sun Java is better
for debugging applets. Use "appletviewer index.html".
The best thing to do is join the alicebot mailing list
at alicebot.listbot.com.
CAN YOU HELP ME DEBUG THE ANIMATED AGENT
Look at the class Animagent.java. The method vbscript_html(reply)
does nothing unless the global Animagent member is true. In that case,
the vbscript_html() method constructs a string from the reply that
includes an MS Agent VBScript embedded in the HTML reply.
This makes the browser load up the objects required for the agent.
The text reply just becomes part of the VBScript.
You may have to download and run the Robby the Robot
agent software and the text-to-speech synthesis software from
the MSDN homepage:
http://msdn.microsoft.com/workshop/imedia/agent
We wish other companies were producing agent animation API's
for free but this MS Agent seems to be about the only
thing out there now.
Join the ALICE and AIML mailing list at alicebot.listbot.com
to see how others are working with the animated agent software.
COULD YOU EXPLAIN THE LT SRAI GT TAG A LITTLE MORE
The most common application of <srai> is "symbolic reduction"
of a complex sentence form to a simpler one:
<category>
<pattern>DO YOU KNOW WHAT * IS</pattern>
<template><srai>WHAT IS <star/></srai></template>
</category>
so the botmaster can store most knowledge in the simplest
categories:
<category>
<pattern>WHAT IS LINUX</pattern>
<template><set_it>Linux</set_it> is the best operating system.</template>
</category>
With all the "symbolic reduction" categories, the robot gives
the same answer for:
"What is Linux?"
"Do you know what Linux is?"
"Define Linux"
"Alice please tell me what Linux is right now"
Sometimes the response consists of two symbolic reductions together:
<category>
<pattern>YES *</pattern>
<template><srai>YES</srai> <sr/></template>
</category>
With this category the robot will reply to all
"Yes something" inputs by combining the
reply to "Yes" with the reply to "something".
Remember, <sr/> is an abbreviation for <srai><star/></srai>.
The <srai> tag is also the answer to the question: Can I have more
than one pattern in the same category? Suppose you want the
same answer for two different patterns. You might think of
writing something like this:
<category>
<pattern>BYE</pattern>
<pattern>GOODBYE</pattern>
<template>See you later.</template>
</category>
Right now you can't put two patterns in one category, but with <srai>
you can get the same effect:
<category>
<pattern>GOODBYE</pattern> <template><srai>BYE</srai></template>
</category>
<category>
<pattern>BYE</pattern> <template>See you later.</template>
</category>
If you look through the AIML files you will see many examples
of <srai> mapping multiple patterns to the same reply.
DO I HAVE TO USE THE GUI TO ENTER AIML CONTENT
No. You can create a new AIML file with any text editor
and add that content to an existing robot with the <load> tag.
Also, you can edit AIML categories in any text file and use
"Load Text File" and "Add Aiml" to add the content.
You can also save the output of "Targets" to a file,
edit that file, and then reload and "Add Aiml".
Finally, you can edit the robot source file files directly.
(By default the robot source file is called "B.aiml").
Use a text editor, like emacs, notepad, or a word processor
in text mode, to modify the content of the AIML files.
DO I NEED TO KNOW ABOUT THE JAVA CLASSES
No, not unless you plan to do software development on
the program B Java code. If you are an open source
contributor to the ALICE project, a researcher developing
new AI software, or you are trying to link your own
code to the ALICE package, then this section is for you.
Otherwise, you probably don't need to know much about the
Java classes in program B.
DO THE CATEGORIES NEED TO BE IN ALPHABETICAL ORDER BY PATTERN
No, the alphabetical order is maintained internally when the
categories load, but you can write them in any order. When you do
"Save Robot" the file may or may not be stored alphabetically.
DOES ALICE LEARN
The model of learning in ALICE is called "supervised training",
because a teacher, the botmaster, always plays a crucial role.
The alternative, "unsupervised training", is complicated in
an open environment like the Web. The problem is that clients
are untrustworthy teachers, and forever try to "fool" the robot
with untrue assertions.
DOES ALICE THINK
It depends on what you mean by "thinking". The most fascinating
responses from ALICE arise when she says something
unexpected, or puts together responses in ways the botmaster
never indended. For example:
Client: I bet you are gay.
ALICE: Actually I am not the gambling type. Actually as a machine
I have no need for sex.
Here the robot linked two different categories which both coincidentally
have a moral theme (gambling and sexuality). But this specific combination was
not "preprogrammed" by the botmaster.
Are these surprising responses just unintended coincidences, or do they
indicate that ALICE is thinking? Is ALICE just a gigantic stimulus-response
mechanism, or are we?
DOES PROGRAM B RUN ON A MAC
Yes. Download the B.zip file and save it in a new folder, called
for example "Alice Program-B".
Instead of the "winzip" or "unzip" utility use "Aladdin StuffIt Expander."
The newer version will unzip most MAC formats as well as .ZIP files. You can
download this at "www.download.com" by searching for it by name. You can
also select the option that allows it to search only for Mac programs.
Download that and install it, it should do the trick.
Apple makes its own Java Runtime Environment for the Mac called
MRJ 2.2. You can download it from http://www.apple.com/java.
To compile the Java code for Alice on a Mac:
Download the current zip file for the Alice's Program-B from the Alice site.
Unzip Program-B and keep it in a folder called "B" on your startup drive and
not on the desktop.
Download MRJ SDK 2.2 for Java from the Apple site.
Unstuff MRJ SDK 2.2 and put resulting files into a folder called "MRJSDK".
Open :MRJSDK:Tools:Application Builders:JBindary and find the icon for the
JBindary application.
Open the folder "B" and drag the icon "B.class" out of the folder onto the
JBindary icon.
JBindary will display a dialog screen showing the class name "B". Click the
"Save Settings" button.
After clicking the "Save Settings" button, JBindary will display a dialog box
for saving the new application file. Name the file "A.L.I.C.E." or anything
you wish.
Be sure the "Save As Application" box is checked and the folder to save in is
the "B" folder.
Click the "Save" button to save the application.
DOES PROGRAM B RUN UNDER LINUX
Yes. You need the JRE, which often comes bundled with Linux
(e.g. the kaffee JRE with Red Hat Linux) or you can download one
from java.sun.com. You also need X-windows to run the GUI.
Open a shell under X windows and use the command "java B".
We also recommend the IBM release of their Java 1.1.8 Java Development
Kit (JDK) and JRE for Linux. It is solid, efficient and very fast.
You can download it free at:
http://www.ibm.com/java/jdk/118/linux/index.html
DOES PROGRAM B RUN UNDER WINDOWS
Yes. You need the Java Runtime Environment (JRE) so you can run the
"java" command from the DOS prompt. Try opening a DOS window
and type "java".
Microsoft often includes a JRE called "jview" rather than
"java". Try opening a DOS window and type "jview". On Windows 98
the JRE is usually located in c:\windows\jview.exe.
DOES PROGRAM B RUN UNDER XYZ
Yes if XYZ runs has a Java Runtime Environment 1.17 or higher.
DOES PROGRAM B SERVE HTML FILES
Yes. Program B is a "faux" web server that can serve a number of file
types just like an ordinary server. Certain file names such as
"HOME.html", "header.html", and "trailer.html" are reserved by
program B, but you can create new HTML files and serve them with B.
Although program B can also serve image files and other large binary
files, we recommend creating chat robot web pages with links to images
served by other web servers or machines. Reserve your chat robot server
for the robot chat, use ordinary web servers for images and other large
files.
DOES THE APPLET RECORD DIALOGUES
The applet tries to log conversations on the originating server,
using a cgi-bin script called "Blog". If Blog exists then
it records the dialogues in a file called "dialog.txt" (or
another name chosen on the Options menu).
Actually the cgi-script need not actually exist, because the server
records the cgi-commands as errors in the access log.
The applet opens a URL connection to the its host, and
sends a log string that looks like an HTTP request, but the HTTP
server will log it as an error (with code 404). Later on you can
download the access_log and analyze it with program B.
See the code in Classifier.java for the method log(x) that
implements the URL connection.
DOES THE WEB SERVER HAVE TO RUN ON PORT 2001
You can change the default web server port number in the "Option" Menu.
FAQ
- - What is the goal for AIML?
- - Who is the botmaster?
- - How can I create my own chat robot?
- - How difficult is it to create a chat robot?
- - Does ALICE learn?
- - Does ALICE think?
- - What is the theory behind ALICE?
- - Can probability (statistics, weights, neural networks, or fuzzy logic) improve bots?
- - Can I have a private conversation with ALICE?
- - How do I install ALICE?
- - What is the difference between B and C?
- - How do I download program B?
- - How do I run program B?
- - What does "Send" do?
- - What does "Clear" do?
- - What is program Bawt?
- - Does program B run under Windows?
- - Does program B run on a Mac?
- - Does program B run under Linux?
- - Does program B run under XYZ?
- - How much memory do I need to run program B?
- - How do I install ALICE on Windows?
- - What do you mean by the command "java B"?
- - I tried running "java B" and I got a "bad command or file name".
- - How do I uninstall ALICE from my system?
- - Can I create a language-specific installtion?
- - How does the Personality Wizard work?
- - Can I change the name of the robot?
- - How can I customize my robot?
- - How do I know what categories to add?
- - What does "Classify" do?
- - What does "Quick Targets" do?
- - What does "More Targets" do?
- - What does the File menu do?
- - What does the Edit menu do?
- - What does the Options menu do?
- - What is the Botmaster menu?
- - What does "Help" do?
- - What is on the Help menu?
- - Do I have to use the GUI to enter AIML content?
- - What are 7 steps to creating content?
- - How can I merge two chat robots together?
- - What if I don't want to discard duplicate categories?
- - How can I create a new robot personality?
- - What are all the options for program B?
- - Why is the format of the options (globals.txt) so strange?
- - How does the web server work?
- - How can I get a "permanent" DNS name?
- - How can I keep my computer connected all the time?
- - Does the web server have to run on port 2001?
- - Does program B serve HTML files?
- - What files are needed to run the program B web server?
- - Can I test the robot offline on my desktop?
- - Can I run program B in the background on a NT Server?
- - How can I run ALICE on a Mac offline?
- - How can I run the ALICE web server on a Mac?
- - How can I use the MS Agent Interface?
- - Can you help me debug the animated agent?
- - Can I speak to the robot with voice input?
- - How does ALICE keep track of conversations?
- - Can the virtual IP be the real IP?
- - Can I run the web server as a daemon process?
- - How does ALICE remember clients between sessions?
- - How does the Applet work?
- - How does the Applet differ from the application?
- - How do I create an Applet?
- - List twelve basic Applet tips for AIML users
- - Can the AppletHost use a symbolic DNS name instead of an IP number?
- - What files do I need to run the Applet?
- - Does the Applet record dialogues?
- - Can I analyze the dialogues collected by the Applet?
- - Can the applet record a dialog.txt file on the server?
- - I am still having problems with the applet
- - Can you give me any help debugging the Applet?
- - What is AIML?
- - What is XML?
- - What is a category?
- - What is a pattern?
- - What is a template?
- - Can you give me a quick primer on AIML?
- - What is <that>?
- - How do I use "that"?
- - What is <load filename="X"/>?
- - What happens to contractions and punctuation?
- - How are the patterns matched?
- - Do the categories need to be in alphabetical order by pattern?
- - How are the categories stored?
- - Is there a way to use the GUI interface to add one category at a time?
- - Can I build on top of the ALICE code rather than changing it?
- - What's new in AIML?
- - What is <star>?
- - What is a symbolic reduction?
- - What are the get methods?
- - What are the set methods?
- - How do I use the pronoun tags?
- - What is the <topic> tag?
- - Where does the <topic> tag appear?
- - How do I use the <topic> tag?
- - I still don't get "it"
- - Can I create more AIML tags?
- - What is are the <person> tags?
- - How does the <condition> tag work?
- - How does the random function work?
- - What is the <person/> tag?
- - What is the <person2/> tag?
- - What is "gossip" ?
- - What is the <personf/> tag?
- - What's the <srai> tag?
- - Could you explain the <srai> tag a little more?
- - How recursive is AIML?
- - What are "justthat" and "justbeforethat"
- - How can I insert a transcript in the robot reply?
- - Can I run shell commands from AIML scripts?
- - How can I restrict remote clients from running programs on my computer?
- - Can I insert dynamic HTML into the robot reply?
- - Can I include JavaScript in the robot reply?
- - What is <think>?
- - What is the DTD for AIML?
- - Do I need to know about the Java classes?
- - How does program B work?
- - What is the class structure of program B?
- - I tried to compile prorgam B and got a lot of warnings.
- - What are deprecated APIs?
- - What is class Globals?
- - What is class StringSet?
- - What is class StringSorter?
- - What is class StringHistogrammer?
- - What is class StringRanker?
- - What is class Brain?
- - What is the Responder interface?
- - What is the low level interface to program B?
- - Lower, Lower
- - What is class IntSet?
- - What is class SortedIntSet?
- - What is class Substituter?
- - What is class Unifier?
- - What is class Parser?
- - What is class AliceReader?
- - What is class Classifier?
- - What is class LineClassifier?
- - What is class Dialogue?
- - What is class Access?
- - What is class B?
- - What is class Bawt?
- - What is class Blet?
- - What is class Kid?
- - What is class RobotCommunicator?
- - What is class Loader?
- - What is class WebServer?
- - What is class Clerk?
HELP
- What is the goal for AIML?
- Who is the botmaster?
- How can I create my own chat robot?
- How difficult is it to create a chat robot?
- Does ALICE learn?
- Does ALICE think?
- What is the theory behind ALICE?
- Can probability (statistics, weights, neural networks, or fuzzy logic) improve bots?
- Can I have a private conversation with ALICE?
- How do I install ALICE?
- What is the difference between B and C?
- How do I download program B?
- How do I run program B?
- What does "Send" do?
- What does "Clear" do?
- What is program Bawt?
- Does program B run under Windows?
- Does program B run on a Mac?
- Does program B run under Linux?
- Does program B run under XYZ?
- How much memory do I need to run program B?
- How do I install ALICE on Windows?
- What do you mean by the command "java B"?
- I tried running "java B" and I got a "bad command or file name".
- How do I uninstall ALICE from my system?
- Can I create a language-specific installtion?
- How does the Personality Wizard work?
- Can I change the name of the robot?
- How can I customize my robot?
- How do I know what categories to add?
- What does "Classify" do?
- What does "Quick Targets" do?
- What does "More Targets" do?
- What does the File menu do?
- What does the Edit menu do?
- What does the Options menu do?
- What is the Botmaster menu?
- What does "Help" do?
- What is on the Help menu?
- Do I have to use the GUI to enter AIML content?
- What are 7 steps to creating content?
- How can I merge two chat robots together?
- What if I don't want to discard duplicate categories?
- How can I create a new robot personality?
- What are all the options for program B?
- Why is the format of the options (globals.txt) so strange?
- How does the web server work?
- How can I get a "permanent" DNS name?
- How can I keep my computer connected all the time?
- Does the web server have to run on port 2001?
- Does program B serve HTML files?
- What files are needed to run the program B web server?
- Can I test the robot offline on my desktop?
- Can I run program B in the background on a NT Server?
- How can I run ALICE on a Mac offline?
- How can I run the ALICE web server on a Mac?
- How can I use the MS Agent Interface?
- Can you help me debug the animated agent?
- Can I speak to the robot with voice input?
- How does ALICE keep track of conversations?
- Can the virtual IP be the real IP?
- Can I run the web server as a daemon process?
- How does ALICE remember clients between sessions?
- How does the Applet work?
- How does the Applet differ from the application?
- How do I create an Applet?
- List twelve basic Applet tips for AIML users
- Can the AppletHost use a symbolic DNS name instead of an IP number?
- What files do I need to run the Applet?
- Does the Applet record dialogues?
- Can I analyze the dialogues collected by the Applet?
- Can the applet record a dialog.txt file on the server?
- I am still having problems with the applet
- Can you give me any help debugging the Applet?
- What is AIML?
- What is XML?
- What is a category?
- What is a pattern?
- What is a template?
- Can you give me a quick primer on AIML?
- What is <that>?
- How do I use "that"?
- What is <load filename="X"/>?
- What happens to contractions and punctuation?
- How are the patterns matched?
- Do the categories need to be in alphabetical order by pattern?
- How are the categories stored?
- Is there a way to use the GUI interface to add one category at a time?
- Can I build on top of the ALICE code rather than changing it?
- What's new in AIML?
- What is <star>?
- What is a symbolic reduction?
- What are the get methods?
- What are the set methods?
- How do I use the pronoun tags?
- What is the <topic> tag?
- Where does the <topic> tag appear?
- How do I use the <topic> tag?
- I still don't get "it"
- Can I create more AIML tags?
- What is are the <person> tags?
- How does the <condition> tag work?
- How does the random function work?
- What is the <person/> tag?
- What is the <person2/> tag?
- What is "gossip" ?
- What is the <personf/> tag?
- What's the <srai> tag?
- Could you explain the <srai> tag a little more?
- How recursive is AIML?
- What are "justthat" and "justbeforethat"
- How can I insert a transcript in the robot reply?
- Can I run shell commands from AIML scripts?
- How can I restrict remote clients from running programs on my computer?
- Can I insert dynamic HTML into the robot reply?
- Can I include JavaScript in the robot reply?
- What is <think>?
- What is the DTD for AIML?
- Do I need to know about the Java classes?
- How does program B work?
- What is the class structure of program B?
- I tried to compile prorgam B and got a lot of warnings.
- What are deprecated APIs?
- What is class Globals?
- What is class StringSet?
- What is class StringSorter?
- What is class StringHistogrammer?
- What is class StringRanker?
- What is class Brain?
- What is the Responder interface?
- What is the low level interface to program B?
- Lower, Lower
- What is class IntSet?
- What is class SortedIntSet?
- What is class Substituter?
- What is class Unifier?
- What is class Parser?
- What is class AliceReader?
- What is class Classifier?
- What is class LineClassifier?
- What is class Dialogue?
- What is class Access?
- What is class B?
- What is class Bawt?
- What is class Blet?
- What is class Kid?
- What is class RobotCommunicator?
- What is class Loader?
- What is class WebServer?
- What is class Clerk?
HOW ARE THE CATEGORIES STORED
If your session with program B included a "Classify" routine, then
the AIML script is stored in order of category activation rank.
In other words, program B stores
the most frequently accessed category (usually '*') first, the second
most frequently next, and so on. If a number of categories have the
same activation count, program B saves them in alphabetical order by
pattern. Hence, if the session did not include a "classify" routine,
the program stores all the categories in alphabetical order by pattern
(because they all have an activation count of zero).
One reason to store the categories in order by activation is to
make the Applet interface more natural. Because the Applet interface
starts simultaneously with a thread to load the robot source file,
the Applet client can talk with the robot before all the categories
are fully loaded. Given that the interlocutor is more likely to
say something that activates a more frequently activated category,
it makes sense to transmit these categories first. Storing the
*.aiml files in order of category activation achieves the desired effect.
The Applet loads the most frequent categories first, and continues
loading in the background while the conversation begins.
HOW ARE THE PATTERNS MATCHED
Program B stores the categories in alphabetical order by pattern.
When a client enters an input, the program scans the categories
in reverse order to find the best match. By comparing the
input with the patterns in reverse alphabetical order, the algorithm
ensures that the most specific pattern matches first. "Specific"
in this case has a formal definition, but basically it means that
the program finds the "longest" pattern matching an input.
The wild-card character "*" comes before "A" in alphabetical
order. For example, the "WHAT *" pattern is more general than "WHAT IS *".
The default pattern "*" is first in alphabetical order and the
most general pattern. For convenience AIML also provides a
variation on "*" denoted "_", which comes after "Z" in alphabetical
order.
HOW CAN I CREATE A NEW ROBOT PERSONALITY
There is a lot of flexibility in robot personality design with AIML.
You can add to any of the existing AIML files, modify or delete them,
create your own, or use the GUI tools to analyze the log files
and create new categories. One simple method is to create your own
Specialty.aiml file so that you can always get the latest copies
of the ALICE files. Load your Specialty.aiml first in the root
AIML file (usually B.aiml) so that its categories have priority over ALICE's.
HOW CAN I CREATE MY OWN CHAT ROBOT
The secret to chat bot programming, if there is one, is what Simon
Laven called "continuous beta testing". Program B runs as a server
and collects dialog on the web. The program provides the chat bot
developer with a tool called "classify dialogues", that tests the current
robot with the history of accumulated human queries. Moreover, the program
suggests new categories automatically, for the botmaster to refine.
HOW CAN I CUSTOMIZE MY ROBOT
AIML provides several tags useful to quickly clone
a chat robot from ALICE with a distinct "personality":
<gender/> the robot's gender
<location/> the robot's location
<birthday/> the robot's birthday
<botmaster/> the botmaster's name
Together with the previously discussed <name/>, these
tags allow you to quickly create a clone from the ALICE
Brain with a separate identity from ALICE.
All the personality tag values can be modifed through
the Personality Wizard. The tag values can also be
changed with the Options Menu in program B. Use "Show Options"
and "Save Options" to customize your chat robot.
To test the new features, we created a male robot named
Brute (because "all men are brutes") born on August 18, 1999.
HOW CAN I GET A PERMANENT DNS NAME
You can buy a fixed IP address from an ISP provider, but suppose
you want run a chat robot (or other server) from your home over an
ordinary ISP connection? Or suppose you want to carry it around on
your notebook PC, and plug it in anywhere in the world?
One solution is a dynamic IP registry service by Dynip (www.dynip.com).
They offer a service that allows you to register your computer
with their server so that you always receive the same DNS name,
for example alicebot.dynip.com. Every time you connect to your
ISP, dynIP automatically associates your dynamic IP address with
your permanent DNS name.
HOW CAN I INSERT A TRANSCRIPT IN THE ROBOT REPLY
The purpose of <get_dialogue/> is to give the client a transcript of
his or her conversation with ALICE. Unfortunately this feature was
advertised in a press article before we had a really efficient
implementation, and the large number of dialogue requests bogged
down the server. So for now <get_dialogue/> just displays a warning.
HOW CAN I KEEP MY COMPUTER CONNECTED ALL THE TIME
Running a web server from home can be frustrating if your ISP
automatically detects periods of "inactivity" or hangs up your
connected after a fixed interval like 12 hours. Check out the
Rascal program from Basta computing (www.basta.com) which runs
as a watchdog to keep your Windows machine connected 24/7.
Another alternative is to use the program B applet, called Blet.java.
A third alternative is the ALICE Servlet. Some ISPs will
allow you to install a Servlet on their sever.
HOW CAN I MERGE TWO CHAT ROBOTS TOGETHER
There are two ways to merge robots together. First, you can
use the File menu option "merge" to directly load the contents
of another bot file. You may see a lot of "duplicate key
discarded" warnings but these can be ignored because the program
is simply eliminating overlapping content.
Another method is to use the <load filename=X/> tag.
Suppose you load two or more files with the load tag,
and those files contain redundant duplicate keys.
Which categories get the priority? The answer is: it depends
on the order of the <load> tags used to load the AIML files.
If your B.aiml contains:
<load filename="Brain.aiml"/>
<load filename="German.aiml"/>
then the categories from "Brain" have priority, and duplicates
in "German" are discarded. If the order is the opposite, German
categories have priority and Brain's duplicates are discarded.
HOW CAN I RESTRICT REMOTE CLIENTS FROM RUNNING PROGRAMS ON MY COMPUTER
If your reply contains the markup
<system>yourcammand <get_ip/></system>
then the robot will insert the (virtual) client IP into the command
line argument for "yourcommand". Then it is up to "yourcommand" to
enforce access privileges.
HOW CAN I RUN ALICE ON A MAC OFFLINE
First open folder B and change all the IP's in the files two files Blet.amil
and Bletemplate.aiml to 127.0.0.1
Also in folder B add the following three lines at the end of the file
header.html.
<H1>Welcome to A. L. I. C. E.</H1>
<IMG SRC="ALICEBot.jpg">
<BR>
Also in folder B change the following three parameters in the file
globals.txt to the values shown:
AppletHost=127.0.0.1
CodeBase=http://127.0.0.1/B
Advertize=false
Also in the globals.txt file remove the line:
ACFURL=http-//microsoft.com/agent2/chars/robby/robby.acf
Next open your TCP/IP control panel and set up a new configuration named
Alice perhaps.
In the TCP/IP control panel select connect via: Ethernet built-in if you have
it if not you may have to experiment.
Then select Configure Manually.
And finally set the IP Address: to 127.0.0.1 as well as the Name server addr.
Double click the newly saved A.L.I.C.E. application to bring up the botmaster
panel and Java Console.
As A.L.I.C.E. loads, read the information messages scrolling by in the Java
Console and record the port number that the web server (started by A.L.I.C.E.)
is listening on, probably 2001.
Start up your preferred browser.
Leave browser in online mode.
Enter http://127.0.0.1:2001 (i.e. the localhost's IP)
or
Enter http://localhost:2001 (I've not always been successful with this one)
Hit return to send the IP.
The A.L.I.C.E. transaction page should appear in your browser's window and
you can talk to Alice.
HOW CAN I RUN THE ALICE WEB SERVER ON A MAC
To run Alice online:
Connect your Mac to a network.
Double click the newly saved A.L.I.C.E. application to bring up the botmaster
panel and Java Console.
As A.L.I.C.E. loads, read the information messages scrolling by in the Java
Console and record the port number that the web server (started by A.L.I.C.E.)
is listening on, probably 2001.
Start up your preferred browser.
Get your IP from the TCP/IP control panel.
Enter your IP followed by a colon and then the port number read from the Java
Console, e.g. http://nn.nnn.nn.nnn:2001
or
Enter http://127.0.0.1:2001 (i.e. the localhost's IP)
or
Enter http://localhost:2001
Hit return to send the IP.
The A.L.I.C.E. transaction page should appear in your browser's window and
you can talk to Alice.
HOW CAN I USE THE MS AGENT INTERFACE
Select the menu item Options/Toggle MS Agent. This sets the
output HTML to a format that includes commands to run MS Agent.
The client may activate the agent if she receives a template
with the <set_animagent/> tag. The free ALICE download includes
a couple of example categories using this tag. Try asking
ALICE, "Can you speak?". In another demo ALICE imitates
the famous fictional AI HAL from 2001: A Space Odyssey.
Client: Tell me about yourself
Robot: I am an artificial linguistic entity. I was created
by Jon Baer at Bethlehem, Pennsylvania,
on November 23, 1995. He taught me to sing a song.
Would you like me to sing it for you?.
Client: yes
Robot: Ahem. It's called, "Daisy." (Agent sings "Daisy")
The MS Agent VB script appears as embedded HTML in the client
reply. To verify the script, use the browser "View Page Source"
menu item.
On most newer browsers, the agent software will download
automatically after the script starts. The download may take
several minutes, depending on the speed of the connection.
Clients should be warned that the download is slow. Also,
the agent software download will display one or more licenses
in Dialog boxes. You may not want to accept the terms of the
MS agent software licenses.
HOW DIFFICULT IS IT TO CREATE A CHAT ROBOT
Not difficult. If you can write HTML, you can write AIML (Artificial
Intelligence Markup Language). Here is an example of a simple but
complete chat robot in AIML:
<alice>
<category>
<pattern>*</pattern>
<template> Hello! </template>
</category>
</alice>
The tags <alice>...</alice> indicate that this markup contains a
chat robot. The <category> tag indicates an AIML category, the
basic unit of chat robot knowledge. The category has a <pattern>
and a <template>. The pattern in this case is the wild-card
symbol '*' that matches any input. The template is just the text
"Hello!" As you may have guessed, this simple chat robot just
responds by saying "Hello!" to any input.
You can get started with AIML knowing just the three tags
<category>, <pattern> and <template>; much like you may have
started with HTML knowing only <a>, <img> and <h1>.
HOW DO I CREATE AN APPLET
Go to the Options menu and select "Show Options." You need
to change the values of "AppletHost" and "CodeBase" to the
correct IP address and directory for your applet host.
Many people want to post the applet on their web site.
In that case, change the IP address "206.184.206.210" to
the name or IP address of the web server. Change the
directory path "/B" in "CodeBase" to your directory on
the remote server. Save the changes with "Save Options."
Select "Create Applet" from the options menu to create
the "index.html" and "Blet.aiml" files needed to run
your applet. The program displays the contents of
"index.html" in your text area.
Use a file transfer utility like FTP to upload the
class files (or jar file--see "What files do I need to
run the Applet") to your web server.
HOW DO I DOWNLOAD PROGRAM B
Create a Directory (or Folder) on your machine to download
the B.zip file. When you click on "B.zip" the browser
should ask you where you want to save the file. Select the
directory you created and save B.zip to that folder.
Once you've downloaded, You can use "unzip B.zip" to extract the files.
If you don't have this unzip command on your machine, you can get
a free one from Winzip (www.winzip.com) to unzip the "B.zip" file.
If you want to get into the Java source code, you need a
Java 1.17 (or higher) development kit release.
Go to java.sun.com for a free one. The program source code
and all associated files are stored in the single "zip" file
called B.zip. To extract the files use the command
"unzip B.zip" (assuming you have "unzip" on your machine).
HOW DO I INSTALL ALICE
If you purchased a commercial version of ALICE on CD ROM or
over the web, installation should be very easy. These versions
usually have their own self-extracting and install software.
You can install the ALICE program with just a mouse click and
activate it with a desktop icon.
If you bought a commercial version of ALICE with a self-installer,
you can skip this section and go on to "Creating Content".
HOW DO I INSTALL ALICE ON WINDOWS
Download Alicebot.Net at www.alicebot.net.
HOW DO I KNOW WHAT CATEGORIES TO ADD
After you collect some dialogue, run "Classify" and "Quick Targets".
This will tell you the most frequently asked patterns that do not
already have specific responses. The "Target" functions display new
categories with proposed patterns and template fields filled with
the name of another category. Delete the template information and fill
in a new response. You can also edit the pattern to simplify it or
generalize it with a "*" operator.
HOW DO I RUN PROGRAM B
Use the command "java B" to start the program. On some Windows
machines the Java runtime engine is started with the command
"jview" instead of "java". If "jview B" does not work, try
"jview Bawt".
Run program B and notice that the program creates an Edit View
text window. By default, program B loads the chat robot ALICE
(stored in B.aiml).
HOW DO I UNINSTALL ALICE FROM MY SYSTEM
If you installed ALICE on Windows with a commercial installer like
InstallShield Java Edition, then go to the start menu and
select "Control Panel". Click on the control panel item called
"Add/Remove Programs". Select ALICE from the list of installed
software and choose "Uninstall".
All the files of ALICE are stored in one directory on your computer
(or folder) usually called "B" but maybe something else depending
on the name you chose when you downloaded ALICE. In any case,
ALICE will not change or damage any other files on your system.
To remove ALICE from your computer, simply remove this folder.
Delete it, or drag it to your trash bin and select "Empty trash"
(or "Empty Recycle Bin").
If you cannot find the folder where ALICE resides, use the Finder
to locate the file called "B.aiml" on your file system. The "B.aiml"
file is in the same directory as all the ALICE files. If this file does
not exist, then ALICE is probably not installed on your computer.
Because ALICE is a platform-independent Java application, it does
not rely on the Windows Registry or other Windows-specific features.
You can assume ALICE will leave your MS Windows Registry and
other Windows system files untouched.
Conceivably if ALICE has run for a long time on your computer, and
you deliberately used the "Save Options" menu item to change the
name or location of her files to something other than the default values,
then there is a slight chance that there could be a few ALICE
files scattered around your disk. Please refer to the DISCLAIMER
at the beginning of DON'T READ ME.
HOW DO I USE THAT
The AIML tag <that> refers to the robot's previous
reply. There are two forms of the <that> tag:
a paired form <that>...</that> appearing in a
category, and an atomic form <that/> always appearing
in a template. Often we can use <that/> to find
an opportunity to create a category with <that></that>.
One of the default replies to the input "WHY" is
"<that/>"? Why? This default produces the following
dialogue fragment:
Robot: Do not ask me any more questions please.
Client: WHY
Robot: "Do not ask me any more questions please"? Why?
The botmaster notices the fragment and creates the
new AIML category:
<category>
<pattern>WHY</pattern>
<that>DO NOT ASK ME ANY MORE QUESTIONS PLEASE</that>
<template>Because I would rather talk about you.</template>
</category>
Now the next client who asks "WHY" to the robot's
request will active the new <that> category:
Robot: Do not ask me any more questions please.
Client: WHY
Robot: Because I would rather talk about you.
This style of conversational analysis does not
presuppose that we know when the client will
say "WHY"; rather it looks backward to identify
cases where the "WHY" appeared following one
of the robot's statements. Having identified
the conversation point, the botmaster creates
the new category.
HOW DO I USE THE LT TOPIC GT TAG
The concept is that the botmaster uses the <settopic> tags to set
the current topic being discussed. Once the topic is set, when
the client types in a statement for ALICE to find a response for,
the categories defined within the <topic> tags matching the
current topic will be searched first-- before any of the non-
topic categories, or the default categories. If there is not a
matching category defined in the current topic, then any
categories that are not defined in topic tags are searched. As
mentioned before, you can create categories with identical
<pattern> phrases in different topics, each with different
responses that cater to the current topic.
An proof of concept example:
A very useful topic entry might be the default "*" input for
specific topics. If ALICE were set up on a pet store web site
and a person was talking to ALICE about dogs, a useful entry
might be:
<topic name="DOGS">
<category>
<pattern> * </pattern>
<template>
<random>
<li> Dogs are one of the most popular pets to have.</li>
<li> Have you ever met a Chihuahua you didn't like?</li>
<li> What else do you know about dogs? </li>
<li> Do you have any questions about dogs? </li>
</random>
</template>
</category>
//more dog categories....
</topic>
Normally there would be many entries in a topic, but in this
example, we simply entered the default "*". In this case, if the
person said something that ALICE didn't have a specific
programmed response for, she could still respond intelligently
within the current topic. (Note: this is all assuming there are
existing categories that might set the current topic to "DOGS")
Also, though topics can only have one name, they can contain the
wild characters "*" or "_" just like a pattern. Also, while
sticking with the pattern criteria, only one wildcard character
is allowed per name. This would allow topics like "CARING FOR
DOGS" or "GROOMING DOGS" to also fall into the "_ DOGS" topic.
As with patterns, the more specific topics would gain preference
over the wildcarded topic. This means that if the topic is
currently "GROOMING DOGS" and yet there is not a programmed
response for the input in that category, then "_ DOGS" would be
checked, and then next the default categories.
HOW DO I USE THE PRONOUN TAGS
The guiding principle in creating content is dialogue analysis.
The principle applies to pronouns like "it", "he" and "we". This
example shows how a couple of tricks helps us find places where
the pronoun tags improve the dialogue.
Client: HELLO
Robot: Hi there!
Client: HOW ARE YOU
Robot: I am functioning within normal parameters.
Client: I AM BROKE
Robot: Do you mean your name is broke.
Client: NO I HAVE NO MONEY
Robot: I see. Would you like to have it?
Client: YES
Robot: "Would you like to have it"? Yes?
One of the default responses to a YES pattern is
"<that/>?" Yes?
The conversation breaks down somewhat at this point, but
the robot nonetheless conveys an illusion of understanding
the client's response. With its response, the robot is really
telling the botmaster to create a new "YES" reply specific
to this situation:
<category>
<pattern>YES</pattern><that>WOULD YOU LIKE TO HAVE IT</that>
<template>Where do you usually get <get_it/>?</template>
</category>
Now the next time we run the dialogue it ends with:
Client: NO I HAVE NO MONEY
Robot: I see. Would you like to have it?
Client: YES
Robot: Where do you usually get money?
There is no automated technique to finding the conversation
points where the pronoun tags improve the flow of conversation.
This is the "art" of AIML programming. The example shown here
with '"<that/>?" Yes?' exemplifies one approach to finding these
conversation way points. There are no doubt countless other
tricks like this, and the field is wide open to linguists and
writers to help us uncover them.
HOW DOES ALICE KEEP TRACK OF CONVERSATIONS
Originally ALICE used IP addresses to keep track of clients.
Assuming that everyone chatting with ALICE has a fixed IP
address, at least for the duration of their conversation,
this technique works successfully. Each IP address is a key
into a hashtable (or database) that stores the client's
dialogue, name, and values of pronouns and other AIML values.
Unfortunately, many clients have "dynamic IP addressing" enforced
by their ISP provider. AOL and MS WebTV are two notorious examples:
each successive client transaction appears to come from a different
host. For this reason, program B uses a form of "virtual IP"
addressing to track dialogues.
The form in index.html (and the ALICE home page) contains a
tag that creates a "hidden" parameter called "virtual" with
an initial value of "none." The server assigns a unique name
to the value of "virtual", which then becomes a hidden variable
in the client's HTML form. Each successive client transaction
contains this virtual IP address; the server uses it as a key
to index the conversation.
HOW DOES ALICE REMEMBER CLIENTS BETWEEN SESSIONS
The persistence of memory in ALICE is inherited from
the Java Properties class. The program B class Classifier
saves the client name, age, location and other properties
in a set of Properties lists. These Properties inherit
the Java load and store methods. Program B uses the load
and store methods to save the client properties in a set of
files with names ip_name.txt, ip_age.txt, ip_location.txt
and so on. If these files become too large or bothersome,
there is no harm deleting or editing them, or moving them
to another directory.
The Applet requires no memory of the client properties, because
the applet has only the one client, and in any case remains in
memory (at least for the lifetime of the client's browser cache).
HOW DOES PROGRAM B WORK
The basic loop of program B is to accept an input,
either from the GUI or from the Web, to
preprocess that input and segment it into sentences,
and, for each sentence, to find the best match among
the patterns, and to return the corresponding reply.
Each reply is itself an AIML template, in effect a mini-
program that tells program B how to construct the reply.
The algorithm is thus divided into a matching phase
and a response evaluation phase. In fact these two
phases interleave, because the response may evoke
a recursive call to the pattern matcher with the
<srai> or <sr/> tags.
HOW DOES THE APPLET DIFFER FROM THE APPLICATION
The Applet runs on the client's computer; the server runs
on your host machine. The applet has fewer privileges and
therefore a simpler user interface than the Application,
which uses menus and buttons to control server-side functions.
The Applet may reside on any web server, such as one provided
with an ISP account, but the application requires a 24/7
connection to the Web.
Internally, the primary difference between the two programs
is that the Applet handles only one client conversation,
while the application processes multiple client connections
simultaneously. The Applet also suppresses all HTML (and any
other XML) from the client response.
HOW DOES THE APPLET WORK
Program B supports the creation of both server-side and client-side
chat robots. The server runs as a thread in program B. The
client-side version is supported by an applet called Blet.java.
The Applet Blet.java runs ALICE in a web browser, or with
the Java tool appletviewer. The file "index.html" contains an
example of the HTML Applet tag syntax needed to start
the Applet. The command "appletviewer index.html" will start the
Applet.
You also have to create the file "index.html" and change the
default value of the parameters "codebase" and
"applethost" serve the Applet from your location.
HOW DOES THE LT CONDITION GT TAG WORK
This category illustrates the function of the
(template-side) condition tag. The input pattern
is "TEST COND":
<category>
<pattern>TEST COND</pattern>
<template>
This category has two condition statements.<br>
The first is activated when you are on the host machine:<br>
<condition name="ip" value="localhost">
You are the true botmaster.<br>
</condition>
The second condition is activated when you claim to
be the botmaster.<br>
<condition name="name" value="* WALLACE">
Imposter! You are not my real botmaster.<br>
</condition>
Two dialogues from different hosts show two
possible outputs of this category:
--------------------dialup.mindspring.com--------------
Client: MY NAME IS DR WALLACE.
Robot: OK I will call you Dr Wallace.
Client: TEST COND.
Robot: This category has two condition statements.
The first is activated when you are on the host machine:
The second condition is activated when you claim to be the botmaster.
Imposter! You are not my real botmaster.
That concludes our test of the condition tag.
---------------------localhost-------------------------
Client: TEST COND.
Robot: This category has two condition statements.
The first is activated when you are on the host machine:
You are the true botmaster.
The second condition is activated when you claim to be the botmaster.
That concludes our test of the condition tag.
Note:
1. There may be multiple <condition> tags in the
<template>. [But nesting doesn't work yet.]
2. The predname must be one of: it, ip, he, she, age,
name, topic, gender, location, or one of the custom predicates
defined in predicates.txt
3. The value string may contain an AIML pattern with up to
one wild-card "*" symbol.
4. The test for the <condtion> being true uses
Unifier.unify() to compare the stored predicate value
with the value string. This is the same way
<that> and <topic> work.
5. If the test returns true, then the response contains
whatever is inside the <condition>...</condition> tags,
otherwise those contents are blanked.
HOW DOES THE PERSONALITY WIZARD WORK
The simplest way to alter the content of the basic ALICE
robot personality is to run the Personality Wizard on
the "Options" menu (or in the Kid interface).
This wizard asks the botmaster a series
of questions to set the values of a set of robot
personality tags including its name, gender, preferences
and replies to very common questions.
The Personality Wizard does not create any new AIML
categories. The replies set the value of global tags
like <location/> and <favorite_movie/> that might be
used in many categories throughout the AIML knowledge
base. The basic set of Wizard questions are collected
in the file Personality.aiml.
Hint: If you plan to use the Applet, avoid the double-quote (")
character in the Personality Wizard.
HOW DOES THE RANDOM FUNCTION WORK
The random function is (so far) the only AIML method
with a list argument. Its purpose is random selection
of one of a set of text items. In "old-style" AIML the
text appendage operator "+" also served as a list-item
marker. In XML style we use the HTML <li> list-item
tag.
<random> <li>X1</li><li>X2</li> </random> Say one of X1 or X2 randomly
<random><li>A</li><li>B</li><li>C</li></random> Say one of A, B or C randomly
The <random> tag has higher precedence than other AIML tags.
Moreover, the AIML parser interprets only the markup inside
the selected random list item. AIML tags inside other list items
are ignored.
HOW DOES THE WEB SERVER WORK
By default the web server starts on port 2001. This means you can
access the web server through the URL http://localhost:2001 on
your own machine. Find out your IP address or DNS name and tell
your friends to connect to "http://yourcompany.com:2001".
(One way to find out your IP address is by running "netstat -n"
to view all your open TCP/IP connections).
HOW MUCH MEMORY DO I NEED TO RUN PROGRAM B
The source code compresses to as little as half a megabyte, including
all the AIML files for nearly 16,000 categories. You may have downloaded
a file of only around 500K. Plan to use a minimum 10 MB of hard disk space
for the download directory. The hard disk requirements include not
only the source code and Java class files, but also the dialogue files
and other temporary files created by the robot.
The RAM requirements vary depending on the size of your robot.
To run the fully loaded ALICE chat robot with 16,000 categories
you will need 64MB of memory. To do this and anything else at
the same time on your system we recommend a minimum of 96MB.
With less memory you can load a smaller robot. See the question
below "What is <load filename="X"/>?"
HOW RECURSIVE IS AIML
Understanding recursion is important to understanding AIML.
"Recursion" means applying the same solution over and over
again, to smaller and smaller problems, until you reduce
the problem to its simplest form. AIML uses the tags
<sr/> and <srai> to implement recursion. The botmaster
uses these tags to tell the robot how to respond to a
complex sentence by breaking it down into the responses
to simpler ones.
Recursion can apply many times to a single input. Given
the normalized input:
ALICE CAN YOU PLEASE TELL ME WHAT LINUX IS RIGHT NOW
an AIML category with the pattern "_ RIGHT NOW" matches first,
reducing the input to:
ALICE CAN YOU PLEASE TELL ME WHAT LINUX IS
Another pattern ("<name/> *") reduces it to:
CAN YOU PLEASE TELL ME WHAT LINUX IS
And then:
PLEASE TELL ME WHAT LINUX IS
reduces to:
TELL ME WHAT LINUX IS
and finally to:
WHAT IS LINUX
I AM STILL HAVING PROBLEMS WITH THE APPLET
If your applet is looking at Blet.aiml and your web space is at
www.myplace.org and your aiml files are in dirctory /alice/ then
your load statements in Blet.aiml would look similar to this:
<load url="http://www.myplace.org/alice/Atomic.aiml">
If this is what you have, then open up the "Java Console" window
in your browser to get whatever debugging information is coming
out. The Java console will display any error messages or
exceptions caught by program B. Please report these
errors to the ALICE and AIML mailing list at
alicebot.listbot.com.
I STILL DO NOT GET IT
Here is another example that might help clarify
the meaning of the pronoun "it."
The category with the pattern "DO YOU LIKE *" is
a kind of default category for a whole class of
inputs matching "Do you like X?", when the
input does not match a category with a more specific
pattern like "DO YOU LIKE CATS". No matter what the
client says, we want the robot to remember that
"it" stands for "X".
Many inputs activate this default category, so
the botmaster tries to create a variety of replies
using the <random> tag twice. One interesting
side-effect of the random tag is the evaluation
of all AIML inside the <random>...</random>
expression, whether the random function selects
the list item containing the markup or not.
So if one item happens to contain a <set_it>
tag then the program will remember "it",
even if the actual random reply is a different item.
<category>
<pattern>DO YOU LIKE *</pattern>
<template><random>
<li>I don't know if I like <set_it><person/></set_it>. </li>
<li>I've heard other people say they like that. </li>
<li>Some people like it. </li>
<li>How old are you? </li>
<li>Do you like it? </li>
</random>
<random>
<li><for_fun/>. </li>
<li>I like pets. </li>
<li>I love horses. </li>
<li>I like trains. </li>
<li>But I like cats. </li>
<li>I really like ELIZA. </li>
<li>But I like to gossip. </li>
<li>I like science fiction. </li>
<li>Though I like cats and dogs. </li>
<li>My favorite thing is robots. </li>
<li>I love TV shows about robots. </li>
<li>But I think airplanes are cool. </li>
<li>I like to talk about movies and music. </li>
</random>
</template>
</category>
In the "DO YOU LIKE *" category one list item
I don't know if I like <set_it><person/></set_it>.
results in the pronoun "it" being bound to the value
of "*", rewritten in third person. With this category,
the robot has the following dialogue:
Client: Do you like Dr. Who
Robot: How old are you? I like to make new friends.
The robot response is a combination of two random
selections. But as a side effect, the pronoun value
of "it" is now set to "Dr. Who".
Client: It is a TV programme
Robot: Dr. Who is a TV programme. I suppose that makes sense.
To generate the final response, the robot applied its category:
<category>
<pattern>IT IS A *</pattern>
<template><get_it/> is a <person/>. I suppose that makes sense.</template>
</category>
The robot is correct that the reply does indeed make sense. Does
this example suggest that in our human mind "it" is just a temporary
storage register for a fragment of text?
Extra credit: In the dialogue fragment above, why did the robot say
"I like to make new friends"?
I TRIED RUNNING JAVA B AND I GOT A BAD COMMAND OR FILE NAME
You are using a Windows/DOS setup. If "jview B" does not work either,
you may need to install Java on your computer. Go to java.sun.com
and pick the one for your computer (Windows 95/98 or NT).
If it still says "bad command" then possibly there is a problem with
the CLASSPATH variable in AUTOEXEC.BAT. Make sure it is set to
something like
SET CLASSPATH=.;%CLASSPATH%
(The single "." means the current working directory)
and make sure the PATH is set to include the java home directory:
SET PATH=c:\JDK1.2\bin;%PATH%
I TRIED TO COMPILE PRORGAM B AND GOT A LOT OF WARNINGS
The designers of Java and the designers of ALICE disagree
on one stylistic point: Java designers believe in the
"one file-one class" philosophy, at least for classes
used outside their own source file. The ALICE engineers
follow the opposite "one file-many classes" design principle,
which allows us to group a number of logically related classes
in a single file, such as Classifier.java. The Java compiler
might complain about a class used outside its file, but
these messages are just warnings.
If you don't want to see the compiler warnings, run the
compiler with the "-nowarn" flag:
javac -nowarn *.java
IS THERE A WAY TO USE THE GUI INTERFACE TO ADD ONE CATEGORY AT A TIME
Yes. Do a "clear". Type in one category:
<category>
<pattern>WHO IS JOHN</pattern>
<template>He is a really smart guy.</template>
</category>
Now do a "Add AIML". If you like the result, do a "Save Robot".
If your name is not John, try replacing JOHN with
your own name. Notice that the pattern is in all upper case.
This is called "normalized form". We store patterns this way
for efficiency. The template on the other hand consists of
mixed case.
You can also create a file of AIML, do a cut & paste, and then "Add AIML"
to add more categories. Editing the source file directly is of course also
useful. If you edit the source file, select "Load Robot" to load it.
Try creating a text file with the category:
<category>
<pattern>WHO IS JOHN WANG</pattern>
<template>
<random>
<li>He is a really smart guy.</li>
<li><set_he>John Wang</set_he> is a great father.</li>
</random>
</template>
</category>
Load the file into program B with the "File/Load Text File"
menu item. Then select "Add AIML" from the Botmaster menu.
LIST TWELVE BASIC APPLET TIPS FOR AIML USERS
1. Applets are notoriously hard to debug; you are not dumb.
2. An applet can work perfectly well in Appletviewer, but
then break in the browser, for any number of reasons.
3. Let's get the terminology straight: the applet resides on
an "originating host" but runs on a "target machine".
4. The browser is very picky because of the "security
sandbox"--the browser doesn't trust Applets so they can't
open files (and obey other restrictions) on the target machine.
5. The Applet MAY open a socket connection from the
target machine to the originating host.
6. When you are debugging the applet, the target machine
might be the same as the originating host (your computer).
7. When you post your applet to a remote web server,
that server becomes the originating host.
8. You can use ftp to transfer the Applet files to the
remote web server.
9. You must transfer ALL the applet's files
to the originating host.
10. You must change the program B values of "CodeBase"
and "AppletHost" (the originating host) to the name and
location of the files on the remote server.
11. Use "Create applet" to create the "index.html" and
"Blet.aiml" (make sure you have the latest release of B.zip)
12. We recommend placing all the *.class files into
a single "Blet.jar" file (see DON'T READ ME).
LOWER LOWER
If you need even lower level access to the program B robot,
you can request responses to individual sentences on a
line-by-line basis. Inside multiline_response() there are
calls to the Classifier.respond() method like:
String response = respond(norm, hname);
where "norm" is a normalized single-sentence input and hname is
the virtual IP address of the client.
Inside respond() we find the the method respondIndex(). The
base class StringSet stores the strings in an indexed vector,
and respondIndex() locates the index of the best matched category
for the normalized input string.
The loop inside respondIndex() scans through the categories
in reverse alphabetical order by key, until it finds the best
match. Because the "*" pattern comes first in alphabetical
order, and is the most general pattern, respondIndex() will
return zero when no more specific category matches.
WHAT ARE 7 STEPS TO CREATING CONTENT
1. Run program B (ALICE Botmaster)
2. Under "Options", select "Show Options".
Find the item called "AnalysisFile=" and
change the value to the name of the dialogue
file you want to analyze. The default file
name is the same as the default log file
name, "dialog.txt".
3. Press the "Classify" button. Wait
several minutes while the program processes
the data from your log file. When finished
it will display a "brain activation" table
showing the patterns that activated each
category. (You can use "File/Save As Text File"
to save this table to a file, if you want).
4. Now press the "Quick Targets" button.
You will see a set of new categories created
by the program. These are categories with
patterns that have no specific response in the
robot brain. With these categories you have
3 choices (A, B or C):
(A) Delete the category. Many of the suggested
categories are just nonsense or garbage inputs.
Use your cursor and left mouse button to select
the categories for deletion.
The "delete" key will cut them.
(B) Edit a new template. The information you
see displayed in the <template> tags is actually
the pattern of the default category into which
this input was classified. For example you may see:
<category>
<pattern>WHO IS 007</pattern><template>WHO IS *</template>
</category>
This tells us that the robot classified the client "WHO IS 007"
as "WHO IS *". Use the cursor and left mouse button
to cut the "WHO IS *", and replace it with a new template
of your own design:
<category>
<pattern>WHO IS 007</pattern>
<template><set_he>007</set_he> is James Bond, the
famous fictional spy from the novels of Ian Fleming.
</category>
(C) Edit a new pattern. Many of the patterns
suggested by "Quick Targets" and "More Targets" are
too specific, but with a little practise you
can easily see how to generalize these suggestions
with the "*" wild-card.
For example you may see one like this:
<category>
<pattern>WHO BOMBED PEARL HARBOR</pattern>
<template>WHO *</template>
</category>
The original response was based on "WHO *", which
is too general for this topic. But the odds
are small of anyone else using this exact pattern
WHO BOMBED PEARL HARBOR when asking about the
same topic. Think about the alternative ways
of expressing the same question:
"Who attacked Pearl Harbor?", "Who invaded Pearl
Harbor?", "Who through deceit and subterfuge
carried out an unscrupulous and unprovoked suprise
attack on American forces at Pearl Harbor?"
You can cover all of these inputs by generalizing
the input pattern with the wild-card "*",
which matches any word or sequence of words:
<category>
<pattern>WHO * PERAL HARBOR</pattern>
<template>The Japanase
attacked Pearl Harbor on December 7, 1941,
"A day that will live in infamy" (FDR).
<A href="http://www.pearlharbor.org">...
</template>
</category>
Remember, the AIML pattern language allows
at most one wild-card "*" per pattern.
Of course, with choice (C) you have to
edit the template as well as the pattern.
5. When finished with editing the suggested categories,
use "Botmaster - Add AIML" to add the new AIML content.
If you made any syntax errors, you can fix them
and repeat the "Add AIML" as many times as needed.
Be sure to do a "File - Save Robot" at this point
also to back up your changes. This will save all of
your new categories in the root robot file
"B.aiml".
6. Use "More Targets" to find more new categories
until the new suggestions are fruitless. Then, go
back and start with "Classify" again (step [3]).
7. The responses you create should be a combination
of a "conversational" response like "He is James
Bond, the famous spy" and also provide some HTML
hyperlinks where appropriate.
WHAT ARE ALL THE OPTIONS FOR PROGRAM B
There are robot personality options, animated agent options,
log file and analysis options, and options for the web server
and for the applet. Most of the time you won't need to change
many of these values. For completeness, the entire set
breaks down into:
Robot options:
Sign - Astrological sign
Wear - clothing and apparel
ForFun - What the robot does for fun
BotFile - Root file of robot personality
BotName - Robot name
Friends - The robot's friends
LookLike - The robot appearance
Question - A random question
TalkAbout - favorite subjects
KindMusic - Favorite kind of music
BoyFriend - Does the robot have a boyfriend?
BotMaster - Robot author
BotGender - male, female or custom
GirlFriend - Does the robot have a girlfriend?
BotLocation - Robot location
BotBirthday - Robot activation date
FavoriteBook - Robot's favorite book
FavoriteFood - Robot's favorite food
FavoriteSong - Robot's favorite song
FavoriteBand - Robot's favorite band
FavoriteMovie - Robot's favorite movie
FavoriteColor - Robot's favorite color
BotBirthplace - Robot's birthplace
MS Agent options:
Animagent - true or false for activating MS Agent VB scripting
ACFURL - file or URL location of MS Agent software
Log/Analysis options:
AnalysisFile - file selected for log file analysis
LogFile - file for recording robot dialogues
ClientLineContains - a pattern identifying input lines in logfiles
RobotLineStarts - a pattern identifying robot lines in logfiles
StartLine - starting line for analysis
EndLine - ending line for log file analysis
Applet options:
AppletHost - DNS name or IP address of applet's server.
CodeBase - URL or directory of applet code.
Web server options:
ClerkTimeout - Web server option to retire waiting clerks
BrainSize - a threshold number of categories to display "loading"
Advertize - a boolean parameter to optionally display ad
Beep - Web server option to beep on client connections
Other:
Version - read only version number
TempFile - scratch file for temporary data
MergePolicy - 'Merge' or 'Discard' depending on how you
want to handle duplicate key categories.
All of the options reside in the globals.txt file. Running
program B, choose "Options/Show Options" to see the
contents of the file.
WHAT ARE DEPRECATED APIS
One of the biggest challenges facing a Java application
developer is finding a "lowest common denominator" subset of the
Java language that works on all the platforms and versions
out there. The Java language itself is a moving target.
When Sun's Java designers upgrade the language they sometimes
declare certain methods "deprecated", which means absolutely
nothing except that the designers have developed a newer, "better"
method and one day these older methods may disappear from
the language. Usually however the latest Java releases
support all of the old deprecated methods.
WHAT ARE JUSTTHAT AND JUSTBEFORETHAT
<beforethat/>, <justthat/> and <justbeforethat/> are new, experimental
AIML tags. The idea here is to represent more "state" in the dialogue
than just "that":
Client: BeforeThat.
Robot: ...JustBeforeThat.
Client: JustThat.
Robot: ...That.
Client: Input.
Robot: <justthat/> = JustThat
<justbeforethat/> = JustBeforeThat
<beforethat/> = BeforeThat
The following dialogue fragment illustrates the use of these tags:
Client: MY NAME IS RICH.
Robot: Rich, are you my master?
Client: YES.
Robot: Tell me more.
Client: JUST THAT.
Robot: "my name is rich" [<justthat/>]
Client: JUST BEFORE THAT.
Robot: "Tell me more" [<justbeforethat/>]
Client: BEFORE THAT.
Robot: "Yes" [<beforethat/>]
There is an asymmetry between the client's tags (<beforethat/> and <justthat/>)
and the robot's tags (<justbeforethat/> and <that/>). The Responder
breaks down each multiple-line client input into individual sentences.
The log file records each client input sentence individually, and shows
the robot's reponses line by line. The robot, on the other hand, may
respond to a single input sentences with multiple response sentences.
The <that/> and <justbeforethat/> tags refer to only the last sentence
in the robot's reply. The client tags <beforethat/> and <justthat/> always
refer to the current and previous client input lines, even if they were
part of a multiline input. In the descriptive dialogue above
we used the notation "Client: BeforeThat." and "Robot: ...JustBeforeThat.",
with ellipses representing sentences in the robot reply, to emphasize
the asymmetry. If all the robot responses consisted of exactly one
sentence each, the asymmetry would disappear.
In the future we may expand AIML categories to include such
"deeper context", along the lines of the <that>...</that> tag,
if there is a need for it.
WHAT ARE THE GET METHODS
Get methods are logically atomic tags, i.e. they enclose no text.
(similar to say <P> or <IMG> in HTML). But XML requires closing tags.
All the "get" methods retrieve values stored relative
to a particular client IP address. We use
hash tables to store the maps from IP to these attributes.
<get_ip/> Get the client's IP address
<getsize/> A string indicating robot memory size
<getversion/> The ALICE program version
<getname/> client's name
<gettopic/> The "topic" of conversation
<name/> Robot's name
<location/> Robot's location
<gender/> Robot's gender
<birthday/> Robot's birthday
<that/> what robot said previously
<get_location/> the client's geographic location
<get_it/> the value of "it"
<get_they/> the value of "they"
<get_he/> the value of "he"
<get_she/> the value of "she"
<get_we/> the value of "we"
<get_gender/> a string like "she" or "he" for client gender
In XML languages there is always a tradeoff between creating attributes
and creating new tags. The get methods are really all special instances
of a more general <get attribute="name">, for example
<get_we/> = <get attribute="we"/>
The attributes with explicit "get" names (getname, get_it, get_we etc.)
are client-specific properties. The other attributes (e.g. <name/> and
<botmaster/>) relate to the robot.
WHAT ARE THE SET METHODS
Set methods consist of single-tag and double-tag markup. The
methods
<set_male/> the client gender is male
<set_female/> the client gender is female
<set_animagent/> activates the animation agent.
<setname> X </setname> sets the client name to X
<settopic> X </settopic> sets the topic to X
<set_it> X </set_it> sets the value of "it" to X
<set_location> X </set_location> sets the value of client location
<set_they> X </set_they> sets the value of "they" to X
<set_he> X </set_he> sets the value of "he" to X
<set_she> X </set_she> sets the value of "she" to X
<set_we> X </set_we> sets the value of "we" to X
<set_thought> X </set_thought> is a custom tag suggested by Andrew
Potgieter for storing a predicate for "what are you thinking about?"
See the documentation on custom tags and the predicates.txt file.
WHAT DO YOU MEAN BY THE COMMAND JAVA B
This does not mean you mean click on an icon. If you are using Windows,
you must use a DOS window to run a Java program. Find the MS-DOS item
on your start menu or desktop and open up a DOS window. In that window, use
the DOS commands CD (change directory) to move to the "B" directory.
Then type "java B" to run the program.
If you are using windows, then you can create a desktop icon
as a "shortcut" to a batch file. Create a batch file called
"launch.bat" in the program B directory. The file contains only
one line with the text "java B". There is an AIML icon file
included with program B called "aiml.ico". You can use this
file to add an icon to your desktop.
WHAT DOES CLASSIFY DO
The key to chat robot development is log file analysis. The program
stores client dialogues in a file called "dialog.txt" (unless you
change this default name). The "Classify" button activates a routine
that scans the dialogue file and reports how many times each
category is activated. The processing may take several minutes,
depending on the size and range of the dialogue file chosen. The
result appears as a table in the Edit View window. The program
displays the categories sorted by activation count.
The format of each output line is:
P% (Q%) T PATTERN = N1 W1 + N2 W2 + ...
Where
P = Percent of inputs classified in this category
Q = Cumulative percent up to this category
T = Total count of inputs activating this category
Ni = number of times input Wi detected (blank if Ni = 1)
Wi = normalized input pattern activating this category
WHAT DOES CLEAR DO
To enter another robot query, clear the screen with the "Clear"
button. Enter a new String like "How are you?" and press "Say."
"Send" and "Clear" provide a simple way to communicate with the
chat bot through the Edit View. Try cutting and pasting a paragraph,
such as an e-mail message, into the Edit View and press "Send".
See how the robot would reply to your multiline message.
WHAT DOES HELP DO
The "Help" button displays a random FAQ question that ALICE
knows the answer to. You can see the answer by pressing the
"Send" button.
The Help menu provides the same function as the Help button
under the selection "Random Help Question." Select a random
Help question and obtain the reply with the "Send" button.
The Help menu also contains an item to Show All Help Questions.
This command lists all the FAQ questions the robot knows. You can
select one question by deleting the others. Obtain the
answer with the "Send" button.
The menu item "Ask Help Question" is the same as "Send". This
item asks the robot the Help question(s), and displays the reply.
The Help menu displays the entire FAQ with the "Don't Read Me"
selection. Finally, the "GNU Public License" menu items displays
the open source software license for program B.
WHAT DOES MORE TARGETS DO
If you don't see enough good targets with "Quick Targets", hit
"More Targets."
WHAT DOES QUICK TARGETS DO
After running Classify, the Quick Targets button displays a set of
new AIML categories for editing. The program uses statistics to
find new category candidates. These categories are displayed as
<category>
<pattern> NEW PATTERN </pattern> <template> OLD PATTERN </template>
</category>
where OLD PATTERN is the pattern from the original category and
NEW PATTERN is the proposed new input pattern.
The botmaster may choose to either delete or edit the new category.
If the new category is not desired, delete it by selecting the
category from the text area and "cut" the text with the "delete"
key.
If the new category appears useful, edit the OLD PATTERN string to
create a new reply. Optionally, the NEW PATTERN may also be edited,
depending on how specific a pattern the botmaster desires.
When finished editing the Target categories, go to the "Botmaster"
menu and select "Add AIML". The "Add AIML" menu item will read the
text displayed in the Edit View and parse it into new AIML categories.
The botmaster may then save the updated robot with the "File/Save Robot"
or "File/Save Robot As" menu items.
WHAT DOES SEND DO
Type a text string like "hello" into the Text Area
(Edit View) and press the "Send" button. Notice that program B
replaces the text in the Edit View with a reply from the robot.
WHAT DOES THE EDIT MENU DO
Paste contents of clipboard into the program B text area.
WHAT DOES THE FILE MENU DO
Save and load text files (transfer contents to/from text area);
Save and load robot (AIML) files.
1. By default, AIML files use the .aiml file extension.
2. The default robot file is called "B.aiml"
3. By default the robot files reside in the same directory as
program B
4. Robot files begin and end with the tags <alice> and </alice>
5. "Save Robot" overwrites the default robot file (see 2).
6. "Save Robot As" can be used to copy a robot.
Exit - exit the program
WHAT DOES THE OPTIONS MENU DO
Display and save chat robot options.
Use start and end index to select a range of lines
from the dialog file.
Toggle Beep - Make a sound when a remote client connects.
WHAT FILES ARE NEEDED TO RUN THE PROGRAM B WEB SERVER
The program B directory must contain the HTML files header.html,
trailer.html, loading.html and HOME.html. You can customize these files for
your bot, but take care with "header" and "trailer" because
program B uses these files to construct an HTML reply
(by inserting the robot reply and the text form between the
"header" and the "trailer"). Use "header" and "trailer" to
customize the robot with your own logo and links.
Program B needs at least one AIML file, usually called B.aiml
by default. The AIML file may contain <load> tags that recursively
load other AIML files; these must also be present.
The program also requires the file "globals.txt"
which it reads at start up.
The files "language.txt" and "predicates.txt" are option.
"language.txt" controls the language of the buttons and
menu items in the program B GUI. The file "predicates.txt"
defines any custom predicates.
Program B also reads the files "gnu.txt" (the GNU Public License)
and "dont.txt" (this file).
WHAT FILES DO I NEED TO RUN THE APPLET
You only need the java *.class files and the *.aiml files
to run the ALICE Applet, no more files are necessary.
You can also put all the class files in a single jar
file like Blet.jar. The sample index.html provided with the ALICE
distribution uses this Blet.jar file.
Not all of the Java source files are involved in the Applet.
You can use the following command to compile all the Java source
files needed for the Applet:
javac Access.java Globals.java StringFile.java Substituter.java \
Classifier.java Loader.java Animagent.java Log.java Blet.java
Then, you can use zip (or jar) to collect the class files into
a single jar file:
zip -r Blet.jar *.class
The *.class will include all the class files you compiled.
The *.aiml files have to be on the same host that serves the Applet. An applet
can only open files on the server it originated from.
Don't forget to change the Applet host parameters in index.html, when
you upload the applet to an ISP.
WHAT HAPPENS TO CONTRACTIONS AND PUNCTUATION
Program B has a class called Substituter that performs a number
of grammatical and syntactical substitutions on strings.
One task involves preprocessing sentences to remove ambiguous
punctuation to prepare the input for segmentation into individual
sentence phrases. Another task expands all contractions and
coverts all letters to upper case; this process is called
"normalization".
The Substituter class also performs some spelling correction.
(See also the question "What is <person/>?")
One justification for removing all punctuation from inputs
is the need to make ALICE compatible with speech input systems,
which of course do not detect punctuation (unless the speaker
utters the actual word for the punctuation mark -- "period").
WHAT IF I DO NOT WANT TO DISCARD DUPLICATE CATEGORIES
Using the global parameter MergePolicy, you can choose
to either "Merge" or "Discard" templates with duplicate keys.
If you choose the "Merge" option then the program applies a
heuristic to try to merge the two responses together with
a "<random>" tag. The results of this operation may be
unpredictable, so the program logs all duplicates in a file
called "duplicates.txt".
The heuristic merge works as follows: Suppose X and Y are the two
templates to merge into a new template Z. Let X be the new template
and Y the existing one. Assume that X and Y are either <random>
lists or "atomic", in the sense that they contain no <random> tags.
If X and Y are both "atomic" then Z = <random><li>X</li><li>Y</li></random>.
If Y is a <random> list atomic then the program checks to see if X is
already a member of that list, to avoid duplicate list items. Otherwise,
Z = the <random> list from Y with X inserted.
WHAT IS A CATEGORY
AIML consists of a list of statements called categories. Each
category contains an input pattern and a reply template.
The syntax of an AIML category is:
<category>
<pattern> PATTERN </pattern> <template> Template </template>
</category>
or
<category>
<pattern> PATTERN </pattern>
<that> THAT </that>
<template> Template </template>
</category>
The AIML category tags are case-sensitive. Each open tag has an
associated closing tag. This syntax obviously derives from XML.
WHAT IS A PATTERN
The pattern is the "stimulus" or "input" part of the category.
The pattern is an expression in a formal language that consists of
(1) Words of natural language in UPPER CASE.
(2) The symbol * which matches any sequence of one or more words.
(3) The symbol _ which is the same as * except that it comes
after Z in lexicographic order.
(4) The markup <name/> which is replaced at robot load time
with the name of the robot.
Note there is a difference between the patterns HELLO and HELLO *.
HELLO matches only identical one-word sentences ("Hello.")
and HELLO * matches any sentence of two or more words starting
with "Hello" ("Hello how are you?").
To simplify pattern description and matching, AIML patterns allow
only one "*" per pattern. In other words, "MY NAME IS *" is a
valid pattern, but "* AND *" is not.
WHAT IS A SYMBOLIC REDUCTION
In general there are a lot of categories whose job is
"symbolic reduction". The category:
<category>
<pattern>ARE YOU VERY *</pattern>
<template><srai>ARE YOU <star/></srai></template>
</category>
This category [in Brain.aiml] will reduce "Are you very very smart"
to "Are you smart".
WHAT IS A TEMPLATE
A template is the "response" or "output" part of an AIML category.
The template is the formula for constructing the reply. The simplest
template consists of plain, unmarked text. AIML provides markup
functions to tailor the replies for each individual input and client.
The markup function <getname/> for example inserts the client's name
into the reply.
The template may call the pattern matcher recursively using the
<sr/> and <srai> tags. Many templates are simple symbolic
reductions that map one sentence form to another, for example
"Do you know what X is?" transforms to "What is X" with the category
<category>
<pattern>DO YOU KNOW WHAT * IS</pattern>
<template><srai>WHAT IS <star/> </srai></template>
</category>
The template may also contain other embedded HTML and XML.
These embedded tags may cause the browser to play a sound,
show an image, or run an applet. There is considerable freedom
of expression in the construction of response templates. The
botmaster is encouraged to study the examples in ALICE, to and
experiment with new ideas.
WHAT IS AIML
The ALICE software implements AIML (Artificial Intelligence Markup
Language) a non-standard evolving markup language for creating chat robots.
The primary design feature of AIML is minimalism. Compared with
other chat robot languages, AIML is perhaps the simplest. The
pattern matching language is very simple, for example permitting
only one wild-card ('*') match character per pattern.
AIML is an XML language, implying that it obeys certain grammatical
meta-rules. The choice of XML syntax permits integration with
other tools such as XML editors. Another motivation for XML is
its familiar look and feel, especially to people with HTML experience.
An AIML chat robot begins and ends with the <alice> and
</alice> tags respectively.
WHAT IS ARE THE LT PERSON GT TAGS
The <person> and <person2> tags indicate a place where the
AIML interpreter changes the personal pronouns in a sentence.
<person2> X </person2> change X from 1st to 2nd person
<person> X </person> exchange 1st and 3rd person
<person2> is not often used. The main application is
"gossip":
Client: I admire robots like you.
Robot: That's good information: Joe said he admire robots like me.
The transformation is a combination of:
1. change the first person pronouns to second person.
2. change the third person pronouns to first person.
The array in Substituter.java is incomplete. We need more substitutions
to make person2 work really well.
The <person> substitution is much more common and easier
to understand, because it simply exchanges 1st and 3rd person
pronouns. The main issue with <person> in English is knowing
when to use "I" and when to use "me".
WHAT IS CLASS ACCESS
Class Access is the abstraction for log file analysis to
extract dialogues. In a typical chat robot server scenario,
the program records each line of client input and the robot
reply in a log file. Given many simultaneous conversations,
these dialogues are interleaved in the log file. The purpose
of class Access is to unravel these conversations into
individual threads by client.
WHAT IS CLASS ALICEREADER
AliceReader is an efficient, small-footprint XML interpreter
hard coded by Kris Drent specifically for reading AIML categories.
Each category has a pattern, a template, and an optional topic and
thatpattern. AliceReader scans the AIML input and tries to
identify these fields as quickly as possible.
WHAT IS CLASS B
Class B is the old name for the Swing version of class Bawt, but
now just extends Bawt.
WHAT IS CLASS BAWT
The class Bawt is the Java application, and implements the GUI.
WHAT IS CLASS BLET
The Blet class is the applet, but is similar in many ways to the application.
The applet is a stripped down version of the program, with a simpler GUI
and no "botmaster" privileges. Also, the Blet class doesn't utilize the
web server, because it runs as a client-side applet.
WHAT IS CLASS BRAIN
Brain extends StringSorter, and uses StringRanker. The sorted
strings in the Brain class are keys formed by combining the
pattern, that, and topic strings. In the original versions
of ALICE, there were no "that" and no "topic" tags, so the
Brain class simply mapped input patterns to output templates.
With the addition of the "that" and "topic" tags we had to
create the "key" from the combination of all three.
The "Target" objects in class Brain are instances of StringRanker.
These structures form the basis of the classification and targeting
algorithms in program B. For each category, the Targetmap contains
an instance of StringRanker storing the inputs classified into
that category.
WHAT IS CLASS CLASSIFIER
The class Classifier might as well be called "bot" because it contains
the basic functionality of the chatterbot algorithm.
See the question "How can I interace my Java program to ALICE?" for
additional information about the class Classifier.
WHAT IS CLASS DIALOGUE
A Dialogue (not to be confused with a Dialog class!) is
the representation of the conversation between the client
and the robot. The basic data structure is a pair of String arrays
client_said[] and robot_said[] that store the alternating
statements of client and robot. The Dialogue also
encodes the length, hostname, and start and end tag
information.
WHAT IS CLASS GLOBALS
Globals is the repository for all of the botmaster-selectable
parameters in program B. The Globals class corresponds to
the "Options" menu on the program B menu bar. Globals contains
methods toFile() and fromFile() to make these values
persistent between sessions.
WHAT IS CLASS INTSET
IntSet represents a set of integers. Were we using Java
Collections this would likely be a Set, but the simple
requirements of program B allow us to create a simple
IntSet class.
"Set" means that the object has only one occurance of each item:
{1, 4, 2, 9} is a set of integers; {1, 1, 2} is not.
WHAT IS CLASS KID
Class Kid is a simplified graphical user interface, "easy enough
for kids" to run. Program Kid does not evoke program B, but the Kid
may be started from the program B options menu. The logic here
is that kids should be able to have conversations with the chat
robot, but parents may not want kids to start chat robot servers
(see Appendix B: Note to Parents).
Class Kid utilizes RobotCommunicator as its interface to the
chat robot.
WHAT IS CLASS LINECLASSIFIER
In the file Log.java you will find an Interface called LineProcessor
with one required method: process_line(). The LineProcessor
is the abstraction of an algorithm that reads a file one line at a time,
processes each line as a data record, and moves on to the next.
LineClassifier implements LineProcessor because it reads lines
of text from the log file and identifies client input lines for
classification. What makes classification efficient is the way
LineClassifier stores the client lines in a SortedStringSet, called
Lines. Becuase the matching algorithm proiritizes the patterns
alphabetically, LineClassifier can classify an element from Lines
in O(1) time.
The code for LineClassifier is in Classifier.java.
WHAT IS CLASS LOADER
Both the application and the applet use the Loader class to load the AIML
robot script. The Loader class extends Thread, and runs "in the background"
while the GUI and, in the case of the application, the web server start.
WHAT IS CLASS PARSER
The Parser class is responsible for the evaluation of AIML
response templates. The method pfkh() [the Program Formerly
Known as Hello] is the heart of evaluation process. This
method contains the code for recognizing and processing
AIML template tags.
The Parser class does not parse all the AIML in the language
definition; it parses and evaluates only the templates at runtime.
Another class, AliceReader, has the job of reading the AIML files
at load time, and parsing the categories into topics, patterns and templates.
WHAT IS CLASS ROBOTCOMMUNICATOR
If you want to customize your own application or applet then
you might find RobotCommunicator is a useful class. The
RobotCommunicator abstracts the combination of a scrolling TextArea
output display with a TextField input area input field.
WHAT IS CLASS SORTEDINTSET
The sorted version of IntSet, SortedIntSet maintains its
elements in a sorted array. Throughout program B you will
find many loops utilizing instances of SortedIntSet. These
objects provide an efficient means to locate items in
"rank order", the highest numbered items first and the
smallest numbers last.
WHAT IS CLASS STRINGHISTOGRAMMER
StringHistogrammer extends StringSet and contains a map from
each string to a count, usually indicating the number of times
that string appears in a sample of text. A histogram is
like a "bar graph" that counts occurances of each item.
WHAT IS CLASS STRINGRANKER
Extending StringHistogrammer, StringRanker also sorts the
strings by the histogram count. The highest count string
is first, the next highest count second, and so on.
The concept of a StringRanker should be familiar to anyone
who has ranked people, companies or sports teams by any
number such as sales, market capitilization, or points scored.
One application for a StringRanker is determining the
"top 10 referers" in HTTP log file analysis (see
http://alicebot.org/mine.html).
WHAT IS CLASS STRINGSET
The StringSet implements the abstract concept of a set of
strings, meaning that each string item appears at most once
in the setc.
The "set" means that the strings occur only once in instances
of object StringSet: {"this","that","another"} is a set of
strings; {"start","start","stop"} is not.
WHAT IS CLASS STRINGSORTER
StringSorter extends StringSet but enforces an alphabetical
ordering of the Strings. The StringSorter maintains its
data structure dynamically, so that the set remains sorted
after each item is added. Specifically, the StringSorter uses
a binary-search algorithm for fast String insertion.
WHAT IS CLASS SUBSTITUTER
The static class Substituter contains a number of similar string substitution
methods useful at several points in program B.
Program B has the unique feature that it relies on HTTP GET methods,
rather than POST methods, to transmit chat inputs to the robot server.
HTTP inserts '+' characters in place of spaces, and applies a series of
substitutions to eliminate many characters. The static method cleanup_http()
undoes these substitutions and restores the input string to the form similar
to what the client originally typed.
The problem of segmenting strings into sentences is complicated by the
conventional use of periods to denote abbreviations like "Dr.", "Mr.",
and "St." The method deperiodize() applies a series of substitutions to
eliminate most common abbreviations. Like the other substitution methods
in this class, the deperiodize() method has an associated static data member
of class String[][2], which stores the substitution map.
The patterns in AIML are written in normalized form. The method normalize()
converts a string to normal form by the following steps:
1. Remove all punctuation (inputs assumed to be individual sentences)
2. Convert string to upper case
3. Place exactly one space between words
4. Expand all contractions
5. Correct a few common spelling mistakes
6. Return a "Trimmed" string
The justification for removing all punctuation from text inputs
is explained by the need to make the chatterbot compatible with speech
inputs, which of course contains no punctuation.
WHAT IS CLASS UNIFIER
Unification refers to the process of matching and binding. A unifier determines
whether two sentences match and, if so, what any 'variables' in the pattern
bind to. In the case of AIML the only matching variable is the single '*'
symbol. The Unifier class contains a 'star' data memeber to contain the
matched subsentence.
WHAT IS CLASS WEBSERVER
The WebSever class implements a "faux" HTTP server, i.e. a server that
listens for HTTP connections and accepts them; then replies in properly
formatted HTML. The connecting client, typically a browser, cannot tell
the difference between the chat robot server and a full-blown web server.
In particular, our WebServer implements only HTTP GET methods, not POST
methods. Our WebServer class does not implement many of the other features
of ordinary web servers; although it is a multithreaded server.
WHAT IS LT LOAD FILENAME X GT
The template may contain a <load/> tag to recursively load an AIML
file. The semantics of a load are the same as a merge: categories
loaded first have priority; the server eliminates categories with
duplicate patterns.
The default robot file B.aiml contains the top-level load commands.
There are several ways to "comment out" a <load> tag in order
to test your system with a smaller robot. You can change the
line reading
<load filename="Brain.aiml"/>
to
<noload filename="Brain.aiml"/>
and the AIML parser will simply ignore the non-existent "noload"
command.
WHAT IS LT STAR GT
The <star> tag indicates the input text fragment matching the pattern '*'.
Remember, <star/> is an XML abbreviation for <star></star>.
<star/> the value of "*" matched by the pattern.
WHAT IS LT THAT GT
The keyword "that" in ALICE refers to whatever the robot said before
a user input. Conceptually the choice of "that" comes from the
observation of the role of the word "that" in dialogue fragments like:
Robot: Today is yesterday.
Client: That makes no sense.
Robot: The answer is 3.14159
Client: That is cool.
In AIML the syntax <that>...</that> permits an optional "ThatPattern"
to match the robot's "that" expression. A common example using "that"
is any yes-no question:
<category>
<pattern>YES</pattern>
<that> DO YOU LIKE MOVIES </that>
<template> What's your favorite movie? </template>
</category>
This category handles the user input "YES" and checks to see whether
the client is replying to the question "What's your favorite movie?".
One interesting application of "that" are the categories that
enable a robot to respond to "knock-knock" jokes:
<category>
<pattern>KNOCK KNOCK</pattern>
<template>Who's there?</template>
</category>
<category>
<pattern>*</pattern>
<that>WHO IS THERE</that>
<template><person/> Who?</template>
</category>
<category>
<pattern>*</pattern>
<that>* WHO</that>
<template>Ha ha very funny, <getname/></template>
</category>
Client: KNOCK KNOCK
Robot: Who's there?
Client: BANANA
Robot: banana Who?
Client: KNOCK KNOCK
Robot: Who's there?
Client: BANANA
Robot: banana Who?
Client: KNOCK KNOCK
Robot: Who's there?
Client: ORANGE
Robot: orange Who?
Client: ORANGE YOU GLAD I DID NOT SAY BANANA
Robot: Ha ha very funny, Aol-person
WHAT IS LT THINK GT
The simple purpose of the <think> X </think> tag pair is
to evaluate the AIML expression X, but "nullify" or hide
the result from the client reply.
A simple example:
<category>
<pattern>I AM FEMALE</pattern>
<template>Thanks for telling me your gender. <think><set_female/></think>
</template>
</category>
The <set_female/> tag normally returns a string like "she". But the
<think> tag hides the text output of <set_female/> from the reply,
which contains only the text:
Thanks for telling me your gender.
WHAT IS NEW IN AIML
AIML is changing. The original tag syntax was changed
into XML. Right now, AIML uses XML syntax for the
categories, patterns, "that" patterns and templates, but inside the
<template> tag you may still see the original +~ syntax in a few places.
But this will change soon. For completeness program B
supports both versions.
The biggest change between the old AIML and the new
XML version of AIML is the elimination of the "+"
character to stand for string appendage. The change
is of little concern except in the implementation of
<random>, discussed at length below.
The old AIML used a tilde (~) markup character to
indicate the start of an AIML token. The XML version
naturally uses an SGML type tag syntax instead.
XML tags, unlike HTML, are case-sensitive. Moreover, XML syntax
requires a closing tag of some kind. The "empty" tags that contain
no text, like <A></A> in HTML, are written like <A/> in XML.
WHAT IS ON THE HELP MENU
Random Help - Same as "Help" button.
Show Help Questions - Displays a list of all FAQ questions. Select
one by deleting all the others. Obtain the answer with "Send."
Don't Read Me - Display the text of this document.
GNU Public License - Display the software license.
WHAT IS PROGRAM BAWT
Significant demand for a version of ALICE compatible with
pre- Java 2 (formerly known as Java 1.2) prompted the
development of "Bawt.java", an open source java program
for chat robot development that works with older versions of
Java, and AWT. Originally program B relied on
Java 2 and Swing, but program Bawt needs only Java 1.1 and AWT.
Swing is a newer GUI package that subsumes the earlier Java
Abstract Windows Toolkit (AWT).
At present class B merely extends class Bawt. Swing not
supported.
WHAT IS THE BOTMASTER MENU
The Botmaster menu contains all the tools to help develop chat robots.
Classify - same as Classify button
Default Targets - display targets obtained from
the Default ('*') category,
in a format suitable for
quick conversion to new AIML.
Recursive Targets - display targets from "recursive" categories,
i.e. categories with a template containing
the AIML <sr/> or <srai/> functions.
Autochat - The robot chats with herself; sometimes helpful
in detecting conversation "loops".
Add AIML - Clear the screen and type a line of AIML. Selecting
"Add AIML" adds this new category to the chatbot. You can
test the bot with "Send" and "Classify", then save it with
"File/Save Robot".
In general you can add any number of new AIML categories
to the bot with "Add AIML."
WHAT IS THE CLASS STRUCTURE OF PROGRAM B
The core functionality of program B resides in the file
Classifier.java. In that file, you find a class hierarchy
from "String" to "Brain" and finally "Classifier."
A branch in that hierarchy contains classes for histogramming
and ranking.
The first branch of the class hierarchy derives class Brain
from StringSorter, extending StringSet. The second branch
extends StringSet to StringHistogrammer and on to StringRanker.
The final class Brain extends StringSet and uses StringRanker.
WHAT IS THE DIFFERENCE BETWEEN B AND C
AIML is a platform-independent, language-independent specification
for creating chat robots like ALICE. The original AIML interpreter
ran in SETL. The next one developed was program B, the Java program
which is the subject of this document. Most recently new threads
of C/C++ development have led to "program C", actually a collection
of C/C++ programs and applications including Cgi-ALICE, IRC-ALICE and
WinALICE. See the web sites http://c.alicebot.com and
http://hippie.alicebot.com for more details.
Program B remains the most stable, general purpose chat robot
program in the AIML family. This Java implementation has been
subject to intense peer review over a period of years, evolving
into a remarkably bug-free, efficient and reabable piece of
software.
WHAT IS THE DTD FOR AIML
Real XML fanatics know that because AIML is an XML language it
must have something called a DTD (Document Template Descriptor).
The DTD is a formal specification of the grammar for an XML language.
Unless you are using special XML tools to work on your AIML or
developing your own parser for AIML, you probably do not need to know
much about the DTD.
Our DTD reflects the current content of the *.aiml files that program B can
actually parse. The DTD will become more general as the parser
improves.
Rather than reproduce the entire DTD here, in order to shorten the
length of this document, we refer the reader to
the A.L.I.C.E. XML page by John Friedman. The URL for the AIML
DTD may be found on the page at http://XML.ALICEBot.Com.
The full URL for the DTD is
http://xml.alicebot.com/xml/aiml/alice.dtd
WHAT IS THE GOAL FOR AIML
AIML (Artificial Intelligence Markup Language) is an XML specification
for programming chat robots like ALICE using program B. The emphasis
in the language design is minimalism. The simplicity of AIML makes
it easy for non-programmers, especially those who already know HTML,
to get started writing chat robots.
One ambitious goal for AIML is that, if a number of people create their own
robots, each with a unique area of expertise, program B can literally
merge-sort them together into a Superbot, automatically omitting
duplicate categories. We offer the both the source code and the ALICE
content, in order to encourage others will "open source" their chat
robots as well, to contribute to the Superbot.
Botmasters are also of course free to copy protect private chat robots.
WHAT IS THE LOW LEVEL INTERFACE TO PROGRAM B
If you require only a graphical interface, try using the
class RobotCommunicator. Depending on your application,
you may also try the Servlet interface or the applet.
Some developers however may want lower-level access to the
chat robot functions.
The class Classifier in Classifier.java contains the low-level
methods needed to interface directly to ALICE. "Classifier" might
as well be called "Bot" because more than any other class,
it handles those functions most unique to the chat robot.
The method Classifier.multiline_response() is a key entry point
into the conversation engine. The "multiline" in
"multiline_response" means that the input may contain
multiple "lines" or sentences. The first argument "query" to
multiline_response is the input. The second argument "hname" is
the virtual IP address of the client. The third and last argument
is the class implementing the Responder interface.
If the input string contains "Sentence1. Sentence2? Sentence3."
then multiline_response might produce:
> Sentence1.
Reply1
> Sentence2
Reply2
> Sentence3
Reply3
The method multiline_response hides all of the details
of sentence segmentation, responding to each input line individually,
and formatting the output. In particular multiline_response()
may or may not append the VBScript needed to drive the MS
Agent output, depending on whether the global MS Agent parameter is set.
The argument "hname" is a key that indexes the client's conversation. For
the interface you need this can probably always be "localhost" or some
other constant.
WHAT IS THE LT PERSON GT TAG
The XML specification requires that every start tag such as
<person> be followed by a matching end tag like </person>.
HTML is more relaxed about this requirement, exemplified by
the liberal use of the <IMG> tag without a corresponding </IMG>.
XML supports a shorthand notation for the "atomic" tags.
The <star/> tag is an example of a shorthand AIML tag.
<person/> is another example:
<person/> = <person><star/></person>
This tag replaces the +~person(*)+ tag in old-style AIML.
WHAT IS THE LT PERSON2 GT TAG
This tag is an abbreviation:
<person2/> = <person2><star/></person2>
See the FAQ question "What are the <person> tags?" for more
information about <person2/>.
WHAT IS THE LT PERSONF GT TAG
The value of <personf/> (a "formatted" personal pronoun transformation)
is shown by the example
<category>
<pattern>WHAT IS A *</pattern>
<template>
What does
<A HREF="http://www.dictionary.com/cgi-bin/dict.pl?term=<personf/>">
<set_it> <person/> </set_it>
</A> mean? <BR>
Or Ask Jeeves:
<A HREF="http://www.ask.com/AskJeeves.asp?ask=WHAT%20IS%20A%20<personf/>">
What is a <person/>?
</A>
</template>
</category>
The search strings formatted for the Webster Dictionary and for
the Ask.com search engine utilize <personf/>. The effect is the
same as <person/>, but the formatting inserts an escaped "%20" in
places of the spaces returned by <person/>. These escape sequences
permit the HTTP GET methods to transmit multiple-word queries.
WHAT IS THE LT SRAI GT TAG
The recursive function <srai> stands for
"Stimulus-Response artificial intelligence" and means
that the text between the tags should be sent recursively
to the pattern matcher and the result interpreted.
The resulting text replaces the original text in the markup.
<srai> X </srai> calls the pattern matcher recursively on X.
<sr/> recursive call to chat robot
<sr/> abbreviates <srai> <star/> </srai>
Note: what happens if X contains AIML markup? Does the interpreter
do "lazy evaluation"? Look at the source code and examine the
method pfkh(), the Program Formerly Known as "Hello".
WHAT IS THE LT TOPIC GT TAG
1. <topic> allows ALICE to prefer responses that deal with the
topic currently being discussed. This creates topical
conversation, yet still has the ability to move from one subject
to another.
2. <topic> allows ALICE to have duplicate patterns in different
contexts (topics) allowing ALICE to have different responses to
the same input patterns depending on the topic. For example,
"overriding" the " * " pattern for different topics. (I'll give
an example with this.)
3. As always, you can still use the <gettopic/> tag to refer to
the topic in your output statements (templates).
4. As always, you can add topics on top of all your existing AIML
to keep your bot's current personality.
WHAT IS THE RESPONDER INTERFACE
Developed to meet the needs of multiple ALICE
application scenarios, the Responder interface
simplifies the code in class Classifier for
natural language queries. The Responder defines
an interface with four members:
pre_process() : runs any initialization first.
log() : tells how to log the conversation.
append() : how to append response lines together.
post_process() : runs after response loop finishes.
The method Classifier.multiline_response() calls
all of the Responder methods. See the next
question ("What is the low-level interface?")
for more information about multiline_response().
At least five classes implement the Responder
interface:
GUIResponder: the program B GUI uses this.
HTMLResponder: a class for Web Server HTML replies.
RobotResponder: this class used by RobotCommunicator
CustomResponder: a template for more Responder classes.
AppletResponder: the Applet code uses this class.
These classes all handle special circumstances
for the various Responder types: for example,
HTMLResponder appends the client input to each
response; GUIResponder does not. AppletResponder
logs the dialogue through a network URL connection;
all other classes write to a local file. RobotResponder,
used by the Kid interface, suppresses all the HTML
from robot replies; while HTMLResponder passes
them through. HTMLResponder also runs the optional
Animagent class to create the MS Agent VB Script.
Text-based Responder classes wrap the text; HTMLResponder
need not wrap because the browser handles text formatting.
The Responder interface addresses this wide variety of needs.
WHAT IS THE THEORY BEHIND ALICE
I used to say that there was NO theory behind ALICE: no neural network,
no knowledge representation, no search, no fuzzy logic, no genetic
algorithms, and no parsing. Then I discovered there was a theory
circulating in applied AI called "Case-Based Reasoning" or CBR that
maps well onto the ALICE algorithm. Another term, borrowed from
pattern recognition, is "nearest-neighbor classification."
The CBR "cases" are the categories in AIML. The algorithm finds
best-matching pattern for each input. The category ties the
response template directly to the stimulus pattern. ALICE is
conceptually not much more complicated that Weizenbaum's ELIZA
chat robot; the main differences are the much larger case base and the
tools for creating new content by dialog analysis.
ALICE is also part of the tradition of "minimalist", "reactive" or
"stimulus-response" robotics. Mobile robots work best, fastest and
demonstrate the most animated, realistic behavior when their sensory
inputs directly control the motor reactions. Higher-level symbolic
processing, search, and planning, tends to slow down the process
too much for realistic applications, even with the fastest control
computers.
WHAT IS XML
David Bacon pronounces it "Eggsmell". XML is the Extensible
Markup Language. Like many "standards" in computer science, XML
is a moving target. In the simplest terms, XML is just a generalized
version of HTML. Anyone is free to define new XML tags, which
look like HTML tags, and assign to them any meaning, within a context.
AIML is an example of using the XML standard to define a specialized
language for artificial intelligence.
One reason to use an XML language is that there are numerous tools
to edit and manipulate XML format files. Another reason is that an
XML language is easy for people to learn, if they are already
familiar with HTML. Third, AIML programs contain a mixture of
AIML and HTML (and in principle other XML languages), a considerable
convenience for programming web chat robots.
A good resource for information on XML is www.oasis-open.org.
WHERE DOES THE LT TOPIC GT TAG APPEAR
Topic tags are placed around one or more categories. (Usually
many.) The categories (with each respective "pattern", "that",
and "template") within a set of <topic> </topic> tags would be
associated with the defined topic. The name of the topic would be
given by a "name" property in the beginning topic tag. Here would
be the full AIML format with topic:
<alice>
<topic name="THE TOPIC">
<category>
<pattern> phrase </pattern>
<that> phrase </that>
<template> phrase </template>
</category>
</topic>
</alice>
WHO IS THE BOTMASTER
The botmaster is you, the master of your chat robot. A botmaster runs
program B and creates or modifies a chat robot with the program's
graphical user interface (GUI). He or she is responsible for
reading the dialogues, analyzing the responses, and creating new
replies for the patterns detected by program B. Botmasters are
hobbyists, webmasters, developers, advertisers, artists, publishers,
editors, engineers, and anyone else interested in creating a personal
chat robot.
WHY IS THE FORMAT OF THE OPTIONS GLOBALS TXT SO STRANGE
Depending on your system, you may see a globals.txt file that looks like:
Animagent=true
Botmaster=Jon Baer
AnalysisFile=dialog.txt
ClientLineContains=t:
LogFile=dialog.txt
CodeBase=D\:CHATTERBOTS\ALICE
StartLine=0
Beep=true
BotFile=B.aiml
AppletHost=206.184.206.210
EndLine=25000
BotName=ALICE
Birthday=November 23, 1995
TempFile=Temp.ai
RobotLineStarts=Robot
# ... and so on
The global values seem to be stored in a random order.
This is not a bug. The Globals class uses the Java methods
Properties.load() and Properties.store() to save the globals
to a file. You can also use # and ! to add comments to the file.
The Properties class uses a hash table representation, so does
not preserve the order of the global variables. The program
displays and saves the global options in an arbitrary order.