Permission to reprint or excerpt is granted only if the following
      line appears at the top of the article:

       ANTIC PUBLISHING INC., COPYRIGHT 1986.  REPRINTED BY PERMISSION.



      PROFESSIONAL GEM  by Tim Oren
      Column #14 - User Interfaces, part 2


           This  issue of ST PRO GEM (#14) continues the  discussion  of
      user  interface  design which began in episode eight.   It  begins
      where  we left off,  with a further treatment of the mode problem,
      and  proceeds  into  topics such as visual  grammar  and  layered
      interfaces.

           Note  that  there  is  no  download  for  this  column.   The
      downloads  will return with the next issue,  a discussion of using
      the  GEM DOS file system within a GEM application.   Specifically,
      it  will include sample code for using the file selector,  the GEM
      form_error  alerts,  and some utilities for manipulating file  and
      path names.  There will also be a feedback section.  The following
      two columns will be devoted to "graphics potpourri",  a collection
      of  small  but useful GEM utilities such as  popup  menus,  string
      editing, and source code for drag and rubber box operations.

           MODES AGAIN.   If a program is modeless, it acts predictably,
      which turns out to be very important.   On the other hand,  a good
      definition  for  "modes"  is hard to find.   In  column  eight,  I
      suggested  that  a  mode exists when you cannot  use  all  of  the
      capabilities  of the program without performing some  intermediate
      step.   If  this  is  less  than clear,  here  are  two  alternate
      definitions offering different views of the problem.

           THE   "TWO  USER  TEST".    Consider  the  following  thought
      experiment:   Imagine  that  your ST (and GEM) had two  mice,  two
      cursors,  and  two  users.   Could  they both effectively use  the
      program at the same time?  If so, the application is modeless.  If
      there are points where one user can be "locked out" by the actions
      of  the other,  then a mode exists at that point.  Let's  consider
      some examples of this test.

           In any program which uses the GEM menu system, one user could
      stop  the  other by touching a menu hotspot and dropping  a  menu.
      This constitutes an inherent mode in the GEM architecture.

           On  the  GEM Desktop,  two users could open windows and  view
      files without interference.   However, as soon as one person tries
      to delete a file (assuming the verify option is on),  the other is
      brought  to  a halt as a dialog appears.   Thus,  we have found  a
      modal dialog.

           In  many "Paint-type" programs,  such as MacPaint,  PC Paint,
      and  GEM Paint,  two artists could co-exist quite well,  utilizing
      the  on-screen  palette  and tool  selection.   Of  course,  these
      programs  also contain modal dialogs for such operations  as  file
      and  brush  shape  selection.   In contrast,  consider  the  paint
      program  DEGAS  for  the ST.   Here,  two artists could only  work
      together as long as neither wanted to change tool or color.   Then
      the  display  would have to be flipped to  the  selection  screen,
      stopping the other user.  This is a mode in the DEGAS interface.

           (By the way, this test is not just academic.  The grand-daddy
      of all mouse based systems, NLS, demonstrated by Doug Englebart in
      1968,  had  two  mice  and two users,  one of whom was  physically
      remote.   Cooperative  techniques  such as this are still  largely
      unexplored and unexploited.)

           ONE  LINER.   Here's  a  terse definition by  Jef  Raskin:  A
      program is modeless if a given action has one and only one result.
      Again, let's run a few examples.

           The  menu  dropdowns are clearly modal  by  this  definition.
      Before  the  menu was activated,  window control points  could  be
      activated with a click.   However, when the dropdown is visible, a
      click action is interpreted as a menu selection or a dismissal  of
      the dropdown.   Similarly, dialogs are modal because the action of
      moving  the mouse into the menu bar no longer causes the  dropdown
      to appear.

           I am typing this using the First Word editor program.  It has
      a  nice desktop level box full of characters where I can click  to
      get  symbols which the ST keyboard won't produce.   However,  if I
      invoke  the  find or replace string dialog,  the  click-in-the-box
      action  doesn't  work anymore.   This is a mode in the First  Word
      interface.

           Finally, consider an "old style" menu program, the kind where
      you  type in the number of the desired action from a list.   Since
      the  number  "2" might mean "Insert the record" in one  menu,  and
      "Purge  the file" in another,  such a program is clearly modal  by
      Raskin's definition.

           These  three definitions say almost the same thing,  but from
      different  viewpoints.   Depending  on the situation,  one or  the
      other  may  be more intuitive for you.   The goal of this type  of
      analysis  is to root out unnecessary modes,  and to make sure that
      those  which remain only appear when requested by the user,  offer
      some visual cue such as a rubber line or standard dialog box,  and
      are used consistently throughout the application.

           PREDICTABILITY  FOREVER  AND  EVER  AND  EVER.   As  Raskin's
      definition  makes  clear,  when the modes go away,  the  interface
      becomes  predictable.   Predictability  leads to the formation  of
      habits   of   use.     Habits  reduce  "think  time"  and   become
      progressively faster due to the Power Law of Practice discussed in
      column eight.  This is exactly what we want!

           There is another benefit of predictability.   A habit learned
      in  one  part  of a program with a  consistent  interface  can  be
      transferred  and  used elsewhere in the application.   If  several
      programs share the same style of interface, the same habits can be
      used across a complete set of products.  Learning time for the new
      functions becomes shorter,  and the user is more likely to use the
      new feature.

           IS  A  BOGEYMAN!   Most  casual  users are  scared  silly  of
      computers  and programs.   (If you have any doubt,  eavesdrop on a
      secretary with a new word processor,  or the doctor's receptionist
      coping with an insurance data entry program.)  In most cases, they
      have  a  right to be frightened.   Even  experienced  programmers,
      prone  to  toss  the manuals and hack  away,  know  that  moderate
      paranoia  is  the best way to deal with an unknown  program.   How
      must this feel to someone whose ability to perform (or lose) their
      job depends on an unpredictable (aha!) black box.

           So here's another way in which predictability works.   But to
      produce  a truly fearless user,  we need other qualities as  well.
      One  is robustness,  meaning that the program will not crash given
      normal or even bizarre actions by the user.   Another is feedback,
      which shuts off invalid options,  reinforces correct actions,  and
      gives  reassurance  that  an  operation  is  proceeding  normally.
      Finally, we need forgiveness, in the form of inverse operations or
      Undo options, when the inevitable mistake is made.

           The  ultimate  goal is make the program  discoverable.   This
      means  the user should be able to safely "wing it" after  a  short
      session  with  the application and its interface.   This  practice
      ought to be considered the norm anyway, since the manual is always
      across  the office or missing when an esoteric and  half-forgotten
      feature  is  needed.   If it is possible to muddle through such  a
      situation  by  trial  and  error,   without  causing  damage,  the
      immediate  problem  will  be  solved,   and  the  user  will  gain
      confidence.

           GOOD GRAMMAR OR...  So exactly what are these habits that are
      supposed  to be so helpful?   One of the most useful patterns is a
      consistent  command  grammar  for the  program.   This  may  sound
      strange,   since   we  have  supposedly  abandoned  command   line
      interfaces  in the graphics world,  but in fact,  the same type of
      rules apply.   For instance, in the world of A> we might issue the
      command:

           copy a:foobar.txt b:

      By  analogy  to  English grammar,  this command contains  a  verb,
      "copy",  a  file  as  subject:  "a:foobar.txt",  and a location as
      an object: "b:".  The equivalent GEM Desktop operation is:

           - Move mouse to foobar.txt icon in a: window
           - Press mouse button
           - Move mouse to b: icon
           - Release mouse button

      The  operation  can be described as a  select-drag-drop  sequence,
      with  the select designating the subject file,  the drag  denoting
      the  operation  (copy),  and the location of the drop showing  the
      object.   A  grammar still exists,  but its "terminal symbols" are
      composed  of  mouse  actions interpreted in  the  context  of  the
      current screen display, rather than typed characters.

           One  useful way to analyze simple grammars,  including  those
      used  as  command  languages,  is to separate  them  into  prefix,
      postfix,  and infix forms.   In a prefix grammer, the operation to
      be  performed precedes its operands,  that is,  its subject(s) and
      object(s).   The  DOS copy command given above is an example of  a
      prefix  command.   LISP  is  an example of a language  which  uses
      prefix specification for its commands.

           Postfix grammars specify the action after all of the operands
      have been given.   This command pattern is familiar to many as the
      way  in  which  Hewlett-Packard calculators  work.   FORTH  is  an
      example of a language which uses a postfix grammar.

           Infix  notation  places the verb,  or operator,  between  its
      subject and object.   Conventional algebraic notation is infix, as
      are most computer languages such as C or PASCAL.   The example GEM
      command  given  above  is also infix,  since the  selection  of  a
      subject  file  preceded  the action,  which was  followed  by  the
      designation of an object.

           The  "standard" GEM command grammar,  as used in the products
      produced  by Digital Research,  is in fact infix.   This is not to
      say that GEM enforces such a convention,  or that it is rigorously
      followed.  However, when there is no pressing reason for a change,
      adoption  of an infix command grammar will make  your  application
      feel most like others which users may have seen.

           The general problem of specifying a graphic command  language
      can be difficult, but much of the problem has already been handled
      on the ST.   Part of the solution is by constraint:  the input and
      output hardware of the ST are predefined,  so most developers will
      not  need  to  worry about choosing a pointing  device  or  screen
      resolution.   The  other part of the standard solution is the  GEM
      convention for mouse usage.  I am going to review these rules, and
      then describe of the situations in which they have been bent,  and
      finally  some alternate approaches which may prove useful to  some
      developers.

           SPECIFYING  A SUBJECT.   There are really two sets of methods
      for  designating what is to be affected by an operation.   One set
      is  used when distinct objects are to be affected.   Examples  are
      file and disk icons in the Desktop and trees in the RCS.   Another
      set of designation methods is used when continuous material,  such
      as text or bit images, is being handled.

           When dealing with objects, a single mouse click (down and up)
      over the object selects it.   The application should show that the
      selection  has occurred by changing the appearance of the  object.
      The  most  common  methods are inverting the  object,  or  drawing
      "handles" around it.

           Many   operations   allow  "plural",   or  multiple   object,
      selections.  The GEM convention is that a click on an object while
      the  shift key is held down extends the selection by  adding  that
      object.   If the shift-clicked object was already selected,  it is
      deleted from the selection list.

           Another  way to select multiple objects is to use  a  "rubber
      box"  to enclose them.   This operation begins with drag on a part
      of  the  view where no object is present.   The  application  then
      animates a rubber box on the screen as long as the mouse button is
      held  down.   When the button is released,  all objects within the
      current extent of the box are selected.   A shift-drag combination
      could be used to add the objects to an existing selection list.

           Selecting  part of a text or bit plane display is  also  done
      with a rubber box.   Since there are no "objects" in the view, any
      mouse  drag  is  interpreted  as  the  beginning  of  a  selection
      operation.   In  the  simplest  case,  a bit plane,  the rectangle
      within the box when the button is released is the selected extent.

           When  the  underlying data has structure,  such as words  and
      lines  of  text,  the display should reflect this fact during  the
      selection  operation.   Typically,  text selection is indicated by
      inversion  of  the  characters  rather than  a  rubber  box.   The
      selection  extends  along the starting line so long as  the  mouse
      stays  within the line.   If the mouse move off the starting  text
      line, the implied selection is all characters between the starting
      character  and the character currently under the mouse,  which  is
      not necessarily a rectangular area.

           An  extended  "plural"  selection may be  supported  in  text
      editing.   The  use of the shift key is also conventional in  this
      application.

           ACTION.  With the subject designated, the user can now choose
      an operation.   In many cases,  this will be picked from the menu,
      in  which  case  the  entire  command  is  complete.    Some  menu
      selections will lead to dialogs,  in which the interaction methods
      are  regulated  by  the GEM form manager.   When  the  command  is
      completed,  it  is  often  helpful if the application  leaves  the
      objects  (or areas) selected and ready for another  operation.   A
      single click away from any object is interpreted as cancelling the
      selections.

           Many  operations  are indicated by gestures  on  the  screen.
      Usually,   this  is  some  variant  of  a  drag  operation.    The
      interpretation of the gesture may depend on the type and  location
      of the selected subject,  which part of it is under the mouse, and
      in what location the drag terminates.

           "Handles"  are small boxes or dot displayed around an  object
      when it is selected.   A drag beginning with the mouse on a handle
      is  usually  interpreted  as  a resizing  operation,  if  this  is
      appropriate.   The  pointing  finger  mouse form is  displayed  to
      indicate  the operation in progress,  and a rubber version of  the
      object  is animated on the screen to show the user the  result  if
      the  button  were released.   In some cases,  where an  underlying
      "snap"  grid  exists,  the  animated  object may  change  size  in
      discrete steps.

           Dragging  a non-handle area of a selected object  is  usually
      interpreted  as  the  beginning  of  a  move  function.   In  most
      applications,  a  move  of a single object may be started  without
      pre-selection.   Simply  beginning the drag on the object is taken
      to imply selection.  The spread hand, or "grabber", mouse form  is
      typically displayed during a drag operation.

           Dragging  may  denote copying or movement,  or  more  complex
      functions such as instantiation or generalization.   The operation
      implied by movement on the screen will differ among  applications,
      and  often  within  the  same  application,  depending  on  target
      location.   This  target is the recipient of the command's action,
      or its object, in an English grammar sense.

           For  example,  a  drag from window to window in  the  Desktop
      denotes a copy.   On the other hand, dragging the same icon to the
      trashcan  deletes it completely.   Dragging an object from the RCS
      partbox  to the editing view creates a new copy of that  prototype
      object.   Dragging  the  same object within the edit  view  simply
      changes its placement.

           There   are  some  mouse  actions  which   are   conventional
      "abbreviations".   A  double click on an object is interpreted  as
      both a selection and an action.   Usually, the double click action
      is the same as the Open entry in the "File" menu.

           When  the  usual interpretation of a drag is  movement,  then
      shift-drag  may be used as an enhanced varient  implying  copying.
      For  instance,  shift-dragging  an object within the  RCS  editing
      window  makes  a  copy of the object and places it  in  the  final
      location.

           To return to the beginning of this discussion, the reason for
      adopting  these conventional usages is to build an interface  that
      promotes  habits.   Particularly,  a  standard grammar for  giving
      commands helps answer the question "What comes next?".   It breaks
      the user's actions into logical phrases,  or chunks,  which may be
      thought of a whole, rather than one action at a time.

           DIFFERENT  FOLKS,   DIFFERENT  STROKES.    There  are  always
      exceptions to a rule,  or so it seems.   In this case, consistency
      of   the  interface  grammar  is  sometimes  traded  off   against
      consistency of metaphor,  preservation of screen space,  and "fast
      path" methods for experts.

           One example is the use of "tools" in Paint and Draw programs.
      In  such  programs,  an  initial  click is made on  a  tool  icon,
      denoting the operation to be applied to all following  selections.
      This is an prefix style of grammar,  and stands in contrast to the
      usual  method  of selecting subject object(s) first.   Because  of
      this contrast,  it is sometimes called "moding the cursor".   (Try
      applying the tests above to be sure it really is a mode.)

           In  these  cases,  there  are two reasons for  accepting  the
      nonstandard  method.   The first is consistency of metaphor.   The
      "user model" portrayed in the programs is an artist's work  table,
      with  tools,  palette,  and  so  on.   The cursor moding action is
      equivalent  to  picking up a working tool.   The second reason  is
      speed.   In a Paint program,  the "canvas" is often modified,  and
      speed  in  creating or changing the bits is  important.   In  more
      object  oriented applications such as Desktop or RCS,  the objects
      are more persistent.   Speed is then more essential when adding or
      changing properties of the objects.

           When  command  styles  are mixed in this  fashion,  you  must
      design very carefully to avoid conflicts or apparent  side-effects
      in  the  command language.   For example,  in GEM Draw picking  an
      action from the Edit menu cancels the current cursor mode  without
      warning.   Confusion  from  such side-effects may cancel  out  the
      benefits of the mixed grammar.

           The  subject  of command speed  deserves  further  attention.
      While  the  novice approaching a program needs  full  feedback,  a
      person who uses it day in and day out will learn the program,  and
      want  faster  ways  to get the job done,  even if  they  are  more
      arcane.  The gives rise to a "layered" style of interface.

           A layered interface is designed so that the visual grammar is
      obvious,  as  we  have discussed.   However, there are one or more
      sets of "accelerators" built into the program, which may be harder
      to find but faster to use.  One example is condensed mouse actions
      such  as the double-click.   For instance,  attempting to select a
      block  of text which extends beyond a window is  impossible  using
      the  basic metaphor.   The novice will simply do the operation  in
      pieces.   A  layered interface might put a less obvious Mark Begin
      and  Mark End option in the menus.   Another way is to take a drag
      which  extends outside the window as a request to begin  scrolling
      in that direction, while extending the current selection.

           One  of  the most common and useful  accelerator  methods  is
      function  keys.   Using  this approach,  single key equivalents to
      actions are listed in the menu.   Striking this key when an object
      is  selected  will cause the action to occur.   Note that this  is
      most  useful if some keyboard driven method of  object  selection,
      such as tabbing, is also available.  Otherwise, the time switching
      from  the  mouse,  used to select the object,  to the keyboard for
      command input, may well cancel any advantage.

           Finally,  radical  departures  from the GEM metaphor  may  be
      useful when attempting to replicate the look of another system, or
      trying  to  meet severe constraints,  such as display space.   One
      example  would  be discarding the standard GEM menus in  favor  of
      "popup"  menus which appear next to the current mouse position  in
      response  to  a click on the second button.   This method has  the
      advantage  of preserving the menu space at the top of the  screen,
      and  is potentially faster because the menu appears right next  to
      the  current mouse position.   The drawbacks are lack of a  visual
      cue for naive users trying to find the commands,  and the need for
      custom coding to build the popups.

           MORE  TO COME.   We have reached the end of the second sermon
      on  user  interface.   In a future column,  I will look at "higher
      level"  topics  relating to the design of the  application's  user
      metaphor.   These  include  issues of object  orientation,  direct
      manipulation,   and  the  construction  of  microworlds.   In  the
      meantime,  several  of  the  more practical columns  will  present
      implementions  of  techniques such as accelarator keys  and  popup
      menus which I have discussed this time.

           THANKS AND APOLOGIES to the following people whose public and
      published   remarks  have  formed  part  of  the  basis  of   this
      discussion:  Jef Raskin, Bill Buxton, Adele Goldberg, James Foley,
      and Ben Schneidermann.  As always, any errors are my own.