Monday, December 10, 2012

Hooks, local variables, and namespaces

Sometimes you want to customize variables depending on what mode you are in. The usual way to do this is through hooks. These are variables that hold a list of functions to call when they are triggered. Most modes have a hook for when they are enabled, which lets you then use the hook to customize based on modes. For example, here's how I currently customize c++ and Java modes:

(add-hook 'c++-mode-hook (lambda ()
                           (setq fill-column 80)
                           (fci-mode 1)
                           (electric-pair-mode)
                           ;; compatible with fci-mode
                           (setq whitespace-style '(face trailing))))
(add-hook 'java-mode-hook (lambda ()
                            (setq fill-column 80)
                            (fci-mode 1)
                            (electric-pair-mode)))

This means that when I load or create a c++ file, we'll set the fill-column to 80, enable fci-mode (which has a marker which indicates when you are over the fill-column), enables electric-pair-mode (which inserts closing parens or similar item when you type the open parens), and sets the whitespace-style so that trailing whitespace is visible.

To use hooks properly, you have to understand and global and local variables.  In elisp, all variables are global. But there are different kinds of global variables. One is the truly global variable. If you set it in one buffer, it is set for all buffers. Some variables are local, so they are the same in all buffers except the one in which make-variable-local was called. Finally, there's a variable that is always local to a buffer whenever it is a setup with make-variable-buffer-local. There's also a similar make-variable-frame-local. To set a variable in a hook, you are setting it in the context of a buffer. You want to ensure when you set it, it is set just in that buffer. So it needs to be either buffer-local, or you need to make it local. The easiest way to check to see if a variable is local or buffer-local is to look at it's help page C-h f. If the variable is local, it will be noted there. You can programmatically do this with local-variable-if-set-p, which will return a true value if the variable will be local when set. The similarly local-variable-p returns true if the variable has been set and is now local.

fill-column is a variable that will become local whenever it is set, as the documentation mentions. So this is safe to set. However whitespace-style does not have documentation that mentions this. And if we check manually in ielm, we can confirm this:

ELISP> (local-variable-if-set-p 'fill-column)
t
ELISP> (local-variable-if-set-p 'whitespace-style)
nil

This looks like a bug in my customization code, then. Whenever I'm loading a c++ file, I'm changing the global value of whitespace-style! We can fix this by making the variable buffer local before we set it.

(add-hook 'c++-mode-hook (lambda ()
                           (setq fill-column 80)
                           (fci-mode 1)
                           (electric-pair-mode)
                           ;; compatible with fci-mode
                           (set (make-local-variable 'whitespace-style)
                                '(face trailing))))  

This uses the fact that the function make-local-variable returns the unquoted function, and therefore it can be used with set.

We can improve this further. If you look at the C++ and Java versions, you can see that there is considerable overlap. It's useful to make everything a function, so that if there's any issue, we can simply redefine it to change the behavior. Otherwise, if you just used a lambda, and there's an issue with it, you'd have to remove the hook manually and re-add it after fixing.

Here's our factored code:

(defun ash/c-like-initialization ()
  (setq fill-column 80)
  (fci-mode)
  (electric-pair-mode))

(defun ash/show-trailing-whitespace ()
  (set (make-local-variable 'whitespace-style)
                                '(face trailing)))

(add-hook 'c++-mode-hook 'ash/c-like-initialization)
(add-hook 'c++-mode-hook 'ash/show-trailing-whitespace)
(add-hook 'java-mode-hook 'ash/c-like-initialization)

This is much cleaner, and now it's much easier to add new behaviors to either C++ or Java mode.

You may notice the code above has functions that start with ash/, such as ash/show-trailing-whitespace. Elisp allows all sorts of characters in identifiers, including slashes. It's wise to use a personal prefix in your elisp, so that nothing you do conflicts with built-in functions or packages you may have loaded. I used to use the prefix with a dash, but I've recently seen many uses of the slash, and agree it's better. The slash makes the namespacing clear.

With the refactorings in mind, here are the key ideas to remember when working with your own configuration file:
  1. When setting variables in a hook, make sure each variable is a local variable. If not, make it a local variable in the code you are adding to the hook.
  2. Prefer functions to lambdas in hooks.
  3. Use a namespace, separated with a slash, for your named functions and variables.

No comments: