Home | Python |     Share This Page
PyBeautify

A Python source code cleanup utility

P. Lutus Message Page

Copyright © 2010, P. Lutus

Discussion | Licensing, Source | Revision History
Program Listing

(double-click any word to see its definition)


Discussion

I may have mentioned that I don't take a language seriously unless I can create a beautifier for its source files, preferably written in the language itself. My Ruby beautifier has become quite popular, and writing it helped me learn many of that language's traits. I even wrote a beautifier for Bash scripts, when I was writing a lof of those — but I decided against trying to write it as a Bash shell script.

I had resisted taking up Python for a long time because of one of its less desirable characteristics — whitespace is syntactically significant. I regard this as an abomination, but over time I got involved with some projects that relied on Python (Sage and Blender among others). I eventually weakened and started writing in Python, and I have decided it's worth its defects.

I wouldn't be emphasizing the whitespace issue except that PyBeautify needs to work around the implications of the whitespace issue. Unlike beautifiers for other languages, PyBeautify can only change the overall indentation of a program (and a few other things) — it can't use the language's block syntax tokens to control the indentation, because Python's block syntax is controlled by indentation, not by tokens.

Here is what PyBeautify does:

  • In pass one, PyBeautify scans a source file and determines which indentation the file uses — one or more spaces.

  • In pass two, PyBeautify indents the program based on either PyBeautify's default indentation of two spaces or a user-entered specification. This feature can be used to reliably change a file's oveall indentation from one standard to another, and any indentation between 1 and 64 spaces can be specified.

  • PyBeautify also checks the program's indentation for consistency. The assumption is that a program will always use a multiple of a basic indentation — say, four spaces — and each indentation is a multiple of this value.

  • If PyBeautify finds any indentation inconsistencies, for each one it prints a warning with a file name and a line number, but it doesn't try to change the indentation.

  • PyBeautify also turns all tabs into eight-space blocks. I think it's generally accepted that tabs should be removed from the world of computing. PyBeautify does its little part.

Here is what PyBeautify won't do:

  • Make your source files beautiful (the program's name is more a tradition than a description), unless you regard removal of tabs as a move toward beauty (as I do).

  • Change the indentation of lines it thinks are errors. It will print a warning message for each one, but any changes are up to you.

Here's how to use PyBeautify:

  • Use as a stream filter:
    ./pybeautify.py - < input.py > output.py
                  
  • Specify an indentation other than two spaces:
    ./pybeautify.py 4 - < input.py > output.py
                  
  • Replace a file in place, specifying an indentation of 4 spaces (makes a backup copy):
    ./pybeautify.py 4 input.py
                  
  • Process all Python files in a directory in the same way:
    ./pybeautify.py 4 *.py
                  

Licensing, Source

PyBeautify is released under the GNU General Public License. Here is the plain-text source file without line numbers.

Revision History

  • Version 1.0 12/01/2010. Initial Public Release.

Program Listing

  1: #!/usr/bin/env python
  2: # -*- coding: utf-8 -*-
  3: 
  4: # Version 1.0 12/01/2010
  5: 
  6: # ***************************************************************************
  7: # *   Copyright (C) 2010, Paul Lutus                                        *
  8: # *                                                                         *
  9: # *   This program is free software; you can redistribute it and/or modify  *
 10: # *   it under the terms of the GNU General Public License as published by  *
 11: # *   the Free Software Foundation; either version 2 of the License, or     *
 12: # *   (at your option) any later version.                                   *
 13: # *                                                                         *
 14: # *   This program is distributed in the hope that it will be useful,       *
 15: # *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
 16: # *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
 17: # *   GNU General Public License for more details.                          *
 18: # *                                                                         *
 19: # *   You should have received a copy of the GNU General Public License     *
 20: # *   along with this program; if not, write to the                         *
 21: # *   Free Software Foundation, Inc.,                                       *
 22: # *   59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.             *
 23: # ***************************************************************************
 24: 
 25: import re, sys, shutil
 26: 
 27: class PyBeautify:
 28: 
 29:   def __init__(self):
 30:     self.default_indent = 2
 31: 
 32:   # split line into indent and content
 33:   def parse_line(self,s):
 34:     indent,content = re.search(r'^(\s*)(.*)$',s).groups()
 35:     return indent,len(indent),content
 36: 
 37:   def parse_stream(self,stream,path,indv):
 38:     lines = [line.expandtabs().rstrip() for line in stream.readlines()]
 39: 
 40:     # pass 1: find the minimum indent
 41:     mi = 1000000
 42:     for line in lines:
 43:       if(re.search(r'\S',line)): # only non-blank lines
 44:         indent,li,content = self.parse_line(line)
 45:         if(li > 0 and li < mi): mi = li
 46: 
 47:     # pass 2: create output string with specified indentation
 48:     output = []
 49:     for n,line in enumerate(lines):
 50:       indent,li,content = self.parse_line(line)
 51:       if(li % mi != 0): # if indentation is not a multiple of mi
 52:         sys.stderr.write("Warning: inconsistent indentation in line %d of file \"%s\".\n" \
 53:         % (n+1,path))
 54:       iv = li * indv / mi # create indent value
 55:       output.append("%s%s" % (' ' * iv,content))
 56:     return '\n'.join(output) + '\n'
 57: 
 58:   def parse_file(self,path,indv):
 59:     if (path == '-'): # stdin, stdout
 60:       print(self.parse_stream(sys.stdin,path,indv)) # end = ' '
 61:     else: # it's a file
 62:       try: # making a backup copy
 63:         shutil.copyfile(path,path+"~")
 64:       except: # backup failed
 65:         sys.stderr.write("Error: unable to create backup copy of file \"%s\", quitting.\n" \
 66:         % path)
 67:         exit(1)
 68:       with open(path) as fh: # read the file
 69:         output = self.parse_stream(fh,path,indv)
 70:       with open(path,'w') as fh: # write the result
 71:         fh.write(output)
 72: 
 73:   def process(self):
 74:     sys.argv.pop(0) # drop program name
 75:     if (not sys.argv): # no program arguments
 76:       sys.stderr.write("Usage: [indent default %d] filenames or \"-\" for stream\n" \
 77:       % self.default_indent)
 78:       exit(0)
 79:     else:
 80:       try: # is the first argument a number?
 81:         indent = int(sys.argv[0])
 82:         sys.argv.pop(0) # drop the number
 83:       except: # not a number, probably a file name
 84:         indent = self.default_indent
 85:       if(indent <= 0 or indent > 64): # test of acceptable indentations
 86:         sys.stderr.write("Error: bad indent entry value: \"%d\", quitting.\n" \
 87:         % indent)
 88:         exit(1)
 89:       for path in sys.argv:
 90:         self.parse_file(path,indent)
 91: 
 92: 
 93: PyBeautify().process()
 94: 
 

Home | Python |     Share This Page