Naming convention for actually choosing the words in Python, PEP8 compliant

I’m looking for a better way to name everything in Python. Yes, I’ve read PEP8, Spolsky’s wonderful rant, and various other articles. But I’m looking for more guidance in choosing the actual words.

And yes I know

A Foolish Consistency is the Hobgoblin of Little Minds.

But, you can keep consistent with PEP8 etc, and still not have consistent variable/method/class names which are easy to remember. By consistent, I mean that if you were presented with the same decision twice, you would produce the same name.

As an example, there are multiple PEP8 compliant ways to name the items below:

  • number of columns in the table
  • current column number
  • column object
  • sum of column

Yeah, sure, it is easy to make a decision to use something like num_col and count_col rather than col_num and col_count (or v.v.). But, I would like to see an example that has seen some testing/refining over time. I often start with a given convention, and then it starts to break down as I venture into a new area.

I guess what I am looking for is not only what the prefix/root/tag/suffix should do (which was partly covered for Apps Hungarian in the Spolsky article), but (many) examples for each, or a rule for generating each.

14.10.2009 00:21:52

I believe that the need for complex variable naming conventions goes away with good object-oriented design. In the Spolsky article, much focus is on how variable naming helps preventing errors. I believe that those errors will more often occur when you have many variables in the same scope; this can be avoided by grouping data into objects - then, a single naming context will have only few variables, which don't need combined names.

The other purpose of a naming convention is to better remember the names. Again, object-orientation helps (by hiding much data from users that look from the outside); what you then need is a convention for naming methods, not data. In addition, tools can help which provide you with a list of names available in a certain scope (again, those tools rely on object-orientation to do their job).

In your specific example, if column is an object, I would expect that len(table) gives me the number of columns in a table, sum(column) or column.sum() gives me its sum; and the current column is just the variable in the for loop (often c or column).

14.10.2009 02:01:33
Good point about focusing on minimizing the number of variables in any given scope. It certainly helps, but as you point out, method naming is still an issue. Especially when using modules which you haven't used for a long time. A good IDE can help only so much.
nazca 14.10.2009 13:34:07

The universe is multi-dimensional.

You have at least two dimensions to each variable name.

"Total", "Count", "Of Columns", "In a Table"

"Current", "Index", "", "Of a Column"

"Current", "Column", "", ""

"Sum", "Of Something", "", "In a Column"

Rats. It's irregular.

Worse, we can pick anything as the "Primary" dimension and pick any sequence of other features as "secondary" dimensions.

Even worse, we could have a truly complex thing. "Total", "Count", "of Non-Underscore", "Columns", "In Tables", "With Even-Length Names", "From a Dictionary", "Keyed by", "Mother's Maiden Name".

Frankly, there's no possible schema for variable names that encompasses "all" knowledge in a systematic, repeatable form.

Keep trying though. It's always fun and games until someone finds a counter-example.

You can keep trying or you can simply use simple, clear names. If your scope of names is small (a small method function, for example), there's nothing to "remember". It's all perfectly visible in the 20 lines of code that make up the method function.

14.10.2009 01:57:34

Remember that in English, when two ambiguous words are next to each other, the first one becomes an adjective which describes the second one. Try to stick to this rule and always name things with two components where the first component decribes the second.

For instance col_num is a number. What kind of number? A column number.

Next rule is the word of. It is a nice short word so please do not leave it out. Same goes with pluralization. And past tense ending in -ed or -d. Maybe even -ing.

For instance, num_col is a column. What kind? A number column. If you really wanted to refer to the number of columns, you should write num_of_cols. Received date is recd_date or rcvd_date, not rec_date or rcv_date.

English speaking readers are very sensitive to -s and -d at the end of words, as well as of in the middle of a phrase, so don't assume that such short bits of text would be missed. That is very unlikely because we are programmed from a very young age to notice a handful of word endings.

Try for consistency, and keep a glossary or data dictionary of any words, or word fragments that you use. Don't use the same fragment for two different things. If you are using recd to mean received, and you need a variable name for recorded, then either write it out in full or come up with a new abbreviation and put it in the glossary. If you use a relational database, try to be consistent with the naming convention used there. Let the dba know what you are doing and tell them where to find your glossary.

17.10.2009 19:59:39