All of us have seen this:
private String mName;
It's because of me.
I said so - it's my fault.
')
This topic comes up again and again, the
discussion on reddit reminded me that I never explained where this notation came from, and also how wrong it was understood by people. Therefore, I would like to take the opportunity to clarify some things, and I will do it in two parts:
- How m-notation appeared
- Why you probably do not understand what the Hungarian notation is.
M-notation
I was one of the first engineers working on Android, and I was assigned to develop a style guide for the Android API (for us, the Android team) and custom code. At that time, we had few Java developers and little Java code, so developing a guide before there would be a huge amount of code was very important.
When it comes to defining the fields, I get biased a little. At that time, I already wrote a fair amount of Java, Windows, and C ++ code, and I found that using certain syntax for fields can be very useful. Microsoft uses m_ for this, while the leading underscore character (such as _name) is commonly used in C ++. Since I started writing Java code, the fact that Java has moved away from this agreement has always bothered me.
But my task was to write a style guide for Java, thus completing one of our goals from the first day of working on Android — to create a development platform where Java programmers would feel very comfortable.
So I put aside my prejudices and spent some time studying the internal Sun and Google style guides, and I came up with my own Android manual, which was 99% of what was offered by these two manuals, but with a few very small changes.
One of the differences that I remember was related to braces. Although for both style guides, it is required to use curly brackets for everything, I introduced an exception when the continuing statement could fit in one line. The idea of ​​this exception was to take into account the common logging idiom in Android:
if (Log.DEBUG) Log.d(tag, "Logging");
Without this exception, logging would take a lot of screen space, which, and everyone agreed, is undesirable.
So, this was the first version of our style guide, and it did not contain any requirements for field prefixes.
I sent a guide to the team, and, to my surprise, no one liked it, precisely because it did not provide for the syntax of the fields. Everyone believed that the fields should be standardized, and they would not agree with the leadership, which does not have such a rule.
So I went back to my drawing board and thought about several options for standardization.
I took _name and m_name into account, as mentioned above, but rejected them because the underscore was too large a deviation from the Java standard. I ran into several other, more exotic notations (for example, using the “iv” prefix for “instance variable”), but ultimately I rejected them all. Regardless of what I considered, the “m” prefix was spinning in my head as the most reasonable and least voluminous one.
So what was the obvious solution? Take the “m”, remove the underscore and use camelcase. Thus was born mName.
This proposal was accepted by the team, and then we made it the official designation.
You probably don't understand the Hungarian notation.
Whenever there is a discussion about Hungarian notation (HN), I notice that most people seem to think that every time you add some metadata to an identifier, it is automatically HN. But this ignores the basic concept of HN and the very thoughtful design that Simonyi put into it when he came up with this designation.
First of all, there are many different metadata that you can add to identifier names, and they all belong to different categories. Here are the categories that I have identified at the moment (there may be more):
- Information about the type.
- Visibility information.
- Semantic information.
Let's take a look at them one by one.
Type Information
This is perhaps the most common use of field metadata: the name of a field so that its type can be recognized by name. This is used throughout the Win32 / 64 code, where you see names, such as lpsz_name, to mean "Long Pointer to String with a Zero terminator." Although this notation seems extremely verbose and difficult to read, in fact, Windows programmers interpret it almost instantly in the head, and the added information is really very useful for debugging many obscure errors that can occur in the bowels of the Windows system, mainly due to the very dynamic nature of many its API and big dependency on C and C ++.
Visibility information
This is what is used in Android: using metadata to indicate with which type of variable you are dealing with: field, local or function parameter. It immediately became clear to me that the fields are indeed the most important aspect of the variable, so I decided that we did not need further agreements to distinguish local variables from function parameters. Once again: note that this metadata has nothing to do with the type of the variable.
Semantic information
This is, in fact, the least used information in the metadata and, nevertheless, perhaps the most useful. Such differentiation can be applied to variables of identical or similar types, or to identical or similar areas, but belonging to different semantics.
This convention can be used when you need to distinguish variables of similar types, but used for different purposes. In most cases, a sensible name will lead you to the goal, but sometimes metadata is the only way out. For example, if you are developing a graphical interface that allows a user to enter a name, then you can have several view options called “name”: edit text (“textName”), text view (“tvName”), buttons to confirm or cancel (“ okName, cancelName, and so on ...).
In such examples, it is important to clearly indicate that all these identifiers refer to the same operation (name editing) when differentiating their function (metadata).
I hope you should now have a more accurate idea of ​​the Hungarian notation, and I highly recommend reading Joel Spolsi
’s article
“Making the wrong code look wrong” article on this topic, which should help to understand all these points.
So what do you think of the Hungarian notation?
First of all, I think we need to stop using the term “Hungarian notation” because it is too vague. When I ask this question, I usually ask people to clarify which of the three options listed above they talk about (and in most cases they are not sure and they need time to think about it).
I simply use the term identifier metadata to describe the general idea of ​​adding information to a simple identifier name. And, in general, I think that this approach may have advantages in each of the listed cases. I don't think it should be used anytime, anywhere by default, but it is definitely useful, especially in the example of the graphical interface that I described above. I meet such examples on a regular basis and not using the identifier metadata for this type of code leads to the fact that the code is harder to read (both for the author and for future readers) and maintain.
I also disagree with the argument: “Today, our IDEs can distinguish all these identifiers with colors so that we no longer need to do this ourselves.” This argument is erroneous for two reasons:
- The code is often read outside of IDE (starting, ironically, from the screenshot taken from the discussion on reddit, which has no backlight). I read the code in browsers, terminals, diff utils, git tools, etc. Most of them do not have highlighting, which would simplify the analysis of the code, so the use of identifier metadata can help in such cases.
- Illumination in the IDE still does not help you to understand the ambiguous cases, such as, for example, the graphical interface described above. There are still cases where you, the developer, know more about your code than the IDE may know, and adding identifier metadata is the only sensible choice you can make.
Do not listen to people who tell you that identifier metadata should never be used or should always be used. This kind of naming is just a tool in your developer craft, and common sense should be relatively easy for you to determine when it is time to add some metadata to your identifiers.
Finally, I often see violent reactions about this problem. For 30 years that I wrote the code, I noticed that after several days of writing code for a new style guide, you simply stop noticing it and completely follow it. There were times when I could not tolerate a code that was not written with indents with two spaces, and a few months after working on a project with four spaces, I felt the opposite. The same thing happens with naming conventions. You will get used to anything if agreements are applied across the entire code base you are working on.