摘自:http://www.cnblogs.com/WuCountry/archive/2007/02/25/656433.html
Effective C# 原则9:明白几个相等运算之间的关系(译)
Item 9: Understand the Relationships Among ReferenceEquals(), static Equals(), instance Equals(), and operator==
明白ReferenceEquals(), static Equals(), instance Equals(), 和运算行符==之间的关系。
当你创建你自己的类型时(不管是类还是结构),你要定义类型在什么情况下是相等的。C#提供了4个不同的方法来断定两个对象是否是相等的:
public static bool ReferenceEquals
( object left, object right );
public static bool Equals
( object left, object right );
public virtual bool Equals( object right);
public static bool operator==( MyClass left, MyClass right );
这种语言让你可以为上面所有的4种方法创建自己的版本。But just because you can doesn't mean that you should.你或许从来不用重新定义前面两个方法。你经常遇到的是创建你自己实例的Equals()方法,来为你的类型定义语义;或者你偶而重载运== 运算符,但这只是为了考虑值类型的性能。幸运的是,这4个方法的关系,当你改变其中一个时,会影响到其它的几个。是的,须要4个方法来完整的测试对象是否 完全相等。但你不用担心,你可以简单的搞定它们。
和C#里其它大多数复杂元素一样,这个(对相等的比较运算)也遵守这样的一个事实:C#充许你同时创建值类型和引用类型。两个引用类型的变量在引用 同一个对象时,它们是相等的,就像引用到对象的ID一样。两个值类型的变量在它们的类型和内容都是相同时,它们应该是相等的。这就是为什么相等测试要这么 多方法了。
我们先从两个你可能从来不会修改的方法开始。Object.ReferenceEquals()在两个变量引用到同一个对象时返回true,也就是 两个变量具有相同的对象ID。不管比较的类型是引用类型还是值类型的,这个方法总是检测对象ID,而不是对象内容。是的,这就是说当你测试两个值类型是否 相等时,ReferenceEquals()总会返回false,即使你是比较同一个值类型对象,它也会返回false。这里有两个装箱,会在原则16中 讨论。(译注:因为参数要求两个引用对象,所以用两个值类型来调用该方法,会先使两个参数都装箱,这样一来,两个引用 对象自然就不相等了。)
int i = 5;
int j = 5;
if ( Object.ReferenceEquals( i, j ))
Console.WriteLine( "Never happens." );
else
Console.WriteLine( "Always happens." );
if ( Object.ReferenceEquals( i, i ))
Console.WriteLine( "Never happens." );
else
Console.WriteLine( "Always happens." );
你或许决不会重新定义Object.ReferenceEquals(),这是因为它已经确实实现了它自己的功能:检测两个变量的对象ID(是否相同)。
第二个可能从来不会重新定义的方法是静态的Object.Equals()。这个方法在你不清楚两个参数的运行类型时什么时,检测它们是否相等。记
住:C#里System.Object是一切内容的最终基类。任何时候你在比较两个变量时,它们都是System.Object的实例。因此,在不知道它
们的类型时,而等式的改变又是依懒于类型的,这个方法是怎样来比较两个变量是否相等的呢?答案很简单:这个方法把比较的职责委交给了其中一个正在比较的类
型。静态的Object.Equals()方法是像下面这样实现的:
public static bool Equals( object left, object right )
{
// Check object identity
if (left == right )
return true;
// both null references handled above
if ((left == null) || (right == null))
return false;
return left.Equals (right);
}
这个示例代码展示的两个方法是我还没有讨论的:操作符==()和实例的Equals()方法。我会详细的解释这两个,但我还没有准备结束对静态的 Equals()的讨论。现在,我希望你明白,静态的Equals()是使用左边参数实例的Equals()方法来断定两个对象是否相等。
与ReferenceEquals()一样,你或许从来不会重新定义静态的Object.Equals()方法,因为它已经确实的完成了它应该完成 的事:在你不知道两个对象的确切类型时断定它们是否是一样的。因为静态的Equals()方法把比较委托给左边参数实例的Equals(),它就是用这一 原则来处理另一个类型的。
现在你应该明白为什么你从来不必重新定义静态的ReferenceEquals()以及静态的Equals()方法了吧。现在来讨论你须要重载的方 法。但首先,让我们先来讨论一下这样的一个与相等相关的数学性质。你必须确保你重新定义的方法的实现要与其它程序员所期望的实现是一致的。这就是说你必须 确保这样的一个数学相等性质:相等的自反性,对称性和传递性。自反性就是说一个对象是等于它自己的,不管对于什么类型,a==a总应该返回true;对称 就是说,如果有a==b为真,那么b==a也必须为真;传递性就是说,如果a==b为真,且b==c也为真,那么a==c也必须为真,这就是传递性。
现在是时候来讨论实例的Object.Equals()函数了,包括你应该在什么时候来重载它。当默认的行为与你的类型不一致时,你应该创建你自己
的实例版本。Object.Equals()方法使用对象的ID来断定两个变量是否相等。这个默认的Object.Equals()函数的行为与
Object.ReferenceEquals()确实是一样的。但是请注意,值类型是不一样的。System.ValueType并没有重载
Object.Equals(),记住,System.ValueType是所有你所创建的值类型(使用关键字struct创建)的基类。两个值类型的变
量相等,如果它们的类型和内容都是一样的。ValueType.Equals()实现了这一行为。不幸的是,ValueType.Equals()并不是
一个高效的实现。ValueType.Equals()是所有值类型的基类(译注:这里是说这个方法在基类上进行比较)。为了提供正确的行为,它必须比较
派生类的所有成员变量,而且是在不知道派生类的类型的情况下。在C#里,这就意味着要使用反射。正如你将会在原则44里看到的,对反射而言它们有太多的不
利之处,特别是在以性能为目标的时候。
相等是在应用中经常调用的基础结构之一,因此性能应该是值得考虑的目标。在大多数情况下,你可以为你的任何值类型重载一个快得多的Equals()。简单的推荐一下:在你创建一个值类型时,总是重载ValueType.Equals()。
你应该重载实例的Equals()函数,仅当你想改变一个引用类型所定义的(Equals()的)语义时。.Net结构类库中大量的类是使用值类型 的语义来代替引用类型的语义。两个字符中对象相等,如果它们包含相同的内容。两个DataRowViewc对象相等,如果它们引用到同一个 DataRow。关键就是,如果你的类型须要遵从值类型的语义(比较内容)而不是引用类型的语义(比较对象ID)时,你应该自己重载实例的 Object.Equals()方法。
好了,现在你知道什么时候应该重载你自己的Object.Equals(),你应该明白怎样来实现它。值类型的比较关系有很多装箱的实现,装箱在原
则17中讨论。对于用户类型,你的实例方法须要遵从原先定义行为(译注:前面的数学相等性质),从而避免你的用户在使用你的类时发生一些意想不到的行为。
这有一个标准的模式:
public class Foo
{
public override bool Equals( object right )
{
// check null:
// the this pointer is never null in C# methods.
if (right == null)
return false;
if (object.ReferenceEquals( this, right ))
return true;
// Discussed below.
if (this.GetType() != right.GetType())
return false;
// Compare this type's contents here:
return CompareFooMembers(
this, right as Foo );
}
}
首先,Equals()决不应该抛出异常,这感觉不大好。两个变量要么相等,要么不等;没有其它失败的余地。直接为所有的失败返回false,例如 null引用或者错误参数。现在,让我们来深入的讨论这个方法的细节,这样你会明白为什么每个检测为什么会在那里,以及那些方法可以省略。第一个检测断定 右边的对象是否为null,这样的引用上没有方法检测,在C#里,这决不可能为null。在你调用任何一个引用到null的实例的方法之前,CLR可能抛 出异常。下一步的检测来断定两个对象的引用是否是一样的,检测对象ID就行了。这是一个高效的检测,并且相等的对象ID来保证相同的内容。
接下来的检测来断定两个对象是否是同样的数据类型。这个步骤是很重要的,首先,应该注意到它并不一定是Foo类型,它调用了 this.GetType(),这个实际的类型可能是从Foo类派生的。其次,这里的代码在比较前检测了对象的确切类型。这并不能充分保证你可以把右边的 参数转化成当前的类型。这个测试会产生两个细微的BUG。考虑下面这个简单继承层次关系的例子:
public class B
{
public override bool Equals( object right )
{
// check null:
if (right == null)
return false;
// Check reference equality:
if (object.ReferenceEquals( this, right ))
return true;
// Problems here, discussed below.
B rightAsB = right as B;
if (rightAsB == null)
return false;
return CompareBMembers( this, rightAsB );
}
}
public class D : B
{
// etc.
public override bool Equals( object right )
{
// check null:
if (right == null)
return false;
if (object.ReferenceEquals( this, right ))
return true;
// Problems here.
D rightAsD = right as D;
if (rightAsD == null)
return false;
if (base.Equals( rightAsD ) == false)
return false;
return CompareDMembers( this, rightAsD );
}
}
//Test:
B baseObject = new B();
D derivedObject = new D();
// Comparison 1.
if (baseObject.Equals(derivedObject))
Console.WriteLine( "Equals" );
else
Console.WriteLine( "Not Equal" );
// Comparison 2.
if (derivedObject.Equals(baseObject))
Console.WriteLine( "Equals" );
else
Console.WriteLine( "Not Equal" );
在任何可能的情况下,你都希望要么看到两个Equals或者两个Not
Equal。因为一些错误,这并不是先前代码的情形。这里的第二个比较决不会返回true。这里的基类,类型B,决不可能转化为D。然而,第一个比较可能
返回true。派生类,类型D,可以隐式的转化为类型B。如果右边参数以B类型展示的成员与左边参数以B类型展示的成员是同等的,B.Equals()就
认为两个对象是相等的。你将破坏相等的对称性。这一架构被破坏是因为自动实现了在继承关系中隐式的上下转化。
当你这样写时,类型D被隐式的转化为B类型:
baseObject.Equals( derived )
如果baseObject.Equals()在它自己所定义的成员相等时,就断定两个对象是相等的。另一方面,当你这样写时,类型B不能转化为D类型,
derivedObject.Equals( base )
B对象不能转化为D对象,derivedObject.Equals()方法总是返回false。如果你不确切的检测对象的类型,你可能一不小心就陷入这样的窘境,比较对象的顺序成为一个问题。
当你重载Equals()时,这里还有另外一个可行的方法。你应该调用基类的System.Object或者System.ValueType的比 较方法,除非基类没有实现它。前面的代码提供了一个示例。类型D调用基类,类型B,定义的Equals()方法,然而,类B没有调用 baseObject.Equals()。它调用了Systme.Object里定义的那个版本,就是当两个参数引用到同一个对象时它返回true。这并 不是你想要的,或者你是还没有在第一个类里的写你自己的方法。
原则是不管什么时候,在创建一个值类型时重载Equals()方法,并且你不想让引用类型遵从默认引用类型的语义时也重载Equals(),就像 System.Object定义的那样。当你写你自己的Equals()时,遵从要点里实现的内容。重载Equals()就意味着你应该重写 GetHashCode(),详情参见原则10。
解决了三个,最后一个:操作符==(),任何时候你创建一个值类型,重新定义操作符==()。原因和实例的Equals()是完全一样的。默认的版 本使用的是引用的比较来比较两个值类型。效率远不及你自己任意实现的一个,所以,你自己写。当你比较两个值类型时,遵从原则17里的建议来避免装箱。
注意,我并不是说不管你是否重载了实例的Equals(),都还要必须重载操作符==()。我是说在你创建值类型时才重载操作符==()。.Net框架里的类还是期望引用类型的==操作符还是保留引用类型的语义。
C#给了你4种方法来检测相等性,但你只须要考虑为其中两个提供你自己的方法。你决不应该重载静态的 Object.ReferenceEquals()和静态的Object.Equals(),因为它们提供了正确的检测,忽略运行时类型。你应该为了更好 的性能而总是为值类型实例提供重载的Equals()方法和操作符==()。当你希望引用类型的相等与对象ID的相等不同时,你应该重载引用类型实例的 Equals()。简单,不是吗?
Item 9: Understand the Relationships Among ReferenceEquals(), static Equals(), instance Equals(), and operator==
When
you create your own types (either classes or structs), you define what
equality means for that type. C# provides four different functions that
determine whether two different objects are "equal":
public static bool ReferenceEquals
( object left, object right );
public static bool Equals
( object left, object right );
public virtual bool Equals( object right);
public static bool operator==( MyClass left, MyClass right );
The language enables you to create your own versions of all four of these methods. But just because you can doesn't mean that you should. You should never redefine the first two static functions. You'll often create your own instance Equals() method to define the semantics of your type, and you'll occasionally override operator==(), but only for performance reasons in value types. Furthermore, there are relationships among these four functions, so when you change one, you can affect the behavior of the others. Yes, needing four functions to test equality is complicated. But don't worryyou can simplify it.
Like so many of the complicated elements in C#, this one follows from the fact that C# enables you to create both value types and reference types. Two variables of a reference type are equal if they refer to the sameobject, referred to as object identity. Two variables of a value type are equal if they are the same type and they contain the same contents. That's why equality tests need so many different methods.
Let's start with the two functions you should never change. Object.ReferenceEquals() returns TRue if two variables refer to the same objectthat is, the two variables have the same object identity. Whether the types being compared are reference types or value types, this method always tests object identity, not object contents. Yes, that means that ReferenceEquals() always returns false when you use it to test equality for value types. Even when you compare a value type to itself, ReferenceEquals() returns false. This is due to boxing, which is covered in Item 16.
int i = 5;
int j = 5;
if ( Object.ReferenceEquals( i, j ))
Console.WriteLine( "Never happens." );
else
Console.WriteLine( "Always happens." );
if ( Object.ReferenceEquals( i, i ))
Console.WriteLine( "Never happens." );
else
Console.WriteLine( "Always happens." );
You'll never redefine Object.ReferenceEquals() becauseit does exactly what it is supposed to do: test the object identity of two different variables.
The second function you'll never redefine is static Object.Equals(). This method tests whether two variables are equal when you don't know the runtime type of the two arguments. Remember that System.Object is the ultimate base class for everything in C#. Anytime you compare two variables, they are instances of System.Object. Value types and reference types are instances of System.Object. So how does this method test the equality of two variables, without knowing their type, when equality changes its meaning depending on the type? The answer is simple: This method delegates that responsibility to one of the types in question. The static Object.Equals() method is implemented something like this:
public static bool Equals( object left, object right )
{
// Check object identity
if (left == right )
return true;
// both null references handled above
if ((left == null) || (right == null))
return false;
return left.Equals (right);
}
This example code introduces both of the methods I have not discussed yet: operator==() and the instance Equals() method. I'll explain both in detail, but I'm not ready to end my discussion of the static Equals() just yet. For right now, I want you to understand that static Equals() uses the instance Equals() method of the left argument to determine whether two objects are equal.
As with ReferenceEquals(), you'll never redefine the static Object.Equals() method because it already does exactly what it needs to do: determines whether two objects are the same when you don't know the runtime type. Because the static Equals() method delegates to the left argument's instance Equals(), it uses the rules for that type.
Now you understand why you never need to redefine the static ReferenceEquals() and static Equals() methods. It's time to discuss the methods you will override. But first, let's briefly discuss the mathematical properties of an equality relation. You need to make sure that your definition and implementation are consistent with other programmers' expectations. This means that you need to keep in mind the mathematical properties of equality: Equality is reflexive, symmetric, and transitive. The reflexive property means that any object is equal to itself. No matter what type is involved, a == a is always true. The symmetric property means that order does not matter: If a == b is true, b == a is also true. If a == b is false, b == a is also false. The last property is that if a == b and b == c are both true, then a == c must also be true. That's the transitive property.
Now it's time to discuss the instance Object.Equals() function, including when and how you override it. You create your own instance version of Equals() when the default behavior is inconsistent with your type. The Object.Equals() method uses object identity to determine whether two variables are equal. The default Object.Equals() function behaves exactly the same as Object.ReferenceEquals(). But waitvalue types are different. System.ValueType does override Object.Equals(). Remember that ValueType is the base class for all value types that you create (using the struct keyword). Two variables of a value type are equal if they are the same type and they have the same contents. ValueType.Equals() implements that behavior. Unfortunately, ValueType.Equals() does not have an efficient implementation. ValueType.Equals() is the base class for all value types. To provide the correct behavior, it must compare all the member variables in any derived type, without knowing the runtime type of the object. In C#, that means using reflection. As you'll see in Item 44, there are many disadvantages to reflection, especially when performance is a goal. Equality is one of those fundamental constructs that gets called frequently in programs, so performance is a worthy goal. Under almost all circumstances, you can write a much faster override of Equals() for any value type. The recommendation for value types is simple: Always create an override of ValueType.Equals() whenever you create a value type.
You should override the instance Equals() function only when you want to change the defined semantics for a reference type. A number of classes in the .NET Framework Class Library use value semantics instead of reference semantics for equality. Two string objects are equal if they contain the same contents. Two DataRowView objects are equal if they refer to the same DataRow. The point is that if your type should follow value semantics (comparing contents) instead of reference semantics (comparing object identity), you should write your own override of instance Object.Equals().
Now that you know when to write your own override of Object.Equals(), you must understand how you should implement it. The equality relationship for value types has many implications for boxing and is discussed in Item 17. For reference types, your instance method needs to follow predefined behavior to avoid strange surprises for users of your class. Here is the standard pattern:
public class Foo
{
public override bool Equals( object right )
{
// check null:
// the this pointer is never null in C# methods.
if (right == null)
return false;
if (object.ReferenceEquals( this, right ))
return true;
// Discussed below.
if (this.GetType() != right.GetType())
return false;
// Compare this type's contents here:
return CompareFooMembers(
this, right as Foo );
}
}
First, Equals() should never throw exceptionsit doesn't make much sense. Two variables are or are not equal; there's not much room for other failures. Just return false for all failure conditions, such as null references or the wrong argument types. Now, let's go through this method in detail so you understand why each check is there and why some checks can be left out. The first check determines whether the right-side object is null. There is no check on this reference. In C#, this is never null. The CLR throws an exception before calling any instance method through a null reference. The next check determines whether the two object references are the same, testing object identity. It's a very efficient test, and equal object identity guarantees equal contents.
The next check determines whether the two objects being compared are the same type. The exact form is important. First, notice that it does not assume that this is of type Foo; it calls this.GetType(). The actual type might be a class derived from Foo. Second, the code checks the exact type of objects being compared. It is not enough to ensure that you can convert the right-side parameter to the current type. That test can cause two subtle bugs. Consider the following example involving a small inheritance hierarchy:
public class B
{
public override bool Equals( object right )
{
// check null:
if (right == null)
return false;
// Check reference equality:
if (object.ReferenceEquals( this, right ))
return true;
// Problems here, discussed below.
B rightAsB = right as B;
if (rightAsB == null)
return false;
return CompareBMembers( this, rightAsB );
}
}
public class D : B
{
// etc.
public override bool Equals( object right )
{
// check null:
if (right == null)
return false;
if (object.ReferenceEquals( this, right ))
return true;
// Problems here.
D rightAsD = right as D;
if (rightAsD == null)
return false;
if (base.Equals( rightAsD ) == false)
return false;
return CompareDMembers( this, rightAsD );
}
}
//Test:
B baseObject = new B();
D derivedObject = new D();
// Comparison 1.
if (baseObject.Equals(derivedObject))
Console.WriteLine( "Equals" );
else
Console.WriteLine( "Not Equal" );
// Comparison 2.
if (derivedObject.Equals(baseObject))
Console.WriteLine( "Equals" );
else
Console.WriteLine( "Not Equal" );
Under any possible circumstances, you would expect to see either Equals or Not Equal printed twice. Because of some errors, this is not the case with the previous code. The second comparison will never return TRue. The base object, of type B, can never be converted into a D. However, the first comparison might evaluate to true. The derived object, of type D, can be implicitly converted to a type B. If the B members of the right-side argument match the B members of the left-side argument, B.Equals() considers the objects equal. Even though the two objects are different types, your method has considered them equal. You've broken the symmetric property of Equals. This construct broke because of the automatic conversions that take place up and down the inheritance hierarchy.
When you write this, the D object is explicitly converted to a B:
baseObject.Equals( derived )
If baseObject.Equals() determines that the fields defined in its type match, the two objects are equal. On the other hand, when you write this, the B object cannot be converted to a D object:
derivedObject.Equals( base )
The B object cannot be converted to a D object. The derivedObject.Equals() method always returns false. If you don't check the object types exactly, you can easily get into this situation, in which the order of the comparison matters.
There is another practice to follow when you override Equals(). You should call the base class only if the base version is not provided by System.Object or System.ValueType. The previous code provides an example. Class D calls the Equals() method defined in its base class, Class B. However, Class B does not call baseObject.Equals(). It calls the version defined in System.Object, which returns true only when the two arguments refer to the same object. That's not what you want, or you wouldn't have written your own method in the first place.
The rule is to override Equals() whenever you create a value type, and to override Equals() on reference types when you do not want your reference type to obey reference semantics, as defined by System.Object. When you write your own Equals(), follow the implementation just outlined. Overriding Equals() means that you should write an override for GetHashCode(). See Item 10 for details.
Three down, one to go: operator==(). Anytime you create a value type, redefine operator==(). The reason is exactly the same as with the instance Equals() function. The default version uses reflection to compare the contents of two value types. That's far less efficient than any implementation that you would write, so write your own. Follow the recommendations in Item 17 to avoid boxing when you compare value types.
Notice that I didn't say that you should write operator==() whenever you override instance Equals(). I said to write operator==() when you create value types. You should rarely override operator==()when you create reference types. The .NET Framework classes expect operator==() to follow reference semantics for all reference types.
C# gives you four ways to test equality, but you need to consider providing your own definitions for only two of them. You never override the static Object.ReferenceEquals() and static Object.Equals() because they provide the correct tests, regardless of the runtime type. You always override instance Equals() and operator==() for value types to provide better performance. You override instance Equals() for reference types when you want equality to mean something other than object identity. Simple, right?