Java Native Interface Specification

JNI官方文档：https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/jniTOC.html

Chapter 3

This chapter discusses how the JNI maps Java types to native C types.

第二章讲到，简单数据类型，JNI直接copy传递。由jobject集成而来的类型，则构造局部引用或者全局引用传递给native使用，native method返回的复杂类型也是局部引用。

Primitive Types

Table 3-1 describes Java primitive types and their machine-dependent native equivalents.

Table 3-1 Primitive Types and Native Equivalents

Java Type	Native Type	Description
boolean	jboolean	unsigned 8 bits
byte	jbyte	signed 8 bits
char	jchar	unsigned 16 bits
short	jshort	signed 16 bits
int	jint	signed 32 bits
long	jlong	signed 64 bits
float	jfloat	32 bits
double	jdouble	64 bits
void	void	N/A

The following definition is provided for convenience.

1 #define JNI_FALSE  0
2 #define JNI_TRUE   1

The jsize integer type is used to describe cardinal indices and sizes:

typedef jint jsize;

Reference Types

The JNI includes a number of reference types that correspond to different kinds of Java objects. JNI reference types are organized in the hierarchy shown in Figure 3-1.

The top of the heirarchy is jobject. Subclasses of jobject are jclass, jstring, jarray and jthrowable. Subclasses of jarray are jobjectArray, jbooleanArray, jbyteArray, jcharArray, jshortArray, jintArray, jlongArray, jfloatArray, jdoubleArray.

Figure 3-1 Reference Type Hierarchy

In C, all other JNI reference types are defined to be the same as jobject. For example:

typedef jobject jclass;

In C++, JNI introduces a set of dummy classes to enforce the subtyping relationship. For example:

1 class _jobject {}; 
2 class _jclass : public _jobject {}; 
3 ... 
4 typedef _jobject *jobject; 
5 typedef _jclass *jclass;

Field and Method IDs

Method and field IDs are regular C pointer types:

1 struct _jfieldID;              /* opaque structure */ 
2 typedef struct _jfieldID *jfieldID;   /* field IDs */ 
3  
4 struct _jmethodID;              /* opaque structure */ 
5 typedef struct _jmethodID *jmethodID; /* method IDs */

The Value Type

The jvalue union type is used as the element type in argument arrays. It is declared as follows:

 1 typedef union jvalue { 
 2     jboolean z; 
 3     jbyte    b; 
 4     jchar    c; 
 5     jshort   s; 
 6     jint     i; 
 7     jlong    j; 
 8     jfloat   f; 
 9     jdouble  d; 
10     jobject  l; 
11 } jvalue;

Type Signatures

第二章的Resolving Native Method Names讲了一个完整的native method的方法名，这里主要是函数前面的介绍

The JNI uses the Java VM’s representation of type signatures. Table 3-2 shows these type signatures.

Table 3-2 Java VM Type Signatures


Type Signature	Java Type
Z	boolean
B	byte
C	char
S	short
I	int
J	long
F	float
D	double
L fully-qualified-class ;	fully-qualified-class
[ type	type[]
( arg-types ) ret-type	method type

For example, the Java method:

long f (int n, String s, int[] arr);

has the following type signature:

(ILjava/lang/String;[I)J

例如，第二章的GetMethodId调用，第三参数传入函数签名

jmethodID mid =      env->GetMethodID(cls, “f”, “(ILjava/lang/String;)D”);

Modified UTF-8 Strings

值得注意的是，JNI使用UTF8编码字符串。下面都是介绍UTF8编码格式的，这里就不翻译了

The JNI uses modified UTF-8 strings to represent various string types. Modified UTF-8 strings are the same as those used by the Java VM. Modified UTF-8 strings are encoded so that character sequences that contain only non-null ASCII characters can be represented using only one byte per character, but all Unicode characters can be represented.

All characters in the range u0001 to u007F are represented by a single byte, as follows:

0

bits 6-0

The seven bits of data in the byte give the value of the character represented.

The null character ('u0000') and characters in the range 'u0080' to 'u07FF' are represented by a pair of bytes x and y:

x:

1

0

bits 10-6

y:

1

0

bits 5-0

The bytes represent the character with the value ((x & 0x1f) << 6) + (y & 0x3f).

Characters in the range 'u0800' to 'uFFFF' are represented by 3 bytes x, y, and z:

x:

1

0

bits 15-12

y:

1

0

bits 11-6

z:

1

0

bits 5-0

The character with the value ((x & 0xf) << 12) + ((y & 0x3f) << 6) + (z & 0x3f) is represented by the bytes.

Characters with code points above U+FFFF (so-called supplementary characters) are represented by separately encoding the two surrogate code units of their UTF-16 representation. Each of the surrogate code units is represented by three bytes. This means, supplementary characters are represented by six bytes, u, v, w, x, y, and z:

u:

1

0

1

0

1

v:

1

0

1

0

(bits 20-16) - 1

w:

1

0

bits 15-10

x:

1

0

1

0

1

y:

1

0

1

bits 9-6

z:

1

0

bits 5-0

The character with the value 0x10000+((v&0x0f)<<16)+((w&0x3f)<<10)+(y&0x0f)<<6)+(z&0x3f) is represented by the six bytes.

The bytes of multibyte characters are stored in the class file in big-endian (high byte first) order.

There are two differences between this format and the standard UTF-8 format. First, the null character (char)0 is encoded using the two-byte format rather than the one-byte format. This means that modified UTF-8 strings never have embedded nulls. Second, only the one-byte, two-byte, and three-byte formats of standard UTF-8 are used. The Java VM does not recognize the four-byte format of standard UTF-8; it uses its own two-times-three-byte format instead.

For more information regarding the standard UTF-8 format, see section 3.9 Unicode Encoding Forms of The Unicode Standard, Version 4.0.

原文地址：https://www.cnblogs.com/hgwang/p/14437984.html